Skip to content

inet parse IP addresses optimisations#10925

Open
NelsonVides wants to merge 8 commits intoerlang:masterfrom
NelsonVides:feat/inet_parse_binary_input
Open

inet parse IP addresses optimisations#10925
NelsonVides wants to merge 8 commits intoerlang:masterfrom
NelsonVides:feat/inet_parse_binary_input

Conversation

@NelsonVides
Copy link
Copy Markdown
Contributor

There are several related but possibly independent changes:

  • The first and likely uncontroversial is the first commit, it optimises for example inet_parse:ipv4strict_address/1 with a 4.2x time and 6.8x memory improvements.
  • Next change is to accept binaries. Far too often I'm doing inet:parse_address(binary_to_list(Addr)), and so does Elixir almost universally. The input often comes from JSON or from a file, or from some HTTP server library, and it's just all the time a binary. There's also uri_string who takes binaries and converts to list too before parsing the IP address.
  • Related to the previous, is to apply optimisations to the binary branch for IPv4, making it ~50% faster than through the to_list conversion. I couldn't easily apply a similar optimisation for the IPv6 case so that one remains implemented as simply converting to list internally.

From benchee:

IPv4:
```
Name                                                                             ips        average  deviation         median         99th %
1 charlist -> ipv4strict_address/1 (optimised)                                6.70 M      149.21 ns   ±134.65%         125 ns         333 ns
0 charlist -> inet_parse_baseline:ipv4strict_address/1                        1.60 M      624.44 ns   ±654.66%         583 ns         917 ns

Comparison:
1 charlist -> ipv4strict_address/1 (optimised)                                6.70 M
0 charlist -> inet_parse_baseline:ipv4strict_address/1                        1.60 M - 4.18x slower +475.23 ns

Memory usage statistics:

Name                                                                      Memory usage
1 charlist -> ipv4strict_address/1 (optimised)                                 0.63 KB
0 charlist -> inet_parse_baseline:ipv4strict_address/1                         4.24 KB - 6.79x memory usage +3.62 KB
```

IPv6:
```
Name                                                                             ips        average  deviation         median         99th %
1 charlist -> inet_parse:ipv6strict_address/1                                 1.50 M      667.36 ns   ±717.89%         584 ns        1292 ns
0 charlist -> inet_parse_baseline:ipv6strict_address/1                        0.94 M     1063.79 ns   ±409.54%         958 ns        2250 ns

Comparison:
1 charlist -> inet_parse:ipv6strict_address/1                                 1.50 M
0 charlist -> inet_parse_baseline:ipv6strict_address/1                        0.94 M - 1.59x slower +396.44 ns

Memory usage statistics:

Name                                                                      Memory usage
1 charlist -> inet_parse:ipv6strict_address/1                                  3.91 KB
0 charlist -> inet_parse_baseline:ipv6strict_address/1                         7.28 KB - 1.86x memory usage +3.37 KB
```
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

CT Test Results

    4 files    175 suites   2h 41m 46s ⏱️
4 219 tests 3 752 ✅ 465 💤 2 ❌
5 293 runs  4 682 ✅ 609 💤 2 ❌

For more details on these failures, see this check.

Results for commit 970fdeb.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@NelsonVides NelsonVides force-pushed the feat/inet_parse_binary_input branch 4 times, most recently from 1718c73 to a38ad67 Compare March 26, 2026 10:58
Comment on lines +576 to +578
%% Single-pass charlist parser for strict IPv4 addresses.
%% Four functions, one per octet — no packed accumulator, no dot
%% counter, no bit unpacking.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, 4 functions of which 3 do almost exactly the same thing and the last one only slightly different, and yet another 4 similar ones for binaries... this looks pretty verbose and repetitive 😅

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, totally, just more performant as it is all tail-recursive and accumulation happens in parameters, which go to x-registers 🤷🏽‍♂️

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're taking this to the extremes I see 😜 How about putting some of it in macros then, instead of literally writing it over and over again? Inlined functions may be an alternative, too, not sure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't figure out how to generate that with macros easily, do you have any suggestion? 🤔

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look on monday 👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NelsonVides I made a PR on your branch behind this PR 😉

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! That looks absolutely great, pushed here too! 🎉

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, adding that here too, it is also even faster: NelsonVides#2 (review) 🎉

Extend `inet_parse` address parsers to accept UTF-8 binaries via
`binary_to_list/1` delegation. Extend inet_SUITE so existing parse
vectors also run on `list_to_binary` equivalents, add combined
parse_address/parse_strict_address smoke tests (including /2), and
assert ntoa round-trips with binary input.
From benchee:
```
Name                                                                             ips        average  deviation         median         99th %
1 charlist -> ipv4strict_address/1 (optimised)                                5.47 M      182.95 ns  ±1875.09%         167 ns         375 ns
3 binary -> inet_parse:ipv4strict_address/1 (optimised)                       5.03 M      198.75 ns  ±2008.92%         167 ns         333 ns
2 binary -> binary_to_list -> ipv4strict_address/1                            3.30 M      302.85 ns  ±1927.57%         250 ns         666 ns

Comparison:
1 charlist -> ipv4strict_address/1 (optimised)                                5.47 M
3 binary -> inet_parse:ipv4strict_address/1 (optimised)                       5.03 M - 1.09x slower +15.80 ns
2 binary -> binary_to_list -> ipv4strict_address/1                            3.30 M - 1.66x slower +119.91 ns

Memory usage statistics:

Name                                                                      Memory usage
1 charlist -> ipv4strict_address/1 (optimised)                                 0.63 KB
3 binary -> inet_parse:ipv4strict_address/1 (optimised)                        1.02 KB - 1.63x memory usage +0.39 KB
2 binary -> binary_to_list -> ipv4strict_address/1                             2.05 KB - 3.27x memory usage +1.42 KB
```

Note that I couldn't optimise the IPv6 path similarly and therefore just
decided to leave it as is and not complicate code too much.
@NelsonVides NelsonVides force-pushed the feat/inet_parse_binary_input branch from a38ad67 to 385efe5 Compare March 26, 2026 13:36
@rickard-green rickard-green added the team:PS Assigned to OTP team PS label Mar 30, 2026
@NelsonVides
Copy link
Copy Markdown
Contributor Author

Would be good to get the first point merged into the next release, be it a patch or into OTP29, as it is just a substantial performance improvement. The other two might be more elaborated, dunno, I'd like to have them into OTP29 but if the merge window is too tight already I'm happy to split the PR and just get the first change for now :)

@NelsonVides NelsonVides force-pushed the feat/inet_parse_binary_input branch from d32d884 to 45736cc Compare April 13, 2026 13:31
@juhlig
Copy link
Copy Markdown
Contributor

juhlig commented Apr 15, 2026

On a side note @RaimoNiskanen @IngelaAndin...

I noticed that the (relaxed) parse_ipv4_address (maybe also parse_ipv6_address, I didn't look into that) parses in surprising ways. That is, it takes "relaxed" pretty far. This is mentioned in the documentation only superficially and matter-of-factly.

In the given string, octets are interpreted as decimal, but also as octal when starting with a 0, or hex when starting with 0x. This can also be mixed, so something like "12.023.0x56.7" is possible and results in {12,28,86,7}. Especially the "leading 0 means octal" reading is problematic here, as it is ambiguous and could also be understood as "the octet is in decimal, with superfluous leading 0s". This is furthered by the fact that after the leading 0 or 0x indicating octal or hex reading, the actual number can have leading 0s - but not decimal numbers, as any leading 0 there would lead to interpretation as octal.
Reading up on this, AFAICS there is no actual standard about how leading 0s are supposed to be interpreted, and the general recommendation is to flat-out reject it (which is what parse_ipv4strict_address does, but more on that later).

More confusion is added by the fact that trailing (but not leading) octets can be conflated into one number (which are also subject to the 0/0x prefixing) as long as they do not exceed the range left by the leading octects (if any). All of the following (and their 0-prefixed octal and 0x-prefixed, possibly mixed, equivalents) are accepted as valid and are parsed to {12, 34, 56, 78}:

  • "203569230"
  • "12.2242638"
  • "12.34.14414"
  • "12.34.56.78"

I think this opens up a moderatley wide field of mistakes, especially since the documentation does not go into any of this. And while there are of course the strict variants, I argue that without further information, a user would intuitively reach for the relaxed ("my input may be kinda off, but the parser will do the right thing") variants which are also less verbose.

P.S.: What the "relaxed" in the documentation would suggest to me is something like "my input may be kinda sloppy, some spaces, (leading zeros), trailing dots - but the parser will set it (that!) right", while what the actual implementation does is interpret the input in non-obvious decidedly-different ways 🤷‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team:PS Assigned to OTP team PS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants