What is the IPv4address ABNF rule for?


(Philip Potter) #1

This might seem like a question with an obvious answer, but hear me out:

It so happens that any valid IPv4 address already will parse successfully as a reg-name. So why do we need a separate IPv4address rule? What benefit does it get us? (Alternatively: can we remove it and simplify the ABNF, possibly with a comment explaining where IPv4 addresses are handled?)

(I discovered this because I’m slowly unit testing my way through implementing remote import parsing in dhall-golang, and was surprised to discover my test for http://127.0.0.1/foo passed without me having to implement it…)


(Philip Potter) #2

huh, after digging a bit deeper, it seems this is inherited from RFC 3986, which acknowledges the ambiguity:

The syntax rule for host is ambiguous because it does not completely
distinguish between an IPv4address and a reg-name. In order to
disambiguate the syntax, we apply the “first-match-wins” algorithm:
If host matches the rule for IPv4address, then it should be
considered an IPv4 address literal and not a reg-name.

In the context of RFC 3986, the important distinction is: a reg-name needs to be resolved to an address, but an IPv4address doesn’t. For dhall implementations, though, I mostly hope that we’re delegating actual name lookup to a third party library to understand and fetch the URL, so the dhall syntax itself doesn’t need this complexity.

That said, I can see an argument to leave the ABNF as similar to the original RFC 3986 BNF as possible…


(Gabriel Gonzalez) #3

See this related issue: