Currently, the standard requires you to canonicalize imports at import resolution time. This means that, for example, a raw URL for https://example.com/foo/./bar should result in a web request to https://example.com/foo/bar and this behaviour is tested for in the tests/import/success/unit/asLocation/RemoteCanonicalize*
tests.
I would like to change this behaviour: I think we should only canonicalize when chaining imports, not for all import resolution. I have two reasons for this:
- it is much easier for me to implement in dhall-golang
- it feels more correct
Easier to implement
Currently, dhall-golang only canonicalizes at chaining time, not at resolution time. This is easier, because Go’s url library only provides functionality to do what we call “canonicalization” in its ResolveReference method. There is no URL.Canonicalize()
method.
More correct
Now, I’m in danger of motivated reasoning here (see previous section), but I think it is more correct to only canonicalize when chaining, not at all times when resolving.
I’d argue that my proposal is more correct from two points of view:
- it’d be unusual for a user to input a URL containing dotted segments in their dhall source, but I’d expect them to know what they’re doing and for this to be respected
- RFC3986, which defines the process of removing dot segments in URLs, only does so in the context of resolving a relative reference (ie what we call “import chaining”). (This explains the functionality of the Go standard library above).
A minor related point is that our import chaining doesn’t quite match RFC3986 reference resolution, and the test RemoteCanonicalize4A.dhall
demonstrates how: we don’t strip leading ..
segments from URLs, but we should.
Thoughts?