Platform-specific import types

A discussion we’re currently having around our JVM implementation https://github.com/travisbrown/dhallj is how to handle imports from the JVM classpath. When running on the JVM you basically have two parallel filesystems - the local filesystem and the classpath and imports are potentially ambiguous between the two.

We’ve thought of a few options for solving this and thought we’d seek feedback here.

Option 1: a (very small) extension to the spec

We would allow the user to write let import = cp:/absolute/path/to/import in ... This uses very similar syntax to env imports and requires the user to specify an absolute path to avoid classloader-related ambiguity. I believe the semantics would be pretty much the same as an absolute path local import (including referential sanity), just with a different mechanism for resolving the content of the file.

Option 2

No extension to the spec. We would require the user to pass some kind of discriminating function
prefix : Path -> FileSystem Obvious advantage is that it doesn’t require extending the spec. Disadvantages are the user experience and the fact that it doesn’t really help you if the prefixes are the same on both local and classpath filesystems (unlikely though this may be!)

Option 3

Something else we have’t thought of! Suggestions?

2 Likes

My gut feeling is that option 1 is better here. It doesn’t sit well with me having the same syntax for two different kinds of import, and then using some hack later on to actually decide what kind of import it really is.

Note that you don’t need to change the spec to allow this: I think a conforming implementation can have extensions, such as this one, which would otherwise be a syntax error. But it doesn’t make sense for non-JVM implementations to support classpath imports.

1 Like

I’d also lean towards Option 1 (except with the minor suggestion to use classpath for the scheme). The main reason why is that code written to use the classpath import can be made compatible with non-JVM implementations by resolving the imports (i.e. like running the code through dhall resolve)

Another thing that will help is to treat the imports as “local” (for the purpose of the referential sanity check), that way people won’t use them in shared packages that are imported remotely. My mental model is “local imports = application code” and “remote imports = library code”, so classifying classpath imports as local will help restrict them to application code, which will limit potential compatibility issues.

For reference, there is some prior art in Spring for classpath: as a URI scheme.

Many thanks for your input @Gabriel439 and @philandstuff! classpath:/... it is :slight_smile:

3 Likes

Shouldn’t we change the spec so that an implementation that doesn’t implement a scheme doesn’t fail but simply ignore the import?

Something like

let MyLib = classpath:/com/acme/MyApp//MyLib.dhall
            ? /usr/share/lib/dhall/MyApp/MyLib.dhall
            ? http://acme.com/MyApp/MyLib.dhall

should be possible, and parsable. (I can’t think about a use-case, though)

I’m implementing a custom import scheme for pydhall. One issue is the cbor encoding of the scheme.

We don’t want a custom scheme encoding to clash with dhall standard schemes.

Option 1

add a requirement in the spec that any implementation-specific scheme must be encoded as a number >1000

Pros:

  • trivial to implement

Cons:

  • what if pydhall, that already uses 1001 for pydhall+classpath:, wants to support dhallj’s classpath: that happens to be encoded as 1001 too?

Option 2

Encode custom schemes as CBOR strings

Pros:

  • no name clashes (especially if we enforce the namespacing of custom scheme like in pydhall+)

Cons:

  • It’s easy in Python to have a dynamic type for the CBOR-enoded scheme (integer for standard schemes, string for a custom scheme). It may be more complex for statically typed langages

Option 3

Use a standardized (string -> integer) hash. This combines Option 1 and Option 2. Note that we don’t even need to compute the hash at runtime. It can be hard-coded in implementations.

@TimWSpence how do you encode classpath imports ?

As a side note, shouldn’t we require that custom scheme must be namespaced by <implem_code_name>+ as in pydhall+classpath ?

@lisael our implementation is fairly adhoc for the moment and comes with a warning that the semantics are not finalized. For the moment, I’ve encoded classpath imports as 23 as (from what I remember) that is the highest integer that can be CBOR tiny field encoded. Obviously in theory that could cause problems as Dhall expands but it would have to add a lot of import types to do so.

Compatibility between extensions for different languages is a very good question. I have no idea right now what the best solution is!

1 Like

Option 4

Attribute a new binary id to custom imports so they are not decoded as imports and store the scheme as a string. They must follow the same chaining, caching rules as regular imports.

cbor_encode(i: custom_import) = [29, hash(i), import_type(i), "scheme", ...]

@lisael: I believe all of the import schemes are valid URIs, so what if we only require that a custom import is a valid URI? That would allow us to preserve more structure in the encoding than just a string

@Gabriel439 if I understand correctly, your suggested encoding is

[24, null, 0, 8, "my+scheme://a/path"]

or (more consistent with http imports as the paths is broken down into components)

[24, null, 0, 8, "my+scheme", "a", "path"]

It’s clean, but it changes the decoding logic. With standard imports we rely on the first and fourth elements to find the implementation of an import (python classes in pydhall’s case). It’s not a big deal to add another dispatching routine based on the scheme element.

I’ll open an specification issue for this.

@lisael: Yeah, the idea was that existing standard import schemes would still be encoded in the same way and custom schemes would use a different more weakly-structured encoding.