Supporting transitive imports from private repositories?


#1

This is a continuation on Deep imports and environment variables and Import chaining headers

tl;dr let’s say I have two private github repositories, timbertson/base and timbertson/app. Base is… some shared stuff, and app builds upon it.

When I import timbertson/base/package.dhall from timbertson/app/package.dhall, I will need to pass auth headers including my $GITHUB_AUTH environment variable.

That all works fine when I’m importing locally, but comes unstuck when I want to import app/package.dhall from github directly - I can’t evaluate the header expression since it can’t access any environment variables.

One fallback which works in both local and remote contexts is:

-- base-impl.dhall
  (https://raw.githubusercontent.com/timbertson/base/COMMIT_ID/package.dhall using ./github-headers.dhall)
  ? ../../../base/COMMIT_ID/package.dhall

and then importing that (to freeze it):

-- base.dhall
./base-impl.dhall sha256:(...)

But it’s got quite a few usability issues:

  • I can’t test the (local) path is right without actually pushing it and importing the result
  • I need to repeat the COMMIT_ID in both parts, and keep the paths in sync
  • it confuses tooling (e.g. I have a script to bump a commit for github imports, and it’ll only understand the first import)
  • I need two different files for a single import
  • the error messages for fallbacks tend to be more confusing, since one of the two errors is “expected” but users aren’t always aware of which one or why. And even when they are it’s still a screenful of error message to wade through.

So I’m wondering if there’s a modification to the the “same origin” policy which could support a single import working in both contexts.

One idea would be that for a remote file (imported from the web), any transitive imports with the same origin implicitly use the same headers (even if it’s not a relative import). I think it would also need to ignore explicit headers in the case where a remote import happens to contain a fully-qualified import for the same origin, which is definitely a little surprising.

I imagine headers are only really used for auth in practice though, so this slightly surprising behaviour is likely to be the most useful? I don’t believe it weakens dhall’s security posture, since it only affects behaviour which is already within the same origin (and can already use relative imports).


#2

I was a little confused by the 2nd-to-last paragraph. This part I agree with:

One idea would be that for a remote file (imported from the web), any transitive imports with the same origin implicitly use the same headers (even if it’s not a relative import).

Specifically, I agree with the idea that we should permit forwarding headers even for non-relative imports as long as we know it shares the same origin (i.e… by comparing the domain).

However, I didn’t understand this part because it sounded like it was saying the opposite:

I think it would also need to ignore explicit headers in the case where a remote import happens to contain a fully-qualified import for the same origin, which is definitely a little surprising.


#3

Yeah that is a bit confusing. What I meant is, say you have this file:

-- package.dhall
https://example.com/dependency.dhall using ./headers.dhall

Generally, the value of ./headers.dhall is used for the request for dependency.dhall. And that would remain true when this file is imported from the filesystem, or from any other origin.

But I’m proposing that in the case where the current file was imported from the same origin (https://example.com), the ./headers.dhall expression would not be evaluated at all. It would be ignored, and instead the headers that were used to import this file (https://example.com/package.dhall) would be used to fetch dependency.dhall.

The rationale being that ./headers.dhall references $GITHUB_TOKEN, and that won’t evaluate if it’s a remote import.

i.e. it’s not enough to say “propagate the same headers for a request to the same origin”. We also need to say “don’t even attempt to evaluate a using expression for an import with the same origin (because it will likely fail)”.


#4

This makes me wonder if we’re operating at the wrong abstraction layer. It seems like what might work better is some Dhall mapping from domain names to custom headers, rather than specifying that information in-band within the configuration files. In other words, something like a Dhall analog of a ~/.netrc file:

toMap
  { `raw.githubusercontent.com` = ./github-headers.dhall
  }

I believe @philandstuff proposed an idea like this a while back as a way to simplify the standardization logic around custom headers.


#5

Yeah, I agree with that - the information really relates to the host far more than any particular import.

If we’re spitballing, I think the dotfile makes sense but it would also be great to be able to override this in an envvar. We have workflows that require shuffling auth tokens over e.g. docker invocations, and mounting a file onto ~/.config/dhall/netrc or something is definitely more awkward than being able to export a DHALL_NETRC env.


#6

Yeah, I might suggest something like this pseudo-Dhall code for resolving the custom headers file:

  env:DHALL_HEADERS
? "${env:XDG_CONFIG_HOME as Text}/dhall/headers.dhall"  -- Not valid Dhall
? ~/.config/dhall/headers.dhall"