Cookbook request - simple Dhall pinning with nix-shell

ari-becker · February 3, 2020, 8:53am

I’m currently trying to improve the portability of some of my scripts that use dhall-haskell, which uses Nix build expressions. nix-shell seems like a great way to do so, but as a Nix neophyte, I’m starting to struggle with the transition from “nixpkgs is out of date” -> “OK, let me just use the Nix expressions in the dhall-haskell repository” -> “… well how do I do that?”

Consider a starting point:

script.sh:

#! /usr/bin/env bash

dhall version

And an initial attempt to transition to using nix-shell:

script.sh:

#! /usr/bin/env nix-shell
#! nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs-channels/archive/nixos-19.09.tar.gz
#! nix-shell deps.nix -i bash

dhall version

deps.nix:

with import <nixpkgs> {};
let pinned-dhall = dhall
in runCommand "my-dhall-script" {
  buildInputs = [
    bash
    pinned-dhall
  ];
} ""

But really, instead of pinned-dhall = dhall, I should have some kind of derivation, e.g. pinned-dhall = mkDerivation... with the arguments to mkDerivation pinned to a specific tag of dhall-haskell somewhere? Is there an easy/simple way to use the release binaries in a platform-independent way (such that the script will work for both OS X-using colleagues and Linux-using colleagues)?

Gabriel439 · February 7, 2020, 5:28am

@ari-becker: The trick here is to use “import from derivation” (i.e. import Nix code from the output of a build product). The dhall-haskell repository comes with Nix derivations that you can reuse to build dhall, so once you fetch the desired revision of dhall-haskell with Nix, you can have Nix build the Nix code included within the repository, like this:

let
  dhall-haskell-src = builtins.fetchTarball {
    url    = "https://github.com/dhall-lang/dhall-haskell/archive/29f6ab8b9c3b32634f4d24c5be2fdf22184b25fc.tar.gz";
    sha256 = "1ik4r5n3b1wv7ixy89jh6z0qbjfrkd3bnkbny0g4pxahhczhqn7d";
  };

  dhall-haskell = import "${dhall-haskell-src}";  # This is the "import from derivation"

in
  dhall-haskell.dhall

To update the desired revision of dhall-haskell, change the url field of builtins.fetchGit and you can obtain the corresponding sha256 field by running:

$ nix-prefetch-url --unpack "https://github.com/dhall-lang/dhall-haskell/archive/${REVISION}.tar.gz"

ari-becker · February 7, 2020, 12:32pm

Fantastic! That was a big help, it helped me write:

./dhall.sh:

#! /usr/bin/env nix-shell
#! nix-shell ./shell.nix -i bash

dhall version

./shell.nix:

let
  nixpkgs = import (
    let 
      version = "19.09";
    in builtins.fetchTarball {
      name   = "nixpkgs-${version}";
      url    = "https://github.com/NixOS/nixpkgs/archive/${version}.tar.gz";
      sha256 = "0mhqhq21y5vrr1f30qd2bvydv4bbbslvyzclhw0kdxmkgg3z4c92";
    }
  ) {};

  dhall-haskell = import ( 
    let
      version = "1.29.0";
    in builtins.fetchTarball {
      name   = "dhall-haskell-${version}";
      url    = "https://github.com/dhall-lang/dhall-haskell/archive/${version}.tar.gz";
      sha256 = "1gsmzvnzv663jd72mvjpj8r76y1c8fz2wf99q2qcvnqq56qxjili";
    }
  );

in nixpkgs.mkShell {
  buildInputs = [
    dhall-haskell.dhall
  ];
}

But then I try to run the script, while resolving the inputs for dhall, I get the following error:

Config file path source is default config file.                                                                        
Config file /build/src/.cabal/config not found.                                                                        
Writing default configuration to /build/src/.cabal/config                                                              
Warning: Cannot run preprocessors. Run 'configure' command first.                                                      
Building source dist for dhall-1.29.0...                                                                               
./dhall-lang/tests/alpha-normalization/success/unit/: getDirectoryContents:openDirStream: does not exist (No such file 
or directory)                                                                                                          
builder for '/nix/store/3p87j5j4ywfsshbyh4zx70k5i4pq3d38-dhall-sdist.drv' failed with exit code 1

Which I’m at a complete loss at how to try to fix, given that a) I imagine that the Nix build for 1.29.0 is known to be good, considering that it’s been out for nearly ~~two months~~ (edit) one month now, and b) the whole point of Nix is to get reproducible builds… right?

Gabriel439 · February 8, 2020, 4:07am

@ari-becker: Actually, the recipe I gave you didn’t work either (I didn’t wait to test if the build succeeded)

The issue here is that the dhall-haskell repository depends on a git submodule in order for tests to pass. You can either (A) disable tests or (B) include the submodule so that tests pass.

To include the submodule, the trick is to use pkgs.fetchFromGitHub with the fetchSubmodules option set to true. builtins.fetchGit does not yet support fetchSubmodules (but there is a PR to fix that in: https://github.com/NixOS/nix/pull/3166)

I got this version to work:

let
  nixpkgs = import (
    let 
      version = "19.09";
    in builtins.fetchTarball {
      name   = "nixpkgs-${version}";
      url    = "https://github.com/NixOS/nixpkgs/archive/${version}.tar.gz";
      sha256 = "0mhqhq21y5vrr1f30qd2bvydv4bbbslvyzclhw0kdxmkgg3z4c92";
    }
  ) {};

  dhall-haskell = import ( 
    let
      version = "1.29.0";
    in nixpkgs.fetchFromGitHub {
      owner = "dhall-lang";

      repo = "dhall-haskell";

      rev = "1.29.0";

      fetchSubmodules = true;

      sha256 = "0vvxwr0nw9wifh5yxh9wam68bn07pcwz1zh5lpvmv82jh706k49z";
    }
  );

in nixpkgs.mkShell {
  buildInputs = [
    dhall-haskell.dhall
  ];
}

ari-becker · February 9, 2020, 12:54pm

So now I’m trying to run it and it seems to take forever to download binaries from https://cache.dhall-lang.org/, to the point where it appears to be stuck. Is there rate limiting or something else that might be affecting it?

(using the cache settings in dhall-haskell's README)

Gabriel439 · February 9, 2020, 4:25pm

@ari-becker: Not that I’m aware of. The only thing I can think of is that the download for GHC may be larger than other downloads. If you can tell me which /nix/store/… path it is stuck on I can debug things further.

ari-becker · February 10, 2020, 7:31am

@Gabriel439 I did a bit of research - looked at the NixOps definition and looked up where the server is located. Let’s just say that I’m not surprised that network performance between California and Tel Aviv (ten timezones) isn’t so great

I guess then that the feedback I have to offer is this:

a) does Hydra on that build server not already build and cache the binaries themselves? I’m wondering why importing the dhall derivation is forcing a rebuild from the Haskell dependencies - being able to download the dhall binary directly from cache.dhall-lang.org instead of fetching each of the component dependencies for a rebuild would greatly speed up the process.

b) what do you think about adding pushing the release binaries to Cachix as part of the dhall-haskell release process? The main value-add here (apart from reducing load on a server you’re paying for out of pocket) is that Cachix now serves from Cloudflare’s CDN (so global performance will be much faster) and that Cachix is free for open-source projects.

If I can find some free time (and that’s a big if…) then I’ll look more into the current release process and maybe come up with a PR for this.

ari-becker · February 10, 2020, 4:06pm

I finally got the Nix expression above to build - with one small change. I got hit by #1135 - I’m not sure how to apply the #1159 fix, so I ended up changing the second line of the shebang to #! nix-shell ./shell.nix -i bash --option build-use-sandbox false. This isn’t ideal as the tests in dhall-lang which cause the failure refer to the master branch of dhall-lang, not the version of dhall which is used in the Git submodule, but it’s good enough to compile for now.

This is going to be a really big help in my company - even though (especially because?) we are a small company, our engineers are running machines that aren’t centrally-managed (and some run OS X and some run Linux, at that), so different engineers had different versions of various command-line tools installed (not just dhall-haskell) or not at all (particularly when on-boarding). Telling engineers “please just install Nix” and then have all of the scripts just work, even when the scripts’ dependencies are updated out from underneath them, is deeply underappreciated in the industry.

Of course, to really get things going we’ll probably have to set up a local binary cache. Which is a difficult sell for a small company… so I guess we’ll see.

Gabriel439 · February 10, 2020, 4:11pm

@ari-becker: The answers to your two caching-related questions are:

(A) hydra.dhall-lang.org only keeps binaries for the last revision of master and each pull request branch. It does not keep older revisions in cache, mainly to conserve space (for cost reasons; the instance currently costs me $40 / month to host)
(B) I’d be fine adding the binaries to cachix and I can make that part of the release process. The script I use for most of the release process is here:

https://github.com/dhall-lang/dhall-haskell/blob/fe7ebeab61bc01d9d48ee78bba7e68ff7f6bc698/scripts/release.sh

… so if you get to this before I do then that is the place to contribute the cachix-related steps

Gabriel439 · February 11, 2020, 5:58am

@ari-becker: Alright, I uploaded Linux binaries for version 1.29.0 to cachix and you can find the cache here:

https://dhall.cachix.org/

I also updated the release script so that I don’t forget for upcoming releases:

ari-becker · February 11, 2020, 7:24am

Ooh fantastic! Thank you!

Profpatsch · February 11, 2020, 11:57am

A different approach is using Justin’s https://github.com/justinwoo/easy-dhall-nix, which packages the static release binaries (~3MB per tool) as nix expressions.

This won’t help you if you need the Haskell library, but if the static tools are enough it’s wonderful and won’t ever break (no moving parts).

ari-becker · February 11, 2020, 4:36pm

@Profpatsch yes, I’ve seen that repository before. I think I’m reticent to use it because it reduces dependency discipline over just using upstream directly for (probably) not enough benefit and sort of goes against the “Nix way” in which compiling from source is only supposed to be a fallback in case the binaries aren’t found in a cache; and in general release binaries should be cached and not retrieved directly.

As a concrete (albeit minor) example… that repo doesn’t have tags pointing to upstream dhall versions, and the install script creates a bash completion but not a zsh completion.

Getting the release binaries into Cachix is probably the best of all worlds. If I really had to make a choice between a) get everybody to build everything from scratch b) direct everybody to hammer Gabriel’s $40/month-out-of-pocket-and-goodness-of-his-heart server c) copy the MIT-licensed code from the easy-dhall-nix repo into the shell.nix we’re using for the scripts, well, I’d choose C. But D, tell people to just install cachix and run cachix use dhall first, and then just use upstream, sounds like the best option to me.

Profpatsch · February 13, 2020, 2:57am

Idk, I’d rather use official release binaries than have another third party between me and my build results (plus cachix use dhall modifies your nix.conf, so it’s a mixed bag really). But of course that’s for everybody to decide themselves.

Gabriel439 · February 13, 2020, 3:28am

@Profpatsch @ari-becker: I also prefer “Nix code + cachix” over the easy-*-nix approach, because the former documents how the cached binary was built. 95% of the time you only need the pristine and pre-cached build product, but for those 5% of the times where you need to make a deep modification then being able to build from source is a lifesaver.

ari-becker · February 13, 2020, 3:01pm

@Profpatsch there’s a more academic argument to be made here about software security. If you receive a binary without a way to verify how it was built, it increases the likelihood that an attacker will compromise the distributed binary. The mere presence of a mechanism to verify after-the-fact that distributed binaries are legitimately built, even if such a mechanism is not used in daily usage, may be enough to create an analogue of “herd immunity” to dissuade attackers from compromising the binaries in the first place due to the assumed presence of a minority of users who are doing diligent testing (it’s not a perfect analogy because here the minority protects the majority, but it’s the same idea).

If it interests you, see e.g. https://defuse.ca/triangle-of-secure-code-delivery.htm
Nix builds + binary CDN caches get pretty close.

Most software delivery engineering isn’t anywhere near the mostly-academic ideal though so I get where the “official release binaries are good enough” viewpoint is coming from.

Profpatsch · February 14, 2020, 6:39pm

yeah, this attack vector is extremely academic and based on a fairly primitive trust model.

this however is a good and practical argument. That’s why I restricted to “if you don’t need the haskell library” above.