Some insights on making a Dhall-based application

ear7h · December 28, 2021, 7:52am

Hello!

For the past couple weeks I’ve been working a build system that uses Dhall as the base language, and I wanted to share my experience to anyone who might want to similarly use Dhall as a front end for their application. in this post I’ll briefly explain why I chose to integrate Dhall into my project, but focus on the interfaces between Haskell and Dhall, and Dhall and the user.

My goal was to create a more flexible and composable version of make. With make you get caching and incremental builds for free, but deviating from the conventional directory structure (separate build or cache directories, or conflicting names between source and intermediate files) or dynamic behavior requires even more specialized external tools like cmake and poorly defined/non-obvious conditionals. Dhall provides well defined semantics and FP lends itself to DRY-ing.

Dhall to Haskell

The back end is written in Haskell and handles all the IO operations. The core of the back end is the Item type:

data Command = MkCommand
    { cmd :: String
    , args :: [String]
    }

data Item
    = Generated
        { results :: [String]
        , command :: Command
        , prereqs :: [Item]
        }
    | Source String

An Item is either a source file or it can be generated. A generated file can depend on other Items, which get built first and also trigger rebuilds of dependents when changed. This is a natural translation of the concepts of a build system into Haskell. Ideally, I’d like to express my build instructions in a similar recursive manner. While Dhall doesn’t allow this type to be translated directly, the docs provide a helpful guide for working around this limitation but only on the Dhall side of things. This doesn’t exactly solve my problem, but it does get me one step closer. The linear structure could be tokens, such as those output by a lexer, and then I’d just parse those tokens back into a recursive structure in Haskell. This is what a token looks like:

data Token
    = TOpen
    | TClose
    | TGenerated [String] Command
    | TSource String

TOpen and TClose are like parenthesis which mark the beginning and end of a TGenerated's prereqs. For example the following token list:

[ TGenerated ["hello2.txt"] (MkCommand "cp" ["hello1.txt", "hello2.txt"])
, TOpen
, TGenerated ["hello1.txt"] (MkCommand "touch" ["hello1.txt"])
, TOpen
, TClose
, TClose
]

Would be parsed as:

(Generated
  ["hello2.txt"]
  (MkCommand "cp" ["hello1.txt", "hello2.txt"])
  [ Generated ["hello1.txt"] (MkCommand "touch" ["hello1.txt"]) [] ]
)

The current parser is hand-written but in the future, or in other similar projects, I think I could actually use a parser library like happy or megaparsec.

User to Dhall

My implementation for a recursive type is slightly different, and less general than what the guide I linked above suggests. It suggests a function generic over the linear output type, however I have made it static for my own convenience for the time being (it only needs to output tokens, but I might experiment with outputting a bash script in the future). In Dhall the Item type looks like this:

-- ./prelude.dhall
let Item
    : Type
    = ∀ (MkItem :
        { source : Text → List Token
        , generated :
          { results : List Text
          , command : Command
          , prereqs : List (List Token)
          } → List Token
        }
      ) → List Token

Once again, the link above provides a good explanation of the pattern, but the short version is that an Item is a function that takes record of functions that can linearize a source item and a generated item and returns the linearization. Creating such a type is unwieldy, so I’ve also provided the following helper functions:

-- ./prelude.dhall
let generated
  : { results : List Text
    , command : Command
    , prereqs : List Item
    } → Item
  = ...

let source
  : Text → Item
  = ...

Using them allows writing Items in a recursive way:

-- ./default.dhall
let B = ./prelude.dhall
let cmd = λ(cmd : Text) → λ(args : List Text) → { cmd, args } : B.Command
in B.generated
  { results = [ "hello2.txt" ]
  , command = cmd "cp" ["hello1.txt", "hello2.txt"]
  , prereqs =
    [ B.source "hello1.txt"
    ]
  }

This pattern is also present in the standard Dhall prelude’s XML package, which is where I first familiarized myself with it.

Lastly, to generate the tokens so that the Haskell program can understand it, there’s a build function. It takes an Item and passes the record functions to it, so that it generates a List Token:

-- ./prelude.dhall
let build
    : Item → List Token

It can be used like so:

-- buildsys.dhall, the buildsys3 binary looks for a file with this name
let B = ./prelude.dhall
in { default = B.build ./default.dhall }

Conclusion

Recursive types come up often in many problems and while Dhall can help simplify interaction with such problems, some extra effort is required by tool authors to linearize the problem.

Thanks for reading, if you’re curious about using the build system you can check out the examples directory or the source code to my website which was my latest inspiration for making a better build system.

infogulch · January 15, 2022, 6:26am

I wonder if a better approach would be to use a nix build system, where all the nix definitions are written in and validated by dhall.