Normalized form in dhall - limits and roadmap


Hi! I’ve been having a chat with Fabrizio about dhall and he recommended I post some of the questions here to the broader community. I’m looking forward to getting to know y’all!

I am wondering what the limits of dhall’s normalized form are for functions and if there is a roadmap for future development. For example, normalized form on arithmetic is not yet supported in some cases:

PS C:\Users\MikeSolomon\devel\random-scripts\dhall> "\(n : Natural) -> [ n, n + 3 ]" | .\dhall.exe hash    
PS C:\Users\MikeSolomon\devel\random-scripts\dhall> "\(n : Natural) -> [ n, n + 1 + 2 ]" | .\dhall.exe hash

While it is supported in other cases:

PS C:\Users\MikeSolomon\devel\random-scripts\dhall> "\(n : Natural) -> 1 + 1" | .\dhall.exe hash
PS C:\Users\MikeSolomon\devel\random-scripts\dhall> "\(n : Natural) -> 2" | .\dhall.exe hash

The background is that I’m doing an experiment using dhall to compare pure functions, and I would like any functions that have the same behavior to have the same hash (extensionality). This is of course quite hard to do and impossible for a Turing complete language, but as dhall is not Turing complete I thought that this sort of thing may be worth discussing and knowing more about.

Thanks in advance for any thoughts/info about this!

Function hash changing even though function outcome does not

Hi Mike!

Dhall’s hashes are derived like this: For an input expression e, the hash is just

sha256 (binary-encode (α-normalize (β-normalize e))))

In this case you’re already running into the limits of β-normalization:

$ echo '\(n : Natural) -> n + 1 + 1' | dhall normalize
λ(n : Natural) → n + 1 + 1

n + 1 + 1 is parsed as ((n + 1) + 1) and the standard doesn’t provide any rules to simplify that expression further.

Gabriel once proposed additional simplifications for associative operations, but that was ultimately rejected: (I’ll have to reread that thread to understand why though…)


Super useful! Classic problem - I did a little bit of work with/on the z3 solver and these are exactly the types of questions that come up there. Ultimately, unless you’re using something like coq, everything will come down to heuristics, and even with coq there needs to be a human in the loop for all but the most trivial of proofs.

The most useful thing IMO is a statement in the docs about what is and isn’t doable for hashing and normalization. I skimmed the standard and, unfortunately, it’s too hard for me to come up with a mind map of the limits of hashing and normalization from my initial skim. I’d need to spend a lot more time familiarizing myself with the standard to grok those limits.

Function hash changing even though function outcome does not

It looks like I might have contributed to the closure of that PR by pointing out that getting into this is a slippery slope since the amount of rules that we might have to add is not unbounded (as it’s the case with Turing-complete languages) but it’s rather large (combinatorial with the amount of primitives we have I guess?), and standardizing them with the current way we standardize things could add a big burden on bootstrapping implementations.

I suspect this is also why we didn’t move forward with your #688. As I noted there though, I think having very powerful extensionality is very well worth pursuing, but we should first find a way to encode rules (possibly in Dhall itself?) in such a way so that implementations can just generate code to handle them