Function hash changing even though function outcome does not


#1

Continuing a bit on the subject posted on this thread, I have run into two cases where a function produces identical outcomes for all input but has different hashes. Both came up when I was refactoring dhall code, which gave me the spooks as I was pretty sure my logic hadn’t changed, so I had to run several examples a la quickcheck just to convince myself it hadn’t.

Case 1

let Natural/equal : Natural -> Natural -> Bool =
      https://prelude.dhall-lang.org/Natural/equal
      sha256:7f108edfa35ddc7cebafb24dc073478e93a802e13b5bc3fd22f4768c9b066e60
let neq: Natural -> Natural -> Bool = \(n0: Natural) -> \(n1: Natural) -> Natural/equal n0 n1
in neq

yields a hash sha256:7f108edfa35ddc7cebafb24dc073478e93a802e13b5bc3fd22f4768c9b066e60, whereas

let Natural/equal : Natural -> Natural -> Bool =
      https://prelude.dhall-lang.org/Natural/equal
      sha256:7f108edfa35ddc7cebafb24dc073478e93a802e13b5bc3fd22f4768c9b066e60
let neq: Natural -> Natural -> Bool = \(n0: Natural) -> \(n1: Natural) -> Natural/equal n1 n0
in neq

yields a hash sha256:9e0e952e097a2a974ca4ed3e65bcf395cda2bb54382c8ecc3963bca50bfff124.

Case 2

let Natural/equal : Natural -> Natural -> Bool =
      https://prelude.dhall-lang.org/Natural/equal
      sha256:7f108edfa35ddc7cebafb24dc073478e93a802e13b5bc3fd22f4768c9b066e60
let User: Type = { id : Natural, name : Text }
let userEq: User -> User -> Bool = \(u0: User) -> \(u1: User) -> Natural/equal u0.id u1.id
in userEq

Yields a hash of sha256:9dd6204f24c38fec828b454492d37086018b2781b2189f5e270d824055a2fe20 whereas

let Natural/equal : Natural -> Natural -> Bool =
      https://prelude.dhall-lang.org/Natural/equal
      sha256:7f108edfa35ddc7cebafb24dc073478e93a802e13b5bc3fd22f4768c9b066e60
let User: Type = { id : Natural, name : Text, email : Text }
let userEq: User -> User -> Bool = \(u0: User) -> \(u1: User) -> Natural/equal u0.id u1.id
in userEq

yields a hash of sha256:9c5e711166f46d9c69f6fc1d2cf4dd2dc7c14495f9ae60259fa27f468c1cc1f0.

Comments

The two cases are different, and perhaps this is intended behavior, but it caught me by surprise so maybe there is a way to document it? In general, like my first post, it is getting at issues of extensionality in Dhall. Basically, I’m trying to avoid experimenting with heavier solutions like z3 or coq for verifying that certain configurations haven’t changed, and if I can bend dhall to do this that’d be great as my configurations are in dhall anyway and it’d prevent me from using another tool/language.

Thanks for your help!


#2

Hi!

In your first example, the body of the function did change the order of the application of the arguments

TLDR; the order of the application matters.

Long explanation:

Let’s look at the alpha-normalization of your function and then build the hash:

\(n0: Natural) -> \(n1: Natural) -> Natural/equal n0 n1

normalizes to

\(_: Natural) -> \(_: Natural) -> Natural/equal _@2 _@1

while

\(n0: Natural) -> \(n1: Natural) -> Natural/equal n1 n0

normalizes to

\(_: Natural) -> \(_: Natural) -> Natural/equal _@1 _@2

Given this judgement:

encode(f₀) = f₁   encode(a₀) = a₁   encode(b₀) = b₁   …
───────────────────────────────────────────────────────────────────
encode(f₀ a₀ b₀ …) = [ 0, f₁, a₁, b₁, … ]

the binary encodings are respectively:

[0, encode(Natural/equal), encode(_@2), encode(_@1)] 

and

[0, encode(Natural/equal), encode(_@1), encode(_@2)] 

Given this judgment:

 ───────────────────  ; n < 2^64
encode(_@n) = n

the binary encodings are:

[0, encode(Natural/equal), 2, 1]

and

[0, encode(Natural/equal), 1, 2]

… which are not equal.

Concerning the second example, here’s the relevant standard: https://github.com/dhall-lang/dhall-lang/blob/master/standard/binary.md#functions, in the judgement

encode(A₀) = A₁   encode(b₀) = b₁
───────────────────────────────────────────────  ; x ≠ "_"
encode(λ(x : A₀) → b₀) = [ 1, "x", A₁, b₁ ]

The type of a parameter is part of the binary representation of a function (and consequently part of the resulting hash). Here the type changed from { id : Natural, name : Text } to { id : Natural, name : Text, email : Text }.

Note that this would be mitigated when (if?) row polymorphism lands in Dhall.

EDIT: fix flipped DeBruijn indices


#3

For the first example maybe the needed feature is that the normalization re-order arguments in a "normal order’ (yet to be defined!) in a function application iff the function is commutative… I’m not sure it’s desirable, practicable or even possible in all cases.


#4

Neither of them are necessarily desirable, it depends what people want from the language. The front page of dhall-lang.org says:

Fearlessly refactor

Need to clean up a big mess? Move fast without breaking things by leaning on Dhall’s tooling.

Refactoring something mission-critical? Use Dhall’s support for semantic hashes to guarantee that a refactor is behavior-preserving

To me, the two outcomes I talk about above go against that stated goal. It could be of course because the language lacks the features at the present time, which is fine, but if it’s the intent of the community not to add those features, then the value prop on the website should change to reflect the intent of the community/developers.


#5

@mikesol So would you expect Dhall to provide some kind of semantic equality test that never falsely detects two functions to be inequal when their behaviour is in fact the same? I’m not very familiar with the theory, but I suspect this is a hard problem (as in “NP-hard”).

We could of course add more β-normalization rules, but I don’t believe we can ever achieve that kind of perfect semantic equality testing. We might be able to fix the issue you reported earlier, but even that comes with a tradeoff in complexity and probably performance.

Why don’t you take a stab at adding more normalization rules to fix your particular issues, and maybe propose standardizing them? You can always maintain your own fork of dhall with stronger normalization rules, of course…


#6

Thanks so much for the thorough response!

would you expect Dhall to provide some kind of semantic equality test that never falsely detects two functions to be inequal when their behaviour is in fact the same?

Yes, the claim on the website: “Use Dhall’s support for semantic hashes to guarantee that a refactor is behavior-preserving” makes me belive that Dhall will provide some kind of semantic equality test that never falsely detects two functions to be inequal when their behavior is in fact the same. This could be, of course, because I’m misreading the claim on the website.

I’m not very familiar with the theory, but I suspect this is a hard problem (as in “NP-hard”).

You’re right - it’s really really hard. I’ve worked on this a fair bit making automated proofs with z3 and a little bit using automatic proving strategies in coq, but even in those tools break sometimes. That said, when the domain is circumscribed (ie only unary functions that accept integers in a certain range) it is possible to be able to make assertions about the equivalency of any two valid statements, which is pretty cool.

I think the most important thing is advertising - I have no strong opinions about what Dhall should be as I haven’t used it enough to have an opinion, but I was (and still am) attracted to the idea of the value prop as advertised on the website. If the value prop is aspirational, that’s great and I’m excited to see the project move in that direction, and I could even lend a hand to integrating z3 into dhall as an automatic prover, but before that I’d need to understand if that’s what people actually want out of the language or if it’s not really a core desired feature to most folks.


#7

I can agree for the first example. It’s a matter of finding the sweet spot between usability and internal complexity (both in standardization and implementation terms). In this case, the complexity is somewhat huge, I believe, as does @sjakobi. Another solution would be to mark some functions with the commutative property and add exceptional normalization rules for those, which is much more simple to standardize and implement, but is ugly and adds maintenance burden to Dhall implementations.

But I beg to differ for the second example. The whole point of hashing is that two equal hashes denotes equivalent objects (in the mathematical sense of equivalent). Two functions that don’t have the same domain of application can’t be equivalent.

This may be solved by row-polymorphism where the function would become (hypothetical syntax):

let isEq : {id: Natural, ρ} -> {id: Natural, ρ} -> Bool = ....

and then, isEq can be called with anything that is a record and has an id: Natural fields. In your refactoring it would keep the same hash, indeed.


#8

For the second example, I understand what you mean. I did not realize that row polymorphism did not exist in dhall, and it’s been really helpful to read the GitHub issues about that. I’ll leave it on the thread but rescind it from the discussion, so we can zoom in on the first example.


#9

@mikesol: You’re probably already aware of this, but for the benefit of others reading this: function equivalence is undecidable in System F, so Dhall cannot guarantee that two extensionally equal expressions produce the same hash. Dhall can detect that a certain types expressions are equivalent (specifically expressions that are αβ-equivalent), but cannot detect other types of equivalences.

The accurate claim is that if the hash doesn’t change then the refactor is behavior-preserving. The converse is not necessarily true; in other words, if a refactor is behavior-preserving (which can easily be interpreted to mean extensionally equal) then the hash might still change, as you noted.

In response to this, I have a pull request up to reword the claim to be less prone to misinterpretation:

Regarding the suggestion of possibly using z3 (or another SMT solver) for more powerful equivalence checking, I believe we’re unlikely to go in that direction, although I can’t speak for the other implementations. My reasoning is that any change to equivalence checking behavior needs to be implemented by all existing and future implementations, and currently the trend is towards minimizing the complexity of the implementation to facilitate new language bindings. For example, a simpler proposal to detect equivalence of associative operators was rejected:

… and while decisions like that can be relitigated I haven’t seen a strong demand for more powerful equivalence checking from our users.


#10

Thanks for this explanation! As I’m outside of the community and don’t have my finger on the pulse of the language’s design and implementation, your explanation is really helpful. I see how it’s undesirable to enforce that type of maintenance burden across multiple implementations.

I read over your PR, and while I think it’s generous, I also agree with @sjakobi that I’m probably the outlier and most people likely interpret it as you intended it. So I don’t think you need to change it - my guess is that, on the off-chance anyone makes a similar misinterpretation, this thread has all the info someone would need to be informed.

Thanks for your work on Dhall - it’s been really fun getting to know the language!