Implementing Vector in dhall (or, should we add dependent sums?)

I thought I’d have a go at the new dependent type features and do the first thing everyone does in a dependently-typed language, implement fixed-length vectors.

I wanted to implement Vector n a as a record containing a List a and evidence that the given list has length n. Something like:

{ content  = [ 1, 2 ]
, evidence = assert : List/length Natural [ 1, 2 ] ≡ 2
}

However, this encoding is tricky because the same value appears at the term and type levels. The type of this expression is:

{ content : List Natural
, evidence : 2 ≡ 2
}

but I want something more like

∀(l : List Natural) →
  { content : List Natural
  , evidence : List/length Natural l ≡ 2
  }

Essentially, I think I’m reaching for a dependent sum type but dhall doesn’t have one.

So… two questions. First, am I correct? Is there no way to implement Vector from List with Dhall’s current feature set?

Second, would people consider adding dependent sums to the language?

If you don’t care about ergonomics you can get pretty close.

Usually the way I reason through these things is to reason about how it would be implemented in pure calculus. In this case, we can define Vector inductively, like this (similar to the trick for defining recursive code in Dhall):

let Vector
    : Natural → Type → Type
    =   λ(n : Natural)
      → λ(a : Type)
      →   ∀(Vector : Natural → Type)
        → ∀(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
        → ∀(Nil : Vector 0)
        → Vector n

let Cons
    : ∀(n : Natural) → ∀(a : Type) → a → Vector n a → Vector (n + 1) a
    =   λ(n : Natural)
      → λ(a : Type)
      → λ(x : a)
      → λ(v : Vector n a)
      → λ(Vector : Natural → Type)
      → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
      → λ(Nil : Vector 0)
      → Cons n x (v Vector Cons Nil)

let Nil
    : ∀(a : Type) → Vector 0 a
    =   λ(a : Type)
      → λ(Vector : Natural → Type)
      → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
      → λ(Nil : Vector 0)
      → Nil

in  Cons 1 Text "ABC" (Cons 0 Text "DEF" (Nil Text))

In other words, this is a Boehm-Berarducci encoding of a length-indexed linked list.

That actually type-checks and normalizes just fine. But then if you try to implement a type-safe head, you will get stuck when implementing the case for handling Nil (see the TODO in the code):

  λ(TODO : ∀(a : Type) → a)
→ let Vector
      : Natural → Type → Type
      =   λ(n : Natural)
        → λ(a : Type)
        →   ∀(Vector : Natural → Type)
          → ∀(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
          → ∀(Nil : Vector 0)
          → Vector n
  
  let Cons
      : ∀(n : Natural) → ∀(a : Type) → a → Vector n a → Vector (n + 1) a
      =   λ(n : Natural)
        → λ(a : Type)
        → λ(x : a)
        → λ(v : Vector n a)
        → λ(Vector : Natural → Type)
        → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
        → λ(Nil : Vector 0)
        → Cons n x (v Vector Cons Nil)
  
  let Nil
      : ∀(a : Type) → Vector 0 a
      =   λ(a : Type)
        → λ(Vector : Natural → Type)
        → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
        → λ(Nil : Vector 0)
        → Nil
  
  let Vector/head
      : ∀(n : Natural) → ∀(a : Type) → Vector (n + 1) a → a
      =   λ(n : Natural)
        → λ(a : Type)
        → λ(v : Vector (n + 1) a)
        → v
            (λ(_ : Natural) → a)
            (λ(m : Natural) → λ(x : a) → λ(v : a) → x)
            (TODO a)
  
  in  Vector/head 1 Text (Cons 1 Text "ABC" (Cons 0 Text "DEF" (Nil Text)))

… buuuuuuuut it actually type-checks and normalizes fine and produces the desired result (albeit under the assumption of the TODO which is logically equivalent to assuming a false axiom):

$ dhall --file ./Vector.dhall 
λ(TODO : ∀(a : Type) → a) → "ABC"

The interesting part is how we get rid of the dependence on TODO.

We can get even closer by doing this:

  λ(TODO : ∀(a : Type) → a)
→ let Vector
      : Natural → Type → Type
      =   λ(n : Natural)
        → λ(a : Type)
        →   ∀(Vector : Natural → Type)
          → ∀(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
          → ∀(Nil : Vector 0)
          → Vector n
  
  let Cons
      : ∀(n : Natural) → ∀(a : Type) → a → Vector n a → Vector (n + 1) a
      =   λ(n : Natural)
        → λ(a : Type)
        → λ(x : a)
        → λ(v : Vector n a)
        → λ(Vector : Natural → Type)
        → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
        → λ(Nil : Vector 0)
        → Cons n x (v Vector Cons Nil)
  
  let Nil
      : ∀(a : Type) → Vector 0 a
      =   λ(a : Type)
        → λ(Vector : Natural → Type)
        → λ(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
        → λ(Nil : Vector 0)
        → Nil
  
  let Vector/head
      : ∀(n : Natural) → ∀(a : Type) → Vector (n + 1) a → a
      =   λ(n : Natural)
        → λ(a : Type)
        → λ(v : Vector (n + 1) a)
        → v
            -- The following line is the interesting part!
            (λ(m : Natural) → if Natural/isZero m then {} else a)
            (λ(m : Natural) → λ(x : a) → λ(v : a) → x)
            {=}
  
  in  Vector/head 1 Text (Cons 1 Text "ABC" (Cons 0 Text "DEF" (Nil Text)))

Here we say that the case for non-empty lists needs to return an a, but the case for empty lists can return anything at all (so we pick {}, which is trivial to return). Here the expected type our fold on Vectors depends on the list index, so we can have the two branches of the fold return different types.

This almost works, except that if is restricted to terms so you can’t yet have if expressions on types. We could lift that restriction, but it’s unclear to me if that is sound or not:

$ dhall --file ./Vector2.dhall 

Use "dhall --explain" for detailed errors

TODO : ∀(a : Type) → a
n : Natural
a : Type
v :   ∀(Vector : Natural → Type)
    → ∀(Cons : ∀(m : Natural) → a → Vector m → Vector (m + 1))
    → ∀(Nil : Vector 0)
    → Vector (n + 1)
m : Natural

Error: ❰if❱ branch is not a term

36│                               if Natural/isZero m then {} else a

(stdin):36:31
1 Like

Sorry, I forgot to tie this back to your original question about the more ergonomic encoding you proposed.

So, to do this for the more ergonomic encoding it’s basically the exact same problem. You would need to essentially support type-level operations on Naturals, such as a type-level Natural/fold (in general) along with other useful things like type-level if.

Supporting type-level equivalents of term-level operations usually entails adding something like universe polymorphism to the language, because that means you can have a single function (i.e. Natural/fold) that works at the term level, the type level, the kind level, etc. So universe polymorphism is the real feature you are looking for to power this sort of dependently typed programming.

4 Likes

This would be nice to have. Has there been any consideration of adding the feature to Dhall?

I’ve been exploring how to encode a family of types to represent various fixed-lengths of bits (Bools), which is really a specialization of the Vector idea. Currently I generate several of them using an external script, but I’d really prefer to implement a package that lets users dynamically generate the appropriate type for a given bit width and then define functions in terms of this family of types, like bit indexing. I hoped this would be possible because it wouldn’t require general recursion; simple primitive recursion should suffice.

To hopefully get closer to something like this I’ve implemented a simple encoding of Peano numbers. I figured maybe I could implement a Natural -> Peano constructor that recursively applies succ and decrementing the Natural until Natural/isZero and then use the Peano encoding to help construct the appropriate number of iterative applications of \(t : Type) -> { _1 : t, _2 : t }. Unfortunately, I rapidly bump into the Type vs Kind mismatches, e.g. when using Natural/build. If I had access to a Kind-ed analog of Natural/build that would also help; universe polymorphism might not be essential for this.

Universe polymorphism is possible to add to the language, it’s mainly that it’s quite a bit of work (since every language feature would need to be made universe-polymorphic, such as if expressions) and it increases the implementation complexity