Deep / nested default structures to collate multiple values

timbertson · April 22, 2021, 2:04am

I have a potential use case for dhall with some distributed sources.

That is, I don’t have a single expression which simply pulls in the types it uses and produces a single result. Rather I have multiple component expressions from different sources, and I want to bring them together into a cohesive type.

Concretely, imagine I want to put a project.dhall in each of my repos, and have a central repo importing all of these files and producing a List Project output expression.

Maybe it looks like this:

let allProjects = [
, https://example.com/project1/project.dhall
, https://example.com/project2/project.dhall
] : List Project

As soon as Project gains a new field, you must update project1 and project2 before this expression will typecheck. But updating multiple repositories in lockstep is… not really feasible.

Using schemas, we can deal with this problem at the top level, by filling in defaults missing from any individual project:

let allProjects = [
, Project::https://example.com/project1/project.dhall
, Project::https://example.com/project2/project.dhall
] : List Project.Type

This way, we can add whatever toplevel properties to Project as long as we have defaults for all new fields, each project can use an older version of the ProjectMeta with fewer fields.

The problem comes with nested fields. I wrote up a more complex example with a nested Project type, and I found a way to merge a “subset” project (p1) into a full Project type:

let DeploymentTool = < Unknown | Spinnaker | Chef >

let Deployment =
      { Type = { tool : DeploymentTool, link : Optional Text }
      , default = { tool = DeploymentTool.Unknown, link = None Text }
      }

let Project =
      { Type =
          { team : { name : Text, slackChannel : Optional Text }
          , project :
              { name : Text
              , slackChannel : Optional Text
              , deployment : Deployment.Type
              }
          }
      , default =
        { team.slackChannel = None Text
        , project =
          { slackChannel = None Text, deployment = Deployment.default }
        }
      }

let p1 =
      { team.name = "Astronauts"
      , project = { name = "Foo", deployment.tool = DeploymentTool.Spinnaker }
      }

in    [     Project.default
        //  p1
        //  { team = Project.default.team // (p1.team ? {=})
            , project =
                    Project.default.project
                //  (p1.project ? {=})
                //  { deployment =
                            Project.default.project.deployment
                        //  (p1.project.deployment ? {=})
                    }
            }
      ]
    : List Project.Type

It’s… pretty exhausting to do this manual deep merging, and can’t be abstracted. The crux of it is that I can’t mention the type of p1 or any of its subfields, I need to use inline expressions (like a ? {=}) in order to coerce a subset of some known record type into the full type.

So, thoughts:

Is this something we want to do with dhall? It feels like a weakness that the types make it difficult to aggregate disparate expressions which have compatible but not identical types (i.e. a subset).

How could we better support this?

Recursive merge operator

… with this, I could simply do Project.default /*\ p1 :: Project.Type to extend the toplevel version to work for arbitrarily nested types.

I know we’ve talked about this before, and with syntax is usually better. But it only really works as a literal expression, it doesn’t help in more abstract cases like this.

The ability to talk about a (recursive) subset type

Thinking here about making this easier to abstract. If (in the above code) I could pull out some of the sub-parts into functions, that would make things much more manageable.

I think this would mean being able to take an expression which is “a subset of { a: Natural, b: Natural }”. This seems like a pretty huge and complex language feature, but maybe there’s some super-restrictive form which could work.

The implications of such an expression would mean that:

you can’t reference a field without using the fallback operator. x.a is invalid, but x.a ? 1 is valid.
you must produce a well-known type (i.e. you can accept subsets, but not produce them). So basically, most functions will return defaultValue // subsetValue to ensure thart all fields are defined.

Neither of these questions really deal with the issue of enums, which I’m closing my eyes and ignoring for now

Current workarounds

The main one I can think of is going via json-to-dhall | dhall-to-json to take advantage of that tool’s deep merging and field lenience, which is hopefully not the best we can do.

Perhaps though if we don’t want to make it a language feature, there could be a dhall-match command which takes a value and a schema, with all the same coersion behaviours as json-to-dhall without doing a roundtrip through JSON?

Gabriel439 · April 22, 2021, 3:47pm

In Roadmap for improved Kubernetes support I briefly suggested the idea of changing the behavior of the :: operator to desugar to with expressions instead of //. In other words, Project::{ foo.bar = 1, baz = True } would desugar to (Project.default with foo.bar = 1 with baz = True) : Project.Type.

I think that is essentially the same spirit as the /*\ operator you propose, except reusing the :: operator for that purpose. Would that solve your use case?

timbertson · April 24, 2021, 2:48am

I think it would, yeah! I hadn’t seen that idea.

Seems like it couldn’t change the result of any existing uses of the shallow version, so that’s nice.