I have a potential use case for dhall with some distributed sources.
That is, I don’t have a single expression which simply pulls in the types it uses and produces a single result. Rather I have multiple component expressions from different sources, and I want to bring them together into a cohesive type.
Concretely, imagine I want to put a project.dhall
in each of my repos, and have a central repo importing all of these files and producing a List Project
output expression.
Maybe it looks like this:
let allProjects = [
, https://example.com/project1/project.dhall
, https://example.com/project2/project.dhall
] : List Project
As soon as Project
gains a new field, you must update project1 and project2 before this expression will typecheck. But updating multiple repositories in lockstep is… not really feasible.
Using schemas, we can deal with this problem at the top level, by filling in defaults missing from any individual project:
let allProjects = [
, Project::https://example.com/project1/project.dhall
, Project::https://example.com/project2/project.dhall
] : List Project.Type
This way, we can add whatever toplevel properties to Project
as long as we have defaults for all new fields, each project can use an older version of the ProjectMeta
with fewer fields.
The problem comes with nested fields. I wrote up a more complex example with a nested Project type, and I found a way to merge a “subset” project (p1
) into a full Project type:
let DeploymentTool = < Unknown | Spinnaker | Chef >
let Deployment =
{ Type = { tool : DeploymentTool, link : Optional Text }
, default = { tool = DeploymentTool.Unknown, link = None Text }
}
let Project =
{ Type =
{ team : { name : Text, slackChannel : Optional Text }
, project :
{ name : Text
, slackChannel : Optional Text
, deployment : Deployment.Type
}
}
, default =
{ team.slackChannel = None Text
, project =
{ slackChannel = None Text, deployment = Deployment.default }
}
}
let p1 =
{ team.name = "Astronauts"
, project = { name = "Foo", deployment.tool = DeploymentTool.Spinnaker }
}
in [ Project.default
// p1
// { team = Project.default.team // (p1.team ? {=})
, project =
Project.default.project
// (p1.project ? {=})
// { deployment =
Project.default.project.deployment
// (p1.project.deployment ? {=})
}
}
]
: List Project.Type
It’s… pretty exhausting to do this manual deep merging, and can’t be abstracted. The crux of it is that I can’t mention the type of p1
or any of its subfields, I need to use inline expressions (like a ? {=}
) in order to coerce a subset of some known record type into the full type.
So, thoughts:
Is this something we want to do with dhall? It feels like a weakness that the types make it difficult to aggregate disparate expressions which have compatible but not identical types (i.e. a subset).
How could we better support this?
Recursive merge operator
… with this, I could simply do Project.default /*\ p1 :: Project.Type
to extend the toplevel version to work for arbitrarily nested types.
I know we’ve talked about this before, and with
syntax is usually better. But it only really works as a literal expression, it doesn’t help in more abstract cases like this.
The ability to talk about a (recursive) subset type
Thinking here about making this easier to abstract. If (in the above code) I could pull out some of the sub-parts into functions, that would make things much more manageable.
I think this would mean being able to take an expression which is “a subset of { a: Natural, b: Natural }”. This seems like a pretty huge and complex language feature, but maybe there’s some super-restrictive form which could work.
The implications of such an expression would mean that:
- you can’t reference a field without using the fallback operator.
x.a
is invalid, butx.a ? 1
is valid. - you must produce a well-known type (i.e. you can accept subsets, but not produce them). So basically, most functions will return
defaultValue // subsetValue
to ensure thart all fields are defined.
Neither of these questions really deal with the issue of enums, which I’m closing my eyes and ignoring for now
Current workarounds
The main one I can think of is going via json-to-dhall | dhall-to-json to take advantage of that tool’s deep merging and field lenience, which is hopefully not the best we can do.
Perhaps though if we don’t want to make it a language feature, there could be a dhall-match
command which takes a value and a schema, with all the same coersion behaviours as json-to-dhall without doing a roundtrip through JSON?