Pointers to profiling tools for dhall-haskell?


(Ari Becker) #1

Context: https://github.com/dhall-lang/dhall-kubernetes/issues/54#issuecomment-476202026

As Dhall has become more and more central to my workflow, it’s become a bit of a gateway drug for Haskell, as I want to get to a point where I can become an active contributor. I’m currently on chapter 5 of the Haskell book, but it’ll take me a while to get through it.

In the time being, I have some expressions that are taking an ungodly amount of time to parse (more than ten minutes just to get to a type error), but they’re expressions that I unfortunately cannot publicly share just yet. So I want to be able to start to profile them, maybe get a sense of whether or not there’s any low-hanging fruit, maybe even submit a PR.

As I’m new to the Haskell world - can somebody point me in the direction of tools that could help with profiling? Anything else that could be helpful to somebody going in this direction?

Much appreciated

Edit: final parsing time came down to almost 40 minutes (2018 mac book pro)… ouch :face_with_thermometer:


(Gabriel Gonzalez) #2

The first step is obtaining the default profiling output (i.e. a .prof file). If you are using cabal then I believe you only run these commands within the dhall Haskell project to build a dhall executable that supports profiling:

$ cabal configure --enable-profiling
$ cabal build
$ time cabal run dhall -- +RTS -p <<< './example.dhall'

That will generate a dhall.prof file within your current working directory.

That file is a text file that by itself produces some useful profiling output, but in my experience it is structured in a way that is easy to misinterpret, so I prefer to use a derived tool called profiteur whose input is a .prof file and whose output is a graphical user interface for exploring the time and space costs in a visual tree-based way:

If you install the profiteur tool then you use it by running:

$ profiteur dhall.prof

I did this a few months ago when investigating performance issues for the following issue:

… and what I learned so far is that the bottle-neck is type-checking because if you omit the type-checking step then normalization alone is much quicker. It was also specifically type-checking of either records or unions (due to the cost being in Map-related operations), so there is probably something algorithmically dumb I’m doing when processing them (like unnecessarily normalizing things multiple times).

You might want to also use that same issue for initial optimization tests because it only takes ~30 seconds to iterate on changes there (compared to 40 minutes for the example you gave).


(Gabriel Gonzalez) #3

Also, I forgot to mention that the dhall project has detailed instructions for getting started with the various Haskell build tools (i.e. cabal/stack/cabal+Nix):


(Vanessa McHale) #4

FWIW I would use

cabal new-build -p --enable-profiling

rather than

cabal build

when possible.

You might even be able to do

cabal new-install dhall -p --enable-profiling

Then you can run

dhall +RTS -h -p <<< './Pipeline.dhall'

and you’ll get a heap profile (.hp) and .prof file that lists the functions that consumed the most time.

There’s also threadscope but I don’t think you need that yet :stuck_out_tongue: