Pawel Szulc

Getting acquainted with Lens (part 1)

December 15, 20205 min read
Content of this blog and the source code is available on Github. Repository is divided into small commits so that you can follow along if you prefer jumping straight into the code.
This post is based on a talk I did at Haskell.Love 2020
I want to personally thank @cateroxl for proof-reading and helping me fix all the typos I've created while writing this post. @cateroxl you rock!

Introduction

In this post we will explore a concept of a Lens. More concretely the Lens library.

  1. What problem they are trying to solve?
  2. How we can use them?
  3. How they are being implemented?

What problem they are trying to solve?

It is my observation that any newcomer to the Haskell ecosystem, who gets excited with the language, stumbles upon two limitations of the language design. Mainly:

  1. Record syntax
  2. Strings

Those two are suprising to newcomers, especially because both record syntax and strings are considered a no-brainer in almost every other programming language. Writing about string representation in Haskell deserves a blog post on its own. Today we want to focus on record syntax, trying to understand what are its limitations and main source of frustration. And frustrating indeed it is. Below are a few quotes scrapped from the Internet to back that claim.

“The record system is a continual source of pain” - Stephen Diehl

“What is your least favorite thing about Haskell? Records are still tedious” - 2018 State of Haskell Survey

“Shitty records.” - Someone on reddit

An example

Someone famous once said “Talk is cheap, show me the code”. In that sprit let’s explore an example project in which those problems are clearly visible.

We will use the latest version of GHC Haskell:

and a standard cabal project:

Imagine we are writing a tool allowing conference organizers to maintain their events. We have a datatype Conference:

where Organizer is:

and Speaker is:

Organizer has a Name and Address. Name is a simple record with firstName and lastName:

and Address encapsulates street, city and country:

Now we just need an example of a conference organizer, a value that we could play with in the REPL. While creating this blog post, I could not miss the opportunity to pay my tribute to Oli Makhasoeva - one of the best conference organizers on the planet, the master mind behind such events as Haskell Love or Scala Love.

Let’s create a value of type Organizer called oli:

Fetching values from records

We can observe that both name and contact are in fact accessor functions that allow us to retrieve values from records:

This can even look nicer if we use & operator from Data.Function:

Here we see that & is a simple function application, where instead of providing function name and the argument (as we would normally do):

we provide first the argument and then the function name:

This allow us to change previous call to name from:

to:

It does not seem much at first, but you can observe that this approach composes nicely when you want to read the value of a deeply nested record:

This resembles dot-like record access that is available in other languages. Correctly formatted makes accessing deeply nested values a pleasure experience

But when we do a slight modification to our code, where not only Organizer has a name field:

the code suddenly stops compiling:

There is one trick we could do make it work…Have you guessed it? Yes, it is GHC. We can always add a language extension:

And things compile again!

But when we try to use the name function:

The compiler gives us a quick reality check:

It seems that we can define multiple records with the same field name, but we are not allowed to use it.

Is that it? Are we done?

Before we give up, there is fortunately one more trick we can use. It’s called OverloadedLabels but it requires a bunch of other extendsions to be enabled. We enabled them for the whole project by modifing the cabal file (along with the DuplicatedRecordFields that’ve used before):

What OverloadedLabels gives us is a typeclass IsLabel from GHC.OverloadedLabels:

It is a typeclass that we define for a particular type a and a Symbol (think of it as type level String). As an example we can create an instance of it for a Speaker and "encodepanda":

And now we can use it as we would normally use any other typeclass:

But OverloadedLabels comes with a nice syntatic sugar where we can reference just the symbol directly by prefixing the Symbol with a hash # and let the type inference magic do the rest work for us.

Underneath, it desugars to a function call to fromLabel.

Leveraging OverloadedLabels for record access

We can now take OverloadedLabels for a spin, to see if they can help us with our issue. As a reminder, the problem at hand is a fact that (even though we have DuplicateRecordFields turned on) we can not reuse duplicated accessor function:

The idea is to provide different instances for "name" and different accessor functions called name:

It is suprising that even though we are reusing name functions to implement those typeclasses, the compiler is suddenly happy. In other words: we’ve build up this boilerplate in order to workaround the fact that different accessor functions that have the same names could not be used. At the same time, that workaround is making use of those functions. Why this is happening is probably a good idea for a different project. For now lets ignore this suprising effect, and celebarate the fact that we can finally can write a function like this

So yes, if looking at how accessing records is encoded in Haskell makes you want to do the following

I honestly don’t blame you. But remember, this is only half of the story, we still need figure out the way how to set values into records.

Setting values

Imagine we have a value of type Conference:

We also have two speakers who would love to be part of this conference:

At first, setting a new value does not seem to be scary.

conference { speakers = [ pawel, marcin ] }

It seems pretty reasonable - it reuses syntax for the creation of a value. Small, nice, compact. However, the moment you try to do something a bit more complicated, things get really messy really quickly.

For example, something as simple as making all speakers for a given conference marked as not ready:

Or modifing an organizer’s email address using a function:

In both example we have to very explicitly say each little tiny detail of how the new value should look like. If that is not imperative programming, I don’t know what is.

Now what?

Is that it? Is there now hope for us? Where do we go from here? You will have to wait for Part 2 to get your questions answered. But fear not, you might get it yet before Christmas :) Stay tooned!