12: Gergő Érdi

_Wouter Swierstra:_ So welcome to the next episode of the Haskell Interlude. I'm

Dr. Wouter Swierstra.

_Andres Löh:_ And I'm Andres Löh.

_WS:_ And today our guest is Gergő Érdi. Gergő has an interesting path into

Haskell taking many twists and turns. We'll talk a little bit about that. And

discuss his recent book on implementing retro computers using Haskell.

So, welcome! And our guest today is Gergő Érdi and Gergő is famous for many

things, which we'll talk about. So both his work at Standard Chartered, his

research and the papers that he's published, and a recent book on retro

computing using Haskell. So Gergő, maybe we can start by saying something about

how you got interested in Haskell.

_Gergő Érdi:_ Right. Well so um, my connection to computers in general,

basically started back in the early Eighty's like I think everyone in my

generation. So I grew up on the Commodore 64 where I was programming in BASIC

and for some reason I never got interested in machine level computing. So like

you know, using Assembly and machine code. Everyone was basically forced to do

that on these machines because the machines were so underpowered and there were

no good high level languages for using most of the hardware. But for some reason

I really don't know why I never made that leap. So it was just BASIC and then I

tried finding some more featureful BASIC dialects.

But this was like you know, when I was a kid. But this theme of going upwards

instead of downwards seemed to be persistent and then I started using Linux. And

I started looking for some real programming languages. I used C for a bit and

C++ and then I went in the Lisp direction for a while.

And I had this, I guess, the stereotypical experience of using Lisp of suddenly

realizing that there's so much more out there than what is thought possible with

programming languages so I did Lisp for a bit.

So I used Lisp for a while but I didn't really use it. So there was no like

actual tangible end product of that unlike C++ where I got involved in some open

source project. But I really enjoyed Lisp and then I started my formal Computer

Science studies. I played around with Lisp some more but then I also started

working at Intentional Software. And there we were using C# for for all the

stuff that was not and possible in our in-house languages. So the whole idea was

that it was a language workbench where you'd make your own languages and we were

dogfooding it. So we were making our own languages for solving the problem of

giving you a tool of making your own languages but the bedrock of it was in C#.

So I programmed in C# but that was not really something that interested me.

_WS:_ But this is um, ah just to get the timeline right? So this was BASIC and

messing around in Commodore 64 in high school or whatever age up to eighteen

maybe, and then getting drawn into Linux a little bit and then you studied in

Hungary, right?

_GE:_ Yeah, so okay so the timeline was basically that says C64 was yeah when I

was a kid and then high school I but high school I I was doing I was mostly

using C++ for for open source GUI apps.

So I got involved in in Gnome and then I had this really big detour of six years

of medical school and during that I was basically contributing to open source as

a kind of keeping my sanity.

So I really didn't enjoy my time in in that school.

_WS:_ So the medical degree wasn't for you. But you kept programming

nonetheless.

_GE:_ Yeah, so I I was programming like, I always liked programming. It was

something that I regarded as like a hobby that would be cool to do on the side

but it's not necessarily something I took that I would want to do

professionally.

And so I yeah so I trained to be a doctor and then halfway through I realized

that is just not for me. But of course by then, you know, because of the sunk

cost fallacy, I decided "Oh yeah, I'm not going to waste the last 3 years. I'm

going to put in the next 3 ones and then at least get a degree out of it". So I

did that.

And then um, of course after you decide that you don't ever want anything with

it, the remaining small motivation just goes away. So it was super hard from

that point on and ah contributing to open source was a good way of doing

something fun on the side that I actually enjoy and still you know uses my brain

for something. And then basically there was like one year of overlap where I

started computer science while I was doing med school. But then I had to skip a

year at CS because basically, at least the way it's in Hungary, the sixth year

of medical school is that you're doing like six times two months of different

clinics. So you can't meaningfully go to university at the same time.

Because you you're you're almost like working. I mean you're not working but

your schedule is as if you were working in hospitals. So then I went to, I

started studying CS here in Budapest. And then I guess it was around that time

when I started picking up Lisp. And then I started working at Intentional, while

because I recognized that I still have energy left just going to University just

doing the the Masters. So I went to work at Intentional Software and there like

I said we were using C# and our in-house languages.

And there, that was a wonderful ah crowd of of coworkers there. It was like a

tiny team and everyone had formal Computer science background. And we were

working on this project that was not really well managed and there was no good

idea of what the product should be. There was no good way of "How is this ever

going to make money?". But we weren't in there for that. We were there for the

for the fun of it, for solving interesting problems.

And you know back then DSLs were not a given. It was not something that like

these days right? If you say you're working on the way of ah making DSLs or

"We're exploring the ideas behind Domain Specific Languages or Language-Oriented

Programming", these days you would say "Okay, well so what? So what is it that

you're bringing to the table?". But back then these ideas were much fresher.

_WS:_ So back then, what years are we talking about?

_GE:_ Well, I joined intentional around 2004 or five but this has been going on

for a while. So but I don't think it's particularly interesting to the

listeners. I mean I can tell you about Intentional Software for like the whole

story of that. But what's relevant for this is that there were these really

interesting people there and some of them were interested in Functional

Programming, some of them less so.

I was doing Lisp but I started really feeling the need for, well, Types,

basically. Maybe I didn't realize at that point that it was Types that I wanted

but I definitely wanted something that could make my programs more manageable.

So I have ah programs that I can't go back to because I have absolutely, like,

even though I have ideas on how to, like, there are features I would like add.

So for example, like when I started learning Haskell I wrote a, basically like

a, clean interpreter in Lisp if you want to be fancy about it. So all it does is

it implements a graph rewrite based reduction. Ah and it was written for me to

understand lazy evaluation, basically.

And I, that's something where I will, well I probably don't care enough anymore.

But like a couple of years afterwards, I would have loved to go back and do a

proper type checker for it because currently it is untyped. And I tried that and

I realized that I have absolutely no idea how it works I have absolutely no idea

if I change something here, what else is going to collapse in on itself. So this

was this was my main problem with this.

And then, so Haskell, I think, I bounced off Haskell the first time I

encountered it. It was before my university days I think, in the early 2000s. I

found something on the internet, some tutorial on Haskell and I started reading

it but I bounced off it pretty hard. I just couldn't really connect to it and so

this time was like in second half of 2000s I guess. I picked it up and because I

it was more something that I was searching for. It wasn't something that just I

just stumbled up on it was something that matched what I was looking for. And so

the reason I brought up Intentional is two reasons. One is that ah later on when

we we talk about Electronics and Hardware. It was one of the the reasons that I

got involved in in playing around it. But the other one is that basically my

interests and and and wanting this Functional Programming and didn't gel well

with the direction that Intentional was going in. Because even though some of my

coworkers were interested in in Functional Programming, the higher ups were not.

And this is so I have this story which is probably not going to be appropriate

for the podcast. But I'm going to tell you guys anyways and then then you can

just guy that so basically 1 day so happened it so but I had this coworker and

mine and we were using LINQ but abusing LINQ really to to to write things in a

Functional way right? And we probably were going overboard with it because we

had this huge processes that were really just composition of LINQ operators

instead of writing your proper C# and so um.

_WS:_ Um, yeah, so LINQ is, just for our listeners, is this is the library that

and from Microsoft for doing is it SQL queries in a functional way or...

_GE:_ Originally it was for SQL queries. But you can use it as just for like

in-memory ah stream processing. Basically you could say that it's a Monadic DSL

for the kind of data manipulation that you do in SQL but it's not necessarily

backed by SQL and so we were just using it for lazy stream manipulations. And

then so one day I go into the office and you know I put the latest version of

the code and this was we were using subversion. So it was not git so you know it

was like a global, like, the branch that we were using. So I pull and I see that

in the codes and you know, not in a commit message, not in an email, not in an

issue in the bug tracker, but in the code, in the code that this coworker of

mine that I wrote. The lead engineer added this comment on top of one of our

functions which said verbatim "If you want to program in Lisp, go start your own

fucking company!". And this one this made it to the base right?

So basically I realized, Okay? Though if I like that the things that I want to

play around with I'm not going to be able to do that there. And also, I at that

time, I was nearing graduation so like okay well as soon as I get my degree I am

basically becoming mobile. I don't need to stick around so I can start looking

in the wider world and not just whatever is interesting locally. So by the end I

did my Masters on an alternative Type checker for Haskell like languages, a

compositional type checker. I can talk about that if you guys are interested. So

that was like, the final kick to me in deciding that Functional Programming and

in particular Haskell is what I want to do.

So then shortly before graduation, I started looking around for Haskell jobs. So

I found this Ad from Standard Chartered which I didn't know anything about.

_AL:_ I can briefly interrupt and just ask, I mean like, because we were at the

point where you were at the end of your your Masters and you were just fond of

this, did you, did you not consider just staying in academia?

_GE:_ Not so. Not really. I mean, so now I'm thinking that maybe I should have,

but so my perspective at that point was a bit limited on the possibilities of

that to be honest. So I would definitely wouldn't have wanted to stick around at

my University. And I was not involved in the the the academic world so I didn't,

like, I didn't really know anyone at any other University. So I had this one

example in front of me and that was a negative example. There was example that

time, "Okay, you don't want to do this. You don't want to get stuck in something

like that". And so because of that, I was like "Okay, I'm not going to try to

stay here". If I go somewhere else and it's similar. Okay, maybe it's going to

professionally more interesting but financially is probably going to be quite

poor. I should probably just go out in industry and try to find something that's

fun to do there.

_AL:_ Yeah sure. I mean why not? And then probably also because you had your

medical degree already, you just thought that you've spent enough time at a

university already.

_GE:_ That's at that point I was like 11 years in my studies. I'm really looking

forward to just retiring and probably and going back to academia is what I would

like to do now. So my perspective has done a full hundred eighty degree turn.

And I hope that once I finish my working life, I hope I can go back and do a PhD

in my area of Computer Science. But that's in the future. It's not like I have

any plans about it but I have completely changed. Working 10 years in the

industry has completely soured me on the idea of working in industry.

_AL:_ So back to the point where I interrupted you, and try to, like, sketch

these these ten years in industry a little bit more. And we were just telling us

about how you found Standard Chartered...

_GE:_ Yeah, so I literally just just stumbled upon their ad in the industrial

users of functional programming or something like that. That's a thing right?

that there's something like that.

Yeah, and then there was this web site where they had these these job ads and I

didn't know anything about Standard Chartered that they're not present in in

Central Europe. I never heard of them and the the job ad was basically for

either London or Singapore. So I applied as a complete nobody. No one knew me. I

had no networking at that point. And basically I was told that so okay so we did

like the technical interview.

I talked to the hiring manager, who was Neville Dwyer who Lennart also mentioned

on his appearance on this show and he was doing so he was in the process of

setting up the the second Haskell team at Standard Chartered. So they already

had the core team that Lennart talked about at length and at this point they

started building the team that would do like the last mile development of

getting all this nice technology to the hands of the the traders. And so at that

point it was really up in the air whether I would end up in London or Singapore.

The idea was that I would have to be in Singapore for nine months. Because we

would like to build up the team. Everyone would be there and then so that's what

that's for about nine months. But about 6 months in we would start the

discussion on where everyone wants to be.

And so the the team was just starting out at that point. I think I was the

second or third hire and I didn't know anything about Singapore. I didn't know

anything about Standard Chartered but I knew that using Haskell in finance

sounded like an interesting thing. And so I did the interview and everything.

The timing was perfect because what happened is that by the time I, so HR was

notoriously slow at Standard Chartered at that point.

And I was through with the interviews and everything, I got the offer

everything, but I was set to start like four months later. And I was on a one

month notice at Intentional. So my plan was to basically just you know, resign

one month or one and a half months before I would start in in Singapore. That

was the plan. But what happened is that the founder of Intentional Software

decided that they finally want to start making money on this whole enterprise.

So they hired a CEO. And then he, like, you know, "what do you do when you're a

new CEO to a company that's not making any money?". You go through the various

offices you kind of get a sense of who's doing what and then you try to get rid

of most of the people right? So that's exactly what happened.

So the Budapest office got, half of us got laid off and that meant that I got

severance of, you know, I don't know, several months, like, 3 months severance

pay even though I was already, I already had the contract that Standard

Chartered signed. I was just waiting to put in my resignation. And that was

great because what happened because of that was that I had this two month or

three month window of not doing anything. So I already finished my thesis at

that point. I had to defend it, but the actual writing was done. I had the move

to Singapore lined up and and planned out but other than that it was just smooth

sailing. So I started like cooking a lunch every day and stuff like that because

I had all the time in the world. It was very idyllic. And so then I went to

Standard Chartered and true to his word, of course 6 months in Neville asks me

if I want to move to the London office for longer term.

But by that point I was completely all in on staying in Singapore. I really took

to it very quickly and I really liked it, so I decided that this is the place I

want to stay.

And basically everything worked out so well that 10 years later, I'm still, I'm

not in the same team, but I'm still at Standard Chartered and basically doing

the same or working on the same project as when I joined.

_WS:_ Um, so what do you like about Singapore that you want to stay there? I

mean not everyone moves from Hungary to Singapore.

_GE:_ So the thing about Singapore is that it's very easy to like it. I'm not

sure you can love it but liking it is very easy and it's a very convenient place

to live if you're working.

Like, I don't plan on staying there for the rest of my life. I don't plan on

retiring there. But for the foreseeable future I like living there and and

working from there. I mean these last 2 years were a bit different.

On one hand the Coronavirus responses of Singapore was, I think, very good. On

the other hand, the traveling situation, the fact that even though we're on a

working visa both me and my partner, we still can't necessarily go back. So

basically there's an ad hoc process that you have to go through to get a

pre-approval for your re-entry into the country. So that's not fun. So right now

is the first time we've traveled since 2020 March because of this. But on the

other hand, once you're in, I think their response has been really good.

Like, I mean, the numbers back them up. If you actually look at the case numbers

and number of deaths.

_WS:_ Yeah, so one interesting thing to notice though is that we've had this

talk about going back and forth between industry and academia a little bit, but

you've managed to publish quite a few papers at places like the Haskell

Symposium, despite apparently you working in industry full-time.

_GE:_ Well, I think there's only one paper that I actually managed to get

published. That's like a real paper and but yeah, so during my time in industry

I tried to keep up doing all the fun stuff that I used to do beside work. So I

was thinking that just because my work is now in the same area, that shouldn't

mean that I give up on, and also doing it for, the fun of it because of course

there is a world of difference in freedom in what you do on your own and what

you do as an employee even if it's roughly, you know, you say "Oh, it's Haskell.

It's Haskell, what's the difference?". But of course it's can. It's can be

wildly different.

And so the real paper the one that I actually got published at the Haskell

Symposium was the the Pattern Synonyms one. So that one came about first as the

actual work. So I implemented Pattern Synonyms in GHC. Because I was just

browsing the GHC bug database just looking for something fun to do.

_WS:_ To do in your spare time after programming Haskell forty hours a week,

it's you know, you come home and then you want to do some more so you look for

GHC feature requests :-)

_GE:_ Ironically yes :-)

_WS:_ As one does. Yeah okay :-)

_AL:_ Is every GHC feature request is a paper? :-)

_GE:_ Yeah, so that's exactly what happened afterwards. So once I did the the

work, GHC language features or language addons do, in fact, have a tendency to

become papers. But I didn't come up with the idea of the paper. I think it was

either Richard or Simon. I can't remember which of the two who contacted me

about this. And then I realized "Okay, that's my chance. Because if I can't get

a paper done with Simon Peyton Jones and Richard Eisenberg and Matt Pickering,

like if, if I can't write a paper with these three people standing next to me,

then what chance do I have of ever getting anything done?". So of course I

jumped at the opportunity and wrote the paper. And it even had the, it was the

whole experience. So there was even the going home from the office instead of

going for lunch on the last day before submission because there was still like

half a paragraph too long for to given size...

So it was It was a real paper submission even though the work was done years

before that. And because of that paper I went to to ICFP to present it at the

Haskell Symposium. And that was like a whole new thing for me. I've never been

to an academic conference before and this was the one in Nara. So beautiful

place I think like the best introduction I could have gotten to the world of

conferences and like half a day in and I was already realizing that "Oh, This is

great!". Like this was missing from my life, I want to do this. But I don't know

what doing it would mean, so I just started attending them and started talking

to people and and meeting a lot of people who were only names on papers before

that. Because I tried keeping up with the literature but it's so much better if

you can put faces and put stories and discussions next to the names.

_AL:_ But you haven't just, like, continued the pattern then and then look for

the next feature request and...:-)

_GE:_ Ah there's still stuff to do even on Pattern Synonyms and that did the one

thing that I think would be a substantial addition would be Value Parameters

instead of Pattern Parameters. So these would be basically allowing you to pass

arguments to View Patterns when you use a Pattern Synonym and that's something

that I think is interesting enough.

Yeah, that's interesting enough that one could be worth a try.

_WS:_ What would be an example?

_GE:_ Like you mentioned that you make a Pattern Synonym that checks if

something is... so okay, here's an example. Say you have a Pattern Synonym which

looks up a key in a dictionary and gives you the result of that. So the scrutiny

type is a dictionary and the thing that you're matching on is the result of

looking up something in the dictionary.

And if it's just a value then the pattern matches. So you can do that today by

breaking a Pattern Synonym that uses a View Pattern in its definition. But what

if the key that you want to look up is something you want to be a parameter to

the Pattern Synonym? So then you would want to pass this thing as a parameter to

Pattern Synonym, but it's not a sub-pattern right? It's a value. It's the key

that you want to look up so it's a completely different thing.

And I don't know what the syntax should be for passing it to Pattern Synonym,

because Pattern Synonyms are already quite heavy, especially on the syntax of

their types.

_AL:_ Yeah, there is this infamous, like, the double constraints for the GADT

Pattern Synonyms where you have Given and Wanted constraints or whatever they're

called and you have a double constraint and and nobody can ever, nobody I know

can ever remember and which order for it to come :-)

_GE:_ I think I have a, either some Github issue or a Stack Overflow question

where I had a problem where the solution was that actually they are the other

way around. So even I don't know which one is which.

_AL:_ I think it was even changed at some point right? I mean I think there was

a GHC version where it had it the one way around and then it was flipped around

which I think is a subtly better order. But I mean it certainly added to the

confusion.

_GE:_ But that was around the time when this thing was actually added to GHC,

right?

_AL:_ Yeah, yeah, that was in the, like in the first few commits they were

added. Yeah, but anyway I mean I do think it makes sense as you're saying. I

mean so with the with View Patterns. The expression on the pattern language are

actually intertwined right? So you have, like expressions occurring as part of

patterns and currently Pattern Synonyms can only abstract over other patterns

and I think it would be useful to have abstraction over values. I mean another

thing that I sometimes want but I don't even have the idea of how this could be

realized as sort of Pattern Synonyms that could be computed somehow, which is

also sort of in the direction of abstracting over both patterns and over

function arguments that you...

_GE:_ Ooh, but that's ah, that's a whole different direction. That sounds like

something that Connor McBride could have ideas around because I remember he had

some talk in Oxford where he had to talk about, basically saying that pattern

and matching should be an effect should be regarded as an effect just like

anything else and we should be able to to have higher order if you have higher

order effects. We should have higher order pattern matching and should do to

make something whose effect is either matching or not and then you can keep

going and try other matches and that's probably there. But I don't think Haskell

is suited for these ideas.

_AL:_ Yeah, possibly not, possibly not.

_GE:_ And the other pattern synonym feature that I remember seeing but never

getting the round to implementing would be the associated pattern synonyms. So

basically having type plus polymorphic pattern synonyms.

_AL:_ But I think that's, in my perspective, going into the same direction as

computed pattern synonyms because in a way like the Typeclass resolution

mechanism is a form of computation even though it's a relatively...

_GE:_ Yeah, but it's static right? It's all at compile times. So.

_AL:_ Yeah, that's fair. I think, I mean that would be a good first step.

_WS:_ Yeah, that would be interesting right? Because then you could have things

like abstract data types perhaps, right? where you define a Typeclass together

with some interesting or elimination principle or pattern matching without

exposing the implementation. That would be, that'd be quite amusing I think.

_AL:_ Yeah, I mean, this is always something that, I mean, I've been working for

a while on this data type generic programming stuff. Of course and then the one

thing that is occurring there is these data type generic data types. So you can

have like tries that are indexed by multiple key types or zippers that are like

depending on the structure of the type. And if you define them generically as

abstract data types then it works kind of okay but they're never fully first

class because, like you can, you can define functions that work generically over

them. But you cannot pattern match on them without revealing their like sort of

strange internal generic structure. And there you would really want something

like um, yeah, associated pattern synonyms or computed pattern synonyms. Yeah,

that would be would be cool. But I mean I've never, I've never, I've never

figured out how to do this. I mean I'm not sure whether but you have so...

_GE:_ But so the nice thing about the implementation in GHC is that ultimately

Pattern Synonyms are compiled into normal functions. So there was no Core

changes needed for Pattern synonyms and of course at that level because is just

a function. You can have any, any way that you can currently come up with a

function. You can come up with a pattern synonym the same way. So if, if you

have Typeclass methods, and you do of course in Haskell, then you should be able

to use the same mechanism to have Pattern Synonyms that are, that come from a

Typeclass because it's just another function that takes a dictionary as an

argument. It's just, it's really all in the front end where this has to be

worked out. But at least we know what it should elaborate to in Core. Unlike for

computed pattern synonyms for I don't have that great an idea of what it should

do...

_WS:_ Okay, um, so let's talk about your book. So you recently self-published a

book called "Retro computing with Clash - Haskell for FPGA hardware design" and

that's not a typical topic for a book on Functional Programming. So can you,

maybe, say something about how this came about?

_GE:_ Right. So this goes back to my days at Intentional Software. Some of my

coworkers there were interested in electronics like, the old school Hobby

Electronics stuff. The pre-microcontrollers stuff. And I had all this curiosity

and the time but no knowledge of this stuff. I got and pulled into this, and,

you know, started building breadboard circuits of LEDs and whatnot. And that was

fun! And then we started, because we had all this this free time at the office

that the company was so mismanaged that we didn't have much to do that.

So we started this fun project of designing a CPU that would use Brainfuck as

its machine language, designing it from discrete logic gate components. And this

was, so we were using the wrong tools for the wrong job because what we ended up

doing is using this Java GUI program called Logisim where you can simulate

digital circuit by drawing it out. So you know you take your mouse and you click

on a palette of components and you drag it onto this grid and you know you then

draw the wiring between them. And the reason I'm explaining it in so much

excruciating detail is because working with this tool was exactly like that.

You had this feeling that you're doing nothing else but just putting miles on

your mouse. But it was fun to you know it gave you simulation immediately. It

was fun to to play around with it. But what it was lacking was abstraction. And

we built, basically of us went off and built our own version of this CPU and it

all kind of worked. But of course there was no way to actually implementing that

on real hardware with the kind of approach that we were using. So even though it

had no abstractions but Logisim at least had buses and I remember at one point

where we were looking at this design where we were iterating on simplifying it

with the aim of at some point building it in hardware. And then, you know, one

of us pointed at a particular line like a single line on the diagram and said

that "But you guys do realize that this one line would require 32 soldering

points total if we were to do this on on the real board because it's actually a

16 wide bus?". And you know, it just, it was never going to happen. We were

never getting to get to the point of soldering this monstrosity together. And it

was just something that we kind of abandoned. And I kept it at the back of my

mind. And then much later, I was already in Singapore, I think it was 2012 or 13

when I started playing around with Lava, just to get a feel of FPGAs and get to

the point of "Okay, well, what does this even look like?".

_WS:_ Yeah, so Lava is this domain specific language for hardware design in

Haskell and it's implemented like a deeply embedded language. So there's this

data type for circuits where you say you have various gates and so forth and

then you use Haskell to kind of programmatically assemble circuits in this data

type. Is that right?

_GE:_ Yes, so basically you're using Haskell as a macro language in the Lava

approach. So you write the Haskell program that imports Lava as a library and

your main function is going to be something that when run it will write a file

containing the hardware description language representation of your circuit. So

that's the Lava model in contrast the Clash that I'm going to get into in just a

second. But so for Lava, so I started playing around with it. Kansas Lava

specifically because there is, Lava is like a whole family of languages. There

are several implementations and I think at that point Kansas Lava seemed like

the least bit rotten version on Hackage. And so I started playing around with

that. I got an FPGA dev board that had lots of user friendly IO. I didn't want

to get something where I would also need to get a soldering iron and start

messing around with that stuff. So this was the this was the board that had push

buttons and seven segment displays and VGA outfitted everything on the board so

that was great for that. And started using Lava. And so the first project of

course that I wanted to do was the Brainfuck CPU because this was at that point,

you know, like, four years just eating at my brain at the the back of it and

"Oh, but you should go back to it at some point".

So of course there was a long road even to getting to that point I had to figure

out how to blink an LED, how to show, like, a single number on the 7 segment

display and all that. But it kind of worked out I made the the Brainfuck CPU to

work pretty much based on the original design and so that went well. And then my

next idea was to build the CHIP-8 implementation.

And the reason I'm bringing these up is because if you look at the table of

contents of my book, we're doing pretty much the same projects, but in clash,

because this really was my journey originally, just in the context of Lava. And

then why doing the CHIP-8 is, there was one point where I realized that

basically Kansas Lava, like no one is really using Kansas Lava. And the reason

that I found out about this is because I found that if you use XOR in your

circuit then it simulated as XOR but it's compiled as OR. And to me that was a

sign that yeah, if no one has noticed this, it's not some exotic operation

right? It's it's one of the basic two input binary operations. And if no one has

noticed it, that must mean that no one is using Kansas Lava. Yeah, yeah, that's

what yeah, ah.

_WS:_ Yeah, so need not be in anger right?

_GE:_ Probably I should have picked on this fact from having to become the

maintainer of Kansas Lava to even compile it on a modern enough GHC back then.

So I had to fix a bunch of stuff and then Andy Gill talked to me about maybe I

should just put my versions on Hackage under the real package name. So basically

I became the maintainer of Kansas Lava and then I fixed this bug and maybe there

was a bunch of other small things. But there was no, so there were no deep bugs.

The XOR one really was just the case of some copy paste programming in some

compilation template. And that's the complete opposite of my experience with

clash that we're going to get into next.

This was Lava and then I got as far as building a Commodore PET so that was a

6502 CPU and text-based screen and enough peripheral driver chips that the

keyboard would work. So it wasn't like the complete machine I never got to like

Disk I/O or anything like that. But at least you could type in BASIC programs on

the PET. And that one had a very interesting bug that I spent one and a half

months of just afternoons and nights trying to debug. So you turned on this

Commodore PET and you typed in a BASIC program that would print an infinite

loop. You know, like the classic BASIC 10 print something 20 go to 10. And you

would run it, and after a couple of seconds it would break. I mean break in the

sense of as if you you pressed something on the keyboard to break the running

program to break out of the loop. But instead of saying that "Okay, you broke

out", it printed the nonsensical BASIC error message something or some illegal

quantity error something.

So the problem of course is that you have this machine that you can interact

with by inputting BASIC programs and you see the output as video stream going

out of VGA but the actual bug is going to be somewhere in your CPU definition.

So you're so many layers removed from where the bug could be versus how you can

interact and what you can observe. So you need to somehow cut through this. So I

needed a way to simulate the whole machine, because initially you have

absolutely no idea where to start.

The only thing you know is that if you type in this program and you keep it

running long enough then something bad happens. So for starters, you need to

simulate the whole machine end to end. And one thing that happened a lot was

that it turned out that if I type in the BASIC program automatically, so

basically if I change my computer so that the keyboard driver instead of

handling the real Keyboard, it just streams the key presses required to type in

this program, that was enough to trigger. At least there was no interactivity

needed. You could just boot up this modified PET that types in its own program

like in an 80s hacker movie where you know you can see the things appearing on

screen letter by letter. So it was still crashing. So I started running it in

various HDL simulators and I couldn't find anything that was both fast enough

and flexible enough that I could program it to try to stop at the right point.

Because exactly again because of this big distance between where the bug occurs

and...

I found a, there was like a a free software simulator. I can't remember the name

of it but there was like a GCC frontend. Basically so's say took HDL and used

the GCC compiler pipeline to compile that into simulation. And that gave good

enough performance if your observations were small enough, but it scaled really

bad with the the number of signals that you were observing. And I found no way

of specifying the signals that I'm interested in statically. I had to do it

programmatically, basically. I could, like, dynamically recognize that these are

the signals that I'm interested in.

And the reason I'm telling this whole story is because, back in University, we

had a course that used Ada, (is that high pronounced, Ada?) as the programming

language for object oriented programming I think? And I didn't expect to ever

use that knowledge. And then this GCC Frontend happened to be written in Ada.

And I had to hack it to hack my own signal filtering stuff into it. And that was

a great feeling of of using this forgotten knowledge of this language that I

have never interacted with just for this one purpose and then never touching it

again.

And so the reason that the whole thing took one and a half months was because,

well, first of all, a lot of it had to run overnight just to get, so you know,

simulating this couple of seconds of runtime. And then I had to try to find out

"Okay, so what are the relevant things? What are the relevant things, how much

of the machine itself I can just cut out?".

And it turned out that the error was that, so on these old computers, you

usually have the the video system connected directly to the CPU for timing

purposes. So you have an interrupt request that is triggered by the video

subsystem, for example, at the end of each frame. So that the program can switch

over to doing computational stuff and not have to worry about what is in, like

having updated the screen memory contents by the time the the video subsystem

would read from that. And basically, that's how you write a game on these

systems, right? You would time everything from this interrupt.

And so what happened is that the so so this interrupt came in and, it would

happen, you know, sixty times a second. And in four seconds, so that would be

roughly two hundred plus times. And one of those times it hit the CPU while it

was running this BASIC program of, you know, just printing, looping a print. It

would hit the CPU just as it was entering another interrupt handler. And that

one I think was the software triggered one because on the 6502 you can

interrupt, you can trigger an interrupt with a `break` instruction. And so what

happened is that the least common multiple of the the BASIC interpreter loop

doing its thing and the video subsystem drawing the screen, the least common

multiplier worked out to requiring two hundred plus iterations to finally hit at

the one cycle where there was a bug in my implementation of entering an

interrupt handler in the CPU. So that's why it took so long to to track this

down. But finally fixing it was a great feeling. And of course afterwards I

could test it much easier than having to wait for several seconds because I knew

what it was.

_WS:_ Yeah, so I mean, what you, but this was still implemented with Lava or was

this already and yeah, but then you decided at some point to switch to clash

which is this other domain specific language for hardware. But it's very

different of course.

_GE:_ Yes, so the way that happened is, the reason I never finished the PET

fully, the reason I never did the the disk drive and the the I/O chip that

implements, like, serial communication and stuff was because it was becoming

very unwieldy to do these chips in Lava. So for the CPU itself I managed to to

build up enough, like, a library of some utility functions that made it somewhat

bearable to use it. But the big problem for me with Lava is that because it's a

DSL where you're producing this single big term, like, under the hood, what

you're doing is, you're producing this one big term of a circuit representation.

You can't add your own types into what goes on in that term. So you can't write

a circuit that pattern matches on a data type that you define yourself in

Haskell.Even though you know what it should mean because you know that, so you

can pass values around in Lava. Lava already has a concept of representing

values as fixed size bit sequences. So you could, like, you know how to pattern

match on it.

And so what I ended up doing is, I built some utility functions where,

basically, we pass it a total function from all the possible values of your data

and it would exhaustively apply it on every possible value and build a circuit

that was like this branching structure.

But that's of course exactly what you want Lava itself to do for you. And so,

basically it get to the point where it was just not fun anymore to try to finish

the PET.

And because I was doing it for fun, I just dropped it for a while and I started

doing something else. I can't remember what the next thing was, but I always had

some harebrained project that was fun to work on.

So I dropped it and then several years later at a Singapore Functional

Programming meetup Ellie Hermaszewska was presenting about Clash. Which I think

she used even professionally at Myrtle.

But her presentation was like this introductory thing of "Okay, so this is clash

and this is how you would do something in it". And now as a complete

coincidence, the example that she used was a Brainfuck CPU, so of course that

that immediately connected with me.

But basically I was looking through through and her slides and her presentation

and I realized that Clash seemed to solve exactly the problems that I had with

Lava. So what is Clash? So Clash is a GHC backend that emits hardware

description. So the difference is that like in Lava you run your program to get

your circuit description. In Clash, you just compile it, and the result of the

compilation is, instead of getting a like an `x86` executable, you get the

hardware description.

And what this allows is that every type you define in your program is also of

course available to a GHC backend. So Clash has absolutely no problems

translating, pattern matching on your type into the appropriate circuit. On the

other hand, there are some repetitive structures that you can easily express in

Lava by just writing a normal term-level Haskell program that generates like a

macro, that structure. But if you do it in Clash and you want to write it

statically, you have to either do some type level programming to achieve that or

hope that the Clash simplifier will pick up on it and be able to flatten it into

a fixed size circuit.

And there's one situation for me where that broke out in Clash. In my book,

there is one chapter where we have to use Template Haskell to, so you know, in

Lava we could use Haskell. In Clash, we have to use Template Haskell to do this

macroing because there's one thing that I just couldn't get working statically

with Clash. And it's this longstanding open ticket now on the Clash bug tracker.

But when it works, Clash works really really well.

And you know, after my experience with Lava it was just such a, like you know,

you use a tool and you build up this repository of frustrations you have with it

and then something else comes along and it seems to address exactly those

problems.

_AL:_ Yeah, so just for my understanding. So you're saying Clash is essentially

a GHC backend. So um, it starts at the level of GHC Core. If that so, it still

uses the normal GHC optimizations or does it insert it's own optimization path?

_GE:_ So Core, of course, can still be recursive but you can't have recursive

hardware. So Clash does its own transformation after, like on Core's, it takes

the Core output from GHC, but then basically tries to supercompile it into

something that's fixed size.

_AL:_ Right. Does it first let GHC do its thing and then takes Core or does it

take Core immediately after basically type checking and desugaring?

_GE:_ No no, no no, it lets, I believe it lets GHC do its own thing.

_AL:_ Because I mean, I'm a little bit worried that this approach gives you

potentially far less control over what kind of stuff you're ultimately left

with. Is that not an issue? I mean if, like I mean, GHC's optimizer can be

unpredictable at times right? And, I mean, at least if you're, if you're having

an embedded DSL, you're computing some value of a data type. And that's the

value that you're computing, so I mean the semantics is entirely clear.

Whereas here you have what feels like this black box in between which can do

really really much good, but potentially also um, quite some bad things. And but

I mean, I don't really know, I haven't, I haven't tried to use Clash so you seem

to say it's not an issue in practice.

_GE:_ Well, okay so, well, I can't really claim that it's not an issue in

practice because the kind of projects I've been doing were not really dependent

on squeezing out the last drop of performance or the last drop of of gate count

from the FPGAs I was targeting. So this is really a question, I guess, for the

Clash developers.

At the high level, Clash's model is that the, how the pure functions correspond

to circuits is a, like I should put it, like basically if, if you have a pure

function, it is going to turn into a circuit where the input, where the

arguments become the input lines to it and the result become the output lines.

I find in that sense there is a mental correspondence. And and Clash is never

going to, like, insert registers for you. So at that level there is a

correspondence between the Clash code you write and the the HDL you get at the

end. But inside a function, the combinational circuits, how they get optimized,

I think, I think you have a very good point in that the optimization can mean

that you might not get exactly the the circuit that you want. And but like okay,

so like if, if I have a circuit, like, if I have Clash code where I say that

"Okay, so this circuit, it adds up these two eight bit unsigned numbers", like

that already you could say "Okay well, but I don't know what that corresponds to

because is it going to be like a normal you know, 8 times Propagated Carry

Circuit or it is going to be some kind of tree structure where there's less

propagation delay?".

And so I think there is a level of level of abstraction where you can say that

what you write in Clash corresponds to what you get in the HDL is probably

higher than use cases that you might imagine in your question could be happy

with. And I guess in that case, you have to look at what exactly goes on in in

the optimizer. But there is a level at which there is a direct correspondence

and that correspondence is, like I said, that your pure functions will

correspond to the combinational circuits and the registers will only be there if

you yourself in your Clash program have put in registers.

_WS:_ Yeah. So in your book I think one thing which impressed me and this has

come up tangentially a few times is, I could imagine doing this up to the point

of having the basic instruction set working, right? But, I mean what you do is

not just the basic instruction set -- it's the parallel I/O, it's handling the

graphics, the interrupts that you mentioned, it's the, it's the complete story

of how to get one of these chips running or chipsets running on an FPGA. So I'm

kind of surprised, well maybe surprise isn't the right word, that it's still

fun. If it's a hobby project, like, is that, is that still fun to do all that

nitty gritty stuff?

_GE:_ So I think the large part of the fun comes when you make these components

and because they are correct, you can then use them in place of the original

components.

And this is something that I try to stress in in the book in several places. For

example, in one of the chapters, we've written a Space Invaders arcade machine.

And it's a surprisingly short chapter because by that point in the book, we have

already written an Intel 8080 CPU and the real original Space Invaders arcade

machine uses an Intel 8080 CPU plus some very simple peripheral chips. So if we

implement those peripheral chips which is not a lot of work because the most

complicated one is the video chip. But by that point we have also implemented

similar video systems for other computers. So at that point we can just take the

CPU as, we just import it as a Haskell library, the definition of the CPU. We

write these very simple to implement peripheral chips because they are simple to

implement because, you know Clash gives you all these nice tools of abstraction

that Haskell has.

So you can reuse a lot of the code that you wrote for a different machine. So

you put it all together and because we have written a complete Intel 8080 CPU,

we can just take the original arcade machine's firmware and run it on our

machine. And we get Space Invaders! And it almost feels like you get it for free

because you didn't have to do that much Space Invader specific coding and yet

the end result is this full machine that you can interact with.

So I think, and that's the reason that the book is structured in such a way that

we build up some reusable components. But then there's a project chapter where

we do nothing groundbreaking, we just take whatever we've put so far, and we

assemble them in a way that out pops a full computer or a full circuit, because

some of the projects are not even computers, and so I think there's a ton of fun

in seeing programs that people have written 40 years ago for the original system

and seeing them run for the first time. And you know, I'm not talking about

Space Invaders, like even something that, just, like any 8080 program right,

that you can run on your cpu and have something that you can observe from the

outside and it's working.

So taking that, and knowing that the person who wrote that didn't write that

knowing the quirks of your implementation, and yet it still works, I think

that's the satisfying part.

_WS:_ Yeah, maybe. Tastes vary, I guess. So another question I have is: are you

familiar with the book "The Elements of Computing Systems" I think it's called?

It's um, it's sometimes called "NAND to Tetris". So I think it's one of my

favorite CS books, right? So what the book does is, it starts by saying "Here is

like an AND and an OR gate and here's how we can implement something else some

other simple gate using these things". And then it says "Oh we can now do adders

and then we can do memory and we can have a little CPU". And then each chapter,

it builds up this new level of abstraction, right? And then it goes up. And then

to a little instruction set and then a little language which compiles into that

instruction set and then you know the end chapter is that you write a little

Tetris game in this mini programming language. And then, but you know, kind of

in principle, how it gets, goes all the way down to, back down to zeros and

ones.

So my question is, from a, I mean, this is fun if you're interested in playing

Tetris, but I mean, we're programming language geeks. Of course, so could you

not imagine that once you've built up this little microprocessor and that the

next thing that you want to do is a little kind of Haskell compiler and then

have a little kind of Clash dialect which works on Intel 8080 microprocessors?

So you can bootstrap your own programming language ecosystem from the ground up,

right?

_GE:_ Well, at that point you also run into the limitation that, like I said at

the very beginning of this, that I never really went downwards back in my eight

bit days. So I couldn't really write an efficient and non-trivial program for

these old eight bit micros. And I guess that's one of the reasons that I wanted

to implement these full systems that match real world systems. That way I didn't

have to care about the software that runs on them. I could take the library of

existing software. But yeah, sure it would be a fun project. So, so very good,

from NAND to Tetris you would have NAND to Clash or NAND to Haskell or...

_WS:_ Yeah, and then back right? Where you could then, so Lennart always says

that compiler is only ever mature if you have the compiler kind of eating it

your own dog food and using it to compile yourself. This is kind of taking that

idea to the next level where you have an entire architecture plus the target

compiler for that architecture plus everything is self-hosting.

_GE:_ Yeah but there is a shortcut to doing that though which would be to

implement, you know, not something like an 8080 but like a RISC-V in Clash. And

I know there are existing small RISC-V variants in Clash. Because then you could

take something like real Haskell, real GHC compile that to RISC-V which is

probably a lot simpler than trying to build an a implementation and then use

Clash on that. So, so basically, like, you want to meet in the middle and I say

instead of pushing Haskell all the way down to an eight bit micro, you should

lift the the hardware to, like, RISC-V and then you have an easier time. Even,

like a simple version of the RISC-V, and have an easier time meeting in the

middle.

_WS:_ Okay, maybe that can be a next book, I guess. Then there's something to

look forward to.

_AL:_ Perhaps before we wrap up, like, how was it working with Clash? Clash is

still, in itself, if I understand correctly, a project that is an active

development and it is perhaps to some extent experimental but it has a sort of

enthusiastic core of contributors of which you're probably now a part of. Do you

want to say something about that?

_GE:_ Yes, so when I started, when I had the idea for the book, I didn't start

writing the book.I started writing the programs that would make the backbone of

the book. So idea was that I would have all the programming done upfront before

I write the first sentence of the first chapter. And in that process I found, of

course, bugs in Clash and that there were like several weekends where my only

output was just Clash bug reports. But I tried to make them as informative and

self-contained and small as possible and they really liked that, because most of

their users are like proprietary chip developers who can't share their code. So

they would have a problem, but they can't just say "Oh, and by the way, I've cut

it down into this twenty line example and put it in your public bug tracker for

everyone to see and come up with ideas", whereas that's exactly what I was

doing. So that's how I got involved in the actual Clash project because I just

kept hitting these these bugs. And then when I decided that this is too good not

to share with the world, like, I need to write this book or at least I need,

either I need to write like a long series of blog posts or something but this is

too good not to tell the world. So I'm gonna write a book about it.

So when I decided that I'm going to write this book, of course I talked to

Christiaan Baaij, the Clash main developer. And this was back in Berlin 2019

ICFP, I told him that I'm thinking of doing this book, and he was very

enthusiastic about it. You know, at that point I was just another Clash

contributor, but he really liked the idea of the book. And I've been telling

people that I've basically been getting commercial level support from them in

the sense of, you know, like I would have some bug on a Saturday and by Sunday I

would have a fix from it.

And there was at one point, but so the big bug, the one that I mentioned before

that I eventually needed to use Template Haskell to work around. So because no

one really understood what was going wrong there. It was just a Clash program

that completely blew up when you tried to compile it and you ended up with this,

you know, huge circuit that kept growing and growing and growing until it was

even, yeah, like, it was so, so the growth rate was scaling exponentially with

the input. And as it was growing, of course it hits some limit of either your

computers RAM or you just kill it because it's been running for hours. But

basically, it was never going to go anywhere and that bug no one understood why

that was going, what why this was happening or what was going on there. And for

that one at some point I ended up getting I think three or four person weeks

from a QBayLogic. I ended up getting and three or four person weeks of debugging

from QBayLogic on that. Which was great, and like, it didn't fix the issue but

at least we got a great understanding of what exactly was going on and there is

a long-term plan of, you know, just using a partial evaluator instead of the the

current simplifier in Clash that would solve that issue. It's just that the

implementation is not ready yet. So I had a wonderful time with the Clash

developers and the Clash community on this. And I hope that people who read the

book will will become as enthusiastic about Clash as I got when I started

playing around with it.

_WS:_ Okay, so on that note, I just want to thank Gergő again for joining us

today and sharing his adventures in retro computing and using Haskell and

learning Haskell. And thanks again, that was a lot of fun!

_AL:_ Yeah, thank you very much.

_GE:_ Yeah, thanks for thanks for having me! This was this was great!

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript