Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lisp in 99 lines of C and how to write one yourself [pdf] (github.com/robert-van-engelen)
289 points by jstanley on July 14, 2022 | hide | past | favorite | 68 comments


In a similar line as the posted project TinyLisp, I'd like to mention uLisp.

Lisp for microcontrollers - Arduino, Adafruit M0/M4, Micro:bit, ESP8266/32, RISC-V, and Teensy 4.x boards - http://www.ulisp.com/

In particular, its smallest variant (~800 LoC).

> uLisp Zero is a pared-down version of uLisp, capable of running in 8 Kbytes of program memory with 1 Kbyte of RAM

https://github.com/technoblogy/ulisp-zero/blob/master/uLispZ...

---

And I can't help but mention Make-a-Lisp and its C implementation (~350 lines).

https://github.com/kanaka/mal/blob/master/impls/c/stepA_mal....


I recently got back into microcontrollers and embedded electronics as hobby, and uLisp caught my eye. However, as ubiquitous micropython is in the guides and projects, there aren't really any resources for uLisp beyond the website. Can you please point me to any such resources if you have?


I don't know of any resources for learning/using uLisp, other than the website itself - which does have many interesting articles - and maybe the discussion forum.

http://forum.ulisp.com/

---

Arduino is probably better overall for your purpose, the language is more approachable for learning, and there's a huge active community.

https://docs.arduino.cc/


All these "small lisps / lisp-in-go / lisp-to-xyz" are great and I do applaud the exercise and what is produced, and I also appreciate many of of these projects are for learning or fun projects.

BUT if by chance you or your project is one of the more "serious lisps or next-lisp-vibes" hear his:

Before you even think about speed or benchmarks(size,spec,speed,lang) remember we live in a connected world that is multi-core.

First prize for your home-brew lisp that would make me sit up and take notice is if it has very easy net-libraries built-in (http (client + server), websocket, sse, json etc) I think that is one of the things Go(yes I know not every like it, but the real world usage is real) got right, the std lib was very "modern" at least to the point of real-world usability. Multicore is a close second prize :)

Network-libs (stupid easy ones) and the concurrency concepts is the very first thing I look for in a lang or lisp-derivatives

My 2cents :)


Why do you need to build them in when you can just load your favorite libraries that do these functions with https://www.quicklisp.org/ , especially for http the great libraries by Fukamachi: https://github.com/fukamachi parallel processing: https://lparallel.org/ etc.

I'm very grateful that common lisp does not version up (like python), but you can always load a new or newer version of libraries with no impact on your core production code. (Such as a rewrite when the language gets a new version - this never happens with Common Lisp)


>Why do you need to build them

Many of the projects I referred to are not "QL Compatible"

[0] https://www.ale-lang.org/

[1] https://github.com/nukata/lisp-in-go

[2] https://github.com/pixie-lang/pixie

[3] https://github.com/candid82/joker (Clojure like)

[4] https://janet-lang.org/ (more Clojure like than lisp like, but still brilliant and very polished )


Multicore is not common at all in embedded, so there is no point in supporting that in this type of project. And even if you are in "fat" embedded (Raspi etc.), you might want to use just only one core anyway so that your system can do other things efficiently (or use less power).

For interpreters like this, what you should be looking for is not direct support for this or that, but how easy it is to write bindings for that interpreter. In my opinion, what Lua offers is the yardstick there.

Oh, and don't stop at FFI support. FFI is IMO generally the worst option as it introduces a lot of overhead both at programming time and/or at runtime.

I have a personal interpreter - it's not Lisp-based but Forth-based (rule 48: everytime Lisp is mentioned on a forum, someone mentions Forth and vice-versa) - that is very easy to extend with libraries. I actually use it sometimes to experiment with libs. I have dabbled with most of the examples you give (and more).

For tiny interactive interpreters aimed at embedded (otherwise for most people there's no point in the "smallest possible thing that can..."), what you should be looking for is not spoon-fed features, but how much control you have over the runtime. Well, that's my way of thinking for non-embedded stuff too.


I did not meant multicore solely for the embedded product space, I aimed my wishes a little too broad it seems :)

There are many such projects that looks fantastic, embedded and non-ememmbeded that quickly fails a real world use even hobby'ish world usage :/ EDIT: FAIL as in it doesnt come with nice or easy to use network libs i have to roll my own.

Also that is sorta my point, I dont want to extend/built libraries in 2022 I want to built connected programs and hit the ground running.


I guess it depends on what you do. I usually work at low-level, so making lib interfaces myself is acceptable, but at a higher-level, one usually have to pile libs over libs to get anything done because that's the way things are done today (complexity is perceived as a non-issue as long as you can rely on some lib), especially on desktop and especially with most things related to "the web". DIY even for language bindings would take long, be exhausting, and probably not worth it for a prototype or a throwaway script.


ja probably. a lisper can wish though :)


Hard agree.

Have you found any frameworks to be particularly good at this for C/C++?


Maybe libuv? It's used by node and julia. It's not a full standard library by any means, but it would give you the event loop with async/networking/io/threading.


When I did something similar the hard part was implementing continuations efficiently, a feature that a lot of these small lisp interpreters lack.


Any pointer to how that's done (in the context of an "easy" interpreter like this)?


The "PLAI" book explains how to do it trough a rewrite/desugaring transformation: [0]

    The heart of [a CPS] transformation is [...] to turn every one-argument function, f, into one with an extra argument. This extra argument is the continuation, which represents the rest of the computation. The continuation is itself a function of one argument. This argument takes the value that would have been returned by f and passes it to the rest of the computation. f, instead of returning a value, instead passes the value it would have returned to its continuation.

    CPS is a general transformation, which we can apply to any program. Because it’s a program transformation, we can think of it as a special kind of desugaring: in particular, instead of transforming programs from a larger language to a smaller one (as macros do), or from one language to entirely another (as compilers do), it transforms programs within the same language: from the full language to a more restricted version that obeys the [... CPS pattern]. As a result, we can reuse an evaluator for the full language to also evaluate programs in the CPS subset.
There's a more terse description on this site too [1]. And I have to agree with this:

   [...] there are many methods for transforming into CPS [... but] learning CPS transformation remains difficult.
Also I remember stumbling upon this article [2] which kind of gave me an excuse to not pay too much attention to CPS :-) (...but not really :-p, mostly the subject got too hairy for me and I lost interest :-p). If I remember right, the article argues in favor of "delimited continuations" as opposed to implementing full general call/cc.

0: http://cs.brown.edu/courses/cs173/2012/book/Control_Operatio...

1: https://matt.might.net/articles/cps-conversion/

2: https://okmij.org/ftp/continuations/against-callcc.html


This doesn't go into detail of how to make stackfull continuations work (and indeed when it is necessary to make a new stack). If you're working in C, you need to use inline assembly for this. setjmp and longjmp are not stackfull.


CEK[1] machines are very clean way of implementing continuations or other features that can be implemented in terms of them in a language with no support for lambdas or closures. they are also reasonably fast too. i implemented a simple unlambda[2] interpreter in haskell using the cek/cesk style and it was ~20% faster and more memory efficient than the fastest c interpreter i could find.

[1] https://matt.might.net/articles/cek-machines/ [2] http://www.madore.org/~david/programs/unlambda/


Trampolines


as long as your arguments never spill, you can also make an asm insert macro that does a jump instead of a call


The purpose being 'rename + goto'?


the purpose being to be able to use generic C function bodies as continuations without having to trampoline the stack


Funnily enough, I've been writing my own Lisp in C recently. Interesting to see how it is done by someone skilled. The boxing of expression pointers into a single float is really cool.


Just look at ECL then


I have read many papers, including McCarthy's, on Lisp. This one clarified things for me, which I found confusing in most of the others. Perhaps, because I've never been proficient with math and having code with step by step explanations is easier for me to follow.

Obviously, tinyLisp is a pedagogical exercise and not an environment for use in production. There are many Common Lisp and Scheme implementations that are suitable for production systems.


It annoys me when people talk about what McCarthy did, and credit to him features that only appeared later and were borrowed from elsewhere.

For example this article brings up static typing as one of Lisp's innovations. In fact that first appeared in ALGOL-60 and didn't make its way into Lisp until 1962.

And, more egregiously, the article lists closures. In fact the first Lisp with closures was Scheme in the 1970s. And it didn't invent them - the idea appeared in Pascal first. (Though because it lacked true garbage collection, Pascal only allowed you to pass closures into functions, and not return them. this greatly limited their utility.)

To demonstrate how much those features are not core to Lisp, one of the most widely used Lisp applications is emacs. But it did not get support lexical scope or closures until 2012!


Static typing isn't mentioned at all in the paper.

The article talks about "functions as first-class objects (closures) with static scoping". Those appeared as a general construct in Scheme and only later in Lisp. The idea of functions and an environment is a bit older (see for example Landin's ISWIM from 1964 -> https://en.wikipedia.org/wiki/SECD_machine ). The idea thus did not appear in Pascal first, if at all. "first class closures" means that functions can be freely passed to functions, returned from functions and assigned/reassigned in data structures. Scheme for example can store functions in vectors & lists and return those from functions. PASCAL did not have anything like that in 1970 (The programming language Pascal, Niklaus Wirth, 1970 https://www.research-collection.ethz.ch/handle/20.500.11850/... ).

I would also doubt that PASCAL in 1970 had downward closures. One could pass functions & procedures as arguments (not as part of other data structures), but could I use non-local variables? There seemed to be a restriction for recursive functions that they could only use local variables. I would then not expect that this would be much different for passed functions.


But Scheme dates from 1975.


Correct, among Lisp dialects, Scheme had general lexically scoped functions as first class objects first - 75. Lisp got it later. Scheme was first implemented on top of Maclisp at that time.

https://en.wikisource.org/wiki/Scheme:_An_Interpreter_for_Ex...

Pascal didn't have anything like that.


It doesn't mention static typing at all. It only mentions static scoping. In fact, the article credits to Lisp dynamic typing.

Also, the article doesn't credit any specific features to McCarthy. It credits to him Lisp itself and the idea that both code and data can be represented as lists, and then it credits to Lisp the invention of certain features, like first-class functions, garbage collection, etc. I don't think the article intends to imply some transitivity of credit.


There are mails of mccarthy's team discussing upward funargs and lexical scoping in 59 (maybe 58, I don't recall). The very first to do it publicly might have been algol, but it's not clear who made it first.


> #define I unsigned #define L double

Gross. I appreciate the overall idea of the article, but have some taste, and learn to touch type so you aren't tempted into such atrocities.

Maybe this is how a lisp programmer writes C, but as a C programmer, this is offensively disgusting. Jesus, at least use a typedef rather than abusing the preprocessor. Not that single-letter Fortran-esque synonyms of existing types are any sort of approximation of a good idea.


It says 99 lines, and it theoretically is, but is it really 99 lines if it has 119 semicolons?

In the main function,

   nil = box(NIL,0); err = atom("ERR"); tru = atom("#t"); env = pair(tru,tru,nil);
This should probably be 4 lines of code, but it all appears on one line in the source code. '99 lines' of code is technically true, but using number of lines as a measure of code complexity becomes less useful once you start packing multiple distinct things into one line like this.


Disguising complexity as brevity only obfuscates. It detracts from the simplicity because there's a huge opportunity for clarity that's being run away from

There's a huge difference between conceptual purity ass packing bytes


Sorry it was late, I meant "and packing bytes"


This, to me, is the crux of the zealotry for Lisp and its variants. A high level language, with functions as first class citizens, that can not only run on minimal hardware but be programmed on minimal hardware in less than 100-1000 lines of C code.

The other arguments, while valid, for side-effect isolation or for the "brain bending" nature of functional languages pales in comparison to the above reason.

In my opinion, Javascript pretty much took the best parts of this philosophy (scriptable/run-time languages, first class functions, etc.) without the more cumbersome aspects to create a happy compromise between a functional and imperative programming language.

I do wonder if the newer generation of programmers has the same reverence for Lisp and its variants as the last generation.


Implementing a tiny Scheme interpreter made me appreciate Javascript better.


Do you feel the same way about forth? There you will find your answer


Anyone have any input on:

https://www.buildyourownlisp.com/

It's been in my bookmarks for a long time but I've never really had time to really start it. The "who is this for" page say:

"This book is for anyone wanting to learn C, or who has once wondered how to build their own programming language.".

Well, I'm fairly competent in C (but not great) but would like to get a glimpse of what it's like to build my own language. Is it worth the time?


If you want to make a Lisp specifically, I'd suggest starting with MAL: https://github.com/kanaka/mal#mal---make-a-lisp

If you want to build a programming language in general, I strongly recommend https://craftinginterpreters.com/


Love crafting interpreters. I’ve used principles from this book even outside of programming language development. It’s wonderful.


I personally had fun going through Build Your Own Lisp and learned quiet a lot. I really liked the writing style and images.

That said, it is not a good good way to learn neither how to write a Lisp nor how to use C the proper way. Some of the stuff is very idiosyncratic or even questionable.

Again, for me it was fine because also used other sources to complement it and it got me interested and thinking about various topics. Just beware that it teaches things that will hold you back if you ever became serious about implementing a Lisp that is more than a toy.

If you already quite OK with C and don't need much hand holding I would suggest going through MAL instead.

See here for a critique: https://gist.github.com/no-defun-allowed/7e3e238c959e27d4919...


Any ideas on how to add a type system so one can do julia style multiple dispatch

(define abs (Complex n) (* n (bar n)))

(define abs (Int n) (max n (- n)))


Sure, it's not too hard.

You need a new kind of function--let's call it a generic function.

A regular function jumps to a user-defined body when called.

A generic function is a bit more complicated: instead of jumping to a user-defined body, it jumps to a standard dispatch routine. The dispatch routine computes some discriminator (typically a sequence of types) from the passed arguments, looks up the discriminator to find a matching function body (typically user-defined), and jumps to that. You just need to implement the lookup routine and the discriminator computation and package them up in a structure that represents a generic function.

When user source code applies a generic function to some arguments, your Lisp uses the discriminator function to compute the discriminator, looks up that discriminator using the lookup routine to find the right function body, then passes the argument values to the function body.

There are some more details you can mess with, depending on what semantics you want to support. For example, do you want flat dispatch, with a one-to-one mapping from argument types to methods? Or do you want inheritance, so that you can define methods that are applicable to a type and all its subtypes?

To put it another way, do you want to be able to define "(add x y)" once for all x and y? Then you need inheritance. Or are you satisfied to explicitly write a separate method for each and every distinct combination of x and y for all possible types? If you're okay with that, then flat dispatch will do fine.

Flat dispatch is much simpler to implement. Basically, you can just compute the type of each argument and look up the sequence of types in a hash table. When you define a method:

    (define add ((x SmallInteger)(y SmallInteger))
      ...)
you put the body of that definition in a hash table under the key `(SmallInteger SmallInteger)`. When user code calls `(add 2 3)`, your discriminator turns `(2 3)` into `(SmallInteger SmallInteger)` and Bob's your uncle. Look it up, get function body, profit!

The downside of this approach is that you have to explicitly write a definition like that for each and every separate numeric type that you want to support.

Inheritance fixes that problem; you can do this instead:

    (define add ((x Number)(y Number))
      ...)
...and now your method body applies to every kind of argument that is a subtype of Number. The user of your Lisp thanks you.

This solution comes at considerable cost to you, the implementor, though. You can't just compute the types and look them up in a hash table anymore. If there's no entry for `(SmallInteger SmallInteger)`, you need to check for `(Integer SmallInteger)`, and `(SmallIteger Integer)`, and `(Integer Integer)`, and `(Real SmallInteger)`, and `(SmallIteger Real)`, and so on and so forth for all permutations of all possible subtypes of Number.

In addition, if you support multiple inheritance as well as multiple dispatch, then the supertypes of each type form a graph. You have to work out how to linearize that graph for each argument so that your dispatch routine always sees the same deterministic sequence of supertypes every time, so that your dispatch algorithm will be deterministic. The choice of linearization algorithm has significant consequences for how reasonable and comprehensible your dispatch results are. If you choose a bad one then your users will spend a lot of time trying to figure out why they get totally unexpected results all the time.

So flat dispatch is WAY less work to implement, though it does come at some cost in drudgery for users of your Lisp. I guess if you mean for your Lisp to be used by other people for real work, then it's worth considering support for inheritance, but if you're just working on a hobby Lisp to learn more about how implementation works, then flat dispatch is probably the way to go.

Also, it's a valid option to just decide you don't want to mess with the whole inheritance can of worms, so users will just have to live with flat dispatch. There are some relatively simple things you can do to make it a little more convenient--for example, supporting default dispatch (that is, methods that automatically apply when the arguments don't match any defined method).


ages ago I used http://norvig.com/lispy.html to write lisp in C#. Pretty easy to do the basic mechanics of the language, somewhat more work to do some of the more advanced features, was fun to play around with!


Something I can't work out is list dot operators. Can someone point me to information on this? I did look, but I suspect I'm bad at searching... any pointers would be greatly appreciated!


Are you talking about "(a . b)" ? The dot is not an operator. The dot is just there to denote a "pair". Linked lists in lisp are formed with pairs. So "(a b c)" is just a shorthand notation for "(a . (b . (c . null))". To sum it up: it's just syntax.


How many lines of Lisp is C?


Ooh, an excuse to link one of my favorite short stories!

https://aphyr.com/posts/353-rewriting-the-technical-intervie...


I reread these often, and they delight me.

The full series:

Reversing the technical interview: https://aphyr.com/posts/340-reversing-the-technical-intervie...

Hexing the technical interview: https://aphyr.com/posts/341-hexing-the-technical-interview

Typing the technical interview: https://aphyr.com/posts/342-typing-the-technical-interview

Rewriting the Technical Interview: https://aphyr.com/posts/353-rewriting-the-technical-intervie...

Unifying the Technical Interview: https://aphyr.com/posts/354-unifying-the-technical-interview


I wish I understood any of what they did in lisp. How does one learn?


I wish I knew :)

I picked up "Common Lisp: a Gentle Introduction to Symbolic Computation" after seeing it highly recommended here, but I haven't actually worked through it yet.


A lot more than the number of lines of Lisp in Lisp.


Are you proposing writing a replacement for gcc in lisp? If the goal is to have executable machine code then the turtles must end somewhere.


Already been done, meet GNU Mes: https://www.gnu.org/software/mes/

It's a C compiler in Scheme, and a Scheme interpreter in C, each of which can bootstrap the other. The GNU project is using it to reduce the size of already-existing binary code you need to have (and trust) to bootstrap a GNU system from source.


How about a Lisp that can parse the hardware specification of a chip and produce its own compiler?


This was a long dream of mine.


Hasn't sectorlisp rendered all these things superfluous?


sectorlisp has lowered the bar (in the limbo sense) for golf-a-lisp. I wouldn't agree that sectorlisp has rendered size competitions for lisp superfluous, but, damn, it's hard to beat.

So, yes, everybody can now relax if they want to write a small and self-contained lisp. Unless that's your jam, you don't need to worry about the bytes. Just make it work, and keep it simple.

The question becomes "what does this lisp implementation add". Well, compared sectorlisp, it's a breeze to read. It's really well documented, too (to be fair, sectorlisp's writeup is a great read).


Nothing can really make creating and learning something superfluous


that helped more than the ton of classes and books I took on programming languages period. blessed be :D


they could have removed all the line breaks. Yes, you can do it in fewer lines!


We changed the URL from https://raw.githubusercontent.com/Robert-van-Engelen/tinylis... to one that doesn't auto-download the pdf.

The project page is https://github.com/Robert-van-Engelen/tinylisp/.


"CMC Threat Intelligence" flags that PDF as malicious.

https://www.virustotal.com/gui/url/081fd12ade3ff10b2812f630f...


Weird. 1 out of 87 vendors. I wonder what they found that others didn't.


Would be kinda cool if they just spotted that it contained a full LISP interpreter and assumed from that that it might be malicious because in some earlier case they came across a malicious PDF that obscured its payload with a custom LISP interpreter.


These days, it's common for antivirus to flag software as malicious because it looks unfamiliar. This one may have flagged the PDF as malicious because it contains something resembling source code, but the virus scanner cannot identify the code.


Anybody have an image or text copy?


https://archive.ph/zcTJs/image

You can make one from any webpage by prepending archive.ph to the URL.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: