Brave New World

17/11/2016

Update: Finally figured out the layout after digging a bit more into the sources, it’s a QWERTY-UK (see devices/rpi2/uspi/include/uspios.h). Looks like I’ll have to modify the bundled USPI library to include a QWERTY-US layout before I can make any progress on keyboard remapping in Lisp…

I believe I’ve found an even greater time sink than writing Lisp interpreters for fun. Long time ago, I’ve read an encouraging blog post on the future of the LispM, not expecting to find an implementation of the ideas presented therein. Turns out I was wrong about that. Meet Interim OS!

In case you’re wondering why you should possibly care:

  • Small and readable codebase (most of the code is device drivers for the Raspberry Pi)
  • Simple to hack on
  • Plan9-style APIs
  • Minimal Lisp dialect
  • Runs on your favorite desktop OS in hosted mode, that is, safely contained to a terminal with the ability to spawn graphical windows
  • Runs on bare metal (Raspberry Pi 2)

Getting it to run in hosted mode is simple enough, so I won’t explain it here. Booting on bare metal however is a different story, so here we go:

$ git clone https://github.com/mntmn/interim
$ cp interim/docs/interim-0.1.0-rpi2.tgz ./
$ bsdtar -xf interim-0.1.0-rpi2.tgz # cry me a river
% mkdir /media/boot
% mount /dev/sdXN /media/boot
% cp release-rpi2/* /media/boot/
% rm /media/boot/cmdline.txt
% umount /media/boot
  • Plug in the SDHC card, a HDMI monitor and a USB keyboard
  • Optionally: Plug in a network cable and/or a USB mouse
  • Power up

You’ll be greeted by a “Welcome to Interim OS” and dropped into a promptless shell. If you’re unlucky, the chosen resolution may be unreadable, so feel free to retry this process a few times. The keyboard layout is hardcoded and somewhere between QWERTY-US and QWERTZ-DE, something I intend to fix soon. For basic usage instructions, type (bytes->str (load "/sd/hello.txt")) and hit the enter key. Happy hacking!


On Minimalism

26/10/2016

Update: There is a CL merge request that supports SBCL among other implementations, as expected it beats the picolisp implementation in speed by a factor of 2x. Still, I’m fine with being second place :)

I’ve implemented MAL for the third time by now, this time in PicoLisp, a language priding itself on its implementation simplicity. While it clearly is a Lisp dialect, it has foregone a good amount of classic Lisp design choices in favor of terse code. Despite this, there are practical inclusions for writing application software, like the GUI system and a distributed database implementation with a Prolog-style query language. Other interesting features are an unobtrusive OOP system, a FFI for C and Java, live debugging utilities, pattern matching and more in a 1MiB tarball.

You might wonder why I’d be up for learning yet another Lisp dialect, after having learned Emacs Lisp, Clojure and Scheme. Furthermore, Scheme already claims to take minimalism as language design principle and of course there are more obscure Lisp dialects, like Arc and newLISP. I can only blame a friend who told me about this fascinating talk given by the PicoLisp author demonstrating the abilities of his programming language. The overall picture my friend painted was that of a bizarro world where a crazy German ignored all established rules for a Lisp dialect, walking the fine line between insanity and practicability. Most surprisingly though was that he used his own invention to write business applications and succeeded in making a living off it. Naturally I was intrigued and kept PicoLisp on my backlog of things to play with.

My implementation is a bit smaller than the Emacs Lisp one, is the first one to actually make use of GNU readline and went for a purely Lisp tokenizer as I couldn’t figure out how to use PCRE for this task. It also appears to be the fastest one out of all Lisp family implementations. This might change though once the “clisp” implementation gains support for using SBCL instead of CLISP…

Regarding oddities, here’s an incomplete list:

  • No lambda. You pass a quoted list instead.
  • Quote returns more than only its first argument. However (quote 1 2 3) is equivalent to '(1 2 3) so you’ll most likely not ever notice…
  • No macros. Functions can instead not evaluate their arguments, your job is to evaluate them as needed.
  • No strings. String syntax creates so-called “transient” symbols. Additionally to doubling as string replacement, these are not equal to other symbols with the same contents and are therefore used to avoid name clashes in macro-like functions.
  • No implicit closures. If you need to capture something from the environment, you must do this explicitly and have the choice between mutable and immutable ones.
  • Unknown symbols evaluate to NIL instead of throwing an exception.
  • Indentation (and even pretty-printing) is a lot easier than in classic Lisp dialects as it’s basically about increasing the depth by three spaces for each level instead of lining up parentheses.
  • Closing parentheses that do not belong to the current line are separated by spaces. For convenience’s sake, a closing bracket is interpreted as “super paren” and closes all remaining parentheses.
  • The style guide has an interesting solution to the problem of local variables potentially shadowing built-in functions: Capitalized identifiers!
  • NIL is not only equivalent to the empty list, but to the empty string as well. This bit me when wrapping readline as an empty string couldn’t be discerned from NULL with the naïve approach…
  • Error handling is close to non-existing. If you screw up things too much, the interpreter will segfault on you. This is not considered a bug.
  • Identifiers for built-in functions are very short and at times cryptic[1]. Clojure got nothing on that!
[1]read does not parse a string into a S-expression, str does. The result of this cannot be handed to eval either, you’ll need to run it instead. I could go on with this for a while…

Small-time Patching

22/09/2016

Update: Added the improved backtrace

Update: Merged!

Today #emacs reminded me of an oddity in Emacs I’ve sort of learned to live with: Backtraces are, well, see for yourself:

Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p t)
  +(1 t)
  eval((+ 1 t) nil)
  eval-expression((+ 1 t) nil)
  call-interactively(eval-expression nil nil)
  command-execute(eval-expression)

I can live with errors printed as a list. What I can’t live with is none of the call stack lines being printed as a S-Expression… To fix this, one must dive a bit deeper than usual as only the debugger porcelain is implemented in Emacs Lisp. Its workhorse is backtrace in eval.c:

DEFUN ("backtrace", Fbacktrace, Sbacktrace, 0, 0, "",
       doc: /* Print a trace of Lisp function calls currently active.
Output stream used is value of `standard-output'.  */)
  (void)
{
  union specbinding *pdl = backtrace_top ();
  Lisp_Object tem;
  Lisp_Object old_print_level = Vprint_level;

  if (NILP (Vprint_level))
    XSETFASTINT (Vprint_level, 8);

  while (backtrace_p (pdl))
    {
      write_string (backtrace_debug_on_exit (pdl) ? "* " : "  ");
      if (backtrace_nargs (pdl) == UNEVALLED)
        {
          Fprin1 (Fcons (backtrace_function (pdl), *backtrace_args (pdl)),
                  Qnil);
          write_string ("\n");
        }
      else
        {
          tem = backtrace_function (pdl);
          Fprin1 (tem, Qnil);   /* This can QUIT.  */
          write_string ("(");
          {
            ptrdiff_t i;
            for (i = 0; i < backtrace_nargs (pdl); i++)
              {
                if (i) write_string (" ");
                Fprin1 (backtrace_args (pdl)[i], Qnil);
              }
          }
          write_string (")\n");
        }
      pdl = backtrace_next (pdl);
    }

  Vprint_level = old_print_level;
  return Qnil;
}

Despite the rather unusual look and weird naming, it’s not too hard to find the culprit. Most of the time is spent inside a loop that walks through the call stack, accesses the top-most function and args with backtrace_function and backtrace_args, prints a lisp object with Fprin1 (which is just another way to use prin1 from C code) and writes out normal strings with write_string. Qnil refers to the global nil symbol, tem is a naming convention for a temporary variable. It should be sufficient to print the opening paren first, then the function, a space and proceed normally from that point:

tem = backtrace_function (pdl);
write_string ("(");
Fprin1 (tem, Qnil); /* This can QUIT.  */
write_string (" ");

Time to recompile Emacs with make, boot the binary with src/emacs -Q, and trigger a backtrace with M-: (+ 1 t). Unfortunately that does not yield a prettier backtrace yet, but rather a Search failed: "\n debug(". As the debugger is busted, I had to resort to ag to find where exactly the breakage occurs. It’s not too hard to fix, mind you, all you have to do is to patch debugger-setup-buffer in debug.el to search for "\n (debug" instead.

The result is the following teensy patch:

diff --git a/lisp/emacs-lisp/debug.el b/lisp/emacs-lisp/debug.el
index 22a3f39..4020620 100644
--- a/lisp/emacs-lisp/debug.el
+++ b/lisp/emacs-lisp/debug.el
@@ -279,7 +279,7 @@ That buffer should be current already."
   (goto-char (point-min))
   (delete-region (point)
             (progn
-              (search-forward "\n  debug(")
+              (search-forward "\n  (debug")
               (forward-line (if (eq (car args) 'debug)
                                      ;; Remove debug--implement-debug-on-entry
                                      ;; and the advice's `apply' frame.
diff --git a/src/eval.c b/src/eval.c
index 72facd5..e32e7a1 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -3409,8 +3409,9 @@ Output stream used is value of `standard-output'.  */)
       else
    {
      tem = backtrace_function (pdl);
-     Fprin1 (tem, Qnil);   /* This can QUIT.  */
      write_string ("(");
+     Fprin1 (tem, Qnil);   /* This can QUIT.  */
+     write_string (" ");
      {
        ptrdiff_t i;
        for (i = 0; i < backtrace_nargs (pdl); i++)

And an IMHO vastly improved backtrace:

Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p t)
  (debug error (wrong-type-argument number-or-marker-p t))
  (+ 1 t)
  (eval (+ 1 t) nil)
  (eval-expression (+ 1 t) nil)
  (funcall-interactively eval-expression (+ 1 t) nil)
  (call-interactively eval-expression nil nil)
  (command-execute eval-expression)

Not sure whether to bother submitting this… Let me know what you think!


Haunted by Hermann Zapf

21/08/2016

Linux gives me, for worse or better, the ability to coerce nearly every application into picking any font. The respective component in its font stack is the aptly named Fontconfig software. This is good because some applications insist on fonts I don’t like and bad because sometimes my configuration does the wrong thing, resulting in bizarre text rendering problems.

One of those has plagued me for months. When using i3lock, the unlock indicator used a swirly, cursive font from the TeX Gyre project instead of my default sans-serif font. I’ve eventually identified it as TeX Gyre Chorus which appears to be a libre variant of ITC Zapf Chancery. Why would i3lock ever get the idea to use that of all the fonts? It didn’t make much sense, the mere act of making TeX fonts available to the rest of the system shouldn’t do something this drastic…

Eventually I’ve had the idea to use the FC_DEBUG environment variable with 4096 as value, this revealed that the query was empty or in other words, no font was set at all. For some inexplicable reason, fc-match "" returns something entirely different.

Thanks to Cairo and its simple API, it didn’t take me long to figure out a fix. While it recommends to use Fontconfig directly for serious usage, I’d rather not. Just see for yourself at FcPatternBuild and recoil in horror.


Lisping the ChucKian Way

10/08/2016

I did it again! For the uninitiated, ChucK is a special-purpose programming language intended to be used for sound synthesis. While working on a college assignment, it occurred me that the language had just the minimum amount of features to be leveraged for my second MAL implementation (previously): Console/File I/O, an object system reminiscent of both Java and C++, regular expressions and arrays. How hard could it possibly be to write the probably most advanced ChucK program in existence?

This turned out to be a good deal more annoying than my first implementation. The major problem besides the language giving you minimal support for non-audio programming was that clearly nobody else did use it for a project of this size. I came to this conclusion after reporting a number of bugs that should have been absolutely obvious to anybody using the standard library for anything else than creating music.

Modules

I didn’t expect the first obstacle to be the very act of loading code from another file, something that should surely be supported well for a system also used in live programming. It turns out that live programming is a very stretchable term; what they do is more akin to hotswapping invidual code units in terms of files which is cute, but not what I need. There is a Machine.add(file) facility available from code, however it does not immediately load code from the specified file and instead post-pones loading after the current file has been loaded up.

I’ve pondered whether to create a file solely consisting of these instructions via make, then decided against it in favor of an upfront loading approach where a runner script extracts “magic” comments indicating the dependencies and boots chuck with all of them in that order and the file they originate from. While this isn’t ideal[1], it works surprisingly well.

I/O and time

Getting user input was also tricky. Initially I tried out the HID example just to find out that it would only work for special input devices as opposed to console input. Therefore I tried out the ConsoleInput class which gives a “hacked” thing to read from. Due to it having some oddities such as not accepting C-d (and therefore only being terminable with C-c which brings down the entire process), behaving incorrectly when nested and adding an extra space to the prompt, I hacked my own thing together with the KBHit class, an ASCII table and some flushing to standard output.

Speaking of printing, you would expect the manual to tell you how one does that… Instead you are told about the debug output syntax <<< foo >>>;, I’ve had to consult the VERSIONS file to learn that one can send strings to chout. Another wonderful gotcha was that due to some RtAudio bug, you can get spurious errors about your audio stream still running which mess up the prompt. The only way to get rid of them is starting the process with --silent which just runs everything as fast as possible resulting in 100% CPU usage. Fun.

Finally, I’ve hunted for a way to measure time for the performance tests, but learned that ChucK’s notion of it is more about coordination of sound. In silent mode, it is therefore useless. However not all hope is lost as you can shell out to date and retrieve its output in a hacky way: Std.system(command) doesn’t return anything useful, so you need to redirect to a file and read from it instead…

Exceptions

ChucK is somewhere between a traditional compiled and interpreted language. I did run into a good amount of exceptions, but was surprised that I couldn’t throw, create or catch any. This is saddening me as it forces one to return errors and check for them very often, be it in form of integers and out parameters (hello, C!) or a dedicated error object.

Functions

I’m pretty used to have at least some way to pass functions around and expected ChucK to be the same considering that the debug syntax had a way of printing functions. However the language doesn’t have any function type, so while you can coerce one into an Object, it won’t do you any good. Therefore I went for the C# solution and implemented functors, that is, classes with a call(args) method which one can instantiate and pass around. Yuck.

OO

While one can do Java-like OOP, it is severely limited. Considering that there are no generics, no super, no interfaces, no unions, no casting to arrays, no self-references, no automatic boxing or boxed primitives, an impoverished static keyword and no private, the resulting code is clumsy, yet has a certain air to it which I’d call “ChucKian”, like a few other people did on the chuck-users mailing list. The most impressive collection I’ve found so far is LicK.

Types

There is a strict distinction between reference and value types which require using either the @=> or => operators. While the manual insists strings are reference types, one can use => just fine on them…

While there are no hashmaps, one can use an array with string keys and store or retrieve objects of the array type. What doesn’t work though is retrieving or even iterating over the keys. For this reason I wrote a terrible hash table implementation using regular arrays.

Other

I’m not impressed by the compiler. It doesn’t catch things like missing returns, blatant type system abuse and OOP mistakes that result in segfaults or thrown assertions. While many hate Java backtraces, not having any is worse. Scoping is most certainly not lexical and forced me to pick more unique names in a few places.

Summary

I’ve dragged out this project far too long, but have been happy to learn that it is possible to implement MAL in a language as weird as ChucK. Next time I’ll hopefully pick a more featureful language, like SuperCollider

[1]The most obvious reason is that files may not contain cyclical dependencies, a less obvious one is that due to the approach of at most one class per file, one ends up with comically large command lines.