Highlight Text Manually In LaTeX Slides

29/04/2018

Sometimes I’ve been in the situation that I have a text snippet where a splash of color would explain things in a far easier way than having to create a visualization or use a laser pointer. Imagine something like a backtrace, with different parts highlighted in different colors. This thing can be done pretty easily, even when using LaTeX indirectly, like when compiling it from Org in Emacs.

The following trick relies on using the minted package for highlighting. It supports an option for embedding LaTeX escapes into your code snippets. The documentation shows off a mathematic formula in a comment, however we can do far more, like using the \textcolor command from the xcolor package to insert colored text. Have a silly example:

\setminted{escapeinside=||}
\definecolor{green}{HTML}{218A21}

...

\begin{minted}[]{text}
|\textcolor{red}{RR}|
|\textcolor{green}{GG}|
|\textcolor{blue}{BB}|
\end{minted}

In an Org file you’d have to do a bit less typing (assuming you customized Org to always use minted):

#+LATEX_HEADER: \setminted{escapeinside=||}
#+LATEX: \definecolor{green}{HTML}{218A21}

...

#+BEGIN_SRC text
|\textcolor{red}{RR}|
|\textcolor{green}{GG}|
|\textcolor{blue}{BB}|
#+END_SRC

Bridging the Ancient and the Modern

23/03/2018

I tried out some new social networks lately. Mastodon I quite like (it’s like what I’ve wanted Twitter to be), Discord, not so sure. So, if you’ve wondered about my reduced presence on IRC, that’s why.

Writing an IRC bot is one of the classic programming exercises that can be done in pretty much every programming language offering you some way to open TCP sockets and manipulate strings. I started doing one in Emacs Lisp long time ago, although not from scratch (rather by leveraging an existing IRC client) and wondered whether there is anything to learn from doing the equivalent with a “modern” IM platform like Discord. Could it be still be done from scratch? What else is different about it?

First I had to find a meaningful thing for the bot to do. I chose Eliza, the classic demonstration of a chatter bot that managed fooling people into having prolonged conversations with them. The version I’m familiar with is M-x doctor which is part of Emacs. So, first of all, I wrote some code to interface with that command in a REPL-style fashion. A companion shell script boots up Emacs in batch mode for interfacing with the doctor from the shell. Much like the original mode, you terminate your input by pressing RET twice. This is an intentional design decision to allow for multi-line input as seen on Discord (but absent from IRC, where you could get away with making it single-line input).

I briefly entertained the thought of writing the rest of the bot from scratch in Emacs Lisp, but abandoned it after learning that I’d need to use websockets with zlib compression to subscribe and respond to incoming messages. While there is an existing library for websocket support, I’d rather not figure out its nitty-gritty details, let alone with the lack of zlib compression. It doesn’t help that Discord’s official API docs are inconclusive and fail answering questions such as how you can set your current status (and more importantly, why it fails getting updated). So, an officially recommended Discord library it is.

The choice on which one it’s going to be depended on whether the programming language it’s implemented with allowed me to communicate with my shell script. I tried out discord.js first, battled a fair bit with Node.js, but gave up eventually. There doesn’t seem to be a way to spawn a child process and read from / write to its stdout / stdin pipes as you see fit. Instead you can only add a callback for the process output and good luck if you want to figure out what piece of output corresponds to the input you wrote earlier. This is why I went for discordrb instead, wrote some glue code for subprocess communication and started figuring out their API to react to incoming messages.

There are a few lessons to be learned from their API:

  • Allow adding as many event handlers as you want for specific events, with convenience options for narrowing down to commonly needed situations (like messages starting with a prefix)
  • Inside these event handlers, provide an object containing all context you’d need, including the information who to respond to
  • Keep the bot alive when an event handler errors out

Now, to test the bot I needed to unleash it on a server. It turns out that unlike on IRC bot accounts are handled specially. You must:

  • Register an application and obtain an ID for authorization purposes
  • Enable a bot account and obtain an authorization token
  • Generate an URL for inviting the bot to a server
  • Share that URL with someone wielding the necessary permissions
  • Hope for the best and wait for them to invite the bot

This meant that I had to create my own test server to check whether my code worked at all. For this reason I haven’t been able to get it running on the server I was invited on. If you want to, you can try it on your own server, the sources are on GitHub as always.


Cryptopals

03/03/2018

Solving the cryptopals crypto challenges was easily the most fun I’ve had programming. If you happen to work on public-facing code that relies on cryptography, by all means do these challenges. There is no crazy math involved[1] and the only prerequisite is that you’re familiar with any programming language. My solutions can be found on GitHub and include notes on each exercise, some of which spoil the puzzle bits.

I’ve learned the following while completing the original set of 48 exercises:

  • You should under no circumstances use ECB as cipher mode
  • Padding is a crucial thing to get right, both when attacking cryptographic systems and when implementing them
  • An attacker can bitflip your ciphertexts into anything they want and the only thing you can do about it is checking whether they’ve been tampered with before decrypting (like with a MAC or signature)
  • Do not ever reuse a nonce or you’ll weaken your crypto drastically
  • It’s much easier to exploit a sidechannel than attacking the cryptographic primitive
  • Overly detailed error messages can form a sidechannel
  • Do not seed a RNG with the current time, use your system’s CSPRNG instead
  • Do not use MT19937 for cryptographic purposes, given enough observation its next values can be predicted
  • Do not reuse your key as IV
  • Don’t invent your own MAC scheme, it may be susceptible to length extension attacks
  • Even something like a timing leak can form an exploitable sidechannel that circumvents the cryptographic system
  • Diffie-Hellman is susceptible to MITM attacks
  • Make sure to verify the parameters in asymmetric protocols for values that make the shared secret predictable and abort when encountering one
  • Don’t do textbook RSA, padding is crucial
  • Do not use low exponents with RSA
  • Do not use PKCS#1 v1.5 padding with RSA

One more thing that doesn’t fit into a short sentence. You’ve most certainly heard the advice “Don’t implement your own crypto”. This advice isn’t the whole truth because it doesn’t explain what exactly “your own crypto” means. Cryptography in software consists of primitives that are put together to achieve something useful, such as a hash function and a block cipher to form a HMAC. These primitives may be considered safe in isolation, however that doesn’t mean their combination will be equally safe. These combinations are called cryptographic systems and the security of one relies upon making sure none of the invariants are violated. Therefore, creating your own cryptosystem out of stock crypto primitives also counts as “your own crypto” and is rightfully considered dangerous. Your best bet is to use a vetted library that has been designed so that it’s hard to use incorrectly, such as libsodium.

[1]The challenges like to emphasize that it’s only 9th grader math. This is almost correct, you’ll want to look up basic statistics (which I’ve had in 12th grade) and modular arithmetic (which I’ve had at college).

Hand-crafted Uberjars

26/02/2018

While making a MIDI REPL I ran into the problem of providing people interested in trying it out something self-contained so that they wouldn’t have to recreate my dev setup. The solution for this is making a JAR file containing the .class files of your project and all of its dependencies. There are tools for this purpose such as Ant and Maven, but I couldn’t figure out how to make them work for me, so I decided to take a closer look at what happens behind the scenes and created a simple Makefile.

A JAR file is just a ZIP archive following certain rules. It must contain a manifest.txt in its root and class files inside directories mirroring the package structure. The only difference between a regular JAR and an Uberjar is that the latter will also include the class files of its dependencies. Tools for creating them will have to extract the class files from all JAR files involved and combine them into a directory tree before creating a new JAR file containing all required class files. Things can get ugly if your dependencies share the same package prefix (such as org.foo and org.bar) or if a dependency exists in multiple versions (such as org.foo depending on npm.leftpad-0.0.1 and org.bar depending on npm.leftpad-0.0.2)[1], I don’t even attempt to deal with these.

The manifest is a text file following a fixed format. The only thing you can get wrong here is the entry point which must be the name a class containing a static main method. It’s required so that a java -jar my.jar knows where to look, however the entry point can be changed by running java -cp my.jar <classname> instead. This is useful for debugging and allows you to add other dependency JARs to the classpath you haven’t put into your Uberjar yet. Just change the argument to -cp to be a double colon separated list of JARs.

The dependencies are unzipped into a temporary directory. The jar tool supports changing the working directory so that you can switch to that directory and add the extracted directories without any prefix. That’s everything necessary to create a runnable JAR!

[1]The way to deal with them is using a custom class loader, as demonstrated by yet another product in the problem space.

Design Is Hard

20/10/2017

This isn’t about the pixel pushing kind of design, but the engineering one. Given a problematic matter, what choices do you make to create a tool that enables its user to effectively interact another object? More importantly, how do you deal with choices that are hard to rectify afterwards? While this is going to be a rant, the subject is one of my more popular Emacs packages, Shackle. I thought the 1.0.0 release of it with a new debugging facility to make troubleshooting easier is just the right moment to ponder a bit about those choices I made and why I regret some of them.

You may wonder “Wait, what is wrong with Shackle? It has over a hundred stars of GitHub, a few thousand downloads on MELPA, dozens of people using it in their init files and a handful of people recommending it to others.”. While all of this is true, it’s not all roses. I occasionally get issues from users that don’t understand it at all and I can’t really blame them. There is a fundamental mismatch going on here because all this package does is hijacking the display-buffer-alist variable to invent a similar, but not quite as powerful mechanism on top of it. It’s an inherently leaky abstraction which makes for less than ideal debugging: If it ever breaks down, you’ll have to understand both the abstraction and the underlying code it’s built upon.

This project started off with me not understanding how to use this variable at all. In hindsight, this should have been the first warning signal: If you can’t fully understand the problem, don’t expect to solve it in a satisfactory manner. There are a few glaring problems with display-buffer-alist:

  • The docstring for it is hard to parse. If a newbie asks how to customize the display of a certain buffer and is directed to that variable, I couldn’t blame them for just giving up on this altogether.
  • It isn’t clear how to display a buffer in a certain way. I’ve found only one example in the elisp manual so far and it’s more about display-buffer than display-buffer-alist.
  • Conditions may be buffer names and functions, but not major modes. This is rather annoying as it means you’ll have to write a function to check the major mode yourself. While this is far from fool-proof (the code setting up the buffer may enable the desired major mode only after displaying it), it works in many cases.
  • If your customization of display-buffer-alist contains a call to a function that errors out, the display of that buffer will fail. This is particularly annoying if you have a catch-all rule there that prevents the source debugger window from appearing, something I mostly ran into while developing Shackle. While you can use M-: (setq display-buffer-alist nil), it’s relatively annoying to do so.
  • The default behavior is rather inscrutable and mostly, but not only determined by display-buffer-fallback-action. Worse, some packages rely on the default behavior just to fail with customizations to display-buffer-alist.

Now, does Shackle do better? Well, it does in some ways while being worse in others:

  • Conditions are interpreted as buffer names (if a string) or modes (if a symbol) or a list of either. While this is convenient, the original design had the issue of making it impossible to match by regex or use a custom function, so I added a :regex modifier to the action (which is just wrong because it changes all of them to match by regex) and interpret a list starting with :custom as a function which isn’t nice either. Judging by GitHub’s search there’s about three users of this functionality, with the most prolific one being doom.
  • Shackle tries being easier to understand with regards to actions by abolishing the alist approach and instead going for a flat plist. There is no hierarchy whatsoever which turned out to be a mistake, people didn’t understand that there were keywords with mutually-exclusive behavior, keywords that modified other keywords and keywords that work universally. I’ve had feature requests where I was asked to allow to combine keywords more flexibly, to explain how the whole thing works and most surprisingly, to provide a grammar of the implemented language. The latter found its way into the README and is more confusing than helpful IMO. If you want to understand the behavior, you’re best off with heading to the source. I consider this to be the ultimate proof of failing at its design.
  • It’s way harder to shoot yourself in the foot, in case you do you can always bail out with M-x shackle-mode and revert to vanilla Emacs behavior.
  • The mere act of enabling Shackle will subtly change the default behavior of displaying buffers. The reason for this is shackle--display-buffer-popup-window which tries to do something sensible, but will never behave like the original.
  • I’ve added a feature that doesn’t display a window differently, but rather modifies the window parameter. Admittedly it makes things more convenient because you’d otherwise need a second package to achieve the same effect, but it’s the main reason for display of buffers intended to not be selected to have weird side effects.
  • Debugging Shackle not working as expected is rather tricky. In the best case you’ll need to look at the source code of a package to check whether it’s using display-buffer or a function using it internally (like pop-to-buffer, pop-to-buffer-same-window, switch-to-buffer-other-window, etc.). In the worst case you’ll need to debug the part of the package displaying such windows or Shackle itself while it tries matching conditions and applying actions. I’ve added a tracing mode to make the former easier, but the inherent leaky abstraction remains.
  • While Shackle stayed mostly the same, Emacs gained new capabilities for display-buffer-alist. There isn’t nearly as much reason for using Shackle now, other than laziness. Other people reached the same conclusion that it’s worth investing some of your time in customizing display-buffer-alist.

The bottom line is that I’m not happy with Shackle’s design, but am wise enough to keep it as is and not do any more invasive changes. My happiness (or the lack of) isn’t worth risking the happiness of its users.