elixir

Metaprogramming: From C Preprocessing to Elixir Macros

Miguel Palhas

Miguel Palhas on

Metaprogramming: From C Preprocessing to Elixir Macros

Developers have a love-hate relationship with metaprogramming. On the one hand, it’s a powerful tool to build reusable code. On the other hand, it can quickly become hard to understand and maintain.

I like to think of it as salt. It’s pretty handy on many occasions, but use just a little too much of it, and you’re left with an unenjoyable dish.

Also, large doses of either of them can lead to increased blood pressure. 😅

However, metaprogramming has come a long way since it’s early days. While I still try not to overuse it, it’s become more useful and easy to work with. Let’s see how it evolved.

C/C++

If we go back a few decades, to a time when programming languages were more close to the metal, the C/C++ preprocessor was one of the only options we had to do something close to metaprogramming.

This preprocessor was literally what the name suggests: A parser that would run through C code, and process specific definitions (keywords such as #define and #if), and would output a final version of the C code to the compiler. This final version could change based on some criteria. It would look something like this:

1#define FOO 1
2
3#if FOO == 1
4#define MSG "Hello, World"
5#else
6#define MSG "Goodbye, World"
7#endif
8
9#include <stdio.h>
10
11int main() {
12  printf(MSG);
13end

This program would print "Hello, World", always. As you may guess, changing the FOO definition to 0, and re-compiling the program, would instead cause it to print "Goodbye, World" instead.

These preprocessor directives would often be used to create code targeting specific platforms or architectures. For example, you could set different behaviors for your program when compiled to target Windows systems than when targeting Linux systems. The two resulting binaries would have only the code that was relevant to that specific platform, and thus wouldn’t need to perform runtime checks for these conditions. These savings in storage and runtime performance could often be significant.

However, if you have any C experience at all, you know how dangerous it is just in vanilla form. Now add a lot of preprocessing behavior on top of that, and it quickly becomes quite hard to manage. So it wouldn’t be advisable to use it for much more than small configurations, most of the time.

Ruby

With better technology and higher-level scripting languages, also came the possibility of creating more elaborate styles of programming. Particularly in Ruby, metaprogramming proved to be a powerful, yet scary feature.

The way this works in Ruby is based on the idea that code is nothing more than a string of text, interpreted and executed by the Ruby environment.

Since Ruby is interpreted at runtime, there’s no requirement of having the entire codebase compiled upfront. Ruby allows you to dynamically define instance methods on classes.

Also, due to the way Ruby classes and instances are constructed internally, you can even define methods for individual instances rather than the entire class!

PS: Further reading on Ruby Classes here.

1class Foo
2  def hello1
3    puts "Hello from a regular method"
4  end
5
6  [:hello2, :hello3].each do |f|
7    define_method f do
8      puts "Hello from a dynamically-defined #{f} method"
9    end
10  end
11end
12
13foo = Foo.new
14
15foo.define_singleton_method(:hello4) { puts "Hello only from this instance of Foo" }
16
17foo.hello1
18foo.hello2
19foo.hello3
20foo.hello4

Ruby is also pretty lax when it comes to editing existing code, even from the standard library. This is valid Ruby:

1array = [1, 2, 3]
2
3# will print out 3
4puts array.size
5
6class Array
7  def size
8    "Hello"
9  end
10end
11
12# will now print out "Hello"
13puts array.size

Don’t to that, though! It will most likely break your program and is a bad practice overall.

Last but not least, Ruby has some powerful ways of handling unexpected function calls, such as the method_missing callback:

1array = [1, 2, 3]
2
3class Array
4  def method_missing(method, *args)
5    puts "#{method} method not found"
6
7    if method == :sise then
8      puts "Did you intend to type size instead?"
9    end
10  end
11end
12
13puts array.sise

Overall, these abilities were a big game-changer for me when I first learned about them. It enabled me to think about my codebase in a whole new different way and improve it in the process.

There were some issues, though. You know what they say: with great power comes great responsibility.

Several Ruby libraries used and abused these metaprogramming mechanisms to create their own Domain Specific Languages. In the long run, this overuse would result in similar problems as we had in C++ times: difficulty maintaining and understanding a codebase.

Elixir took, in my opinion, yet another step forward in the right direction here…

Elixir ❤️

Here, metaprogramming is built into the language’s core in a much more powerful way. Whereas Ruby allowed you to define methods dynamically, or event generate a string and evaluate it as code (the old eval method that we all hate), Elixir allows you to mess with the Abstract Syntax Tree (AST) itself.

This is done through the quote keyword:

1iex> expr = quote do
2  "Hello, " <> "World"
3end

Trying out the above code, you’ll find that the string concat operation doesn’t get executed directly. Instead of a final string, you end up with an AST expression that describes your code:

1{:<>, [context: Elixir, import: Kernel], ["Hello, ", "World"]}

Those familiar with Polish Notation may quickly identify that this is equivalent to the string concatenation code from above. So by quoting some code, you get an AST description of that code, which you can then use across the rest of your codebase.

You can then start to reason about your code as if it were a data structure (which it is… an AST), and perform operations to transform it:

Let’s modify things a little bit:

1iex> expr = quote do
2  "Hello, " <> name
3end

Now our expression uses a dynamic name instead. However, where does that name come from? We don’t have that variable defined anywhere, but it is still syntactically correct:

1{:<>, [context: Elixir, import: Kernel], ["Hello, ", {:name, [], Elixir}]}

However, it will fail to execute, which we can test by using Code.eval_quoted/3:

1iex> Code.eval_quoted(expr)
2** (CompileError) nofile:1: undefined function name/0
3    (elixir) lib/code.ex:590: Code.eval_quoted/3
4    test.ex:5: (file)

Let’s now create a second AST definition:

1definition = quote do
2  name = "Miguel"
3end

This second expression definition defines a variable called name. However, remember, we’re not defining any value, just creating the AST for that operation.

We can combine these two expressions into a single one:

1final_code = quote do
2  unquote(definition)
3  unquote(expr)
4end

This ends up having the same result as if we had typed:

1name = "Miguel"
2"Hello, " <> name

However, notice we never had to abandon the Elixir syntax and rules while doing so. We’re writing Elixir that writes Elixir!

This is heavily used internally within Elixir’s core. Whenever you define a function, or a simple if statement, you’re executing macros that change the code’s AST according to fit your code into them. Speaking of which…

Elixir’s Macros

Much of Elixir’s features are written with macros. Many of the common operators you use can be rewritten with macros. Let’s take, for instance, the unless operator (which already exists in the language’s core) and define it ourselves:

1defmodule Foo do
2  defmacro custom_unless(condition, do: do_clause, else: else_clause) do quote do
3      if !unquote(condition) do
4        unquote(do_clause)
5      else
6        unquote(else_clause)
7      end
8    end
9  end
10
11  defmacro custom_unless(condition, do: do_clause) do
12    quote do
13      Foo.custom_unless(unquote(condition), do: unquote(do_clause), else: nil)
14    end
15  end
16end
17
18defmodule Bar do
19  require Foo
20
21  Foo.custom_unless true, do: IO.puts("not true"), else: IO.puts("true")
22end

Our custom_unless macro take in a boolean value. Inside, we check for the opposite of the condition (we run whatever code AST was given on that condition, and invert the resulting boolean). Then we execute the AST given for either the do or the else clause, depending on the result.

However, the fun part about Elixir is that, since even the basic constructs such as if clauses are often built using macros themselves, we can better embed our macros in the language. In other words, after defining our macro, this is also working Elixir code:

1defmodule Bar
2  # importing instead of requiring allows us to call the macro directly,
3  # without the Foo. prefix
4  import Foo
5
6  custom_unless true do
7    IO.puts("not true")
8  else
9    IO.puts("true")
10  end
11end

This works because the interpretation of a multiline if/else block in Elixir is not much more than syntactic sugar for:

1if condition do: something, else: something_else

Conclusion

Hopefully, this has been a useful walkthrough of how macros evolved in the past, especially for Elixir developers that may not know the full power of their language, as well as the history.

If you want to get a regular dose of ⚗️Elixir Alchemy, subscribe to get the next episode of Alchemy delivered straight to your inbox.

Share this article

RSS
Miguel Palhas

Miguel Palhas

Guest author Miguel is a professional over-engineer at Portuguese-based Subvisual. He works mostly with Ruby, Elixir, DevOps, and Rust. He likes building fancy keyboards and playing excessive amounts of online chess.

All articles by Miguel Palhas

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps