dnl -*- html -*- define(__timestamp, 2022-02-08 08:52:53)dnl define(__title, `in which five different paths lead to methods')dnl define(__id, 197)dnl include(header.html) <p>I recently made a change in a codebase I've been working on which illustrated an interesting trade-off around modeling in software. The project was <a href="https://git.sr.ht/~technomancy/taverner">Taverner</a>, an IRC server written in <a href="https://fennel-lang.org">Fennel</a>.</p> <p>In particular it had to do with the way that channels were modeled. A channel is basically a "chat room"; it's just something that users can join which lets you send messages to anyone else who's also in the channel. In many languages you would define a Channel class which has a bunch of methods like join, part, send, etc. Fennel doesn't have classes, but there are a few different alternatives available<sup><a href="#fn1">1</a></sup>.</p> <h4>Take 1: Module-based methods</h4> <p>The most obvious approach is to have a <tt>channel</tt> module which just exports the functions that would have been methods along with a constructor function:</p> <pre class="code">(<span class="keyword">fn</span> <span class="function-name">send</span> [{<span class="keyword">:</span> buffer} nick ...] (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)])) (<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&as</span> ch} nick conn] (<span class="keyword">tset</span> members nick conn) (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name)) (<span class="keyword">fn</span> <span class="function-name">part</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&as</span> ch} nick ?cmd] (<span class="keyword">tset</span> members nick nil) (send ch nil (<span class="keyword">..</span> <span class="string">":"</span> nick) (<span class="keyword">or</span> ?cmd <span class="builtin">:PART</span>) name) (<span class="keyword">when</span> (empty? ch) (<span class="type">ch.remove</span>))) (<span class="keyword">fn</span> <span class="function-name">flush</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> buffer}] (<span class="keyword">each</span> [nick conn (<span class="builtin">pairs</span> members)] (<span class="keyword">each</span> [_ [sender msg] (<span class="builtin">ipairs</span> buffer)] (<span class="keyword">when</span> (<span class="keyword">not=</span> nick sender) (conn<span class="builtin">:send</span> (<span class="keyword">..</span> msg <span class="string">"\r\n"</span>))))) (<span class="keyword">while</span> (<span class="builtin">next</span> buffer) (<span class="type">table.remove</span> buffer))) (<span class="keyword">fn</span> <span class="function-name">empty?</span> [{<span class="keyword">:</span> members}] (<span class="keyword">=</span> nil (<span class="builtin">next</span> members))) (<span class="keyword">fn</span> <span class="function-name">member-names</span> [{<span class="keyword">:</span> members}] (<span class="keyword">icollect</span> [k (<span class="builtin">pairs</span> members)] k)) (<span class="keyword">fn</span> <span class="function-name">member?</span> [{<span class="keyword">:</span> members} nick] (<span class="keyword">not=</span> nil (<span class="keyword">.</span> members nick))) (<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name state] {<span class="keyword">:</span> name <span class="builtin">:members</span> {} <span class="builtin">:buffer</span> [] <span class="builtin">:remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil)}) {<span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member? <span class="keyword">:</span> make-channel}</pre> <p>There's nothing particularly clever going on here, which I believe is a big strength. Everything is obvious. The <tt>make-channel</tt> function acts as a constructor, while every other function in the module takes a channel as its first argument, so you can write code like <tt>(channel.join ch client.nick client.conn)</tt> where <tt>ch</tt> is a channel table you got from calling the constructor.</p> <p>The biggest downside here is that it lacks <em>encapsulation</em>. All the data for a channel is exposed in the table that gets passed around to other modules, and it isn't clear which fields are safe to use and which are implementation details which might change later on. In a small codebase maybe this is no problem, but as it grows and changes over time, it will make it more difficult to know what effect a given change might have in a different part of the codebase.</p> <h4>Take 2: Closure-based methods</h4> <p>There's an old saying that "closures are a poor man's objects" and "objects are a poor man's closures". Keeping internal data private by exporting functions whose scope closes over the internal data is one of the oldest tricks in the book:</p> <pre class="code">(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name server-state] (<span class="keyword">let</span> [members {} buffer []] (<span class="keyword">fn</span> <span class="function-name">flush</span> [] (<span class="keyword">each</span> [nick conn (<span class="builtin">pairs</span> members)] (<span class="keyword">each</span> [_ [sender msg] (<span class="builtin">ipairs</span> buffer)] (<span class="keyword">when</span> (<span class="keyword">not=</span> nick sender) (conn<span class="builtin">:send</span> (<span class="keyword">..</span> msg <span class="string">"\r\n"</span>))))) (<span class="keyword">while</span> (<span class="builtin">next</span> buffer) (<span class="type">table.remove</span> buffer))) (<span class="keyword">fn</span> <span class="function-name">send</span> [nick ...] (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)])) (<span class="keyword">fn</span> <span class="function-name">join</span> [nick conn] (<span class="keyword">tset</span> members nick conn) (send <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name)) (<span class="keyword">fn</span> <span class="function-name">part</span> [nick ?cmd] (<span class="keyword">tset</span> members nick nil) (send nick nick (<span class="keyword">or</span> ?cmd <span class="builtin">:PART</span>) name) (<span class="keyword">when</span> (<span class="keyword">=</span> nil (<span class="builtin">next</span> members)) <span class="comment">; last one out off the lights </span> (<span class="keyword">tset</span> <span class="type">server-state.channels</span> name nil))) {<span class="keyword">:</span> name <span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush <span class="builtin">:empty?</span> <span class="keyword">#</span>(<span class="keyword">=</span> nil (<span class="builtin">next</span> members)) <span class="builtin">:member-names</span> <span class="keyword">#</span>(<span class="keyword">icollect</span> [k (<span class="builtin">pairs</span> members)] k) <span class="builtin">:member?</span> <span class="keyword">#</span>(<span class="keyword">not=</span> nil (<span class="keyword">.</span> members <span class="keyword">$</span>))})) {<span class="keyword">:</span> make-channel}</pre> <p>Now the module only exports one thing: <tt>make-channel</tt> function, which returns a table that you can think of as if it were an instance of a Channel class. It has functions inside the table which act like methods would. This makes the interface of the channel very clear and well-defined. If you want to do anything with a channel, you have to use one of the functions in the channel table. You can change anything about the internals, and as long as you update everything in the <tt>make-channel</tt> function, you know you won't break something elsewhere. In a word, it's encapsulated.</p> <p>But there is one very serious downside to this style<sup><a href="#fn2">2</a></sup>: reloading the code would only affect new channels; existing ones would keep the same old code as before since only the module gets the new functions<sup><a href="#fn3">3</a></sup>. In a normal program I might put up with this, even though I really love reloading. But in a long-running IRC server, it's really not a good idea! Getting everyone to leave a channel so you can recreate a version of it which has the new version of the code is extremely disruptive. I absolutely need the ability to fix bugs and add features while the server is running without disrupting the users, and that meant as nice as this code feels, it's not going to cut it. How can we get both encapsulation and reloadability?</p> <h4>Take 3: Metatable-based methods</h4> <p>Lua tables have one feature which gives them an extraordinary amount of flexibility: metatables. There's a lot you can do with metatables, but for the purposes of this code the most important thing is that you can set an <tt>__index</tt> method on them which will let you set a fallback for when you try to look up a field which does not exist in the table. This lets us keep doing method lookups using the module table directly (allowing reloads) but also keeping the data itself out of the table which is exposed as the public interface:</p> <pre class="code">(<span class="keyword">fn</span> <span class="function-name">send</span> [{<span class="keyword">:</span> buffer} nick ...] (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)])) (<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&as</span> ch} nick conn] (<span class="keyword">tset</span> members nick conn) (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name)) <span class="comment-delimiter">;; </span><span class="comment">... all the methods are the same as the first version </span> (<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name state] (<span class="keyword">let</span> [public {<span class="keyword">:</span> name} channel-state {<span class="builtin">:members</span> {} <span class="builtin">:buffer</span> [] <span class="builtin">:remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil)}] (<span class="builtin">setmetatable</span> public {<span class="builtin">:__index</span> channel-state}) (<span class="builtin">setmetatable</span> channel-state {<span class="builtin">:__index</span> (<span class="builtin">require</span> <span class="builtin">:channel</span>)}) public)) {<span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member? <span class="keyword">:</span> make-channel}</pre> <p>This looks nice! It's very clear what the public fields are (only the channel's <tt>name</tt>) and the private fields are attached using the first metatable. But if we put the method functions directly into the <tt>channel-state</tt> table <em>during the constructor</em> we would have the same reload problem as the previous version where the module containing the methods would change after we already pulled the functions out of it, and we wouldn't see the new values. Because of that, we use <em>the module itself</em> as the metatable of the metatable.</p> <p>There's one big downside to this compared to the previous version: it lacks transparency:</p> <pre>>> (local channel (require :make-channel)) >> (local ch (make-channel "#mychannel" state [])) >> ch {:name "#mychannel"} ; wait, where are the methods? >> ch.join #<function: 0x55c7d468f0> ; but it's found if you ask for it directly >> (ch:join client.nick client.conn) ; and this works fine! </pre> <p>The functions are found (via <tt>__index</tt>) when you go look them up, but they do not show up otherwise. This is a common problem with using metatables; they can lead to surprising, unpredictable behavior. While there is a workaround to this (the <tt>pairs</tt> metamethod) it's error-prone and does not work on all versions of the Lua runtime. Personally I try to avoid metatables unless the downsides of the alternatives are too great. But what other options are there?</p> <h4>Take 4: Class-based methods</h4> <p>Just because Lua and Fennel don't have classes as part of the language doesn't mean you can't use classes; metatables give you the flexibility to construct your own class system if that's what you really want. The <a href="https://github.com/kikito/middleclass">middleclass</a> library is one of the most popular implementations of this for Lua, which means of course that we can use it from Fennel too:</p> <pre class="code">(<span class="keyword">local</span> <span class="variable-name">class</span> (<span class="builtin">require</span> <span class="builtin">:middleclass</span>)) (<span class="keyword">local</span> <span class="variable-name">Channel</span> (class <span class="builtin">:Channel</span>)) (<span class="keyword">fn</span> <span class="function-name">Channel.send</span> [{<span class="keyword">:</span> buffer} nick ...] (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)])) (<span class="keyword">fn</span> <span class="function-name">Channel.join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&as</span> ch} nick conn] (<span class="keyword">tset</span> members nick conn) (ch<span class="builtin">:send</span> <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name)) <span class="comment-delimiter">;; </span><span class="comment">the methods are the same as before </span> (<span class="keyword">fn</span> <span class="function-name">Channel.initialize</span> [self name state] (<span class="keyword">set</span> <span class="type">self.name</span> name) (<span class="keyword">set</span> <span class="type">self.members</span> {}) (<span class="keyword">set</span> <span class="type">self.buffer</span> []) (<span class="keyword">set</span> <span class="type">self.remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil))) Channel</pre> <p>If you're used to Java or Ruby or another class-based language, this may look comfortingly familiar to you. You define a class, and you give it methods. You invoke them using <tt>(ch:join client.nick client.conn)</tt> notation. But how does it fare on the encapsulation and reloadability fronts?</p> <pre>>> (Channel:new "mychannel" {}) {:buffer {} :class {:__declaredMethods {:__tostring #<function: 0x55c7c28450> :empty? #<function: 0x55c7b56660> :flush #<function: 0x55c7bd4aa0> :initialize #<function: 0x55c7b711c0> :isInstanceOf #<function: 0x55c7d45000> :join #<function: 0x55c7d36da0> :member-names #<function: 0x55c7c06720> :member? #<function: 0x55c7f06a80> :part #<function: 0x55c7b710b0> :send #<function: 0x55c7f03e40>} :__instanceDict @3{:__index @3{...} :__tostring #<function: 0x55c7c28450> :empty? #<function: 0x55c7b56660> :flush #<function: 0x55c7bd4aa0> :initialize #<function: 0x55c7b711c0> :isInstanceOf #<function: 0x55c7d45000> :join #<function: 0x55c7d36da0> :member-names #<function: 0x55c7c06720> :member? #<function: 0x55c7f06a80> :part #<function: 0x55c7b710b0> :send #<function: 0x55c7f03e40>} :name "Channel" :static {:allocate #<function: 0x55c7c0a930> :include #<function: 0x55c7b8a5d0> :isSubclassOf #<function: 0x55c7d44f70> :new #<function: 0x55c7d44b40> :subclass #<function: 0x55c7b8a590> :subclassed #<function: 0x55c7af5600>} :subclasses {}} :members {} :name "mychannel" :remove #<function: 0x55c7d2e570>} </pre> <p>Yikes! That's a lot of ... stuff. The methods are just dumped straight into a nested table inside the instance itself (twice, for some reason?) and the fields are not encapsulated away at all. The middleclass wiki has <a href="https://github.com/kikito/middleclass/wiki/Private-stuff">some suggestions for how to keep data private</a> but they are quite inconvenient compared to simply using closures. On top of that, the printed representation of the instance is very cluttered and messy. Overall it's not clear that we gain much from this approach beyond a sense of familiarity for people who come from certain other languages.</p> <h4>Take 5: Reloadable, encapsulated methods</h4> <p>So far the closure version from take 2 has appealed to me the most; the tight encapsulation there just feels <em>so tidy</em>. What if we could go back to that but do something about the reloading? Well, there is actually one other concern we haven't touched on with reloading yet, and it leads us to our solution.</p> <p>When you reload, you're bringing a new version of a module into play in a system that's already running. When your program is a server, that means that you've got "in-flight" connections with active users of your program. What happens when you add a function that expects some new fields that didn't exist when your users initially connected? For example, let's say we add a ban list to the channels. This data wasn't included in the existing channels, but now you need to check it when you join:</p> <pre class="code">(<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> banned <span class="keyword">&as</span> ch} nick conn] (<span class="builtin">assert</span> (<span class="keyword">not</span> (<span class="type">lume.find</span> (<span class="keyword">or</span> banned []) nick)) <span class="string">"Cannot join channel; banned."</span>) (<span class="keyword">tset</span> members nick conn) (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))</pre> <p>You could code defensively and make sure that every single reference to the field is wrapped in an <tt>or</tt>, but that's a drag. You're sure to miss one. And do you really want that check sticking around in your codebase forever? What we really want here is something like Erlang's upgrade process<sup><a href="#fn4">4</a></sup> for when it <a href="https://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving">hot loads a new module</a>. Here we provide an <tt>upgrade</tt> function which takes the existing table and replaces its contents with the closures from the new version:</p> <pre class="code">(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name server-state ?members ?buffer ?banned] (<span class="keyword">let</span> [members (<span class="keyword">or</span> ?members {}) banned (<span class="keyword">or</span> ?banned []) buffer (<span class="keyword">or</span> ?buffer [])] (<span class="keyword">fn</span> <span class="function-name">send</span> [nick ...] (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)])) (<span class="keyword">fn</span> <span class="function-name">join</span> [nick conn] (<span class="builtin">assert</span> (<span class="keyword">not</span> (<span class="type">lume.find</span> banned nick)) <span class="string">"Cannot join channel; banned."</span>) (<span class="keyword">tset</span> members nick conn) (send <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name)) <span class="comment-delimiter">;; </span><span class="comment">... the methods are all the same as the closure-based version </span> (<span class="keyword">fn</span> <span class="function-name">upgrade</span> [self new-make] (<span class="keyword">each</span> [k v (<span class="builtin">pairs</span> (new-make name server-state members buffer banned))] (<span class="keyword">tset</span> self k v))) {<span class="keyword">:</span> name <span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member? <span class="keyword">:</span> upgrade})) {<span class="keyword">:</span> make-channel}</pre> <p>We've extended the constructor to accept all the state fields as optional arguments, (the ones beginning with a question mark) allowing you to build a new version of an existing channel by passing the existing state on in. The <tt>upgrade</tt> function does exactly this with the private data it's closed over. We'll need to modify the server's reload command to call <tt>upgrade</tt> on every one of the channels with the new constructor as its second argument. The <tt>upgrade</tt> function calls the new constructor to get an updated version of the channel, then it takes all these new functions from it and drops them into the existing channel, seamlessly upgrading it in-place without dropping any connections. Any currently-running code which had access to the old channel now can see all the new methods from the new constructor. It's the best of both worlds, and it didn't require sacrificing encapsulation. Best of all only took a few lines of code to accomplish.</p> <p>But I do want to stress that each of these five approaches are all just trade-offs, and none of them are universally wrong. If you're not writing a server that keeps live connections open, it might not make sense to care about hot-loading upgrades. If you're writing a program that launches, prints its output, and immediately exits, you might not care about reloading, and the second approach is probably fine. If you've got a high tolerance for weird/unexpected behavior, maybe metatables are fine. If serialization is important to you, the first one might come out ahead. Even though the class-based approach is my least favorite, it could suit some projects if the people working on the codebase have a background in object-oriented languages and aren't comfortable changing their style. Context is <em>everything</em>.</p> <p>All in all I have to say that writing an IRC server has been a lot of fun and not as difficult as I expected it to be. At this point my code is only 366 lines but it supports channels, private messages, channel operators, bans, kicks, listing, and more. Writing an IRC bot is of course easier (a simple one is under a hundred lines) but this could be good if you're looking for a little more of a challenge when picking up a new language.</p> <hr> <p>[<a name="fn1">1</a>] If you don't know Fennel, you can probably still follow along if you understand scope and closures; the main things to know are that <tt>fn</tt> declares a function, the curly brackets in the argument list are used to pull fields out of a table argument, curly brackets outside a the argument list are used to make tables, <tt>:colon-style</tt> is string shorthand, and <tt>#(+ 2 $)</tt> is shorthand for a function that adds 2 to its argument.</p> <p>[<a name="fn2">2</a>] Another smaller problem with this approach is that closures cannot be serialized, so if you had to save off a channel, you can't just take the channel table and write it out to disk. This isn't an issue in Taverner, but it could be for other things which could be modeled this way.</p> <p>[<a name="fn3">3</a>] This is because in the Lua runtime used by Fennel, modules are the unit of reloading. Reloading a module involves taking the module table, emptying it out, re-executing the module's file, and pouring the resulting fields back into the original table, meaning that any existing code which had access to the module table can see the new fields. I <a href="/189">wrote about this in more detail</a> in a previous blog post.</p> <p>[<a name="fn4">4</a>] Of course, Erlang's version is much more sophisticated; it allows the old and new versions of the module to both exist simultaneously, moving process over when it detects an opportune time to call the upgrade function. Since we don't have to worry about concurrency in Lua, it's much simpler.</p> include(footer.html)