dnl -*- html -*-
define(__timestamp, 2022-02-08 08:52:53)dnl
define(__title, `in which five different paths lead to methods')dnl
define(__id, 197)dnl
include(header.html)
<p>I recently made a change in a codebase I've been working on which
  illustrated an interesting trade-off around modeling in software. The
  project was <a href="https://git.sr.ht/~technomancy/taverner">Taverner</a>,
  an IRC server written in <a href="https://fennel-lang.org">Fennel</a>.</p>

<p>In particular it had to do with the way that channels were
  modeled. A channel is basically a "chat room"; it's just something that
  users can join which lets you send messages to anyone else who's
  also in the channel. In many languages you would define a Channel
  class which has a bunch of methods like join, part, send,
  etc. Fennel doesn't have classes, but there are a few different
  alternatives available<sup><a href="#fn1">1</a></sup>.</p>

<h4>Take 1: Module-based methods</h4>

<p>The most obvious approach is to have a <tt>channel</tt> module
  which just exports the functions that would have been methods along
  with a constructor function:</p>

<pre class="code">(<span class="keyword">fn</span> <span class="function-name">send</span> [{<span class="keyword">:</span> buffer} nick ...]
  (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)]))

(<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&amp;as</span> ch} nick conn]
  (<span class="keyword">tset</span> members nick conn)
  (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))

(<span class="keyword">fn</span> <span class="function-name">part</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&amp;as</span> ch} nick ?cmd]
  (<span class="keyword">tset</span> members nick nil)
  (send ch nil (<span class="keyword">..</span> <span class="string">":"</span> nick) (<span class="keyword">or</span> ?cmd <span class="builtin">:PART</span>) name)
  (<span class="keyword">when</span> (empty? ch)
    (<span class="type">ch.remove</span>)))

(<span class="keyword">fn</span> <span class="function-name">flush</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> buffer}]
  (<span class="keyword">each</span> [nick conn (<span class="builtin">pairs</span> members)]
    (<span class="keyword">each</span> [_ [sender msg] (<span class="builtin">ipairs</span> buffer)]
      (<span class="keyword">when</span> (<span class="keyword">not=</span> nick sender)
        (conn<span class="builtin">:send</span> (<span class="keyword">..</span> msg <span class="string">"\r\n"</span>)))))
  (<span class="keyword">while</span> (<span class="builtin">next</span> buffer)
    (<span class="type">table.remove</span> buffer)))

(<span class="keyword">fn</span> <span class="function-name">empty?</span> [{<span class="keyword">:</span> members}] (<span class="keyword">=</span> nil (<span class="builtin">next</span> members)))
(<span class="keyword">fn</span> <span class="function-name">member-names</span> [{<span class="keyword">:</span> members}] (<span class="keyword">icollect</span> [k (<span class="builtin">pairs</span> members)] k))
(<span class="keyword">fn</span> <span class="function-name">member?</span> [{<span class="keyword">:</span> members} nick] (<span class="keyword">not=</span> nil (<span class="keyword">.</span> members nick)))

(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name state]
  {<span class="keyword">:</span> name <span class="builtin">:members</span> {} <span class="builtin">:buffer</span> []
   <span class="builtin">:remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil)})

{<span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush
 <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member?
 <span class="keyword">:</span> make-channel}</pre>

<p>There's nothing particularly clever going on here, which I believe
  is a big strength. Everything is obvious. The <tt>make-channel</tt>
  function acts as a constructor, while every other function in the
  module takes a channel as its first argument, so you can write code
  like <tt>(channel.join ch client.nick client.conn)</tt>
  where <tt>ch</tt> is a channel table you got from calling the
  constructor.</p>

<p>The biggest downside here is that it
  lacks <em>encapsulation</em>. All the data for a channel is exposed
  in the table that gets passed around to other modules, and it isn't
  clear which fields are safe to use and which are implementation
  details which might change later on. In a small codebase maybe this
  is no problem, but as it grows and changes over time, it will
  make it more difficult to know what effect a given change might
  have in a different part of the codebase.</p>

<h4>Take 2: Closure-based methods</h4>

<p>There's an old saying that "closures are a poor man's objects" and
  "objects are a poor man's closures". Keeping internal data private
  by exporting functions whose scope closes over the internal data is
  one of the oldest tricks in the book:</p>

<pre class="code">(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name server-state]
  (<span class="keyword">let</span> [members {}
        buffer []]

    (<span class="keyword">fn</span> <span class="function-name">flush</span> []
      (<span class="keyword">each</span> [nick conn (<span class="builtin">pairs</span> members)]
        (<span class="keyword">each</span> [_ [sender msg] (<span class="builtin">ipairs</span> buffer)]
          (<span class="keyword">when</span> (<span class="keyword">not=</span> nick sender)
            (conn<span class="builtin">:send</span> (<span class="keyword">..</span> msg <span class="string">"\r\n"</span>)))))
      (<span class="keyword">while</span> (<span class="builtin">next</span> buffer)
        (<span class="type">table.remove</span> buffer)))

    (<span class="keyword">fn</span> <span class="function-name">send</span> [nick ...]
      (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)]))

    (<span class="keyword">fn</span> <span class="function-name">join</span> [nick conn]
      (<span class="keyword">tset</span> members nick conn)
      (send <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))

    (<span class="keyword">fn</span> <span class="function-name">part</span> [nick ?cmd]
      (<span class="keyword">tset</span> members nick nil)
      (send nick nick (<span class="keyword">or</span> ?cmd <span class="builtin">:PART</span>) name)
      (<span class="keyword">when</span> (<span class="keyword">=</span> nil (<span class="builtin">next</span> members)) <span class="comment">; last one out off the lights
</span>        (<span class="keyword">tset</span> <span class="type">server-state.channels</span> name nil)))

    {<span class="keyword">:</span> name <span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush
     <span class="builtin">:empty?</span> <span class="keyword">#</span>(<span class="keyword">=</span> nil (<span class="builtin">next</span> members))
     <span class="builtin">:member-names</span> <span class="keyword">#</span>(<span class="keyword">icollect</span> [k (<span class="builtin">pairs</span> members)] k)
     <span class="builtin">:member?</span> <span class="keyword">#</span>(<span class="keyword">not=</span> nil (<span class="keyword">.</span> members <span class="keyword">$</span>))}))

{<span class="keyword">:</span> make-channel}</pre>

<p>Now the module only exports one thing: <tt>make-channel</tt>
  function, which returns a table that you can think of as if it were
  an instance of a Channel class. It has functions inside the table
  which act like methods would. This makes the interface of the
  channel very clear and well-defined. If you want to do anything
  with a channel, you have to use one of the functions in the channel
  table. You can change anything about the internals, and as long as
  you update everything in the <tt>make-channel</tt> function, you
  know you won't break something elsewhere. In a word, it's
  encapsulated.</p>

<p>But there is one very serious downside to this
  style<sup><a href="#fn2">2</a></sup>: reloading the code would only
  affect new channels; existing ones would keep the same old code as
  before since only the module gets the new
  functions<sup><a href="#fn3">3</a></sup>. In a normal program I
  might put up with this, even though I really love reloading. But in
  a long-running IRC server, it's really not a good idea!  Getting everyone to
  leave a channel so you can recreate a version of it which has the
  new version of the code is extremely disruptive. I absolutely need the
  ability to fix bugs and add features while the server is running
  without disrupting the users, and that meant as nice as this code
  feels, it's not going to cut it. How can we get both encapsulation and
  reloadability?</p>

<h4>Take 3: Metatable-based methods</h4>

<p>Lua tables have one feature which gives them an extraordinary
  amount of flexibility: metatables. There's a lot you can do with
  metatables, but for the purposes of this code the most important
  thing is that you can set an <tt>__index</tt> method on them which
  will let you set a fallback for when you try to look up a field
  which does not exist in the table. This lets us keep doing method
  lookups using the module table directly (allowing reloads) but also
  keeping the data itself out of the table which is exposed as the
  public interface:</p>

<pre class="code">(<span class="keyword">fn</span> <span class="function-name">send</span> [{<span class="keyword">:</span> buffer} nick ...]
  (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)]))

(<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&amp;as</span> ch} nick conn]
  (<span class="keyword">tset</span> members nick conn)
  (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))

<span class="comment-delimiter">;; </span><span class="comment">... all the methods are the same as the first version
</span>
(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name state]
  (<span class="keyword">let</span> [public {<span class="keyword">:</span> name}
        channel-state {<span class="builtin">:members</span> {} <span class="builtin">:buffer</span> []
                       <span class="builtin">:remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil)}]
    (<span class="builtin">setmetatable</span> public {<span class="builtin">:__index</span> channel-state})
    (<span class="builtin">setmetatable</span> channel-state {<span class="builtin">:__index</span> (<span class="builtin">require</span> <span class="builtin">:channel</span>)})
    public))

{<span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush
 <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member?
 <span class="keyword">:</span> make-channel}</pre>

<p>This looks nice! It's very clear what the public fields are
  (only the channel's <tt>name</tt>) and the private fields are attached
  using the first metatable. But if we put the method functions
  directly into the <tt>channel-state</tt> table <em>during the
  constructor</em> we would have the same reload problem as the previous
  version where the module containing the methods would change after
  we already pulled the functions out of it, and we wouldn't see the
  new values. Because of that, we use <em>the module itself</em> as
  the metatable of the metatable.</p>

<p>There's one big downside to this compared to the previous version:
  it lacks transparency:</p>

<pre>>> (local channel (require :make-channel))
>> (local ch (make-channel "#mychannel" state []))
>> ch
{:name "#mychannel"} ; wait, where are the methods?
>> ch.join
#&lt;function: 0x55c7d468f0&gt; ; but it's found if you ask for it directly
>> (ch:join client.nick client.conn) ; and this works fine!
</pre>

<p>The functions are found (via <tt>__index</tt>) when you go look them up, but they do not show
  up otherwise. This is a common problem with using metatables; they
  can lead to surprising, unpredictable behavior. While there is a
  workaround to this (the <tt>pairs</tt> metamethod) it's error-prone
  and does not work on all versions of the Lua runtime. Personally I try to avoid
  metatables unless the downsides of the alternatives are too
  great. But what other options are there?</p>

<h4>Take 4: Class-based methods</h4>

<p>Just because Lua and Fennel don't have classes as part of the
  language doesn't mean you can't use classes; metatables give you the
  flexibility to construct your own class system if that's what you
  really want. The <a href="https://github.com/kikito/middleclass">middleclass</a>
  library is one of the most popular implementations of this for Lua,
  which means of course that we can use it from Fennel too:</p>

<pre class="code">(<span class="keyword">local</span> <span class="variable-name">class</span> (<span class="builtin">require</span> <span class="builtin">:middleclass</span>))

(<span class="keyword">local</span> <span class="variable-name">Channel</span> (class <span class="builtin">:Channel</span>))

(<span class="keyword">fn</span> <span class="function-name">Channel.send</span> [{<span class="keyword">:</span> buffer} nick ...]
  (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)]))

(<span class="keyword">fn</span> <span class="function-name">Channel.join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> name <span class="keyword">&amp;as</span> ch} nick conn]
  (<span class="keyword">tset</span> members nick conn)
  (ch<span class="builtin">:send</span> <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))

<span class="comment-delimiter">;; </span><span class="comment">the methods are the same as before
</span>
(<span class="keyword">fn</span> <span class="function-name">Channel.initialize</span> [self name state]
  (<span class="keyword">set</span> <span class="type">self.name</span> name)
  (<span class="keyword">set</span> <span class="type">self.members</span> {})
  (<span class="keyword">set</span> <span class="type">self.buffer</span> [])
  (<span class="keyword">set</span> <span class="type">self.remove</span> <span class="keyword">#</span>(<span class="keyword">tset</span> <span class="type">state.channels</span> name nil)))

Channel</pre>

<p>If you're used to Java or Ruby or another class-based language,
  this may look comfortingly familiar to you. You define a class, and
  you give it methods. You invoke them using <tt>(ch:join client.nick
  client.conn)</tt> notation. But how does it fare on the encapsulation
  and reloadability fronts?</p>

<pre>>> (Channel:new "mychannel" {})
{:buffer {}
 :class {:__declaredMethods {:__tostring #&lt;function: 0x55c7c28450&gt;
                             :empty? #&lt;function: 0x55c7b56660&gt;
                             :flush #&lt;function: 0x55c7bd4aa0&gt;
                             :initialize #&lt;function: 0x55c7b711c0&gt;
                             :isInstanceOf #&lt;function: 0x55c7d45000&gt;
                             :join #&lt;function: 0x55c7d36da0&gt;
                             :member-names #&lt;function: 0x55c7c06720&gt;
                             :member? #&lt;function: 0x55c7f06a80&gt;
                             :part #&lt;function: 0x55c7b710b0&gt;
                             :send #&lt;function: 0x55c7f03e40&gt;}
         :__instanceDict @3{:__index @3{...}
                            :__tostring #&lt;function: 0x55c7c28450&gt;
                            :empty? #&lt;function: 0x55c7b56660&gt;
                            :flush #&lt;function: 0x55c7bd4aa0&gt;
                            :initialize #&lt;function: 0x55c7b711c0&gt;
                            :isInstanceOf #&lt;function: 0x55c7d45000&gt;
                            :join #&lt;function: 0x55c7d36da0&gt;
                            :member-names #&lt;function: 0x55c7c06720&gt;
                            :member? #&lt;function: 0x55c7f06a80&gt;
                            :part #&lt;function: 0x55c7b710b0&gt;
                            :send #&lt;function: 0x55c7f03e40&gt;}
         :name "Channel"
         :static {:allocate #&lt;function: 0x55c7c0a930&gt;
                  :include #&lt;function: 0x55c7b8a5d0&gt;
                  :isSubclassOf #&lt;function: 0x55c7d44f70&gt;
                  :new #&lt;function: 0x55c7d44b40&gt;
                  :subclass #&lt;function: 0x55c7b8a590&gt;
                  :subclassed #&lt;function: 0x55c7af5600&gt;}
         :subclasses {}}
 :members {}
 :name "mychannel"
 :remove #&lt;function: 0x55c7d2e570&gt;}
</pre>

<p>Yikes! That's a lot of ... stuff. The methods are just dumped
  straight into a nested table inside the instance itself (twice, for
  some reason?) and the fields are not encapsulated away at all. The
  middleclass wiki
  has <a href="https://github.com/kikito/middleclass/wiki/Private-stuff">some
  suggestions for how to keep data private</a> but they are quite
  inconvenient compared to simply using closures. On top of that, the
  printed representation of the instance is very cluttered and
  messy. Overall it's not clear that we gain much from this approach
  beyond a sense of familiarity for people who come from certain other
  languages.</p>

<h4>Take 5: Reloadable, encapsulated methods</h4>

<p>So far the closure version from take 2 has appealed to me the most;
  the tight encapsulation there just feels <em>so tidy</em>. What
  if we could go back to that but do something about the reloading? Well, there is
  actually one other concern we haven't touched on with reloading
  yet, and it leads us to our solution.</p>

<p>When you reload, you're bringing a new version of a module into
  play in a system that's already running. When your program is a
  server, that means that you've got "in-flight" connections with
  active users of your program. What happens when you add a function
  that expects some new fields that didn't exist when your users
  initially connected? For example, let's say we add a ban list to the
  channels. This data wasn't included in the existing channels, but now
  you need to check it when you join:</p>

<pre class="code">(<span class="keyword">fn</span> <span class="function-name">join</span> [{<span class="keyword">:</span> members <span class="keyword">:</span> banned <span class="keyword">&amp;as</span> ch} nick conn]
  (<span class="builtin">assert</span> (<span class="keyword">not</span> (<span class="type">lume.find</span> (<span class="keyword">or</span> banned []) nick)) <span class="string">"Cannot join channel; banned."</span>)
  (<span class="keyword">tset</span> members nick conn)
  (send ch <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))</pre>

<p>You could code defensively and make sure that every single
  reference to the field is wrapped in an <tt>or</tt>, but that's a
  drag. You're sure to miss one. And do you really want that check sticking
  around in your codebase forever? What we really want here
  is something like Erlang's upgrade process<sup><a href="#fn4">4</a></sup> for when it
  <a href="https://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving">hot
    loads a new module</a>. Here we provide an <tt>upgrade</tt> function
  which takes the existing table and replaces its contents with the closures from the new version:</p>

<pre class="code">(<span class="keyword">fn</span> <span class="function-name">make-channel</span> [name server-state ?members ?buffer ?banned]
  (<span class="keyword">let</span> [members (<span class="keyword">or</span> ?members {})
        banned (<span class="keyword">or</span> ?banned [])
        buffer (<span class="keyword">or</span> ?buffer [])]

    (<span class="keyword">fn</span> <span class="function-name">send</span> [nick ...]
      (<span class="type">table.insert</span> buffer [nick (<span class="type">table.concat</span> [...] <span class="string">" "</span>)]))

    (<span class="keyword">fn</span> <span class="function-name">join</span> [nick conn]
      (<span class="builtin">assert</span> (<span class="keyword">not</span> (<span class="type">lume.find</span> banned nick))
              <span class="string">"Cannot join channel; banned."</span>)
      (<span class="keyword">tset</span> members nick conn)
      (send <span class="string">""</span> (<span class="keyword">..</span> <span class="string">":"</span> nick) <span class="builtin">:JOIN</span> name))

    <span class="comment-delimiter">;; </span><span class="comment">... the methods are all the same as the closure-based version
</span>
    (<span class="keyword">fn</span> <span class="function-name">upgrade</span> [self new-make]
      (<span class="keyword">each</span> [k v (<span class="builtin">pairs</span> (new-make name server-state members buffer banned))]
        (<span class="keyword">tset</span> self k v)))

    {<span class="keyword">:</span> name <span class="keyword">:</span> send <span class="keyword">:</span> join <span class="keyword">:</span> part <span class="keyword">:</span> flush
     <span class="keyword">:</span> empty? <span class="keyword">:</span> member-names <span class="keyword">:</span> member?
     <span class="keyword">:</span> upgrade}))

{<span class="keyword">:</span> make-channel}</pre>

<p>We've extended the constructor to accept all the state fields as
  optional arguments, (the ones beginning with a question mark)
  allowing you to build a new version of an existing channel by
  passing the existing state on in. The <tt>upgrade</tt> function does
  exactly this with the private data it's closed over. We'll need to
  modify the server's reload command to call <tt>upgrade</tt> on every one of
  the channels with the new constructor as its second
  argument. The <tt>upgrade</tt> function calls the new constructor to
  get an updated version of the channel, then it takes all these new
  functions from it and drops them into the existing channel,
  seamlessly upgrading it in-place without dropping any
  connections. Any currently-running code which had access to the old channel
  now can see all the new methods from the new constructor. It's the
  best of both worlds, and it didn't require sacrificing
  encapsulation. Best of all only took a few lines of code to
  accomplish.</p>

<p>But I do want to stress that each of these five approaches are all
  just trade-offs, and none of them are universally wrong. If you're not
  writing a server that keeps live connections open, it might not make
  sense to care about hot-loading upgrades. If you're writing a
  program that launches, prints its output, and immediately exits, you
  might not care about reloading, and the second approach is probably
  fine. If you've got a high tolerance for weird/unexpected behavior,
  maybe metatables are fine. If serialization is important to you, the
  first one might come out ahead. Even though the class-based approach is
  my least favorite, it could suit some projects if the
  people working on the codebase have a background in object-oriented
  languages and aren't comfortable changing their style. Context
  is <em>everything</em>.</p>

<p>All in all I have to say that writing an IRC server has been a lot
  of fun and not as difficult as I expected it to be. At this point my
  code is only 366 lines but it supports channels, private messages,
  channel operators, bans, kicks, listing, and more. Writing an IRC bot is
  of course easier (a simple one is under a hundred lines) but this
  could be good if you're looking for a little more of a challenge
  when picking up a new language.</p>

<hr>

<p>[<a name="fn1">1</a>] If you don't know Fennel, you can probably
  still follow along if you understand scope and closures; the main
  things to know are that <tt>fn</tt> declares a function, the curly
  brackets in the argument list are used to pull fields out of a table
  argument, curly brackets outside a the argument list are used to
  make tables, <tt>:colon-style</tt> is string shorthand, and <tt>#(+
  2 $)</tt> is shorthand for a function that adds 2 to its
  argument.</p>

<p>[<a name="fn2">2</a>] Another smaller problem with this approach is
  that closures cannot be serialized, so if you had to save off a
  channel, you can't just take the channel table and write it out to
  disk. This isn't an issue in Taverner, but it could be for other
  things which could be modeled this way.</p>

<p>[<a name="fn3">3</a>] This is because in the Lua runtime used by
  Fennel, modules are the unit of reloading. Reloading a module
  involves taking the module table, emptying it out, re-executing the
  module's file, and pouring the resulting fields back into the
  original table, meaning that any existing code which had access to
  the module table can see the new fields. I <a href="/189">wrote
  about this in more detail</a> in a previous blog post.</p>

<p>[<a name="fn4">4</a>] Of course, Erlang's version is much more
  sophisticated; it allows the old and new versions of the module to
  both exist simultaneously, moving process over when it detects an
  opportune time to call the upgrade function. Since we don't have
  to worry about concurrency in Lua, it's much simpler.</p>

include(footer.html)

Generated by Phil Hagelberg using scpaste at Thu Mar 3 13:10:13 2022. PST. (original)