June 2026

GenServer Patterns in Elixir and the Same Idea in Ruby

Most of my production work runs on Elixir and OTP. But before Phoenix, I shipped Ruby in production: Rails APIs, background workers, gem integrations. When I reach for a GenServer today, I am not thinking "Elixir magic." I am thinking: one worker, private state, messages in a mailbox, callers wait for a reply or fire-and-forget. That model is not owned by the BEAM. Ruby can express it too, if you pick the right primitives.

This post walks through a minimal token-bucket rate limiter twice: once with GenServer, once with concurrent-ruby's Concurrent::Async. The reference repos are published on GitHub (links below): same algorithm, same API shape, two runtimes.

Why a token bucket?

A rate limiter is the smallest interesting stateful server. It holds a token count and a last-refill timestamp. Callers ask synchronously: may I proceed? Operators can reset asynchronously. Twenty concurrent callers must never drain more than capacity, which is exactly the guarantee a serial message loop provides.

Elixir: GenServer

GenServer is OTP's generic server behaviour. You implement init/1, handle_call/3, and handle_cast/2. The runtime gives you a dedicated BEAM process, a mailbox, and strict serialisation of messages, so no mutex is required inside your callbacks.

lib/rate_limiter.ex | GenServer implementation

defmodule RateLimiter do
  @moduledoc """
  Token-bucket rate limiter implemented as a GenServer.
  One BEAM process owns the bucket state.
  """

  use GenServer

  def start_link(opts \ []) do
    name = Keyword.get(opts, :name, __MODULE__)
    GenServer.start_link(__MODULE__, opts, name: name)
  end

  def allow?(server, _key \ nil), do: GenServer.call(server, :allow?)
  def stats(server), do: GenServer.call(server, :stats)
  def reset(server), do: GenServer.cast(server, :reset)

  @impl true
  def init(opts) do
    capacity = Keyword.get(opts, :capacity, 5)
    refill_rate = Keyword.get(opts, :refill_rate, 1.0)

    {:ok,
     %{
       capacity: capacity,
       tokens: capacity * 1.0,
       refill_rate: refill_rate,
       last_refill_ms: System.monotonic_time(:millisecond)
     }}
  end

  @impl true
  def handle_call(:allow?, _from, state) do
    state = refill(state)

    if state.tokens >= 1.0 do
      {:reply, true, %{state | tokens: state.tokens - 1.0}}
    else
      {:reply, false, state}
    end
  end

  @impl true
  def handle_call(:stats, _from, state) do
    state = refill(state)
    {:reply, Map.take(state, [:capacity, :tokens, :refill_rate]), state}
  end

  @impl true
  def handle_cast(:reset, state) do
    now = System.monotonic_time(:millisecond)
    {:noreply, %{state | tokens: state.capacity * 1.0, last_refill_ms: now}}
  end

  defp refill(%{tokens: tokens, capacity: cap, refill_rate: rate, last_refill_ms: last} = state) do
    now = System.monotonic_time(:millisecond)
    elapsed_sec = (now - last) / 1000.0
    new_tokens = min(cap * 1.0, tokens + elapsed_sec * rate)
    %{state | tokens: new_tokens, last_refill_ms: now}
  end
end

Client API usage

{:ok, _pid} = RateLimiter.start_link(name: MyLimiter, capacity: 5, refill_rate: 2.0)

RateLimiter.allow?(MyLimiter, "user-123")  # synchronous call: true | false
RateLimiter.stats(MyLimiter)               # %{capacity: 5, tokens: 4.0, ...}
RateLimiter.reset(MyLimiter)               # async cast: :ok

GenServer.call blocks the caller until handle_call returns {:reply, value, new_state}. GenServer.cast posts a message and returns immediately; the server handles it when it reaches the front of the mailbox. That is the Erlang gen_server contract, unchanged since the '90s.

Ruby: Concurrent::Async

The concurrent-ruby gem documents Async as loosely based on Erlang's gen_server, without supervision or linking. You include the module, call super() in initialize, and route work through await (synchronous) or async (fire-and-forget) proxies. Each object gets an executor thread; method calls are queued and processed one at a time. That is a mailbox.

lib/rate_limiter.rb | Concurrent::Async implementation

# frozen_string_literal: true

require "concurrent-ruby"

class RateLimiter
  include Concurrent::Async

  def initialize(capacity: 5, refill_rate: 1.0)
    super()
    @capacity = capacity
    @tokens = capacity.to_f
    @refill_rate = refill_rate
    @last_refill = monotonic_now
  end

  def allow?(_key = nil)
    refill!
    return false if @tokens < 1.0

    @tokens -= 1.0
    true
  end

  def stats
    refill!
    { capacity: @capacity, tokens: @tokens, refill_rate: @refill_rate }
  end

  def reset
    @tokens = @capacity.to_f
    @last_refill = monotonic_now
    :ok
  end

  def call_allow?(key = nil)
    await.allow?(key).value
  end

  def cast_reset
    async.reset
  end

  private

  def refill!
    now = monotonic_now
    elapsed = now - @last_refill
    @tokens = [@capacity.to_f, @tokens + (elapsed * @refill_rate)].min
    @last_refill = now
  end

  def monotonic_now
    Process.clock_gettime(Process::CLOCK_MONOTONIC)
  end
end

Client API usage

limiter = RateLimiter.new(capacity: 5, refill_rate: 2.0)

limiter.call_allow?("user-123")  # synchronous: blocks for reply
limiter.call_stats               # read bucket without consuming
limiter.cast_reset               # async cast: returns immediately

One Ruby detail worth calling out: await returns a Concurrent::IVar, not the bare value. The small call_allow? helper unwraps .value, the same moment a GenServer client unblocks with a reply. async.reset is the cast: enqueue and return.

Side-by-side mapping

Elixir use GenServer -> Ruby include Concurrent::Async
Elixir GenServer.start_link/1 -> Ruby RateLimiter.new (spawns actor thread via super)
Elixir GenServer.call/2 (synchronous) -> Ruby limiter.await.method, then IVar.value
Elixir GenServer.cast/2 (asynchronous) -> Ruby limiter.async.method
Elixir handle_call/3 -> Ruby instance method invoked on actor thread
Elixir handle_cast/2 -> Ruby instance method invoked on actor thread (no reply)
Elixir process mailbox -> Ruby serialized method queue on executor thread
Elixir isolated process heap -> Ruby thread + discipline (do not share mutable refs)

Threads, OS processes, and BEAM processes

This is where comparisons often go wrong: conflating three different things because all three are called "process" in different communities.

The chart above is illustrative, not a benchmark of our rate-limiter repos. It shows orders of magnitude: BEAM processes stay tiny per worker, Ruby threads carry a heavier per-thread cost, and OS processes buy isolation with RAM and spawn time. That is the trade space you are navigating when you pick GenServer vs Concurrent::Async vs Puma workers.

BEAM process (Elixir)

Not an OS process. Thousands fit in one OS process.
Isolated heap and garbage collection per process: a crash does not corrupt neighbours.
Preemptive scheduling across many schedulers (one per core by default).
Communication only via copying messages (for large binaries, ref-counted, but the discipline is still message passing).

Ruby Thread (MRI)

OS thread, but the Global VM Lock (GIL) means only one thread executes Ruby bytecode at a time.
Excellent for I/O-bound actors (network, disk, sleep), exactly what a rate limiter does.
Poor choice for CPU-heavy parallel Ruby on many cores; consider Process.fork, a process pool, or JRuby/TruffleRuby.
Shared memory model: if you pass a mutable Hash into an actor and mutate it elsewhere, you have a data race. GenServer makes this hard to do by accident; Ruby makes it easy.

OS process (Ruby Process.spawn / fork)

True isolation like the BEAM: separate memory, separate GIL.
Heavyweight: slower spawn, higher RAM, harder IPC (pipes, Redis, DB).
Common pattern in MRI for CPU parallelism (e.g. Puma workers, Sidekiq processes).
Complementary to Concurrent::Async: actors inside a worker, processes across workers.

The mental model transfers. The fault-tolerance guarantees do not. OTP supervision (restart a crashed GenServer with a strategy, let it take down a subtree) has no first-class equivalent in concurrent-ruby. That is the honest ceiling on "Ruby can do GenServer." It can do the messaging pattern. It cannot do the reliability layer without you building it.

Proving concurrency safety

Both repos include the same test: twenty concurrent callers, capacity three, so exactly three should succeed. Elixir uses Task.async_stream; Ruby uses Thread.new. Same assertion, different scheduler.

test/rate_limiter_test.exs (excerpt)

test "concurrent callers never exceed capacity", %{name: name} do
  results =
    1..20
    |> Task.async_stream(fn _ -> RateLimiter.allow?(name) end, max_concurrency: 20)
    |> Enum.map(fn {:ok, allowed?} -> allowed? end)

  assert Enum.count(results, & &1) == 3
end

test/test_rate_limiter.rb (excerpt)

def test_concurrent_callers_never_exceed_capacity
  limiter = RateLimiter.new(capacity: 3, refill_rate: 1.0)

  results = Array.new(20) do
    Thread.new { limiter.call_allow? }
  end.map(&:value)

  assert_equal 3, results.count(true)
end

What is genuinely the same

"Do not communicate by sharing memory; share memory by communicating."
One serial worker owns mutable state, with no locks inside the bucket logic.
Synchronous vs asynchronous client APIs map cleanly (call vs cast).
The algorithm (token refill, capacity ceiling) is identical line for line in spirit.

What is different (and matters in production)

Supervision and restart strategies: OTP native; Ruby DIY.
Process isolation: BEAM per-actor GC; Ruby shared VM.
Back-pressure and observability: Telemetry, OTP releases, :sys.get_status/1 vs logging and custom metrics.
Distribution: Node clustering is built in; Ruby typically needs Redis, Kafka, or gRPC.
concurrent-ruby-edge adds ErlangActor and Channel for closer BEAM semantics, but they are edge APIs. Async in the main gem is the pragmatic GenServer-shaped choice.

Run the proof-of-concepts

Elixir

git clone https://github.com/ijunaid8989/rate-limiter-elixir.git
cd rate-limiter-elixir
mix test

Ruby

git clone https://github.com/ijunaid8989/rate-limiter-ruby.git
cd rate-limiter-ruby
bundle install
bundle exec ruby test/test_rate_limiter.rb
ruby bin/demo

Closing thought

I reach for Elixir when I want the pattern and the platform guarantees together: lightweight processes, supervision, hot code upgrades in the right deployment. I reach for Ruby when the ecosystem, team, or integration surface demands it, and I still model concurrent state as actors, not shared mutable singletons. Knowing both runtimes means choosing the guarantee you actually need, not the syntax you used last week.

Reference implementations: rate-limiter-elixir and rate-limiter-ruby on GitHub. Pair them, star them, cite them in your next architecture review when someone says "we can't do that, we're on Ruby."