juxtkeepcat - a Clojure transducer

Recently at work I wrote a Clojure transducer chain that took a collection of messages from a queue, transformed them and filtered them in several ways before grouping them into map.

At one stage of the transducer each input message needed to be sent to two different functions, which acted as triggers. To keep this post general let's say that (trigger-1 msg) would either return an event map if the message caused the the trigger to fire, or else it would return nil. The same is true for (trigger-2 msg).

Each message could return either two, one, or zero events depending on the two triggers' responses. If any events are returned they should be propagated downstream through the transducer, and the original message could be discarded.

One input turning into many is usually a job for mapcat. However, mapcat requires its function to return a collection, meaning that there will be a small amount collection memory allocation at each pass. These days I have been increasingly paranoid about the abundance of little wasteful allocations in clojure programs that could be avoided with just a little bit of effort and mechanical sympathy.

In the specific business case I knew that my count of triggers at this step was unlikely to ever go beyond two. So I wrote up a little transducer that works a little like clojure.core/juxt, a little like clojure.core/keep and also like clojure.core/mapcat.

                
(defn- juxtkeepcat
  [f g]
  (fn [rf]
    (fn
      ([] (rf))
      ([acc] (rf acc))
      ([acc x]
       (let [fx (f x)
             gx (g x)]
         (cond-> acc
           (not (nil? fx)) (rf fx)
           (not (nil? gx)) (rf gx)))))))


;; usage
(sequence (juxtkeepcat trigger-1 trigger-2) messages)
                
            

It's not much! But this is code that can execute a million times a day on our system so offering a little bit of relief to the garbage collector seems worth the price of defining the custom transducer. To illustrate an alternative the mapcat version might look something like:

                
(->> messages
     (sequence (mapcat (fn [msg]
                         (let [e1 (trigger-1 msg)
                               e2 (trigger-2 msg)]
                           (cond-> []
                             e1 (conj e1)
                             e2 (conj e2)))))))
                
            

If we wanted juxtkeepcat to support more than just two functions, we would have to modified the source to accept more arguments. Many function implementations in clojure.core (e.g. map) have unrolled versions for accepting two, three, four arguments as well as a variadic version to support higher argument counts. We could do the same here, but a variadic version would mostly defeat the purpose of writing this transducer, reducing collection allocations, since a collection will need to be allocated for the varargs colleciton.

December 11, 2023