Code Observation: Clojure's Destructuring

Table of Contents
  1. Hypothesis
  2. Experiment
  3. Observations
    1. Sequential Data Structures
      1. Single Item Retrieval
      2. Sub-sequence Retrieval
      3. Nested Sequential Destructuring
      4. Detour: Lazy Sequences
    2. Associative Data Structures
      1. Nested Associative Destructuring
      2. Keyword Arguments: When Sequences become Maps
      3. Sequential : nth :: Associative : get
    3. Destructuring Contexts
  4. Conclusions
  5. Acknowledgements

Prologue: What's a code observation?

Hypothesis

Clojure's destructuring helps us establish bindings to items in Clojure collections in a concise, intuitive way if we already understand how to use Clojure's data structures and core functions that work with them.

Here, concise means that destructuring employs fewer expressions than would be required using only Clojure's core functions for retrieving parts of data structures.

Here, intuitive means that Clojure's implementation and our own intuition are aligned. If they are not now, they should be by the experiment's end.

Experiment

Environment details:
  • Clojure: Version 1.10.3
  • JDK: OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.10, mixed mode, sharing)
  • System: Linux 5.11.0-18-generic #19-Ubuntu SMP Fri May 7 14:22:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

The forms below have been evaluated at Clojure's REPL. Lines printed to STDOUT are prefixed with ; (out).

Evaluation results are collapsed by default. You're encouraged to run the examples at your own REPL or mentally determine the outcomes before peeking.

Observations

We will explore how the syntax of Clojure's destructuring allows us to replace invocations of core Clojure functions. First, we will consider the depth of destructuring support specific to sequential and associative data structures. As we progress, we'll note what is common between those categories, as well as what other Clojure features have related syntax. Finally, we'll review the breadth of binding contexts in which destructuring is supported.

Sequential Data Structures

Sequential data structures maintain an ordering of their items. Destructuring them, therefore, allows us to retrieve items in a specified order.

We will use the following coll in all of the sequential examples that follow.

(def coll (map (comp char (partial + 65)) (range 26)))
;;=> #'user/coll

coll
;;=> (\A \B \C \D \E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)


;; By using a seq, rather than a concrete list or vector,
;; we demonstrate which functions do or do not require
;; a particular concrete collection.
(type coll)
;;=> clojure.lang.LazySeq

Single Item Retrieval

The following functions and macros in Clojure's core API allow you to pull sequential data structures apart:

  • Positional: first, fnext, last, nth, rand-nth, second
  • Positional for nested sequences: ffirst
  • Specific to certain data structures: aget, peek
  • Based on truthiness: some, when-first
  • A Clojure set, when invoked, returns the value supplied if it's contained in the set, else nil

Sequential destructuring gives us a way to pull values out of a sequence without having to use these directly, though destructuring does not support all of the behaviors of all of these functions.

Expand the section below if you need a reminder of how these functions behave.

Examples for single-item retrieval in Clojure's Sequence API
;; Basic operations:
(first coll)    ; \A
(second coll)   ; \B
(last coll)     ; \Z
(rand-nth coll) ; \K

;; In absence of cadadadr and friends,
;; a few built-in compositions of the above:
(ffirst [coll coll])           ; \A
(ffirst [(reverse coll) coll]) ; \Z
(fnext coll)                   ; \B

;; Efficient retrieval based on
;; concrete persistent data structures:
(try (peek coll) (catch Exception e (.getMessage e)))
;;=> "class clojure.lang.LazySeq cannot be cast to class clojure.lang.IPersistentStack..."
(peek (apply list coll))  ; \A
(peek (into () coll))     ; \Z
(peek (vec coll))         ; \Z
(nth (vec coll) 2)        ; \C
(nth (apply list coll) 2) ; \C
(nth (into () coll) 2)    ; \X

;; Sets as functions of their members:
(#{\A} \A)        ; \A
(#{\A} \B)        ; nil
(some #{\A} coll) ; \A

;; Utility macro for conditional binding of `(first coll)`:
(when-first [letter coll]
  letter)
;;=> \A

(when-first [letter [false]]
  letter)
;;=> false

(when-first [letter ()]
  letter)
;;=> nil

If we consider that literal vectors are used by defn and let for ordered bindings, it should not surprise us that the syntax for destructuring sequential data structures also employs literal vectors.

(let [[?] coll]
  ?)
Result
\A
(let [[_ ?] coll]
  ?)
Result
\B
(let [[_ _ ?] coll]
  ?)
Result
\C

The left-hand side of each binding—which is normally just a Clojure symbol—is treated as sequential destructuring when a literal vector is used. The [?], [_ ?], and [_ _ ?] forms bind ? to the value of (nth coll 0), (nth coll 1), and (nth coll 2), respectively.

How far into the sequence can we go?

(let [[__0 __1 __2 __3 __4 __5 __6 __7 __8 __9
       _10 _11 _12 _13 _14 _15 _16 _17 _18 _19
       _20 _21 _22 _23 _24 ?] 
      coll]
  ?)
Result
\Z

What happens if we go past the end of a sequence?

(let [[_ _ ?] (take 2 coll)]
  ?)
Result
nil

Important Observation: Destructuring binds nil rather than throwing exceptions when it can't find what we're asking for.

Before progressing to more advanced examples, let's macroexpand our let to understand what Clojure functions are used:

(macroexpand
 '(let [[?1 ?2] coll]
    [?1 ?2]))
Result
(let*
 [vec__17860 coll
  ?1 (clojure.core/nth vec__17860 0 nil)
  ?2 (clojure.core/nth vec__17860 1 nil)]
 [?1 ?2])

So Clojure's core nth function powers sequential destructuring. Whereas nth supplied with only a coll and index will throw an exception when that index is out of bounds, if a third not-found value is passed, then nth will return that instead:

(try (nth "HAL" 3) (catch Exception e (.getMessage e)))
;;=> "String index out of range: 3"

(nth "HAL" 3 ::not-found)
;;=> :user/not-found

We can further deduce that since destructuring leverages nth, any data structure that nth accepts can be destructured in the same way. In the examples that follow, remember that into uses conj to build new collections, so lists and vectors made with into will have opposite orderings.

(let [[?] (reduce str coll)] ?)  ; \A
(let [[?] (into [] coll)] ?)     ; \A
(let [[?] (into () coll)] ?)     ; \Z
(let [[?] (into-array coll)] ?)  ; \A
(let [[_k v] (first {:A \A})] v) ; \A
(let [[?] (into (clojure.lang.PersistentQueue/EMPTY) coll)] ?) ; \A
(let [[?] (java.util.ArrayList. coll)] ?) ; \A

(let [[?] nil] ?) ; nil

Remember that Clojure's sets are not sequential data structures and cannot be used with nth or sequential destructuring.

Expand the section below to see the salient aspects of the implementation of nth which reveal the kinds of values it supports.

Underlying Java implementation of nth
// From src/jvm/clojure/lang/RT.java

static public Object nth(Object coll, int n){
	if(coll instanceof Indexed)
		return ((Indexed) coll).nth(n);
	return nthFrom(Util.ret1(coll, coll = null), n);
}

static Object nthFrom(Object coll, int n){
	if(coll == null)
		return null;
	else if(coll instanceof CharSequence)
		return Character.valueOf(((CharSequence) coll).charAt(n));
	else if(coll.getClass().isArray())
		return Reflector.prepRet(coll.getClass().getComponentType(),Array.get(coll, n));
	else if(coll instanceof RandomAccess)
		return ((List) coll).get(n);
	else if(coll instanceof Matcher)
		return ((Matcher) coll).group(n);

	else if(coll instanceof Map.Entry) {
		Map.Entry e = (Map.Entry) coll;
		if(n == 0)
			return e.getKey();
		else if(n == 1)
			return e.getValue();
		throw new IndexOutOfBoundsException();
	}

	else if(coll instanceof Sequential) {
		ISeq seq = RT.seq(coll);
		coll = null;
		for(int i = 0; i <= n && seq != null; ++i, seq = seq.next()) {
			if(i == n)
				return seq.first();
		}
		throw new IndexOutOfBoundsException();
	}
	else
		throw new UnsupportedOperationException(
				"nth not supported on this type: " + coll.getClass().getSimpleName());
}

Sub-sequence Retrieval

Now that we understand how to create bindings for individual items, let's progress to extracting sub-sequences. The following functions and macros in Clojure's core API allow you to extract one or more sub-sequences from a sequential data structure:

  • Positional: butlast, drop, drop-last, next, nnext, random-sample, rest, split-at, subs, subvec, take, take-last, take-nth
  • Positional for nested sequences: nfirst, nthnext, nthrest
  • Specific to certain data structures: pop
  • Based on truthiness: drop-while, split-with, take-while

We'll see that destructuring can take the place of some of these core functions, too. If you need a refresher on these forms, expand the section below.

Examples for sub-sequence retrieval in Clojure's Sequence API
;; Basic operations:
(take 3 coll)                ; (\A \B \C)
(drop 23 coll)               ; (\X \Y \Z)
(count (take 1000000 coll))  ; 26
(take 3 (next coll))         ; (\B \C \D)
(take 3 (rest coll))         ; (\B \C \D)
(take-last 3 coll)           ; (\X \Y \Z)
(take-last 3 (butlast coll)) ; (\W \X \Y)
(drop-last 23 coll)          ; (\A \B \C)
(second (split-at 23 coll))  ; (\X \Y \Z)

(subs (reduce str coll) 13 15) ; "NO"
(subvec (vec coll) 23) ; [\X \Y \Z]

;; In absence of cadadadr and friends,
;; a few built-in compositions of the above:
(take 3 (nfirst [coll coll])) ; (\B \C \D)
(take 3 (nnext coll))         ; (\C \D \E)
(nthnext coll 23) ; (\X \Y \Z)
(nthrest coll 23) ; (\X \Y \Z)

;; Efficient retrieval based on
;; concrete persistent data structures:
(try (pop coll) (catch Exception e (.getMessage e)))
;;=> "class clojure.lang.LazySeq cannot be cast to class clojure.lang.IPersistentStack..."
(take-last 3 (pop (vec coll)))   ; (\W \X \Y)
(take 3 (pop (into () coll)))    ; (\Y \X \W)
(take 3 (pop (apply list coll))) ; (\B \C \D)
(take 3 (pop (into (clojure.lang.PersistentQueue/EMPTY) coll))) ; (\B \C \D)

;; Based on predicate functions:
(drop-while #(< (int %) 88) coll)          ; (\X \Y \Z)
(second (split-with #(< (int %) 88) coll)) ; (\X \Y \Z)
(take-while #(< (int %) 68) coll)          ; (\A \B \C)

;; Interesting extras:
(take-nth 8 coll) ; (\A \I \Q \Y)
(random-sample 0.1 coll)
; Seven executions:
; (\K \M \X)
; ()
; (\Q \T \Z)
; (\E \F \G \O)
; (\B \F \G \H \W \X)
; (\H \Q \S \W \Y)
; (\K \W \Y)

Remembering that defn supports a variable number of arguments if we insert a & followed by a symbol that gets bound to a list of the extra arguments passed to our function, it should not surprise us that the same syntax is used to bind a variable number of final items when destructuring:

(let [[_ & ?rest] coll]
  ?rest)
Result
; (\B \C \D \E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)
(let [[_ _ & ?rest] coll]
  ?rest)
Result
; (\C \D \E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)

As with varargs support in defn, we cannot attempt to add any additional positional bindings after the & and its symbol. The following does not compile because macroexpansion fails:

(try
  (macroexpand '(let [[_ & ?rest ?another] coll] ?another))
  (catch Throwable t
    (.getMessage (.getCause t))))
;;=> "Call to clojure.core/let did not conform to spec."

Let's take a look at a successful macroexpansion to see how & usage translates to Clojure functions:

(macroexpand
 '(let [[_1 _2 & ?rest] coll]
    ?rest))
Result
(let*
 [vec__17698 coll
  seq__17699 (clojure.core/seq vec__17698)
  first__17700 (clojure.core/first seq__17699)
  seq__17699 (clojure.core/next seq__17699)
  _1 first__17700
  first__17700 (clojure.core/first seq__17699)
  seq__17699 (clojure.core/next seq__17699)
  _2 first__17700
  ?rest seq__17699]
 ?rest)

Based on the number of single-item retrievals at the beginning of our destructuring expression, a corresponding number of calls to first and next are used to establish the positional and final ?rest bindings.

In our examples so far we have used an already-bound coll, but for extra concision we can bind a new collection and destructure parts of it in one destructuring expression. For this feature, destructuring shares syntax with require, which expects the keyword :as followed by an alias. Here is an example that employs all three aspects we've observed:

(let [[?1 ?2 & ?rest :as alphabet] (map (comp char (partial + 65)) (range 26))]
  [?1 ?2 (count ?rest) (count alphabet)])
Result
[\A \B 24 26]

And as a final test that we understand how this works, let's look at the macroexpansion of this last example:

(macroexpand 
  '(let [[?1 ?2 & ?rest :as alphabet] (map (comp char (partial + 65)) (range 26))]
     [?2 ?1 (count ?rest) (count alphabet)]))
Result
(let*
 [vec__17703 (map (comp char (partial + 65)) (range 26))
  seq__17704 (clojure.core/seq vec__17703)
  first__17705 (clojure.core/first seq__17704)
  seq__17704 (clojure.core/next seq__17704)
  ?1 first__17705
  first__17705 (clojure.core/first seq__17704)
  seq__17704 (clojure.core/next seq__17704)
  ?2 first__17705
  ?rest seq__17704
  alphabet vec__17703]
 [?2 ?1 (count ?rest) (count alphabet)])

Note our alphabet binding is the final one in the expansion.

Nested Sequential Destructuring

If we can destructure a flat sequence, can we apply the same technique to a nested one? Note that we wrap coll in a vector in this example:

(let [[[_ _ ?]] [coll]]
  ?)
Result
\C

Let's pull this apart and ensure we understand what the three [[[ are doing:

(let [ [ [_ _ ?] ] [coll] ]
;;   ^ ^ ^
;;   | | |__Destructure `coll`, taking the 3rd item, which is `\C`
;;   | |
;;   | |__Destructure `[coll]`, taking the 1st item which is `coll`
;;   |
;;   |__The outer [ ... ] for all `let` bindings
  ?)

We can nest destructuring expressions as deeply as needed:

(let [[[[[[[[[[[[[[[[[[[[[[[[[[[[?]]]]]]]]]]]]]]]]]]]]]]]]]]] [[[[[[[[[[[[[[[[[[[[[[[[[[coll]]]]]]]]]]]]]]]]]]]]]]]]]]]
  ?)
Result
\A

We can also use & and :as at any depth:

(let [[[_ & ?rest-1 :as alpha1] [_ _ _ _ & ?rest-2 :as alpha2]] [coll coll]]
  [?rest-1 ?rest-2 (count alpha1) (count alpha2)])
Result
[(\B \C \D \E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)
 (\E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)
 26
 26]

Be wary that such nested destructuring quickly becomes difficult to read.

Detour: Lazy Sequences

Let's re-review the macroexpansion of one of our earlier examples:

(macroexpand '(let [[_ _ ?] coll] ?))
Result
(let*
 [vec__16915 coll
  _ (clojure.core/nth vec__16915 0 nil)
  _ (clojure.core/nth vec__16915 1 nil)
  ? (clojure.core/nth vec__16915 2 nil)]
 ?)

Note the multiple bindings of _. Except for select functions that return lazy sequences, Clojure as a language is eagerly evaluated. Binding to the underscore symbol does not have any special behavior that avoids evaluation of the next form; it's just a widespread convention that informs the human reader to ignore the binding. Each binding in the above macroexpanded code will be evaluated, so keep this in mind if you are destructuring lazy sequences that might trigger side effects.

Consider the following example using range to destructure a lazy sequence:

(let [[_ _ ?] (map #(do (println %) %) (range 1 100))]
  ?)
Result
; (out) 1
; (out) 2
; (out) 3
; (out) 4
; (out) 5
; (out) 6
; (out) 7
; (out) 8
; (out) 9
; (out) 10
; (out) 11
; (out) 12
; (out) 13
; (out) 14
; (out) 15
; (out) 16
; (out) 17
; (out) 18
; (out) 19
; (out) 20
; (out) 21
; (out) 22
; (out) 23
; (out) 24
; (out) 25
; (out) 26
; (out) 27
; (out) 28
; (out) 29
; (out) 30
; (out) 31
; (out) 32
3

If we are retrieving the third item from the lazy seq provided by (range 1 100), why were the first 32 items of our lazy sequence evaluated? Because Clojure's lazy collections are chunked for performance reasons, and that chunk size is 32:

// From Clojure's src/jvm/clojure/lang/RT.java
private static final int CHUNK_SIZE = 32;

We can implement completely lazy sequences (each item realized individually) by combining lazy-seq with cons as follows:

(defn fully-lazy-range-with-side-effect [n]
  (lazy-seq
    (cons (do (println n) n)
          (fully-lazy-rangewith-side-effect (inc n)))))

(let [[_ _ ?] (fully-lazy-range-with-side-effect 1)]
  ?)
Result
; (out) 1
; (out) 2
; (out) 3
3

As a general principle, it is perilous to combine side effects with lazy collections. Not only does destructuring not shield you from those concerns, but because it is presented as end-user syntax and therefore obscures the exact evaluation, it's important to remember that every binding in a destructuring expression is evaluated.

Associative Data Structures

Clojure's associative data structures provide efficient means to insert and retrieve key-value pairs. Destructuring them, therefore, also focuses on specifying keys whose values we want out of the collection.

We'll use the following card for all of the examples that follow:

;; Note the different key types.
(def card {:card/suit :spade
           :card/rank :queen
           :id 42
           'special? false
           "code" "QS"
	   [:game :hearts] :tactic/avoid})

Although a Clojure map supports arbitrary key types, destructuring provides special support for keys that are keywords, strings, or symbols.

(let [{:keys [id]} card]
  id)
Result
42
(let [{:strs [code]} card]
  code)
Result
"QS"
(let [{:syms [special?]} card]
  special?)
Result
false

If you have a map with a mixture of different key types like our chaotic card example, you can specify :keys, :syms, and :strs in a single destructuring expression.

In addition to the simple :id entry, our card also has namespace-qualified keyword keys. We have two techniques we can apply to such keywords:

(let [{:keys [card/suit card/rank id]} card]
  [suit rank id])
Result
[:spade :queen 42]
(let [{:keys [id]
       :card/keys [suit rank]} card]
  [suit rank id])
Result
[:spade :queen 42]

The :syms feature supports the same syntax for namespace-qualified symbols.

There are different ergonomic considerations for both approaches. Some people like to keep the namespace and name of keywords side-by-side as in the first example for easier textual search; others value the DRYer approach shown in the second example.

Tangent about a weird corner of :keys destructuring

I had been working with Clojure for over 10 years before discovering that you can use either symbols or keywords with :keys destructuring. Note the :id in this example:

;; Personal annotation: This works, but please don't do this.
(let [{:keys [:id]} card]
  id)
;;=> 42

As far as I am aware, symbols universally represent bindings in all other Clojure contexts but this one corner of destructuring. Using keywords for bindings adds unnecessary mental overhead for those reading code, who naturally expect to find symbols in binding contexts. Thankfully, this same loop-hole does not work with :strs.

In situations where a key name is problematic (e.g., the name is confusing in the context in which it is destructured) or when a key is not a keyword, string, or symbol, then we have access to another form of associative destructuring:

(let [{api-id "code"} card]
  api-id)
Result
"QS"

This formulation can prove confusing at first, but it's closely related to the syntax of sequential destructuring we explored first. For sequential destructuring, we were able to rely on implicit "keys" of the positions 0, 1, 2, etc. in the sequence. For maps, we have to be explicit about the key whose value we want to bind to a symbol, in this case "code".

We can do this with as many keys as we like:

(let [{internal-id :id
       api-id "code"} card]
  [internal-id api-id])
Result
[42 "QS"]

And with keys that aren't keywords, strings, or symbols:

(let [{tactic [:game :hearts]} card]
  tactic)
Result
:tactic/avoid

Associative destructuring supports the same :as syntax that sequential destructuring does:

(let [{:card/keys [suit rank]
       :as card}
      {:card/suit :spade
       :card/rank :queen
       :id 42
       'special? false
       "code" "QS"
       [:game :hearts] :tactic/avoid}]
  [suit rank (count card)])
Result
[:spade :queen 6]

As with sequential destructuring, if we attempt to extract a value at a key that is not present in an associative data structure, our binding will be nil rather than throwing an exception:

;; Oops, I meant rank
(let [{:keys [card/suit card/value id]} card]
  [suit value])
Result
[:spade nil]

Associative destructuring takes this nil support one step further by allowing us to define default values other than nil by supplying an :or entry in our destructuring expression:

(let [{:card/keys [suit rank wild?]
       :or {suit :heart
            wild? false}} card]
  [suit rank wild?])
Result
[:spade :queen false]

The :or can provide defaults to all symbols bound within the destructuring expression, not just those specified with :keys:

(let [{:keys [card/suit card/rank card/wild?]
       another :absent-field
       :or {suit :heart
            wild? false
            another "default"}} card]
  [suit rank wild? another])
Result
[:spade :queen false "default"]

Note that :or is about supplying defaults for bindings. If you don't bind it within the destructuring expression, :or doesn't know what to do with it:

(try
  (eval '(let [{:keys [id] :or {suit :heart}} card]
           [id suit]))
  (catch Exception e
    (.getMessage (.getCause e))))
;;=> "Unable to resolve symbol: suit in this context"

Nested Associative Destructuring

We can use destructuring to dive as deeply into associative data structures as we did with sequential ones. We'll use the following card+ definition in this section:

(def card+ {:suit {:id :spade
                   :color :black}
            :rank {:id :queen
                   :value {:hearts ##-Inf
                           :blackjack 10}}
            :games [:hearts :blackjack :poker]})

Since :keys, :strs, and :syms expect literal keywords, strings, and symbols, we'll have to use the other syntax for associative destructuring if we want to create bindings of nested values:

(let [{{:keys [color]} :suit} card+]
  color)
Result
:black

Here we used {symbol :a-key} for the outer binding and :keys for the inner binding. If we wanted to go deeper:

(let [{{{:keys [blackjack]} :value} :rank} card+]
  blackjack)
Result
10

Let's tease apart the back-to-back {{{ to ensure we understand:

(let [{ { {:keys [blackjack]} :value} :rank} card+]
;;    ^ ^ ^
;;    | | |__Grab the value of the `:blackjack` key, equivalent to `(get-in card+ [:rank :value :blackjack])`
;;    | |
;;    | |__Grab the value of the `:value` key, equivalent to `(get-in card+ [:rank :value])`
;;    |
;;    |__Grab the value of the `:rank` key in `card+`
  blackjack)

Any combination of destructuring at any level is supported, including a combination of associative and sequential destructuring wherever appropriate data structures are found:

(let [[{id-1 :id
        :keys [foo]
        :or {foo "bar"}
        {suit :id} :suit}

       {id-2 :id
        [_ _ third-game] :games}
       :as cards]
      [(assoc card+ :id 1) (assoc card+ :id 2)]]
  [id-1 id-2 foo suit third-game (count cards)])
Result
 [1 2 "bar" :spade :poker 2]

Let's consider the macroexpansion of this non-trivial example to see what destructuring is buying us:

(macroexpand
  '(let [[{id-1 :id
        :keys [foo]
        :or {foo "bar"}
        {suit :id} :suit}

       {id-2 :id
        [_ _ third-game] :games}
       :as cards]
      [(assoc card+ :id 1) (assoc card+ :id 2)]]
  [id-1 id-2 foo suit third-game (count cards)]))
Result
(let*
  [vec__18794 [(assoc card+ :id 1) (assoc card+ :id 2)]
   map__18797 (clojure.core/nth vec__18794 0 nil)
   map__18797 (if (clojure.core/seq? map__18797)
                (clojure.lang.PersistentHashMap/create
                  (clojure.core/seq map__18797))
                map__18797)
   id-1 (clojure.core/get map__18797 :id)
   map__18798 (clojure.core/get map__18797 :suit)
   map__18798 (if (clojure.core/seq? map__18798)
                (clojure.lang.PersistentHashMap/create
                  (clojure.core/seq map__18798))
                map__18798)
   suit (clojure.core/get map__18798 :id)
   foo (clojure.core/get map__18797 :foo "bar")
   map__18799 (clojure.core/nth vec__18794 1 nil)
   map__18799 (if (clojure.core/seq? map__18799)
                (clojure.lang.PersistentHashMap/create
                  (clojure.core/seq map__18799))
                map__18799)
   id-2 (clojure.core/get map__18799 :id)
   vec__18800 (clojure.core/get map__18799 :games)
   _ (clojure.core/nth vec__18800 0 nil)
   _ (clojure.core/nth vec__18800 1 nil)
   third-game (clojure.core/nth vec__18800 2 nil)
   cards vec__18794]
  [id-1 id-2 foo suit third-game (count cards)])

Even casting aside the gensymed names, there is a clearer syntactic relationship between the shape of our data and our bindings to that data when we use destructuring.

Keyword Arguments: When Sequences become Maps

Before wrapping up associative destructuring, let's briefly explore Clojure's support for keyword arguments to functions and how we can leverage that when destructuring.

You can write Clojure functions that take required positional arguments followed by optional "keyword arguments". Inserting a & marks the beginning of variable arguments in a function definition, and if those variable arguments are of an even number, you can treat that sequence as a map when destructuring.

Keyword arguments in upcoming Clojure 1.11

Starting with Clojure version 1.11.0-alpha, callers of functions that accept keyword arguments can provide either an even number of values (as described above) or a single map.

See this post for the initial announcement and examples.

This is used most often with function definitions, but works in other destructuring contexts too:

(defn f [a b & {:keys [alpha]}]
  alpha)

(=
 (f nil nil :alpha 42)

 (let [[& {:keys [alpha]}] [:alpha 42]]
   alpha)

 42)
Result
true

Whether or not you prefer keyword arguments or explicit maps of options, it's important to understand that this is built into the language.

Sequential : nth :: Associative : get

When analyzing sequential destructuring above, early in our experiment we reviewed its macroexpansion, in part to explore the various data types that would we could use. For associative data structures, we haven't dwelled on its implementation, instead focusing all of our attention on destructuring Clojure's map.

If you review the macroexpansion of the associative destructuring examples above, you'll find that associative destructuring is powered by get, both for retrieving values from maps and providing defaults via :or.

As with nth, we can consider what values Clojure's get accepts to reveal what other types of data we can use with associative destructuring. However, (1) I personally have only ever encountered maps being used with associative destructuring and (2) because get accepts nearly every type of data and returns nil by default, you can put almost anything on the right-hand side of an associative destructuring expression.

Consider these examples and determine for yourself whether they provide an intuitive means of destructuring:

;; Strings work with both types of destructuring
(let [[?a _ ?c] "abc"
      {?b 1
       ?d 3} "abc"]
  [?a ?b ?c ?d])
Result
[\a \b \c nil]
;; Sets are backed by maps.
;; They are _not_ sequential.
(let [{:keys [a c]} #{:a :b :c}
      {:syms [b d]
       :or   {d :some-default}} #{'a 'b 'c}]
  [a b c d])
Result
[:a b :c :some-default]
;; String and array access return nil
;; if the index requested is out of range.
(let [{?b 1
       ?d 3} (into-array [\a \b \c])]
  [?b ?d])
Result
[\b nil]
;; `get` returns `nil` if lookup isn't implemented
(let [{:keys [a b]} (Object.)]
  [a b])
Result
[nil nil]

While we can explain why these examples behave as they do, in practice the presence of an associative destructuring expression signals to the Clojure programmer that a map is on the right-hand side. You should use naming or code comments to highlight any departure from this norm.

For completeness, the Java implementation that underlies get is available in the collapsed section below.

Underlying Java implementation of get
// From src/jvm/clojure/lang/RT.java

static public Object get(Object coll, Object key){
	if(coll instanceof ILookup)
		return ((ILookup) coll).valAt(key);
	return getFrom(coll, key);
}

static Object getFrom(Object coll, Object key){
	if(coll == null)
		return null;
	else if(coll instanceof Map) {
		Map m = (Map) coll;
		return m.get(key);
	}
	else if(coll instanceof IPersistentSet) {
		IPersistentSet set = (IPersistentSet) coll;
		return set.get(key);
	}
	else if(key instanceof Number && (coll instanceof String || coll.getClass().isArray())) {
		int n = ((Number) key).intValue();
		if(n >= 0 && n < count(coll))
			return nth(coll, n);
		return null;
	}
	else if(coll instanceof ITransientSet) {
		ITransientSet set = (ITransientSet) coll;
		return set.get(key);
	}

	return null;
}

Destructuring Contexts

Now that we have explored the depth of destructuring support for sequential and associative data structures, it is time to consider the various places that destructuring can be used.

So far we have limited our observations to the let macro. Function definition via fn and defn also supports destructuring expressions in the parameter declaration vector:

(defn letters [[_ _ ? :as coll]]
  [? (count coll)])

(letters coll)
Result
[\C 26]
(defn card-details
  [{:card/keys [suit rank] :as card}]
  (str (name rank) " of " (name suit) "s"))

(card-details card)
Result
"queen of spades"

Be aware that tooling that shows function signatures will display your destructuring expressions, so be wary of complex destructuring in function parameter declarations. Alternatively, you can specify separate, simpler :arglists metadata for such functions, with the obvious caveat that that metadata requires manual editing if you change your function's signature.

Destructuring is so central to the way in which Clojure programmers break collections apart that it is built into many core macros and is even provided as a public core function: clojure.core/destructure. This allows you to expose the same expressive destructuring that core macros have in your own macros that establish bindings.

The destructure function takes a vector of bindings and compiles them such that destructuring expressions are expanded:

(destructure '[[?] coll])
Result
[vec__17432 coll
 ? (clojure.core/nth vec__17432 0 nil)]

In what other core Clojure contexts can we use destructuring? Both let and loop are defined using the destructure function directly, and any macro that uses either of those to establish bindings provides end-user support for destructuring. Here's an example showing how for supports destructuring of its left-hand bindings, i.e., each individual item from the right-hand collection:

(for [[?1 ?2 ?3] [coll (next coll)]]
  [?1 ?2 ?3])
;;=> ([\A \B \C] [\B \C \D])

In addition to fn, let, loop, and for, experiment with doseq, if-let, and when-let to test your understanding of destructuring in these contexts.

Conclusions

Destructuring provides not only a more concise way to bind names to parts of sequential and associative data structures than the equivalent code using nth, first, next, get, etc, but it also mirrors the shape of the data being destructured. It becomes a visual aid to the person trying to internalize the shape of data in your program.

Because it is a more concise notation, however, it does require the reader of your code to slow down. Whereas you might quickly scan a series of simple first and next calls, destructuring expressions require careful character-by-character reading. The more you practice destructuring, the more literal space you reserve in your source code for handling the essential complexities of your program.

Areas that require additional care from my own experience, as mentioned throughout this article:

  • If you have a deeply-nested data structure that you need to destructure at multiple levels, choosing to create a few intermediate bindings with simpler destructuring expressions can be easier to read than one complex, compound destructuring expression.
  • Including complex destructuring in the declaration of a function—whose signature will be viewed in smaller auto-complete popups or mini-buffers—can make a function's signature difficult to understand.

If you encounter eye-poppingly complex destructuring expressions in the wild, pass them quoted to clojure.core/destructure to observe exactly how they translate to simple Clojure functions. Destructuring is a testament to the power of Clojure's design that provides a closed but rich set of data structures with syntax literals that can be employed in novel ways to extend the language in a cohesive, seamless way.

Acknowledgements

I want to thank Eric Caspary for his thoughtful review and substantial feedback on a draft of this article. Thanks also go to Michael Zavarella for his destructuring guide on Clojure.org.


Tags: clojure code-observation pattern-matching destructuring

Copyright © 2024 Daniel Gregoire