Purely functional list concatenation, xs ++ ys
in Haskell syntax, is well known to be linear time
in the length of the first input and constant time in the length of the second, i.e. xs ++ ys
is
O(length xs). This leads to quadratic complexity if we have a bunch of left associated uses of
concatenation.
The ancient trick to resolve this is to, instead of producing lists, produce list-to-list functions
a la [a] -> [a]
or ShowS
= String -> String
= [Char] -> [Char]
. “Concatenation” of “lists”
represented this way is just function composition which is a constant time operation. We can lift a
list xs
to this representation via the section (xs ++)
. This will still lead to O(length xs)
amount of work to apply this function, but a composition of such functions applied to a list will
always result in a fully right associated expression even if the function compositions aren’t
right associated.
In the last several years, it has become popular to refer to this technique as “difference lists”. Often no justification is given for this name. When it is given, it is usually a reference to the idea of difference lists in logic programming. Unfortunately, other than both techniques giving rise to efficient concatenation, they have almost no similarities.
To start, I want to do a deeper analysis of the “functional lists” approach, because I think what it is doing is a bit misunderstood and, consequently, oversold1. Let’s see how we would model this approach in an OO language without higher-order functions, such as early Java. I’ll use strings for simplicity, but it would be exactly the same for generic lists.
interface PrependTo {
String prependTo(String end);
}
class Compose implements PrependTo {
private PrependTo left;
private PrependTo right;
public Compose(PrependTo left, PrependTo right) {
this.left = left; this.right = right;
}
String prependTo(String end) {
this.left.prependTo(this.right.prependTo(end));
}
}
class Prepend implements PrependTo {
private String s;
public Prepend(String s) { this.s = s; }
String prependTo(String end) {
return this.s + end;
}
}
This is just a straight, manual implementation of closures for (.)
and (++)
(specialized to
strings). Other lambdas not of the above two forms would lead to other implementations of
PrependTo
. Let’s say, however, these are the only two forms that actually occur, which is mostly
true in Haskell practice, then another view on this OO code (to escape back to FP) is that it is an
OOP encoding of the algebraic data type:
data PrependTo = Compose PrependTo PrependTo | Prepend String
prependTo :: PrependTo -> String -> String
Compose left right) end = prependTo left (prependTo right end)
prependTo (Prepend s) end = s ++ end prependTo (
We could have also arrived at this by defunctionalizing a typical example of the technique. Modulo
some very minor details (that could be resolved by using the Church-encoded version of this), this
does accurately reflect what’s going on in the technique. Compose
is clearly constant time. Less
obviously, applying these functional lists requires traversing this tree of closures – made
into an explicit tree here. In fact, this reveals that this representation could require arbitrarily
large amounts of work for a given size of output. This is due to the fact that prepending an empty
string doesn’t increase the output size but still increases the size of the tree. In practice,
it’s a safe assumption that, on average, at least one character will be prepended per leaf of the
tree which makes the overhead proportional to the size of the output.
This tree representation is arguably better than the “functional list” representation. It’s less
flexible for producers, but that’s arguably a good thing because we didn’t really want arbitrary
String -> String
functions. It’s more flexible for consumers. For example, getting the head of
the list is a relatively efficient operation compared to applying a “functional list” and taking
the head of the result even in an eager language. (Laziness makes both approaches comparably
efficient.) Getting the last element is just the same for the tree version, but, even with laziness,
is much worse for the functional version. More to the point, this concrete representation allows
the concatenation function to avoid adding empty nodes to the tree whereas (.)
can’t pattern
match on whether a function is the identity function or not.
This view makes it very clear what the functional version is doing.
List append is the archetypal example of a Prolog program due to the novelty of its “invertibility”.
, Ys, Ys).
append([]X|Xs], Ys, [X|Zs]) :- append(Xs, Ys, Zs). append([
For our purposes, viewing this as a function of the first two arguments, this is exactly the usual functional implementation of list concatenation with exactly the same problems. We could, of course, encode the defunctionalized version of the functional approach into (pure) Prolog. This would produce:
Xs, Ys), End, Zs) :- prepend_to(Ys, End, End2), prepend_to(Xs, End2, Zs).
prepend_to(compose(Xs), End, Zs) :- append(Xs, End, Zs). prepend_to(prepend(
(I’ll be ignoring the issues that arise due to Prolog’s untyped nature.)
However, this being a logic programming language means we have additional tools available to use that functional languages lack. Namely, unification variables. For an imperative (destructive) implementation of list concatenation, the way we’d support efficient append of linked lists is we’d keep pointers to the start and end of the list. To append two lists, we’d simply use the end pointer of the first to update the end of the first list to point at the start of the second. We’d then return a pair consisting of the start pointer of the first and the end pointer of the second.
This is exactly how Prolog difference lists work, except instead of pointers, we use unification
variables which are more principled. Concretely, we represent a list as a pair of lists, but the
second list will be represented by an unbound unification variable and the first list contains
that same unification variable as a suffix. This pair is often represented using the infix
operator (“functor” in Prolog terminology), -
, e.g. Xs - Ys
. We could use diff(Xs, Ys)
or
some other name. -
isn’t a built-in operator, it’s just a binary constructor essentially.
At the level of logic, there are no unification variables. The constraints above mean that Xs - Ys
is a list Xs
which contains Ys
as a suffix.
The name “difference list” is arguably motivated by the definition of concatenation in this representation.
Xs - Ys, Ys - Zs, Xs - Zs). concat(
This looks a lot like |Xs - Ys + Ys - Zs = Xs - Zs|. If the suffix component of the first argument
is unbound, like it’s supposed to be, then this is a constant-time operation of binding that
component to Ys
. If it is bound, then we need to unify which, in the worst-case, is O(length Ys)
where the length is up to either nil or an unbound variable tail2.
We also have the unit of concat
, i.e. the empty
list via3:
Xs - Xs). empty(
See the footnote, but this does in some way identify Xs - Ys
with the “difference” of Xs
and
Ys
.
We get back to a “normal” list via:
Xs - [], Xs).
to_list(
% or more generally,
Xs - Ys, Ys, Xs). prepend_to(
to_list
is a constant-time operation, no matter what. Note, to_list
binds the suffix component
of the difference list. This means that the first input no longer meets our condition to be a
difference list. In other words, to_list
(and prepend_to
) consumes the difference list.
More precisely, it constrains the possible suffixes the list could be.
Indeed, any operation that binds the suffix component of a difference list consumes it. For example,
concat
consumes its first argument.
Of course, it still makes logical sense to work with the difference list when its suffix component
is bound, it’s just that its operational interpretation is different. More to the point, given a
difference list, you cannot prepend it (via prepend_to
or concat
) to two different lists to get
two different results.
Converting from a list does require traversing the list since we need to replace the nil node, i.e.
[]
, with a fresh unbound variable. Luckily, this is exactly what append
does.
Xs, Ys - Zs) :- append(Xs, Zs, Ys). from_list(
from_list
also suggests this “difference list” idea. If all of Xs
, Ys
, and Zs
are ground
terms, then from_list(Xs, Ys - Zs)
holds when append(Xs, Zs, Ys)
holds. Exactly when if our
invariants are maintained, i.e. that Zs
is a suffix of Ys
. Writing these relations more
functionally and writing append
as addition, we’d have:
\[\mathtt{from\_list}(Xs) = Ys - Zs \iff Xs + Zs = Ys\]
If we did want to “duplicate” a difference list, we’d essentially need to convert it to a (normal)
list with to_list
, and then we could use from_list
multiple times on that result. This would,
of course, still consume the original difference list. We’d also be paying O(length Xs) for every
duplicate, including to replace the one we just consumed4.
That said, we can prepend to a list to a difference list without consuming it. We can perform other actions with the risk of (partially) consuming the list, e.g. indexing into the list. Indexing into the list would force the list to be at least a certain length, but still allow prepending to any list that will result in a final list at least that long.
I’ll start the comparison with a massive discrepancy that we will ignore going forward. Nothing
enforces that a value of type ShowS
actually just appends something to its input. We could use
abstract data type techniques or the defunctionalized version to avoid this. To be fair, difference
lists also need an abstraction barrier to ensure their invariants, though their failure modes are
different. A difference list can’t change what it is based on what it is prepended to.
Functional Representation | Difference Lists |
---|---|
constant-time concatenation | constant-time concatenation |
constant-time conversion from a list (though you pay for it later) | O(n) conversion from a list |
persistent | non-persistent, requires linear use |
represented by a tree of closures | represented by a pair of a list and a unification variable |
O(n) (or worse!) conversion to a list | constant-time conversion to a list |
defunctionalized version can be implemented in pretty much any language | requires at least single-assignment variables |
unclear connection to being the difference of two lists (which two lists?) | mathematical, if non-obvious, connection to being the difference of two (given) lists |
As an illustration of the difference between persistent and non-persistent uses, the function:
= f . f double f
is a perfectly sensible function on ShowS
values that behaves exactly as you’d expect. On the
other hand:
In, Out) :- concat(In, In, Out). double(
is nonsense that will fail the occurs check (if it is enabled, otherwise it will create a cyclic
list) except for when In
is the empty difference list.
I hope I’ve illustrated that the functional representation is not just not difference lists, but is, in fact, wildly different from difference lists.
This functional representation is enshrined into Haskell via the ShowS
type and related functions,
but I’d argue the concrete tree representation is actually clearer and better. The functional
representation is more of a cute trick that allows us to reuse existing functions. Really, ShowS
should have been an abstract type.
Difference lists are an interesting example of how imperative ideas can be incorporated into a declarative language. That said, difference lists come with some of the downsides of an imperative approach, namely the lack of persistence.
As far as I’m aware, there isn’t an unambiguous and widely accepted name for this functional representation. Calling it “functional lists” or something like that is, in my opinion, very ambiguous and potentially misleading. I think the lack of a good name for this is why “difference lists” started becoming popular. As I’ve argued, using “difference list” in this context is even more misleading and confusing.
If people really want a name, one option might be “delta list”. I don’t think this term is used. It keeps the intuitive idea that the functional representation represents some “change” to a list, a collection of deltas that will all be applied at once, but it doesn’t make any false reference to difference lists. I’m not super into this name; I just want something that isn’t “difference list” or otherwise misleading.
To be clear, it’s still much, much, better than using plain concatenation.↩︎
Such a length relation couldn’t be written in pure Prolog but can in actual Prolog.↩︎
For those algebraically minded, this almost makes concat
and empty
into another
monoid except concat
is partial, but such a partial monoid is just a category! In other words,
we have a category whose objects are lists and whose homsets are, at most, singletons containing
Xs - Ys
for Hom(Xs, Ys). If we maintain our invariant that we have Xs - Ys
only when Ys
is a
suffix of Xs
, this thin category is exactly the category corresponding to the reflexive,
transitive “has suffix” relation. We could generalize this to any monoid via a “factors through”
relation, i.e. |\mathrm{Hom}(m, n)| is inhabited if and only if |\exists p. m = pn| which you can
easily prove is a reflexive, transitive relation given the monoid axioms. However, for a general
monoid, we can have a (potentially) non-thin category by saying |p \in \mathrm{Hom}(m,n)| if and
only if |m = pn|. The category will be thin if and only if the monoid is cancellative. This is
exactly the slice category of the monoid viewed as a one-object category.↩︎
Again, in actual Prolog, we could make a duplicate without consuming the original, though it would still take O(length Xs) time using the notion of length mentioned before.↩︎