Skip to content

SparseArrays

Kal Ahmed edited this page Sep 26, 2016 · 1 revision

Home > Collections > Sparse Arrays

Sparse Arrays

Sparse Arrays are data structures that provide array like behaviour while being potentially far more memory efficient when the arrays are sparsely populated i.e. when most slots in the array have the default value.

Like a normal array they have a fixed length which is defined at the time they are instantiated however they only allocate memory for the array as and when needed and typically do so such that for sparsely populated arrays they only need a fraction of the memory of an equivalent array. This makes them ideal backing storage for algorithms that rely on potentially large but sparsely populated arrays such as Bloom Filters which the library also provides.

Usage

The generic ISparseArray<T> interface provides just two methods, an integer indexed property for getting and settings values and a Length property for getting the length of the array. It also extends IEnumerable<T> so the array can be enumerated like any other .NET array/collection.

// Creating a sparse array
ISparseArray<String> array = new LinkedSparseArray(10000);

// Get the value at a given index
String value = array[3456];

// Set the value at a given index
array[6789] = "Example";

// Get the length of the array
int length = array.Length;

// Enumerating the array
foreach (String entry in array)
{
  // Do something with the value
}

Implementations

The library provides three different of the ISparseArray<T> each with slightly different performance characteristics.

Here we use n to refer to the length of the array i.e. the maximum capacity and m to refer to the number of non-empty slots.

Implementation Backing Storage Memory Usage Insert & Lookup
LinkedSparseArray<T> Linked List O(m) O(m)
BlockSparseArray<T> Array of k blocks of size l (user configurable) where each block is a T[] which is lazily instantiated O(k'l) where k' is number of instantiated blocks O(1)
BinarySparseArray<T> Binary tree O(m) O(log m)

You can see each has quite different performance and memory characteristics so the implementation you use will depend very much on how your application uses the data structure.

Clone this wiki locally