MarginaliaSearch/code/libraries/coded-sequence
Viktor Lofgren 5461634616 (doc) Add readme.md for coded-sequence library
This commit introduces a readme.md file to document the functionality and usage of the coded-sequence library. It covers the Elias Gamma code support, how sequences are encoded, and methods the library offers to query sequences, iterate over values, access data, and decode sequences.
2024-06-24 14:28:51 +02:00
..
java/nu/marginalia/sequence (gamma) Minor clean-up 2024-06-24 13:56:43 +02:00
test/nu/marginalia/sequence (gamma) Correctly decode zero-length sequences 2024-06-24 13:11:41 +02:00
build.gradle (gamma) Implement a small library for Elias gamma coding an integer sequence 2024-05-30 14:19:13 +02:00
readme.md (doc) Add readme.md for coded-sequence library 2024-06-24 14:28:51 +02:00

The coded-sequence library offers tools for encoding sequences of integers with a variable-length encoding.

The Elias Gamma code is supported: https://en.wikipedia.org/wiki/Elias_gamma_coding

The GammaCodedSequence class stores a sequence of ascending non-negative integers in a byte buffer. The encoding also stores the length of the sequence (as a gamma-coded value), which is used in decoding.

Sequences are encoded with the GammaCodedSequence.of()-method, and require a temporary buffer to work in.

// allocate a temporary buffer to work in, this is reused
// for all operations and will not hold the final result
ByteBuffer workArea = ByteBuffer.allocate(1024);

// create a new GammaCodedSequence with the given values
var gcs = GammaCodedSequence.of(workArea, 1, 3, 4, 7, 10);

The GammaCodedSequence class provides methods to query the sequence, iterate over the values, and access the underlying binary representation.

// query the sequence 
int valueCount = gcs.valueCount();
int bufferSize = gcs.bufferSize();

// iterate over the values
IntIterator iter = gcs.iterator();
IntList values = gcs.values();

// access the underlying data (e.g. for writing)
byte[] bytes = gcs.bytes();
ByteBuffer buffer = gcs.buffer();

The GammaCodedSequence class also provides methods to decode a sequence from a byte buffer or byte array.

// decode the data
var decodedGcs1 = new GammaCodedSequence(buffer);
var decodedGcs2 = new GammaCodedSequence(buffer, start, end);
var decodedGcs3 = new GammaCodedSequence(bytes);