MarginaliaSearch/code/libraries/coded-sequence
Viktor Lofgren ecfe17521a (coded-sequence) Correct implementation of Elias gamma
The implementation was incorrectly using 1 bit more than it should.  The change also adds a put method for Elias delta; and cleans up the interface a bit.
2024-07-09 17:28:21 +02:00
..
java/nu/marginalia/sequence (coded-sequence) Correct implementation of Elias gamma 2024-07-09 17:28:21 +02:00
test/nu/marginalia/sequence (coded-sequence) Correct implementation of Elias gamma 2024-07-09 17:28:21 +02:00
build.gradle (gamma) Implement a small library for Elias gamma coding an integer sequence 2024-05-30 14:19:13 +02:00
readme.md (doc) Add readme.md for coded-sequence library 2024-06-24 14:28:51 +02:00

The coded-sequence library offers tools for encoding sequences of integers with a variable-length encoding.

The Elias Gamma code is supported: https://en.wikipedia.org/wiki/Elias_gamma_coding

The GammaCodedSequence class stores a sequence of ascending non-negative integers in a byte buffer. The encoding also stores the length of the sequence (as a gamma-coded value), which is used in decoding.

Sequences are encoded with the GammaCodedSequence.of()-method, and require a temporary buffer to work in.

// allocate a temporary buffer to work in, this is reused
// for all operations and will not hold the final result
ByteBuffer workArea = ByteBuffer.allocate(1024);

// create a new GammaCodedSequence with the given values
var gcs = GammaCodedSequence.of(workArea, 1, 3, 4, 7, 10);

The GammaCodedSequence class provides methods to query the sequence, iterate over the values, and access the underlying binary representation.

// query the sequence 
int valueCount = gcs.valueCount();
int bufferSize = gcs.bufferSize();

// iterate over the values
IntIterator iter = gcs.iterator();
IntList values = gcs.values();

// access the underlying data (e.g. for writing)
byte[] bytes = gcs.bytes();
ByteBuffer buffer = gcs.buffer();

The GammaCodedSequence class also provides methods to decode a sequence from a byte buffer or byte array.

// decode the data
var decodedGcs1 = new GammaCodedSequence(buffer);
var decodedGcs2 = new GammaCodedSequence(buffer, start, end);
var decodedGcs3 = new GammaCodedSequence(bytes);