MarginaliaSearch/code/libraries/btree
Viktor 8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. (#52)
* (index-reverse) Parallel construction of the reverse indexes.

* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.

* (index-reverse)  Force changes to disk on close, reduce logging.

* (index-reverse)  Clean up merging process and add back logging

* (run)  Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM

* (index-reverse)  Better logging during processing

* (array) 2GB+ compatible write() function

* (array) 2GB+ compatible write() function

* (index-reverse) We are logging like Bolsonaro and I will not have it.

* (reverse-index) Self-diagnostics

* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
..
src (index-reverse) Parallel construction of the reverse indexes. (#52) 2023-10-07 10:00:00 +02:00
build.gradle (*) Upgrade to JDK21 with preview enabled. 2023-09-24 10:38:59 +02:00
readme.md Update readme.md 2023-03-20 16:39:15 +01:00

BTree

This package contains a small library for creating and reading a static b-tree in as implicit pointer-less datastructure. Both binary indices (i.e. sets) are supported, as well as arbitrary multiple-of-keysize key-value mappings where the data is interlaced with the keys in the leaf nodes. This is a fairly low-level datastructure.

The b-trees are specified through a BTreeContext which contains information about the data and index layout.

The b-trees are written through a BTreeWriter and read with a BTreeReader.

Demo

BTreeContext ctx = new BTreeContext(
        4,  // num layers max
        1,  // entry size, 1 = the leaf node has just just the key
        BTreeBlockSize.BS_4096); // page size

// Allocate a memory area to work in, see the array library for how to do this with files
LongArray array = LongArray.allocate(8192);

// Write a btree at offset 123 in the area
long[] items = new long[400];
BTreeWriter writer = new BTreeWriter(array, ctx);
final int offsetInFile = 123;

long btreeSize = writer.write(offsetInFile, items.length, slice -> {
    // here we *must* write items.length * entry.size words in slice
    // these items must be sorted!!

    for (int i = 0; i < items.length; i++) {
        slice.set(i, items[i]);
    }
});

// Read the BTree

BTreeReader reader = new BTreeReader(array, ctx, offsetInFile);
reader.findEntry(items[0]);

Useful Resources

Youtube: Abdul Bari, 10.2 B Trees and B+ Trees. How they are useful in Databases. This isn't exactly the design implemented in this library, but very well presented and a good refresher.