MarginaliaSearch/code/index/index-forward
Viktor Lofgren 32fe864a33 (build) Java 22 and its consequences has been a disaster for Marginalia Search
Roll back to JDK 21 for now, and make Java version configurable in the root build.gradle

The project has run into no less than three distinct show-stopping bugs in JDK22, across multiple vendors, and gradle still doesn't fully support it, meaning you need multiple JDK versions installed.
2024-04-24 14:44:39 +02:00
..
java/nu/marginalia/index/forward (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
test/nu/marginalia (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
build.gradle (build) Java 22 and its consequences has been a disaster for Marginalia Search 2024-04-24 14:44:39 +02:00
readme.md (docs) Begin un-fucking the docs after refactoring 2024-02-27 21:22:21 +01:00

Forward Index

The forward index contains a mapping from document id to various forms of document metadata.

In practice, the forward index consists of two files, an id file and a data file.

The id file contains a list of sorted document ids, and the data file contains metadata for each document id, in the same order as the id file, with a fixed size record containing data associated with each document id.

Each record contains a binary encoded DocumentMetadata object, as well as a HtmlFeatures bitmask.

Unlike the reverse index, the forward index is not split into two tiers, and the data is in the same order as it is in the source data, and the cardinality of the document IDs is assumed to fit in memory, so it's relatively easy to construct.

Central Classes