MarginaliaSearch/code/processes/index-constructor-process
Viktor Lofgren 32fe864a33 (build) Java 22 and its consequences has been a disaster for Marginalia Search
Roll back to JDK 21 for now, and make Java version configurable in the root build.gradle

The project has run into no less than three distinct show-stopping bugs in JDK22, across multiple vendors, and gradle still doesn't fully support it, meaning you need multiple JDK versions installed.
2024-04-24 14:44:39 +02:00
..
java/nu/marginalia/index (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
build.gradle (build) Java 22 and its consequences has been a disaster for Marginalia Search 2024-04-24 14:44:39 +02:00
readme.md Clean up documentation and rename domain-links to link-graph 2024-02-28 11:40:39 +01:00

The index construction process is responsible for creating the indexes used by the search engine.

There are three types of indexes:

  • The forward index, which maps documents to words.
  • The full reverse index, which maps words to documents; and includes all words.
  • The priority reverse index, which maps words to documents; but includes only the most "important" words (such as those appearing in the title, or with especially high TF-IDF scores).

This is a very light-weight module that delegates the actual work to the modules:

Their respective readme files contain more information about the indexes themselves and how they are constructed.

The process is glued together within IndexConstructorMain, which is the only class of interest in this module.