MarginaliaSearch/code/process-models/processed-data
Viktor Lofgren 0894822b68 (converter) Add position information to serialized document data
This is not hooked in yet, and the term metadata is still left intact.  It should probably shrink to a smaller representation (byte?) with the upcoming removal of the position mask.
2024-05-28 14:18:03 +02:00
..
java/nu/marginalia (converter) Add position information to serialized document data 2024-05-28 14:18:03 +02:00
test/nu/marginalia/io/processed (converter) Add position information to serialized document data 2024-05-28 14:18:03 +02:00
build.gradle (converter) Add position information to serialized document data 2024-05-28 14:18:03 +02:00
readme.md (docs) Begin un-fucking the docs after refactoring 2024-02-27 21:22:21 +01:00

The processed-data package contains models and logic for reading and writing parquet files with the output from the converting-process.

Main models:

Since parquet is a column based format, some of the readable models are projections that only read parts of the input file.

See Also

third-party/parquet-floor