MarginaliaSearch/libraries
2023-03-06 18:32:13 +01:00
..
array A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
big-string A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
btree More tests for BTree, cleaned up code a bit. 2023-03-05 13:03:55 +01:00
easy-lsh A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
guarded-regex A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
language-processing A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
misc A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
random-write-funnel A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00
readme.md A lot of readmes, some refactoring. 2023-03-06 18:32:13 +01:00

Libraries

These are libraries that are not strongly coupled to the search engine.

  • The array library is for memory mapping large memory-areas, which Java has bad support for. It's designed to be able to easily replaced when Java's Foreign Function And Memory API is released.
  • The btree library offers a static BTree implementation based on the array library.
  • language-processing contains primitives for sentence extraction and POS-tagging.

Micro libraries

  • easy-lsh is a simple locality-sensitive hash for document deduplication
  • guarded-regex makes predicated regular expressions clearer
  • big-string offers seamless string compression
  • random-write-funnel is a tool for reducing write amplification when constructing large files out of order.

The rest

  • misc is just random bits and bobs that didn't fit anywhere.