mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-24 05:18:58 +00:00
![]() How'd This Ever Work? (tm) TermFrequencyExporter was using Math.clamp() incorrectly, and SentenceExtractor was synchronizing on its own instance when initializing shared static members, causing rare issues when spinning multiple SE:s up at once. |
||
---|---|---|
.. | ||
java/nu/marginalia/extractor | ||
build.gradle | ||
readme.md |
Contains converter-like extraction jobs that operate on crawled data to produce export files.
Important classes
- AtagExporter - extracts anchor texts from the crawled data.
- FeedExporter - tries to find RSS/Atom feeds within the crawled data.
- TermFrequencyExporter - exports the 'TF' part of TF-IDF.