mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-24 05:18:58 +00:00
![]() Wrapping these exceptions in a try-catch and logging them with slf4j will ensure they end up in the process logs. The way it worked using the default exception handler, they'd print on console (which nothing captures!), leading to a very annoying debugging experience. |
||
---|---|---|
.. | ||
src | ||
build.gradle | ||
readme.md |
Converting Process
The converting process reads crawl data and extracts information to be fed into the index, such as keywords, metadata, urls, descriptions...
Central Classes
- ConverterMain orchestrates the conversion process.
- DocumentProcessor converts a single document.
-
- HtmlDocumentProcessorPlugin has HTML-specific logic related to a document, keywords and identifies features such as whether it has javascript.
-
- PlainTextDocumentProcessorPlugin has plain text-specific logic related to a document...
- DomainProcessor converts each document and generates domain-wide metadata such as link graphs.