MarginaliaSearch/code/services-core
Viktor Lofgren c73e43f5c9 (recrawl) Mitigate recrawl-before-load footgun
In the scenario where an operator

* Performs a new crawl from spec
* Doesn't load the data into the index
* Recrawls the data

The recrawl will not find the domains in the database, and the crawl log will be overwritten with an empty file,
irrecoverably losing the crawl log making it impossible to load!

To mitigate the impact similar problems, the change saves a backup of the old crawl log, as well as complains about this happening.

More specifically to this exact scenario however, the parquet-loaded domains are also preemptively inserted into the domain database at the start of the crawl.  This should help the DbCrawlSpecProvider to find them regardless of loaded state.

This may seem a bit redundant, but losing crawl data is arguably the worst type of disaster scenario for this software, so it's arguably merited.
2024-02-18 09:23:20 +01:00
..
assistant-service (*) install script for deploying Marginalia outside the codebase 2024-01-11 12:40:03 +01:00
control-service (sideload) Clean up the sideloading code 2024-02-17 14:32:36 +01:00
executor-service (recrawl) Mitigate recrawl-before-load footgun 2024-02-18 09:23:20 +01:00
index-service (index-journal) Improve documentation and code quality 2024-02-15 10:51:49 +01:00
query-service (client) Refactor GrpcStubPool to handle error states 2024-02-17 14:42:26 +01:00
readme.md (refactor) Move search service into services-satellite 2023-10-09 13:40:01 +02:00

Core Services

The cores services constitute the main functionality of the search engine, relatively agnostic to the Marginalia application.

  • The index-service contains the indexes, it answers questions about which documents contain which terms.

  • The query-service Interprets queries and delegates work to index-service.

  • The control-service provides an operator's user interface, and is responsible for orchestrating the various processes of the system.

  • The assistant-service helps the search service with spelling suggestions other peripheral functionality.