MarginaliaSearch/code/processes/crawling-process
Viktor Lofgren 09917837d0 (process) Ensure construction exceptions are logged
Wrapping these exceptions in a try-catch and logging them with slf4j will ensure they end up in the process logs.

The way it worked using the default exception handler, they'd print on console (which nothing captures!), leading to a very annoying debugging experience.
2023-11-22 18:32:06 +01:00
..
src (process) Ensure construction exceptions are logged 2023-11-22 18:32:06 +01:00
build.gradle (crawler) Integrate atags.parquet with the crawler so that "important" URLs are prioritized 2023-11-06 16:14:58 +01:00
readme.md (refactor) Remove features-search and update documentation 2023-10-09 15:12:30 +02:00

Crawling Process

The crawling process downloads HTML and saves them into per-domain snapshots.

Central Classes

See Also