MarginaliaSearch/code/process-models
Viktor Lofgren 5329968155 (crawler) Update CrawlingThenConvertingIntegrationTest
This commit updates CrawlingThenConvertingIntegrationTest with additional tests for invalid, redirecting, and blocked domains. Improvements have also been made to filter out irrelevant entries in ParquetSerializableCrawlDataStream.
2023-12-15 21:04:06 +01:00
..
crawl-spec (*) WIP Add node affinity to EC_DOMAIN 2023-10-19 17:48:34 +02:00
crawling-model (crawler) Update CrawlingThenConvertingIntegrationTest 2023-12-15 21:04:06 +01:00
processed-data (*) Refactor GeoIP-related code 2023-12-10 17:30:43 +01:00
work-log (build) Move unit test configuration to root build.gradle 2023-10-04 12:46:22 +02:00