MarginaliaSearch/code/process-models
Viktor Lofgren 3113b5a551 (warc) Filter WarcResponses based on X-Robots-Tags
There really is no fantastic place to put this logic, but we need to remove entries with an X-Robots-Tags header where that header indicates it doesn't want to be crawled by Marginalia.
2023-12-16 15:58:27 +01:00
..
crawl-spec (*) WIP Add node affinity to EC_DOMAIN 2023-10-19 17:48:34 +02:00
crawling-model (warc) Filter WarcResponses based on X-Robots-Tags 2023-12-16 15:58:27 +01:00
processed-data (*) Refactor GeoIP-related code 2023-12-10 17:30:43 +01:00
work-log (build) Move unit test configuration to root build.gradle 2023-10-04 12:46:22 +02:00