MarginaliaSearch/code/processes/crawling-process/java/nu/marginalia/crawl/retreival
Viktor Lofgren ecb5eedeae (crawler, EXPERIMENT) Disable content type probing and use Accept header instead
There's reason to think this may speed up crawling quite significantly, and the benefits of the probing aren't quite there.
2024-09-30 14:53:01 +02:00
..
revisit (crawler) Refactor 2024-09-23 17:51:07 +02:00
sitemap (crawler) Refactor 2024-09-23 17:51:07 +02:00
CrawlDataReference.java (wip) Extract and encode spans data 2024-07-27 11:44:13 +02:00
CrawlDelayTimer.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
CrawlerRetreiver.java (crawler, EXPERIMENT) Disable content type probing and use Accept header instead 2024-09-30 14:53:01 +02:00
CrawlerWarcResynchronizer.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
DomainCrawlFrontier.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
DomainProber.java (crawler) Refactor boundary between CrawlerRetreiver and HttpFetcherImpl 2024-09-24 15:08:22 +02:00