MarginaliaSearch/code/processes/crawling-process/java/nu/marginalia/crawl/retreival
Viktor Lofgren e9854f194c (crawler) Refactor
* Restructure the code to make a bit more sense
* Store full headers in crawl data
* Fix bug in retry-after header that assumed the timeout was in milliseconds, and then clamped it to a lower bound of 500ms, meaning this was almost always handled wrong
2024-09-23 17:51:07 +02:00
..
revisit (crawler) Refactor 2024-09-23 17:51:07 +02:00
sitemap (crawler) Refactor 2024-09-23 17:51:07 +02:00
CrawlDataReference.java (wip) Extract and encode spans data 2024-07-27 11:44:13 +02:00
CrawlDelayTimer.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
CrawlerRetreiver.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
CrawlerWarcResynchronizer.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
DomainCrawlFrontier.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
DomainProber.java (crawler) Refactor 2024-09-23 17:51:07 +02:00