MarginaliaSearch/code/processes/crawling-process/java/nu/marginalia/crawl/fetcher
Viktor Lofgren bae44497fe (crawler) Add a new system property crawler.maxFetchSize
This gives the same upper limit to the live crawler and the big boy crawler, though the live crawler will reject items too large, and the big crawler will truncate at that point.
2024-12-30 15:10:11 +01:00
..
socket (chore) Remove lombok 2024-11-11 21:14:38 +01:00
warc (crawler) Add a new system property crawler.maxFetchSize 2024-12-30 15:10:11 +01:00
ContentTags.java (crawler) Do not remove W/-prefix on weak e-tags 2024-12-27 20:56:42 +01:00
Cookies.java (crawler) Refactor 2024-09-23 17:51:07 +02:00
HttpFetcher.java (chore) Remove lombok 2024-11-11 21:14:38 +01:00
HttpFetcherImpl.java (crawler) Correct content type probing to only run on URLs that are suspected to be binary 2024-12-26 14:13:17 +01:00
SitemapRetriever.java (crawler) Refactor 2024-09-23 17:51:07 +02:00