MarginaliaSearch/code/processes/crawling-process/java/nu/marginalia/crawl
Viktor Lofgren e7d4bcd872 (crawler) Use the probe-result to reduce the likelihood of crawling both http and https
This should drastically reduce the number of fetched documents on many domains
2024-04-22 15:36:43 +02:00
..
retreival (crawler) Use the probe-result to reduce the likelihood of crawling both http and https 2024-04-22 15:36:43 +02:00
spec (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
warc (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
AbortMonitor.java (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
CrawlerMain.java (crawler/converter) Remove legacy junk from parquet migration 2024-04-22 12:34:28 +02:00
CrawlerModule.java (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00