MarginaliaSearch/code/processes/live-crawling-process/java/nu/marginalia/livecrawler
2024-12-23 23:31:03 +01:00
..
LiveCrawlDataSet.java (live-crawler) Keep track of bad URLs 2024-11-22 00:55:46 +01:00
LiveCrawlerMain.java (live-crawler) Flag live crawled documents with a special keyword 2024-12-10 13:42:10 +01:00
LiveCrawlerModule.java (refac) Move export tasks to a process and clean up process initialization for all ProcessMainClass descendents 2024-11-21 16:00:09 +01:00
SimpleLinkScraper.java (live-crawler) Limit concurrent accesses per domain using DomainLocks from main crawler 2024-12-23 23:31:03 +01:00