mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-24 05:18:58 +00:00
![]() Nephentes has been doing the rounds in social media, adding an easy detection and mitigation mechanism for this type of trap, as sadly not all webmasters set up their robots.txt correctly. Out of the box crawl limits will also deal with this type of attack, but this fix is faster. |
||
---|---|---|
.. | ||
fetcher | ||
logic | ||
retreival | ||
warc | ||
CrawlerMain.java | ||
CrawlerModule.java | ||
DomainStateDb.java |