mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-24 13:19:02 +00:00
524 B
524 B
Crawling Process
The crawling process downloads HTML and saves them into per-domain snapshots.
Central Classes
- CrawlerMain orchestrates the crawling.
- CrawlerRetreiver visits known addresses from a domain and downloads each document.
- HttpFetcher fetches a URL.