mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-23 21:18:58 +00:00
![]() The WARC specification says the records should transparently remove compression. This was not done, leading to the WARC typically being a bit of a gzip-Matryoshka. |
||
---|---|---|
.. | ||
crawl-spec | ||
crawling-model | ||
processed-data | ||
work-log |