MarginaliaSearch/code/process-models/crawling-model/src
Viktor Lofgren 22c8fb3f59 (crawler) Fix a bug where reference copies of crawl data was written without etag and last-modified
This commit also adds a band-aid to ParquetSerializableCrawlDataStream to fetch this from the 304-entity.  This can be removed in a few months.
2024-01-18 16:02:27 +01:00
..
main/java (crawler) Fix a bug where reference copies of crawl data was written without etag and last-modified 2024-01-18 16:02:27 +01:00
test/java/nu/marginalia/crawling (warc) Add a fields for etags and last-modified headers to the new crawl data formats 2023-12-18 17:45:54 +01:00