MarginaliaSearch/code/processes/converting-process/src
Viktor Lofgren dd8fb04886 (converter) Add sizeloadSizeAdvice field to several ProcessedDomain
Since the sideloaders don't populate the documents list in ProcessedDomain to keep the memory footprint manageable, the code that estimates knownUrls etc. will set them to zero, which has negative effects on their ranking.  This change will populate them with a bullshit value within a sane ballpark, ensuring that these domains show up in the rankings.
2023-12-19 18:37:51 +01:00
..
main/java/nu/marginalia/converting (converter) Add sizeloadSizeAdvice field to several ProcessedDomain 2023-12-19 18:37:51 +01:00
test (warc) Add a fields for etags and last-modified headers to the new crawl data formats 2023-12-18 17:45:54 +01:00