MarginaliaSearch/code/processes/converting-process/java/nu/marginalia/converting/sideload
Viktor Lofgren d968801dc1 (converter) Drop feed data from SlopDomainRecord
Also remove feed extraction from converter.  This is the crawler's responsibility now.
2024-12-26 17:57:08 +01:00
..
dirtree (chore) Remove lombok 2024-11-11 21:14:38 +01:00
encyclopedia (encyclopedia-sideloader) Add test suite and clean up urlencoding logic 2024-11-26 13:34:15 +01:00
reddit (converter) Refactor sideloaders to improve feature handling and keyword logic 2024-12-11 16:01:38 +01:00
stackexchange (converter) Drop feed data from SlopDomainRecord 2024-12-26 17:57:08 +01:00
warc (chore) Remove lombok 2024-11-11 21:14:38 +01:00
SideloaderProcessing.java (model) Remove deprecated fields from CrawledDocument and CrawledDomain 2024-11-20 15:27:05 +01:00
SideloadSource.java (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
SideloadSourceFactory.java (crawler) Reintroduce content type probing and clean out bad content type data from the existing crawl sets 2024-12-11 17:01:52 +01:00