MarginaliaSearch/code/processes/converting-process/java/nu/marginalia/converting/sideload
Viktor Lofgren ff17473105 Fix UTF-8 URL normalization issue in sideloader.
Normalize URLs by replacing en-dash with hyphen to prevent encoding errors. This ensures correct handling of a small subset of articles with improperly normalized UTF-8 paths. Added `normalizeUtf8` method to address this issue.

Fixes issue #109.
2024-11-25 14:25:47 +01:00
..
dirtree (chore) Remove lombok 2024-11-11 21:14:38 +01:00
encyclopedia Fix UTF-8 URL normalization issue in sideloader. 2024-11-25 14:25:47 +01:00
reddit (chore) Remove use of deprecated STR.-style string templates 2024-11-11 18:02:28 +01:00
stackexchange (chore) Remove lombok 2024-11-11 21:14:38 +01:00
warc (chore) Remove lombok 2024-11-11 21:14:38 +01:00
SideloaderProcessing.java (model) Remove deprecated fields from CrawledDocument and CrawledDomain 2024-11-20 15:27:05 +01:00
SideloadSource.java (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
SideloadSourceFactory.java (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00