Viktor Lofgren
8c963bd4ba
(feeds) Remove Content-Encoding: gzip from feed fetcher
...
We don't support decompressing gzip, so this just gives us errors at this point should the server support it.
2024-12-18 22:23:44 +01:00
Viktor Lofgren
6a079c1c75
(feeds) Add per-domain throttling for feed fetcher.
2024-12-18 22:06:46 +01:00
Viktor Lofgren
2dc9f2e639
(feeds) Make feed XML parsing more lenient
...
... by consuming BOM markers and leading whitespace.
2024-12-18 17:18:41 +01:00
Viktor Lofgren
b66fb9caf6
(feeds) Improve error handling in the feed fetcher.
2024-12-18 17:02:13 +01:00
Viktor Lofgren
923ebbac81
(feeds) Add logic to handle URI fragments in feed items
...
Introduced a method to decide whether to retain URI fragments in feed items based on their uniqueness. Enhanced FeedItem processing to conditionally strip fragments to maintain clean URLs where applicable.
2024-11-23 16:38:56 +01:00
Viktor Lofgren
4d23fe6261
(feeds) Simplify RSS User-Agent header
...
Removed the redundant "RSS Feed Fetcher" suffix from the User-Agent header in the FeedFetcherService. This will help avoid making the feed fetcher trigger bot mitigation that accepts the regular UA-string.
2024-11-21 16:43:56 +01:00
Viktor Lofgren
c728a1e2f2
(rss) Add endpoint for extracting URLs changed withing a timespan.
2024-11-18 14:59:32 +01:00
Viktor Lofgren
d874d76a09
(rss) Add an endpoint that can be used for identifying when RSS data has changed
2024-11-18 14:22:17 +01:00
Viktor Lofgren
a2bc9a98c0
(feed) Use the message queue to permit the feeds service to tell the calling actor when it's finished
2024-11-10 17:45:20 +01:00
Viktor Lofgren
e24a98390c
(feed) Update API to allow specifying clean vs refresh update
...
Move the logic deciding which operation to perform into the actor, updating its state graph to incorporate a counter that runs a clean update once in a blue moon.
2024-11-09 18:43:47 +01:00
Viktor Lofgren
a293266ccd
(feed) Wipe the feeds db and start over from system URLs periodically.
2024-11-09 18:17:16 +01:00
Viktor Lofgren
d774c39031
(feeds) Reduce log spam
2024-11-09 17:56:43 +01:00
Viktor Lofgren
ab17af99da
(feeds) Refresh the feed db using the previous db, when it is available.
2024-11-09 17:56:43 +01:00
Viktor Lofgren
b0ac3c586f
(feeds) Correct parallelism using SimpleBlockingThreadPool
2024-11-09 17:56:43 +01:00
Viktor Lofgren
139fa85b18
(feeds) Add working heartbeat tracking progress
2024-11-09 17:56:43 +01:00
Viktor Lofgren
bfeb9a4538
(feeds) Retire feedlot the feed bot, move RSS capture into the live-capture service
2024-11-09 17:56:43 +01:00