Commit Graph

11 Commits

Author SHA1 Message Date
Viktor Lofgren
fbba392491 (live-capture) Send a UA-string from the browserless fetcher as well
The change also introduces a somewhat convoluted wiremock test to intercept and verify that these headers are in fact sent
2025-02-04 13:36:49 +01:00
Viktor Lofgren
55aeb03c4a (feeds) Replace rssreader based parsing with a custom jsoup based rss parser
This solves some issues with the rssreader based parser, which was very picky about the XML being valid.  Jsoup is much more lenient when parsing malformed XML.
2025-01-09 18:29:55 +01:00
Viktor Lofgren
faa589962f (live-capture) Browserless now requires a token 2025-01-09 14:51:11 +01:00
Viktor Lofgren
2315fdc731 (search) Vendor rssreader and modify it to be able to consume the nlnet atom feed
Also dial down the logging a bit for the rssreader package.
2025-01-06 17:58:50 +01:00
Viktor Lofgren
0b65164f60 (chore) Fix broken test 2025-01-01 18:06:29 +01:00
Viktor Lofgren
3bc99639a0 (feed-fetcher) Make feed fetcher requests conditional
Add `If-None-Match` and `If-Modified-Since` headers as appropriate to the feed fetcher's requests.  On well-configured web servers, this should short-circuit the request and reduce the amount of bandwidth and processing that is necessary.

A new table was added to the FeedDb to hold one etag per domain.

If-Modified-Since semantics are based on the creation date for the feed database, which should serve as a cutoff date for the earliest update we can have received.

This completes the changes for Issue #136.
2024-12-27 15:10:15 +01:00
Viktor Lofgren
41a59dcf45 (feed) Sanitize illegal HTML entities out of the feed XML before parsing 2024-12-25 14:53:28 +01:00
Viktor Lofgren
b66fb9caf6 (feeds) Improve error handling in the feed fetcher. 2024-12-18 17:02:13 +01:00
Viktor Lofgren
c728a1e2f2 (rss) Add endpoint for extracting URLs changed withing a timespan. 2024-11-18 14:59:32 +01:00
Viktor Lofgren
d874d76a09 (rss) Add an endpoint that can be used for identifying when RSS data has changed 2024-11-18 14:22:17 +01:00
Viktor Lofgren
23cce0c78a Add a new function 'Live Capture' for on-demand screenshot capture
The screenshots are requested by the site-service, and triggered via the site-info view.
2024-09-27 13:46:34 +02:00