Commit Graph

2716 Commits

Author SHA1 Message Date
Viktor
5b93a0e633
Update ROADMAP.md 2025-01-12 20:38:11 +01:00
Viktor
08fb0e5efe
Update ROADMAP.md 2025-01-12 20:37:43 +01:00
Viktor
bcf67782ea
Update ROADMAP.md 2025-01-12 20:37:09 +01:00
Viktor Lofgren
ef3f175ede (search) Don't clobber the search query URL with default values 2025-01-10 15:57:30 +01:00
Viktor Lofgren
bbe4b5d9fd Revert experimental changes 2025-01-10 15:52:02 +01:00
Viktor Lofgren
c67a635103 (search, experimental) Add a few debugging tracks to the search UI 2025-01-10 15:44:44 +01:00
Viktor Lofgren
20b24133fb (search, experimental) Add a few debugging tracks to the search UI 2025-01-10 15:34:48 +01:00
Viktor Lofgren
f2567677e8 (index-client) Clean up index client code
Improve error handling.  This should be a relatively rare case, but we don't want one bad index partition to blow up the entire query.
2025-01-10 15:17:07 +01:00
Viktor Lofgren
bc2c2061f2 (index-client) Clean up index client code
This should have the rpc stream reception be performed in parallel in separate threads, rather blocking sequentially in the main thread, hopefully giving a slight performance boost.
2025-01-10 15:14:42 +01:00
Viktor Lofgren
1c7f5a31a5 (search) Further reduce the number of db queries by adding more caching to DbDomainQueries. 2025-01-10 14:17:29 +01:00
Viktor Lofgren
59a8ea60f7 (search) Further reduce the number of db queries by adding more caching to DbDomainQueries. 2025-01-10 14:15:22 +01:00
Viktor Lofgren
aa9b1244ea (search) Reduce the number of db queries a bit by caching data that doesn't change too often 2025-01-10 13:56:04 +01:00
Viktor Lofgren
2d17233366 (search) Reduce the number of db queries a bit by caching data that doesn't change too often 2025-01-10 13:53:56 +01:00
Viktor Lofgren
b245cc9f38 (search) Reduce the number of db queries a bit by caching data that doesn't change too often 2025-01-10 13:46:19 +01:00
Viktor Lofgren
6614d05bdf (db) Make db pool size configurable 2025-01-09 20:20:51 +01:00
Viktor Lofgren
55aeb03c4a (feeds) Replace rssreader based parsing with a custom jsoup based rss parser
This solves some issues with the rssreader based parser, which was very picky about the XML being valid.  Jsoup is much more lenient when parsing malformed XML.
2025-01-09 18:29:55 +01:00
Viktor Lofgren
faa589962f (live-capture) Browserless now requires a token 2025-01-09 14:51:11 +01:00
Viktor Lofgren
c7edd6b39f (live-capture) Browserless now requires a token 2025-01-09 14:46:05 +01:00
Viktor Lofgren
79da622e3b (search) Update front page with new banner about move 2025-01-08 21:38:19 +01:00
Viktor Lofgren
3da8337ba6 (feeds) Add system property for exporting fetched feeds to a slop table for debugging 2025-01-08 20:49:16 +01:00
Viktor Lofgren
a32d230f0a (special) Trigger deployment 2025-01-08 20:07:54 +01:00
Viktor Lofgren
3772bfd387 (query) Fix handling of optional ranking parameters 2025-01-08 17:11:22 +01:00
Viktor Lofgren
02a7900d1a (search) Correct search-in-title toggle in search UI 2025-01-08 16:51:10 +01:00
Viktor Lofgren
a1fb92468f (refac) Remove ResultRankingParameters, QueryLimits class and use protobuf classes directly instead
This is primarily to make the code a bit easier to reason about, and will reduce the level of indirection and data copying in the search-servi->query-service->index-service communication chain.
2025-01-08 16:15:57 +01:00
Viktor Lofgren
b7f0a2a98e (search-service) Fix metrics for errors and request times
This was previously in place, but broke during the jooby migration.
2025-01-08 14:10:43 +01:00
Viktor Lofgren
5fb76b2e79 (search-service) Fix metrics for errors and request times
This was previously in place, but broke during the jooby migration.
2025-01-08 14:06:03 +01:00
Viktor Lofgren
ad8c97f342 (search-service) Begin replacement of the crawl queue mechanism with node_affinity flagging
Previously a special db table was used to hold domains slated for crawling, but this is deprecated, and instead now each domain has a node_affinity flag that decides its indexing state, where a value of -1 indicates it shouldn't be crawled, a value of 0 means it's slated for crawling by the next index partition to be crawled, and a positive value means it's assigned to an index partition.

The change set also adds a test case validating the modified behavior.
2025-01-08 13:25:56 +01:00
Viktor Lofgren
dc1b6373eb (search-service) Clean up readme 2025-01-08 13:04:39 +01:00
Viktor Lofgren
983d6d067c (search-service) Add indexing indicator to sibling domains listing 2025-01-08 12:58:34 +01:00
Viktor Lofgren
a84a06975c (ranking-params) Add disable penalties flag to ranking params
This will help debugging ranking issues.  Later it may be added to some filters.
2025-01-08 00:16:49 +01:00
Viktor Lofgren
d2864c13ec (query-params) Add additional permitted query params 2025-01-07 20:21:44 +01:00
Viktor Lofgren
03ba53ce51 (legacy-search) Update nav bar with correct links 2025-01-07 17:44:52 +01:00
Viktor Lofgren
d4a6684931 (specialization) Soften length requirements for wiki-specialized documents (incl. cppreference) 2025-01-07 15:53:25 +01:00
Viktor
6f0485287a
Merge pull request #145 from MarginaliaSearch/cppreference_fixes
Cppreference fixes
2025-01-07 15:43:19 +01:00
Viktor Lofgren
59e2dd4c26 (specialization) Soften length requirements for wiki-specialized documents (incl. cppreference) 2025-01-07 15:41:30 +01:00
Viktor Lofgren
ca1807caae (specialization) Add new specialization for cppreference.com
Give this reference website some synthetically generated tokens to improve the likelihood of a good match.
2025-01-07 15:41:05 +01:00
Viktor Lofgren
26c20e18ac (keyword-extraction) Soften constraints on keyword patterns, allowing for longer segmented words 2025-01-07 15:20:50 +01:00
Viktor Lofgren
7c90b6b414 (query) Don't blindly make tokens containing a colon into a non-ranking advice term 2025-01-07 15:18:05 +01:00
Viktor Lofgren
b63c54c4ce (search) Update opensearch.xml to point to non-redirecting domains. 2025-01-07 00:23:09 +01:00
Viktor Lofgren
fecd2f4ec3 (deploy) Add legacy search service to deploy script 2025-01-07 00:21:13 +01:00
Viktor Lofgren
39e420de88 (search) Add wayback machine link to siteinfo 2025-01-06 20:33:10 +01:00
Viktor Lofgren
dc83619861 (rssreader) Further suppress logging 2025-01-06 20:20:37 +01:00
Viktor Lofgren
87d1c89701 (search) Add listing of sibling subdomains to site overview 2025-01-06 20:17:36 +01:00
Viktor Lofgren
a42a7769e2 (leagacy-search) Remove legacy paperdoll class 2025-01-06 20:17:36 +01:00
Viktor
202bda884f
Update readme.md
Add note about installing tailwindcss via npm
2025-01-06 18:35:13 +01:00
Viktor Lofgren
2315fdc731 (search) Vendor rssreader and modify it to be able to consume the nlnet atom feed
Also dial down the logging a bit for the rssreader package.
2025-01-06 17:58:50 +01:00
Viktor Lofgren
b5469bd8a1 (search) Turn relative feed URLs absolute when dealing with RSS/Atom item URLs 2025-01-06 16:56:24 +01:00
Viktor Lofgren
6a6318d04c (search) Add separate websiteUrl property to legacy service 2025-01-06 16:26:08 +01:00
Viktor Lofgren
55933f8d40 (search) Ensure we respect old URL contracts
/explore/random should be equivalent to /explore
2025-01-06 16:20:53 +01:00
Viktor
be6382e0d0
Merge pull request #127 from MarginaliaSearch/serp-redesign
Web UI redesign
2025-01-06 16:08:14 +01:00