Commit Graph

17 Commits

Author SHA1 Message Date
Viktor Lofgren
dfd19b5eb9 (index) Reduce the number of abstractions around result ranking
The change also restructures the internal API a bit, moving resultsFromDomain from RpcRawResultItem into RpcDecoratedResultItem, as the previous order was driving complexity in the code that generates these objects, and the consumer side of things puts all this data in the same object regardless.
2024-07-16 08:18:54 +02:00
Viktor Lofgren
4a8afa6b9f (index, WIP) Position data partially integrated with forward and reverse indexes.
There's no graceful way of doing this in small commits, pushing to avoid the risk of data loss.
2024-06-06 12:54:52 +02:00
Viktor Lofgren
d01d9fa670 (search) Add screenreader-specific notification remark about when search results start. 2024-05-04 11:41:06 +02:00
Viktor Lofgren
a53a32f006 (search) Spell out website problems with "atomic elements" instead of having a hover that's inaccessible with keyboard navigation 2024-05-04 11:41:05 +02:00
Viktor Lofgren
3548d54cf6 (search) Add a screenreader-only alert when the search filters are updated to make it easier to understand what happens. 2024-05-04 11:41:04 +02:00
Viktor Lofgren
2840d9d403 (search) Add screenreader-only positions count text to search results 2024-05-04 11:41:03 +02:00
Viktor Lofgren
2ad0bfda1e (*) Fix boot orchestration for the services
This corrects an annoying bug that had the system crash and burn on first start-up due to a race condition in service initialization, where the services were attempting to access the database before it was properly migrated.

A fix was in principle already in place, but it was running too late and did not prevent attempts to access the as-yet uninitialized database.  Move the first boot check into the MainClass instead of the Service constructor.

The change also adds more appropriate docker dependencies to the services to fix rare errors resolving the hostname of the database.
2024-05-01 12:39:48 +02:00
Viktor Lofgren
4772e0b59d (service) Deprecate /public prefix on HTTP
Before the gRPC migration, the system would serve both public and internal requests over HTTP, but distinguish the two using path prefixes and a few HTTP Headers (X-Public, X-Context) added by the reverse proxy to prevent misconfigurations.

Since internal requests meaningfully no longer use HTTP, this convention is just an obstacle now, adding the need to always run the system behind a reverse proxy that rewrites the paths.

The change removes the path prefix, and updates the docker templates to reflect the change.  This will require a migration for existing systems.
2024-04-30 14:46:18 +02:00
Viktor Lofgren
6690e9bde8 (service) Ensure the service discovery starts early
This is necessary as we use zookeeper to orchestrate first-time startup of the services, to ensure that the database is properly migrated by the control service before anything else is permitted to start.
2024-04-25 15:08:33 +02:00
Viktor Lofgren
6efc0f21fe (index) Clean up data model
The change set cleans up the data model for the term-level data.  This used to contain a bunch of fields with document-level metadata.  This data-duplication means a larger memory footprint and worse memory locality.

The ranking code is also modified to not accept SearchResultKeywordScores, but rather CompiledQueryLong and CqDataInts containing only the term metadata and the frequency information needed for ranking.  This is again an effort to improve memory locality.
2024-04-24 14:44:39 +02:00
Viktor Lofgren
4fb86ac692 (search) Fix outdated assumptions about the results
We no longer break the query into "sets" of search terms and need to adapt the code to not use this assumption.

For the API service, we'll simulate the old behavior to keep the API stable.

For the search service, we'll introduce a new way of calculating positions through tree aggregation.
2024-04-24 14:44:38 +02:00
Viktor Lofgren
6cba6aef3b (minor) Remove dead code 2024-04-24 14:44:38 +02:00
Viktor Lofgren
a3a6d6292b (qs, index) New query model integrated with index service.
Seems to work, tests are green and initial testing finds no errors.  Still a bit untested, committing WIP as-is because it would suck to lose weeks of work due to a drive failure or something.
2024-04-24 14:44:38 +02:00
Viktor Lofgren
46423612e3 (refac) Merge service-discovery and service modules
Also adds a few tests to the server/client code.
2024-03-03 10:49:23 +01:00
Viktor Lofgren
8d0af9548b (search) Bot mitigation
Add the ability to indicate to the search service that a request is malicious, and to poison the results by providing randomly reorered old results instead.
2024-02-27 21:22:19 +01:00
Viktor Lofgren
427f3e922f (index) Retire count operation, clean up index code. 2024-02-27 21:22:17 +01:00
Viktor Lofgren
1d34224416 (refac) Remove src/main from all source code paths.
Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one.

While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules.  Which you'll do a lot, because it's *modul*ar.  The src/main/java convention makes a lot of sense for a non-modular project though.  This ain't that.
2024-02-23 16:13:40 +01:00