Commit Graph

277 Commits

Author SHA1 Message Date
Viktor Lofgren
4668b1ddcb (build) Java 22 and its consequences has been a disaster for Marginalia Search
Roll back to JDK 21 for now, and make Java version configurable in the root build.gradle

The project has run into no less than three distinct show-stopping bugs in JDK22, across multiple vendors, and gradle still doesn't fully support it, meaning you need multiple JDK versions installed.
2024-04-24 13:54:04 +02:00
Viktor Lofgren
b6d365bacd (index) Clean up data model
The change set cleans up the data model for the term-level data.  This used to contain a bunch of fields with document-level metadata.  This data-duplication means a larger memory footprint and worse memory locality.

The ranking code is also modified to not accept SearchResultKeywordScores, but rather CompiledQueryLong and CqDataInts containing only the term metadata and the frequency information needed for ranking.  This is again an effort to improve memory locality.
2024-04-15 16:04:07 +02:00
Viktor Lofgren
002afca1c5 (sys) Upgrade to JDK22
This also entails upgrading JIB to 3.4.1 and Lombok to 1.18.32.
2024-03-21 14:33:27 +01:00
Viktor Lofgren
0bd3365c24 (convert) Initial integration of segmentation data into the converter's keyword extraction logic 2024-03-19 14:28:42 +01:00
Viktor Lofgren
d8f4e7d72b (qs) Retire NGramBloomFilter, integrate new segmentation model instead 2024-03-19 10:42:09 +01:00
Viktor Lofgren
57e6a12d08 (registry) Correct registerMonitor() behavior
The previous behavior would listen to too many changes, and based on zookeeper and not curator assumptions about behavior, add an additional monitor on each invocation of each monitor, (which always trigger on service state changes), leading to each monitor re-registering and effectively doubling monitors in numbers whenever a service stopped or started, which in turn meant a lot of bizarre thrashing behavior even on changes in services that don't explicitly talk to each other.

This re-registering behavior is no longer done.
2024-03-06 12:22:15 +01:00
Viktor Lofgren
46423612e3 (refac) Merge service-discovery and service modules
Also adds a few tests to the server/client code.
2024-03-03 10:49:23 +01:00
Viktor Lofgren
144f967dbf (misc) Tweak pool sizes 2024-02-28 16:23:02 +01:00
Viktor Lofgren
b31c9bb726 (docs) Update process docs 2024-02-28 15:21:33 +01:00
Viktor Lofgren
c0820b5e5c (docs) Update service docs 2024-02-28 15:19:31 +01:00
Viktor Lofgren
65b8a1d5d9 (grpc) Reduce error spam 2024-02-28 14:44:48 +01:00
Viktor Lofgren
a0648844fb (grpc) Reduce error spam 2024-02-28 14:35:29 +01:00
Viktor Lofgren
c4a27003c6 (docs) Fix formatting 2024-02-28 14:22:57 +01:00
Viktor Lofgren
86bbc1043e (service) Clean up thread pool creation 2024-02-28 14:06:32 +01:00
Viktor Lofgren
a8ec59eb75 (conf) Add migration warning when ZOOKEEPER_HOSTS is not set. 2024-02-28 12:09:38 +01:00
Viktor Lofgren
9f1649636e Clean up documentation and rename domain-links to link-graph 2024-02-28 11:40:39 +01:00
Viktor Lofgren
3a65fe8917 Add offload executor to GrpcChannelPoolFactory 2024-02-27 22:08:39 +01:00
Viktor Lofgren
e696fd9e92 (docs) Begin un-fucking the docs after refactoring 2024-02-27 21:22:21 +01:00
Viktor Lofgren
eaf836dc66 (service/grpc) Reduce thread count
Netty and GRPC by default spawns an incredible number of threads on high-core CPUs, which amount to a fair bit of RAM usage.

Add custom executors that throttle this behavior.
2024-02-27 21:22:21 +01:00
Viktor Lofgren
dbf64b0987 (logs) Add the option for json logging 2024-02-27 21:22:20 +01:00
Viktor Lofgren
ff0ef1eebc (cleanup) Minor cleanups 2024-02-24 15:33:56 +01:00
Viktor Lofgren
1d34224416 (refac) Remove src/main from all source code paths.
Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one.

While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules.  Which you'll do a lot, because it's *modul*ar.  The src/main/java convention makes a lot of sense for a non-modular project though.  This ain't that.
2024-02-23 16:13:40 +01:00
Viktor Lofgren
2201b1a506 (refac) Clean up code issues 2024-02-23 11:39:19 +01:00
Viktor Lofgren
5cdb07023b (refac) Clean up unused imports 2024-02-23 11:27:20 +01:00
Viktor Lofgren
6357d30ea0 Clean up docs 2024-02-22 19:53:20 +01:00
Viktor Lofgren
8d4ef982d0 Clean up docs 2024-02-22 19:37:59 +01:00
Viktor Lofgren
4740156cfa Clean up docs 2024-02-22 18:18:58 +01:00
Viktor Lofgren
085137ca63 * Extract the index functionality 2024-02-22 17:31:25 +01:00
Viktor Lofgren
66c1281301 (zk-registry) epic jak shaving WIP
Cleaning out a lot of old junk from the code, and one thing lead to another...

* Build is improved, now constructing docker images with 'jib'.  Clean build went from 3 minutes to 50 seconds.
* The ProcessService's spawning is smarter.  Will now just spawn a java process instead of relying on the application plugin's generated outputs.
* Project is migrated to GraalVM
* gRPC clients are re-written with a neat fluent/functional style. e.g.
```channelPool.call(grpcStub::method)
              .async(executor) // <-- optional
              .run(argument);
```
This change is primarily to allow handling ManagedChannel errors, but it turned out to be a pretty clean API overall.
* For now the project is all in on zookeeper
* Service discovery is now based on APIs and not services.  Theoretically means we could ship the same code either a monolith or a service mesh.
* To this end, began modularizing a few of the APIs so that they aren't strongly "living" in a service.  WIP!

Missing is documentation and testing, and some more breaking apart of code.
2024-02-22 14:01:23 +01:00
Viktor Lofgren
73947d9eca (zk-registry) Filter out phantom addresses in the registry
The change adds a hostname validation step to remove endpoints from the ZkServiceRegistry when they do not resolve.  This is a scenario that primarily happens when running in docker, and the entire system is started and stopped.
2024-02-20 18:09:11 +01:00
Viktor Lofgren
a69c0b2718 (grpc-client) Fix warmup crash
The warmup would sometimes crash during a cold start-up, because it could not get an API.  Changed the warmup to just create a GrpcSingleNodeChannelPool for the node.
2024-02-20 18:03:57 +01:00
Viktor Lofgren
6c764bceeb (doc) Update documentation for service-discovery 2024-02-20 16:09:49 +01:00
Viktor Lofgren
273aeb7bae (doc) Update documentation with new gRPC service setup 2024-02-20 16:06:05 +01:00
Viktor Lofgren
d185858266 (minor) Add missing query parameter to ServiceEndpoint.toURL 2024-02-20 15:49:43 +01:00
Viktor Lofgren
453bd6064b (minor) Add warm-up to GrpcMultiNodeChannelPool to speed up the initial messages
Without doing this, connections would be created lazily, which is probably never desirable.
2024-02-20 15:45:16 +01:00
Viktor Lofgren
ee8e0497ae (refac) Move service discovery injection to a separate guice module 2024-02-20 15:41:04 +01:00
Viktor Lofgren
30bdb4b4e9 (config) Clean up service configuration for IP addresses
Adds new ways to configure the bind and external IP addresses for a service.  Notably, if the environment variable WMSA_IN_DOCKER is present, the system will grab the HOSTNAME variable and announce that as the external address in the service registry.

The default bind address is also changed to be 0.0.0.0 only if WMSA_IN_DOCKER is present, otherwise 127.0.0.1; as this is a more secure default.
2024-02-20 14:22:48 +01:00
Viktor Lofgren
2ee492fb74 (gRPC) Bind gRPC services to an interface
By default gRPC it magically decides on an interface.  The change will explicitly tell it what to use.
2024-02-20 14:22:47 +01:00
Viktor Lofgren
36a5c8b44c (cleanup) Clean up code 2024-02-20 14:22:47 +01:00
Viktor Lofgren
07b625c58d (query-client) Add support for fault-tolerant requests to single node services
Adding a method importantCall that will retry a failing request on each route until it succeeds or the routes run out.
2024-02-20 14:16:05 +01:00
Viktor Lofgren
746a865106 (client) Fix handling of channel refreshes
The previous code made an incorrect assumption that all routes refer to the same node, and would overwrite the route list on each update.  This lead to storms of closing and opening channels whenever an update was received.

The new code is correctly aware that we may talk to multiple nodes.
2024-02-20 14:14:09 +01:00
Viktor
f85ec28a16
Merge branch 'master' into service-discovery 2024-02-20 11:44:12 +01:00
Viktor Lofgren
0307c55f9f (refac) Zookeeper for service-discovery, kill service-client lib (WIP)
To avoid having to either hard-code or manually configure service addresses (possibly several dozen), and to reduce the project's dependency on docker to deal with routing and discovery, the option to use [Zookeeper](https://zookeeper.apache.org/) to manage services and discovery has been added.

A service registry interface was added, with a Zookeeper implementation and a basic implementation that only works on docker and hard-codes everything.

The last remaining REST service, the assistant-service, has been migrated to gRPC.

This also proved a good time to clear out primordial technical debt from the root of the codebase.  The 'service-client' library has been taken behind the barn and given a last farewell.  It's replaced by a small library for managing gRPC channels.

Since it's no longer used by anything, RxJava has been removed as a dependency from the project.

Although the current state seems reasonably stable, this is a work-in-progress commit.
2024-02-20 11:41:14 +01:00
Viktor
d05c916491
Merge pull request #80 from MarginaliaSearch/ranking-algorithms
Clean up domain ranking code
2024-02-18 09:52:34 +01:00
Viktor Lofgren
e61e7f44b9 (blacklist) Delay startup of blacklist
To help services start faster, the blacklist will no longer block until it's loaded.  If such a behavior is desirable, a method was added to explicitly wait for the data.
2024-02-18 09:23:20 +01:00
Viktor Lofgren
f9b6ac03c6 (api) Clean up incorrect error handling in GrpcChannelPool 2024-02-18 08:45:35 +01:00
Viktor Lofgren
296ccc5f8e (blacklist) Clean up blacklist impl
The domain blacklist blocked the start-up of each process that injected it, adding like 30 seconds to the start-up time in prod.

This change moves the loading to a separate thread entirely.  For threads or processes that require the blacklist to be definitely loaded, a helper method was added that blocks until that time.
2024-02-18 08:16:48 +01:00
Viktor Lofgren
92717a4832 (client) Refactor GrpcStubPool to handle error states
Refactored the GRPC Stub Pool for better handling of channel SHUTDOWN state. Any disconnected channels are now re-created before returning the stub.

The class was also renamed to GrpcChannelPool, as we no longer pool the stubs.
2024-02-17 14:42:26 +01:00
Viktor Lofgren
9ec262ae00 (domain-ranking) Integrate new ranking logic
The change deprecates the 'algorithm' field from the domain ranking set configuration.  Instead, the algorithm will be chosen based on whether influence domains are provided, and whether similarity data is present.
2024-02-16 20:22:01 +01:00
Viktor Lofgren
b15f47d80e (db) Retire the EC_DOMAIN_LINK table
Retire the EC_DOMAIN_LINK table as the data has been migrated off into a file instead.
2024-02-08 15:52:30 +01:00