Viktor Lofgren
81815f3e0a
(qs, index) New query model integrated with index service.
...
Seems to work, tests are green and initial testing finds no errors. Still a bit untested, committing WIP as-is because it would suck to lose weeks of work due to a drive failure or something.
2024-04-04 20:17:58 +02:00
Viktor Lofgren
87bb93e1d4
(qs, WIP) Fix edge cases in query compilation
...
This addresses the relatively common case where the graph consists of two segments, such as x y, z w; in this case we want an output like (x_y) (z w | z_w) | x y (z_w). The generated output does somewhat pessimize a few other cases, but this one is arguably more important.
2024-03-29 12:40:27 +01:00
Viktor Lofgren
e596c929ac
(qs, WIP) Clean up dead code
2024-03-28 16:37:23 +01:00
Viktor Lofgren
9852b0e609
(qs, WIP) Tidy it up a bit
2024-03-28 14:18:26 +01:00
Viktor Lofgren
51b0d6c0d3
(qs, WIP) Tidy it up a bit
2024-03-28 14:09:17 +01:00
Viktor Lofgren
15391c7a88
(qs, WIP) Tidy it up a bit
2024-03-28 13:54:30 +01:00
Viktor Lofgren
fe62593286
(qs, WIP) Break up code and tidy it up a bit
2024-03-28 13:26:54 +01:00
Viktor Lofgren
4cc11e183c
(qs, WIP) Fix output determinism, fix tests
2024-03-28 13:11:26 +01:00
Viktor Lofgren
f82ebd7716
(WIP) Query rendering finally beginning to look like it works
2024-03-28 13:01:21 +01:00
Viktor Lofgren
bd0704d5a4
(*) Fix JDK22 migration issues
...
A few bizarre build errors cropped up when migrating to JDK22. Not at all sure what caused them, but they were easy to mitigate.
2024-03-21 14:33:27 +01:00
Viktor Lofgren
002afca1c5
(sys) Upgrade to JDK22
...
This also entails upgrading JIB to 3.4.1 and Lombok to 1.18.32.
2024-03-21 14:33:27 +01:00
Viktor Lofgren
a4b810f511
WIP
2024-03-21 14:33:26 +01:00
Viktor Lofgren
0bd3365c24
(convert) Initial integration of segmentation data into the converter's keyword extraction logic
2024-03-19 14:28:42 +01:00
Viktor Lofgren
d8f4e7d72b
(qs) Retire NGramBloomFilter, integrate new segmentation model instead
2024-03-19 10:42:09 +01:00
Viktor Lofgren
afc047cd27
(control) GUI for exporting segmentation data from a wikipedia zim
2024-03-18 13:45:23 +01:00
Viktor Lofgren
00ef4f9803
(WIP) Partial integration of new query expansion code into the query-serivice
2024-03-18 13:16:49 +01:00
Viktor Lofgren
07e4d7ec6d
(WIP) Improve data extraction from wikipedia data
2024-03-18 13:16:00 +01:00
Viktor Lofgren
8ae1f08095
(WIP) Implement first take of new query segmentation algorithm
2024-03-12 13:12:50 +01:00
Viktor Lofgren
57e6a12d08
(registry) Correct registerMonitor() behavior
...
The previous behavior would listen to too many changes, and based on zookeeper and not curator assumptions about behavior, add an additional monitor on each invocation of each monitor, (which always trigger on service state changes), leading to each monitor re-registering and effectively doubling monitors in numbers whenever a service stopped or started, which in turn meant a lot of bizarre thrashing behavior even on changes in services that don't explicitly talk to each other.
This re-registering behavior is no longer done.
2024-03-06 12:22:15 +01:00
Viktor Lofgren
46423612e3
(refac) Merge service-discovery and service modules
...
Also adds a few tests to the server/client code.
2024-03-03 10:49:23 +01:00
Viktor Lofgren
29bf473d74
(encyclopedia) Add URLencoding to path element
...
This prevents corruption of the links to the sideloaded encyclopedia data when the article path contains characters that are not valid in a URL.
2024-03-01 17:28:09 +01:00
Viktor Lofgren
9689f3faee
(domain-info) Fix incorrect array indexing
2024-02-29 18:56:09 +01:00
Viktor Lofgren
93fa58c93d
(domain-info) Fix incorrect array indexing
...
Using the id instead of idx when addressing the ranksArray caused exceptions.
2024-02-29 17:54:23 +01:00
Viktor Lofgren
186a98cc99
(doc) Fix wonky bullet lists
2024-02-28 17:43:05 +01:00
Viktor Lofgren
9993f265ca
(doc) Remove irrelevant text
2024-02-28 17:40:05 +01:00
Viktor Lofgren
144f967dbf
(misc) Tweak pool sizes
2024-02-28 16:23:02 +01:00
Viktor Lofgren
b31c9bb726
(docs) Update process docs
2024-02-28 15:21:33 +01:00
Viktor Lofgren
c0820b5e5c
(docs) Update service docs
2024-02-28 15:19:31 +01:00
Viktor Lofgren
65b8a1d5d9
(grpc) Reduce error spam
2024-02-28 14:44:48 +01:00
Viktor Lofgren
a0648844fb
(grpc) Reduce error spam
2024-02-28 14:35:29 +01:00
Viktor Lofgren
c4a27003c6
(docs) Fix formatting
2024-02-28 14:22:57 +01:00
Viktor Lofgren
41abd8982f
(math) Clean up error handling
2024-02-28 14:19:50 +01:00
Viktor Lofgren
86bbc1043e
(service) Clean up thread pool creation
2024-02-28 14:06:32 +01:00
Viktor Lofgren
9a045a0588
(index) Clean up index code
2024-02-28 13:09:47 +01:00
Viktor Lofgren
9415539b38
(docs) Update docs
2024-02-28 12:25:19 +01:00
Viktor Lofgren
84bab2783d
(docs) Fix fake news in docs
2024-02-28 12:16:45 +01:00
Viktor Lofgren
d78e9e715f
(misc) Fix broken tests
2024-02-28 12:12:43 +01:00
Viktor Lofgren
a8ec59eb75
(conf) Add migration warning when ZOOKEEPER_HOSTS is not set.
2024-02-28 12:09:38 +01:00
Viktor Lofgren
20fc0ef13c
(gradle) Add task alias 'docker' for 'jibDockerBuild'
...
The change also moves the jib boilerplate to an include.
2024-02-28 11:59:15 +01:00
Viktor Lofgren
9f1649636e
Clean up documentation and rename domain-links
to link-graph
2024-02-28 11:40:39 +01:00
Viktor Lofgren
3a65fe8917
Add offload executor to GrpcChannelPoolFactory
2024-02-27 22:08:39 +01:00
Viktor Lofgren
99a6e56e99
(index-client) Increase thread count in index client
...
This should be a fair bit larger than the number of index nodes
2024-02-27 22:00:29 +01:00
Viktor Lofgren
e696fd9e92
(docs) Begin un-fucking the docs after refactoring
2024-02-27 21:22:21 +01:00
Viktor Lofgren
c943954bb4
(domain-info) Reduce memory usage
2024-02-27 21:22:21 +01:00
Viktor Lofgren
eaf836dc66
(service/grpc) Reduce thread count
...
Netty and GRPC by default spawns an incredible number of threads on high-core CPUs, which amount to a fair bit of RAM usage.
Add custom executors that throttle this behavior.
2024-02-27 21:22:21 +01:00
Viktor Lofgren
dbf64b0987
(logs) Add the option for json logging
2024-02-27 21:22:20 +01:00
Viktor Lofgren
8d0af9548b
(search) Bot mitigation
...
Add the ability to indicate to the search service that a request is malicious, and to poison the results by providing randomly reorered old results instead.
2024-02-27 21:22:19 +01:00
Viktor Lofgren
67aa20ea2c
(array) Attempting to debug strange errors
2024-02-27 21:22:18 +01:00
Viktor Lofgren
5604e9f531
(query) Bump query length, see what happens :P
2024-02-27 21:22:17 +01:00
Viktor Lofgren
1a51ec2d69
(index) Index optimization
2024-02-27 21:22:17 +01:00