Viktor Lofgren
199c459697
(*) Add node-affinity to services, processes and file storage.
2023-10-10 12:32:22 +02:00
Viktor Lofgren
61288c5e68
(service, client) First steps towards multiple nodedness
2023-10-09 22:13:27 +02:00
Viktor Lofgren
6319b8ef51
(api-service) Improved testability, always set content type to application/json
2023-10-09 15:39:34 +02:00
Viktor Lofgren
397a85eaa4
(query-service) Apply blacklisting to search results
2023-10-09 15:18:53 +02:00
Viktor Lofgren
3889c4bdd9
(refactor) Remove features-search and update documentation
2023-10-09 15:12:30 +02:00
Viktor Lofgren
c899f1cb85
(docs) Update documentation to reflect new query service
2023-10-09 14:56:59 +02:00
Viktor Lofgren
d8956c51d0
(refactor) Remove api:search-api
...
Application services should not have an API, but purely act as clients
to the core services (which should always have an API).
2023-10-09 14:42:33 +02:00
Viktor Lofgren
c0e61d4c87
(refactor) Move search service into services-satellite
2023-10-09 13:40:01 +02:00
Viktor Lofgren
97e17282ab
(query-service) Move query parsing from search-service to the new query service.
2023-10-09 13:27:44 +02:00
Viktor Lofgren
94c882af7d
(query-service) Provide delegate of IndexApi's query functionality.
...
This is an intermediate step in the process of introducing the query-service as a proxy between search and index.
2023-10-08 22:22:26 +02:00
Viktor Lofgren
89c6d85f2f
(query-service) Create new empty 'query-service' service
2023-10-08 17:31:50 +02:00
Viktor Lofgren
cf366c602f
(search) Refactor SearchQueryIndexService in preparation for feature extraction.
...
Prefer working on DecoratedSearchResultItem in favor of UrlDetails.
2023-10-08 17:15:41 +02:00
Viktor Lofgren
77ccab7d80
(index) Move linkdb to index from search.
...
This makes index complete in the sense that you can deploy an index instance and build a complete separate application on top of it, without having to go through the Marginalia-laden search service.
2023-10-08 16:48:35 +02:00
Viktor Lofgren
f51ba63742
(search) Remove dead file
2023-10-07 21:05:06 +02:00
Viktor Lofgren
9044518be5
(search) Fix broken link to git repo
2023-10-07 19:43:22 +02:00
Viktor Lofgren
9e0367eef4
(search) Filter blacklisted items in API query service as well
2023-10-07 16:16:04 +02:00
Viktor Lofgren
235bb6c1b9
(control) Administrative QOL improvement, GUI for banning spam
2023-10-07 15:45:50 +02:00
Viktor Lofgren
49344d7ea8
(control) Administrative QOL improvement, GUI for banning spam
2023-10-07 15:43:18 +02:00
Viktor Lofgren
1b418d77ff
(search) We got some new IP ranges to work with for the crawler
2023-10-07 13:41:55 +02:00
Viktor Lofgren
80cc302627
(search) We can't in claim to be on PC hardware anymore...
2023-10-07 11:49:29 +02:00
Viktor
8e1abc3f10
(index-reverse) Parallel construction of the reverse indexes. ( #52 )
...
* (index-reverse) Parallel construction of the reverse indexes.
* (array) Remove wasteful calculation of numDistinct before merging two sorted arrays.
* (index-reverse) Force changes to disk on close, reduce logging.
* (index-reverse) Clean up merging process and add back logging
* (run) Add a conservative default for INDEX_CONSTRUCTION_PROCESS_OPTS's parallelism as it eats a lot of RAM
* (index-reverse) Better logging during processing
* (array) 2GB+ compatible write() function
* (array) 2GB+ compatible write() function
* (index-reverse) We are logging like Bolsonaro and I will not have it.
* (reverse-index) Self-diagnostics
* (btree) Fix bug in btree reader to do with large data sizes
2023-10-07 10:00:00 +02:00
Viktor Lofgren
c51159672e
(build) Move unit test configuration to root build.gradle
2023-10-04 12:46:22 +02:00
Viktor Lofgren
405300b4b2
(control) Fix bug where finishing one process ad hoc task would remove all other tasks from the db
2023-10-04 11:44:31 +02:00
Viktor Lofgren
40768e935b
(test) Removing /tmp-guardrails as it doesn't hold in CI
2023-10-02 16:52:59 +02:00
Viktor Lofgren
d160954080
(index) Two useful debug endpoints
2023-09-24 19:39:48 +02:00
Viktor Lofgren
14372e0ef0
(index) Slightly reduce alloc churn
2023-09-24 19:36:14 +02:00
Viktor Lofgren
03bffa27ac
(search) Add combined id to the search result HTML
2023-09-24 19:34:35 +02:00
Viktor Lofgren
028b5a4f0d
(minor performance) Reduce GC churn in index
2023-09-24 12:12:08 +02:00
Viktor Lofgren
1bd146fb8e
(minor) Remove dead code
2023-09-24 10:55:20 +02:00
Viktor Lofgren
5f6c3da7a4
(index) Add close methods on the index readers so they clean up their mmaps
2023-09-24 10:54:23 +02:00
Viktor Lofgren
dbe9235f3a
(*) Upgrade to JDK21 with preview enabled.
...
... also move some common configuration into the root build.gradle-file.
Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work. This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
d78569986b
(backups) Fix bug where backup service would zero the linkdb when restoring.
2023-09-22 18:34:34 +02:00
Viktor Lofgren
95323e6caa
(backups) Support restore multi-source load data
2023-09-22 18:34:17 +02:00
Viktor Lofgren
f809d22fc6
(loader) Support simultaneous loading of multiple processed data sets
2023-09-22 13:14:58 +02:00
Viktor Lofgren
70aa04c047
(converter, stackexchange-xml) Add the ability to sideload stackexchange data
2023-09-21 12:48:33 +02:00
Viktor Lofgren
f8050816ac
(search) Don't run LSH deduplication on details with zero lsh to support not calculating this hash.
2023-09-21 12:47:02 +02:00
Viktor Lofgren
9b385ec7cc
(converter) Make it possible to sideload documents from a directory tree
2023-09-17 14:35:06 +02:00
Viktor Lofgren
5c040f7a46
(crawl-spec) Parquetify crawl spec
...
* Crawl-specs are now parquet files
* Deprecate the crawl-job-extractor tool
2023-09-17 09:41:34 +02:00
Viktor Lofgren
5e5aaf9a7e
(converter, control) Re-enable sideloading encyclopedia data
2023-09-14 12:12:07 +02:00
Viktor Lofgren
07d7507ac6
(control-service) Move Actions up in storage-details
...
Papercut fix. If a file storage area has a lot of files, you have to scroll down a long way to get to the actions otherwise.
2023-09-02 15:41:55 +02:00
Viktor Lofgren
9e185e80ce
(control-service) Add timestamp to file storages.
2023-09-02 14:01:04 +02:00
Viktor Lofgren
d31d8ec5b0
(index) Log keyword ids on hex format
2023-09-01 15:40:24 +02:00
Viktor Lofgren
2b00cd632d
(process) Propagate environment JVM params to the index constructor
2023-09-01 15:39:42 +02:00
Viktor Lofgren
764e7d1315
(index) Add more comprehensive integration tests for the index service.
2023-08-30 10:37:24 +02:00
Viktor Lofgren
e4d7958379
(control) ProcessLivenessMonitorActor shouldn't reap tasks based on service instance liveness
2023-08-29 18:19:04 +02:00
Viktor Lofgren
3f288e264b
(minor) Clean up dead endpoints
2023-08-29 17:04:54 +02:00
Viktor Lofgren
dd593c292c
(loader) Minor optimizations and bugfixes.
...
* Reduce memory churn in LoaderIndexJournalWriter, fix bug with keyword mappings as well
* Remove remains of OldDomains
* Ensure LOADER_PROCESS_OPTS gets fed to the processes
* LinkdbStatusWriter won't execute batch after each added item post 100 items
2023-08-29 15:37:52 +02:00
Viktor Lofgren
39c1857c61
(heartbeat, reverse-index) Better heartbeat mocking, improved heartbeats for reverse index construction.
2023-08-29 13:07:55 +02:00
Viktor Lofgren
c57a2d0dc3
(control-service) Remove old index journal files when restoring a backup.
2023-08-29 11:58:01 +02:00
Viktor Lofgren
6525b16e1f
(minor) Improved logging and error messages
2023-08-28 19:53:55 +02:00