MarginaliaSearch

mirror of https://github.com/MarginaliaSearch/MarginaliaSearch.git synced 2025-02-24 05:18:58 +00:00

Author	SHA1	Message	Date
Viktor Lofgren	cf7f84f033	(rank) Reduce the impact of domain rank bonus, and only apply it to cancel out negative penalties, never to increase the ranking	2024-12-10 22:04:12 +01:00
Viktor Lofgren	c97c66a41c	(ranking) Reduce the verbatim score multiplier	2024-11-28 13:37:11 +01:00
Viktor Lofgren	e5db3f11e1	(chore) Clean up some of the uglier delomboking artifacts	2024-11-15 13:57:20 +01:00
Viktor Lofgren	9f47ce8d15	(chore) Remove lombok There are likely some instances of delombok gore with this commit.	2024-11-11 21:14:38 +01:00
Viktor Lofgren	2ee58f4bc9	(index) Adjust ranking parameters to dial down the importance of tcfProximity and firstPosition	2024-09-29 15:33:12 +02:00
Viktor Lofgren	4a0356e26f	(search-service) Add pagination support to the search GUI	2024-09-25 14:26:49 +02:00
Viktor Lofgren	73f973cc06	(search-query) Add pagination to search query API and the direct query-service interface	2024-09-25 14:20:59 +02:00
Viktor Lofgren	28e7c8e5e0	Increase temporal bias weight to give the recent results filter a bit more recency	2024-09-17 18:11:40 +02:00
Viktor Lofgren	50ec922c2b	(index) Fix broken index tests Also cleaned up the tests to be less fragile to ranking algorithm changes.	2024-09-10 10:23:46 +02:00
Viktor Lofgren	bb5d946c26	(index, EXPERIMENTAL) Clean up ranking code	2024-08-29 11:34:23 +02:00
Viktor Lofgren	4fbcc02f96	(index) Adjust sensible defaults for ranking parameters	2024-08-25 11:24:16 +02:00
Viktor Lofgren	9aa8f13731	(index) Remove tcfAvgDist ranking parameter This is captured by tcfProximity already	2024-08-25 11:20:19 +02:00
Viktor Lofgren	0999f07320	(search-query) Add new ranking parameters for proximity and verbatim matches	2024-08-25 10:34:12 +02:00
Viktor Lofgren	5d2b455572	(search) Clean up inconsistent usage of MathClient in SearchOperator Also clean up SearchOperator and adjacent code	2024-08-24 10:39:31 +02:00
Viktor Lofgren	9eb1f120fc	(index) Repair positions bitmask for search result presentation	2024-08-22 11:28:23 +02:00
Viktor Lofgren	03d5dec24c	(*) Refactor termCoherences and rename them to phrase constraints.	2024-08-15 11:02:19 +02:00
Viktor Lofgren	016a4c62e1	(index) Bugs and error fixes, chasing and fixing mystery results that did not contain all relevant keywords	2024-08-10 09:51:03 +02:00
Viktor Lofgren	2e89b55593	(wip) Repair qdebug utility and show new ranking details	2024-08-09 12:57:25 +02:00
Viktor Lofgren	7babdb87d5	(index) Remove intermediate models	2024-08-07 10:10:44 +02:00
Viktor Lofgren	8462e88b8f	(index) Add min-dist factor and adjust rankings	2024-08-03 13:07:00 +02:00
Viktor Lofgren	b316b55be9	(index) Experimental initial integration of document spans into index	2024-07-30 12:01:53 +02:00
Viktor Lofgren	aebb2652e8	(wip) Extract and encode spans data Refactoring keyword extraction to extract spans information. Modifying the intermediate storage of converted data to use the new slop library, which is allows for easier storage of ad-hoc binary data like spans and positions. This is a bit of a katamari damacy commit that ended up dragging along a bunch of other fairly tangentially related changes that are hard to break out into separate commits after the fact. Will push as-is to get back to being able to do more isolated work.	2024-07-27 11:44:13 +02:00
Viktor Lofgren	dfd19b5eb9	(index) Reduce the number of abstractions around result ranking The change also restructures the internal API a bit, moving resultsFromDomain from RpcRawResultItem into RpcDecoratedResultItem, as the previous order was driving complexity in the code that generates these objects, and the consumer side of things puts all this data in the same object regardless.	2024-07-16 08:18:54 +02:00
Viktor Lofgren	ad3857938d	(search-api, ranking) Update with new ranking parameters Adding new ranking parameters to the API and routing them through the system, in order to permit integration of the new position data with the ranking algorithm. The change also cleans out several parameters that no longer filled any function.	2024-07-15 04:49:40 +02:00
Viktor Lofgren	6973712480	(query) Tidy up code	2024-06-26 13:40:06 +02:00
Viktor Lofgren	95b9af92a0	(index) Implement working optional TermCoherences	2024-06-26 12:22:06 +02:00
Viktor Lofgren	9d00243d7f	(index) Partial re-implementation of position constraints	2024-06-24 15:55:54 +02:00
Viktor Lofgren	36160988e2	(index) Integrate positions data with indexes WIP This change integrates the new positions data with the forward and reverse indexes. The ranking code is still only partially re-written.	2024-06-10 15:09:06 +02:00
Viktor Lofgren	89aae93e60	(*) Lift jetty and guava-dependencies	2024-05-23 14:20:01 +02:00
Viktor Lofgren	4668b1ddcb	(build) Java 22 and its consequences has been a disaster for Marginalia Search Roll back to JDK 21 for now, and make Java version configurable in the root build.gradle The project has run into no less than three distinct show-stopping bugs in JDK22, across multiple vendors, and gradle still doesn't fully support it, meaning you need multiple JDK versions installed.	2024-04-24 13:54:04 +02:00
Viktor Lofgren	ed250f57f2	(ranking) Set regularMask correctly	2024-04-19 14:31:57 +02:00
Viktor Lofgren	e92c25f7e0	(ranking) Cleanup	2024-04-19 14:13:12 +02:00
Viktor Lofgren	41782a0ab5	(index) Fix TCF bug where the ngram terms would be considered instead of the regular ones due to a logical derp	2024-04-19 12:19:26 +02:00
Viktor Lofgren	9b06433b82	(qs) Additional info in query debug UI	2024-04-19 12:18:53 +02:00
Viktor Lofgren	def607d840	(qs) Additional info in query debug UI	2024-04-19 11:46:27 +02:00
Viktor Lofgren	2b811fb422	(qs) Basic query debug feature	2024-04-19 11:00:56 +02:00
Viktor Lofgren	36cc62c10c	(proto) Improve handling of omitted parameters	2024-04-18 10:47:12 +02:00
Viktor Lofgren	7641a02f31	(query) Update ranking parameters with new variables for bm25 ngrams and tcf mutual jaccard The change also makes it so that as long as the values are defaults, they don't need to be sent over the wire and decoded.	2024-04-18 10:36:15 +02:00
Viktor Lofgren	f52457213e	(index) Split ngram and regular keyword bm25 calculation and add ngram score as a bonus	2024-04-17 14:05:02 +02:00
Viktor Lofgren	599e719ad4	(index) Fix priority search terms This functionality fell into disrepair some while ago. It's supposed to allow non-mandatory search terms that boost the ranking if they are present in the document.	2024-04-15 16:44:08 +02:00
Viktor Lofgren	b6d365bacd	(index) Clean up data model The change set cleans up the data model for the term-level data. This used to contain a bunch of fields with document-level metadata. This data-duplication means a larger memory footprint and worse memory locality. The ranking code is also modified to not accept SearchResultKeywordScores, but rather CompiledQueryLong and CqDataInts containing only the term metadata and the frequency information needed for ranking. This is again an effort to improve memory locality.	2024-04-15 16:04:07 +02:00
Viktor Lofgren	ed73d79ec1	(qs) Clean up parsing code using new record matching	2024-04-11 17:36:08 +02:00
Viktor Lofgren	fcdc843c15	(search) Fix outdated assumptions about the results We no longer break the query into "sets" of search terms and need to adapt the code to not use this assumption. For the API service, we'll simulate the old behavior to keep the API stable. For the search service, we'll introduce a new way of calculating positions through tree aggregation.	2024-04-07 12:09:44 +02:00
Viktor Lofgren	ae7c760772	(index) Clean up new index query code	2024-04-05 13:30:49 +02:00
Viktor Lofgren	81815f3e0a	(qs, index) New query model integrated with index service. Seems to work, tests are green and initial testing finds no errors. Still a bit untested, committing WIP as-is because it would suck to lose weeks of work due to a drive failure or something.	2024-04-04 20:17:58 +02:00
Viktor Lofgren	002afca1c5	(sys) Upgrade to JDK22 This also entails upgrading JIB to 3.4.1 and Lombok to 1.18.32.	2024-03-21 14:33:27 +01:00
Viktor Lofgren	46423612e3	(refac) Merge service-discovery and service modules Also adds a few tests to the server/client code.	2024-03-03 10:49:23 +01:00
Viktor Lofgren	427f3e922f	(index) Retire count operation, clean up index code.	2024-02-27 21:22:17 +01:00
Viktor Lofgren	9429bf5c45	(index) Clean up	2024-02-27 21:22:17 +01:00
Viktor Lofgren	fc00701a1e	(index) Experimental refactoring of the indexing functionality	2024-02-25 11:05:10 +01:00

1 2

54 Commits