Viktor Lofgren
|
185b79f2a5
|
(converter) Fix bug where sideloaded reddit content was errouneously categoriszed as wiki-generated.
|
2024-09-01 11:30:25 +02:00 |
|
Viktor Lofgren
|
8d0f9652c7
|
(crawler) Correct RSS-sitemap behavior
|
2024-08-31 11:38:34 +02:00 |
|
Viktor Lofgren
|
5353805cc6
|
(crawler) Correct RSS-sitemap behavior
|
2024-08-31 11:37:09 +02:00 |
|
Viktor Lofgren
|
5407da5650
|
(crawler) Grab favicons as part of root sniff
|
2024-08-31 11:32:56 +02:00 |
|
Viktor Lofgren
|
b1bfe6f76e
|
(control) New view for domains
Add capability to assign domains, and bulk-add new domains.
|
2024-08-30 17:06:48 +02:00 |
|
Viktor Lofgren
|
74e25370ca
|
(control) New view for domains
Still a work in progress, but at this point it's possible to use for viewing domains
|
2024-08-29 15:40:40 +02:00 |
|
Viktor Lofgren
|
bb5d946c26
|
(index, EXPERIMENTAL) Clean up ranking code
|
2024-08-29 11:34:23 +02:00 |
|
Viktor Lofgren
|
abab5bdc8a
|
(index, EXPERIMENTAL) Evaluate using Varint instead of GCS for position data
|
2024-08-26 14:20:39 +02:00 |
|
Viktor Lofgren
|
30bf845c81
|
(index) Speed up minDist calculations by excluding large lists
|
2024-08-26 13:04:15 +02:00 |
|
Viktor Lofgren
|
77efce0673
|
(paper-doll) Fix compilation
|
2024-08-26 12:51:29 +02:00 |
|
Viktor Lofgren
|
67a98fb0b0
|
(coded-sequence) Handle weird legacy HTML that puts everything in a heading
|
2024-08-26 12:49:15 +02:00 |
|
Viktor Lofgren
|
7d471ec30d
|
(coded-sequence) Evaluate new minDist implementation
|
2024-08-26 12:45:11 +02:00 |
|
Viktor Lofgren
|
f3182a9264
|
(coded-sequence) Evaluate new minDist implementation
|
2024-08-26 12:02:37 +02:00 |
|
Viktor Lofgren
|
805cb5ad58
|
(coded-sequence) Correct behavior of findIntersections
|
2024-08-25 14:54:17 +02:00 |
|
Viktor Lofgren
|
fdf05cedae
|
(index) Optimize DocumentSpan.countIntersections
|
2024-08-25 14:12:30 +02:00 |
|
Viktor Lofgren
|
9c5f463775
|
(index) Optimize DocumentSpan.countIntersections
|
2024-08-25 13:59:11 +02:00 |
|
Viktor Lofgren
|
893fae6d59
|
(index) Optimize DocumentSpan.countIntersections
|
2024-08-25 13:51:43 +02:00 |
|
Viktor Lofgren
|
5660f291af
|
(index) Optimize DocumentSpan.countIntersections
|
2024-08-25 13:43:29 +02:00 |
|
Viktor Lofgren
|
efd56efc63
|
(index) Optimize SequenceOperations.minDistance
|
2024-08-25 13:28:06 +02:00 |
|
Viktor Lofgren
|
d94373f4b1
|
(index) Optimize calculatePositionsMask
|
2024-08-25 13:24:37 +02:00 |
|
Viktor Lofgren
|
0d01a48260
|
(index) Optimize SequenceOperations
|
2024-08-25 13:19:37 +02:00 |
|
Viktor Lofgren
|
00ab2684fa
|
(index) Optimize SequenceOperations
|
2024-08-25 13:17:38 +02:00 |
|
Viktor Lofgren
|
a5585110a6
|
(index) Optimize SequenceOperations
|
2024-08-25 13:16:31 +02:00 |
|
Viktor Lofgren
|
965c89798e
|
(index) Optimize DocumentSpan
|
2024-08-25 12:44:33 +02:00 |
|
Viktor Lofgren
|
982b03382b
|
(index) Optimize DocumentSpan
|
2024-08-25 12:31:15 +02:00 |
|
Viktor Lofgren
|
24b805472a
|
(index) Evaluate performance implication of decoding gcs early
|
2024-08-25 12:23:09 +02:00 |
|
Viktor Lofgren
|
6ce029b317
|
(index) Remove vestigial parameter
|
2024-08-25 12:14:12 +02:00 |
|
Viktor Lofgren
|
63e5b0ab18
|
(index) Correct weightedCounts calculations
|
2024-08-25 12:06:56 +02:00 |
|
Viktor Lofgren
|
6dda2c2d83
|
(coded-sequence) Reduce allocations in GCS.values()
|
2024-08-25 12:06:31 +02:00 |
|
Viktor Lofgren
|
3fb3c0b92e
|
(index) Optimize ranking calculations
|
2024-08-25 11:56:11 +02:00 |
|
Viktor Lofgren
|
aa2c960b74
|
(index) Optimize ranking calculations
|
2024-08-25 11:53:44 +02:00 |
|
Viktor Lofgren
|
4fbcc02f96
|
(index) Adjust sensible defaults for ranking parameters
|
2024-08-25 11:24:16 +02:00 |
|
Viktor Lofgren
|
9aa8f13731
|
(index) Remove tcfAvgDist ranking parameter
This is captured by tcfProximity already
|
2024-08-25 11:20:19 +02:00 |
|
Viktor Lofgren
|
65bee366dc
|
(index) Try harmonic mean for avgMinDist
|
2024-08-25 11:11:52 +02:00 |
|
Viktor Lofgren
|
53700e6667
|
(index) Try harmonic mean for avgMinDist
|
2024-08-25 11:08:41 +02:00 |
|
Viktor Lofgren
|
7f498e10b7
|
(index) Adjust proximity score
|
2024-08-25 11:01:35 +02:00 |
|
Viktor Lofgren
|
6eb0f13411
|
(index) Adjust handling of full phrase matches to prioritize full query matches over large partial matches
|
2024-08-25 10:54:04 +02:00 |
|
Viktor Lofgren
|
773377fe84
|
(index) Correct handling of full phrase match group
|
2024-08-25 10:48:34 +02:00 |
|
Viktor Lofgren
|
4372c8c835
|
(index) Give ranking components more consistent names
|
2024-08-25 10:44:27 +02:00 |
|
Viktor Lofgren
|
099133bdbc
|
(index) Fix verbatim match score after moving full phrase group to a separate entity
|
2024-08-25 10:43:35 +02:00 |
|
Viktor Lofgren
|
b09e2dbeb7
|
(build) Fix dependency churn from testcontainers
Apparently you need to pull in commons-codec now in order to run testcontainers, through spooky action at a distance.
|
2024-08-25 10:35:48 +02:00 |
|
Viktor Lofgren
|
96bcf03ad5
|
(index) Address broken tests
They are still broken, but less so.
|
2024-08-25 10:34:36 +02:00 |
|
Viktor Lofgren
|
0999f07320
|
(search-query) Add new ranking parameters for proximity and verbatim matches
|
2024-08-25 10:34:12 +02:00 |
|
Viktor Lofgren
|
5d2b455572
|
(search) Clean up inconsistent usage of MathClient in SearchOperator
Also clean up SearchOperator and adjacent code
|
2024-08-24 10:39:31 +02:00 |
|
Viktor Lofgren
|
ea75ddc0e0
|
(search) Absorb SearchQueryIndexService into SearchOperator, and clean up SearchOperator
|
2024-08-22 11:50:52 +02:00 |
|
Viktor Lofgren
|
2db0e446cb
|
(search) Absorb SearchQueryIndexService into SearchOperator, and clean up SearchOperator
|
2024-08-22 11:49:29 +02:00 |
|
Viktor Lofgren
|
557bdaa694
|
(search) Clean up SearchQueryIndexService and surrounding code
|
2024-08-22 11:45:28 +02:00 |
|
Viktor Lofgren
|
9eb1f120fc
|
(index) Repair positions bitmask for search result presentation
|
2024-08-22 11:28:23 +02:00 |
|
Viktor Lofgren
|
266d6e4bea
|
(slop) Replace SlopPageRef<T> with SlopTable.Ref<T>
|
2024-08-21 10:13:49 +02:00 |
|
Viktor Lofgren
|
e4c97a91d8
|
(*) Comment clarity
|
2024-08-21 10:12:00 +02:00 |
|