Commit Graph

12 Commits

Author SHA1 Message Date
Viktor Lofgren
12590d3449 (index-reverse) Added compression to priority index
The priority index documents file can be trivially compressed to a large degree.

Compression schema:
```
00b -> diff docord (E gamma)
01b -> diff domainid (E delta) + (1 + docord) (E delta)
10b -> rank (E gamma) + domainid,docord (raw)
11b -> 30 bit size header, followed by 1 raw doc id (61 bits)
```
2024-07-11 16:13:23 +02:00
Viktor Lofgren
19163fa883 (array) Clean up the Array library
IntArray gets the YAGNI axe.   The array library had two implementations, one for longs which was used, and one for ints, which only ever saw bit rot.   Removing the latter, as all it ever did was clutter up the codebase and add technical debt.  If we need int arrays, we fork LongArray again (or add int capabilities to it)

Also cleaning up the interfaces, removing layers of redundant abstractions and adding javadocs.

Finally adding sz=2 specializations to the quick- and insertion sort algorithms.  It seems the JIT isn't optimizing these particularly well, this is an attempt to help it out a bit.
2024-05-18 13:23:06 +02:00
Viktor Lofgren
650f3843bb (array) Clean up search function jungle
Retire search functions that weren't used, including the native implementations.  Drop confusing suffixes on search function names.  Search functions no longer encode search misses as negative values.

Replaced binary search function with a branchless version that is much faster.

Cleaned up benchmark code.
2024-05-17 14:31:02 +02:00
Viktor Lofgren
9e766bc056 (array) Clean up search function jungle
Retire search functions that weren't used, including the native implementations.  Drop confusing suffixes on search function names.  Search functions no longer encode search misses as negative values.

Replaced binary search function with a branchless version that is much faster.

Cleaned up benchmark code.
2024-05-17 14:30:06 +02:00
Viktor Lofgren
48aff52e00 (array) Increase LongArray on-heap alignment to 16 bytes
This primarily affects benchmarks, making performance more consistent for the 128 bit operations, as the system mostly works with memory mapped data.
2024-05-16 19:12:36 +02:00
Viktor Lofgren
3549be216f (array, experimental) Documentation for native algos 2024-05-14 17:43:05 +02:00
Viktor Lofgren
55a7c1db00 (array, experimental) Call C++ helper methods to do some low level stuff a bit faster than is possible with Java 2024-05-14 12:54:14 +02:00
Viktor Lofgren
deaba0152d (index) Explicitly free LongQueryBuffers 2024-04-16 19:23:00 +02:00
Viktor Lofgren
ae7c760772 (index) Clean up new index query code 2024-04-05 13:30:49 +02:00
Viktor Lofgren
81815f3e0a (qs, index) New query model integrated with index service.
Seems to work, tests are green and initial testing finds no errors.  Still a bit untested, committing WIP as-is because it would suck to lose weeks of work due to a drive failure or something.
2024-04-04 20:17:58 +02:00
Viktor Lofgren
67aa20ea2c (array) Attempting to debug strange errors 2024-02-27 21:22:18 +01:00
Viktor Lofgren
1d34224416 (refac) Remove src/main from all source code paths.
Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one.

While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules.  Which you'll do a lot, because it's *modul*ar.  The src/main/java convention makes a lot of sense for a non-modular project though.  This ain't that.
2024-02-23 16:13:40 +01:00