Commit Graph

  • 4d29581ea4 (crawler) Introduce absolute upper limit to crawl depth growth Viktor Lofgren 2024-07-16 14:40:45 +0200
  • 0b31c4cfbb (coded-sequence) Replace GCS usage with an interface Viktor Lofgren 2024-07-16 14:37:50 +0200
  • 5c098005cc (index) Fix broken test Viktor Lofgren 2024-07-16 12:37:59 +0200
  • ae87e41cec (index) Fix rare BitReader.takeWhileZero bug Viktor Lofgren 2024-07-16 11:03:56 +0200
  • dfd19b5eb9 (index) Reduce the number of abstractions around result ranking Viktor Lofgren 2024-07-15 05:18:10 +0200
  • 8ed5b51a32
    Merge branch 'master' into term-positions Viktor 2024-07-15 07:05:31 +0200
  • 9d0e5dee02 Fix gitignore issue .so files not to be ignored correctly. Viktor Lofgren 2024-07-15 05:18:10 +0200
  • ffd970036d (term-frequency) Fix concurrency issues in SentenceExtractor and TermFrequencyExporter Viktor Lofgren 2024-07-15 05:15:30 +0200
  • fa162698c2 (term-frequency) Fix concurrency issues in SentenceExtractor and TermFrequencyExporter Viktor Lofgren 2024-07-15 05:15:30 +0200
  • ad3857938d (search-api, ranking) Update with new ranking parameters Viktor Lofgren 2024-07-15 04:49:28 +0200
  • 179a6002c2 (coded-sequence) Add a callback for re-filling underlying buffer Viktor Lofgren 2024-07-12 23:50:28 +0200
  • d28fc86956 (index-prio) Add fuzz test for prio index Viktor Lofgren 2024-07-11 19:22:36 +0200
  • 6303977e9c (index-prio) Fail louder when size is 0 in PrioDocIdsTransformer Viktor Lofgren 2024-07-11 19:22:05 +0200
  • 97695693f2 (index-prio) Don't increment readItems counter when the output buffer is full Viktor Lofgren 2024-07-11 19:21:36 +0200
  • 1ab875a75d (test) Correcting flaky tests Viktor Lofgren 2024-07-11 16:12:02 +0200
  • 31881874a9 (coded-sequence) Correct indicator of next-value Viktor Lofgren 2024-07-11 16:11:20 +0200
  • f090f0101b (index-construction) Gather up preindex writes Viktor Lofgren 2024-07-10 23:18:06 +0200
  • 9881cac2da (index-reader) Correctly handle negative offset values Viktor Lofgren 2024-07-10 23:17:30 +0200
  • 12590d3449 (index-reverse) Added compression to priority index Viktor Lofgren 2024-07-10 18:34:07 +0200
  • abf7a8d78d (coded-sequence) Correct implementation of Elias gamma Viktor Lofgren 2024-07-10 14:28:28 +0200
  • ecfe17521a (coded-sequence) Correct implementation of Elias gamma Viktor Lofgren 2024-07-09 17:27:53 +0200
  • 0d29e2a39d (index-reverse) Entry Sources reset() their LongQueryBuffer Viktor Lofgren 2024-07-09 01:39:40 +0200
  • 12a2ab93db (actor) Improve error messages for convert-and-load Viktor Lofgren 2024-07-08 19:19:30 +0200
  • d90bd340bb (index-reverse) Removing btree indexes from prio documents file Viktor Lofgren 2024-07-08 17:20:17 +0200
  • 21afe94096 (index-reverse) Don't use 128 bit merge function for prio index Viktor Lofgren 2024-07-07 21:36:10 +0200
  • fa36689597 (index-reverse) Simplify priority index Viktor Lofgren 2024-07-06 16:12:29 +0200
  • 85c99ae808 (index-reverse) Split index construction into separate packages for full and priority index Viktor Lofgren 2024-07-06 15:44:47 +0200
  • a4ecd5f4ce (minor) Fix non-compiling test due to previous refactor Viktor Lofgren 2024-07-06 15:11:43 +0200
  • 6401a513d7 (crawl) Fix onsubmit confirm dialog for single-site recrawl Viktor Lofgren 2024-07-05 17:21:03 +0200
  • d86926be5f (crawl) Add new functionality for re-crawling a single domain Viktor Lofgren 2024-07-05 15:31:47 +0200
  • a6b03a66dc (crawl) Reduce Charset.forName() object churn Viktor Lofgren 2024-07-04 20:49:07 +0200
  • d023e399d2 (index) Remove unnecessary allocations in journal reader Viktor Lofgren 2024-07-04 15:24:53 +0200
  • e8ab1e14e0 (keyword-extraction) Update upper limit to number of positions per word Viktor Lofgren 2024-07-02 20:52:32 +0200
  • a6e15cb338 (keyword-extraction) Update upper limit to number of positions per word Viktor Lofgren 2024-06-30 22:46:56 +0200
  • 4fbb863a10 (keyword-extraction) Add upper limit to number of positions per word Viktor Lofgren 2024-06-30 22:41:38 +0200
  • 6ee4d1eb90 (keyword) Increase the work area for position encoding Viktor Lofgren 2024-06-28 16:42:39 +0200
  • 738e0e5fed (process) Add option for automatic profiling Viktor Lofgren 2024-06-27 13:58:36 +0200
  • 0e4dd3d76d (minor) Remove accidentally committed debug printf Viktor Lofgren 2024-06-27 13:40:53 +0200
  • 10fe5a78cb (log) Prevent tests from trying to log to file Viktor Lofgren 2024-06-27 13:19:48 +0200
  • 975b8ae2e9 (minor) Tidy code Viktor Lofgren 2024-06-27 13:15:31 +0200
  • 935234939c (test) Add query parsing to IntegrationTest Viktor Lofgren 2024-06-27 13:15:20 +0200
  • 87e38e6181 (search-query) refac: Move query factory Viktor Lofgren 2024-06-27 13:14:47 +0200
  • f73fc8dd57 (search-query) Fix end-inclusion bug in QWordGraphIterator Viktor Lofgren 2024-06-27 13:13:42 +0200
  • 3faa5bf521 (search-query) Tidy up QueryGRPCService and IndexClient Viktor Lofgren 2024-06-26 14:03:30 +0200
  • 6973712480 (query) Tidy up code Viktor Lofgren 2024-06-26 13:40:06 +0200
  • 02df421c94 (*) Trim the stopwords list Viktor Lofgren 2024-06-26 12:22:57 +0200
  • 95b9af92a0 (index) Implement working optional TermCoherences Viktor Lofgren 2024-06-26 12:22:06 +0200
  • 8ee64c0771 (index) Correct TermCoherence requirements Viktor Lofgren 2024-06-25 22:18:10 +0200
  • b805f6daa8 (gamma) Fix readCount() behavior in EGC Viktor Lofgren 2024-06-25 22:17:54 +0200
  • dae22ccbe0 (test) Integration test from crawl->query Viktor Lofgren 2024-06-25 22:17:26 +0200
  • 9d00243d7f (index) Partial re-implementation of position constraints Viktor Lofgren 2024-06-24 15:55:54 +0200
  • 5461634616 (doc) Add readme.md for coded-sequence library Viktor Lofgren 2024-06-24 14:28:51 +0200
  • 40bca93884 (gamma) Minor clean-up Viktor Lofgren 2024-06-24 13:56:43 +0200
  • b798f28443 (journal) Fixing journal encoding Viktor Lofgren 2024-06-24 13:56:27 +0200
  • fff2ce5721 (gamma) Correctly decode zero-length sequences Viktor Lofgren 2024-06-24 13:10:56 +0200
  • 69f88255e9
    Merge pull request #101 from MarginaliaSearch/security-scan Viktor 2024-06-17 13:18:36 +0200
  • 08ff79827e
    Merge branch 'master' into security-scan Viktor 2024-06-17 13:18:25 +0200
  • 67703e2274 (run) Update install.sh with stronger warnings against non-docker install. Viktor Lofgren 2024-06-17 13:15:15 +0200
  • d0d6bb173c (control) Fix warc data http status filter default value Viktor Lofgren 2024-06-17 12:40:25 +0200
  • 54caf17107 (docs) Amend install instructions for non-docker install Viktor Lofgren 2024-06-16 10:22:07 +0200
  • 2168b7cf7d (docs) Update docs with clearer references to the full guide Viktor Lofgren 2024-06-16 10:01:19 +0200
  • 90744433c9 Merge branch 'master' into security-scan Viktor Lofgren 2024-06-13 13:14:47 +0200
  • 5371f078f7
    Merge pull request #102 from jaseemabid/jabid/macos-build Viktor 2024-06-12 14:45:03 +0200
  • 0dd14a4bd0 Specify C++ standard in build command Jaseem Abid 2024-06-12 12:46:15 +0100
  • 9974b31a09 Don't track build files(libcpp.so) with git Jaseem Abid 2024-06-12 12:45:49 +0100
  • 0ffbbaf4b9 (crawler) Update WARC builder to use SHA-256 for digests Viktor Lofgren 2024-06-12 09:14:12 +0200
  • 6839415a0b (crawler) Fetch TLS instead of SSL context Viktor Lofgren 2024-06-12 09:07:54 +0200
  • 55f3ac4846 (atags) Fix duckdb SQL injection Viktor Lofgren 2024-06-12 09:05:57 +0200
  • 801cf4b5da (search) Fix bad practice usage of innerHTML to set what should be text content. Viktor Lofgren 2024-06-12 08:59:40 +0200
  • e0459d0c0d (build) Upgrade parquet dependencies to 1.14.0 Viktor Lofgren 2024-06-12 08:57:22 +0200
  • 23759a7243 (loader) Correctly clamp document size Viktor Lofgren 2024-06-10 18:29:14 +0200
  • 55b2b7636b (loader) Correctly load the positions column in the keyword projection Viktor Lofgren 2024-06-10 18:27:15 +0200
  • 36160988e2 (index) Integrate positions data with indexes WIP Viktor Lofgren 2024-06-10 15:09:06 +0200
  • 9f982a0c3d (index) Integrate positions file properly Viktor Lofgren 2024-06-06 16:45:42 +0200
  • dcbec9414f (index) Fix non-compiling tests Viktor Lofgren 2024-06-06 16:35:09 +0200
  • a07cf1ba93 (array/cpp) Update gitignore to properly exclude libcpp.so Viktor Lofgren 2024-06-06 13:05:59 +0200
  • 4a8afa6b9f (index, WIP) Position data partially integrated with forward and reverse indexes. Viktor Lofgren 2024-06-06 12:54:52 +0200
  • bb06cc9ff3
    Merge pull request #98 from samstorment/ThemeSwitcher Viktor 2024-06-06 12:51:19 +0200
  • 9c06f446fb (search) Styling tweaks. Make the filter button near the top right corener a bit bigger so it's easier to press on mobile Sam Storment 2024-06-05 19:55:17 -0500
  • 2d076cbd67 (search) move data-has-js attribute from body to html element Sam Storment 2024-06-05 18:20:33 -0500
  • fb2eef24d6 Handle themeing when javascript is disabled. Hide the theme select and fallback to dark media query instead of data-theme attribute Sam Storment 2024-06-03 14:15:35 -0500
  • e2f68d9ccf Add a theme select to the header that lets users toggle their theme independent of their OS theme Sam Storment 2024-06-02 21:02:52 -0500
  • d4f4d751c0 Merge remote-tracking branch 'origin/master' Viktor Lofgren 2024-06-02 16:30:41 +0200
  • b4eac2516e (crawler) Send "Accept"-headers when fetching documents, also indicate we prefer English results Viktor Lofgren 2024-06-02 16:30:34 +0200
  • 4435f6245c
    Merge pull request #94 from samstorment/search-dark-theme Viktor 2024-06-02 16:21:52 +0200
  • 9b922af075 (converter) Amend existing modifications to use gamma coded positions lists Viktor Lofgren 2024-05-30 14:20:36 +0200
  • 0112ae725c (gamma) Implement a small library for Elias gamma coding an integer sequence Viktor Lofgren 2024-05-30 14:17:23 +0200
  • 619392edf9 (keywords) Add position information to keywords Viktor Lofgren 2024-05-28 16:54:53 +0200
  • 0894822b68 (converter) Add position information to serialized document data Viktor Lofgren 2024-05-28 14:18:03 +0200
  • 206a7ce6c1 Merge remote-tracking branch 'origin/master' Viktor Lofgren 2024-05-28 14:15:57 +0200
  • a69ab311c7 (qword) Fix tests that broke due to stopword removal Viktor Lofgren 2024-05-28 14:15:45 +0200
  • a61327fa0b
    Update ROADMAP.md Viktor 2024-05-24 13:57:50 +0200
  • 6985ab762a (query) Improve handling of stopwords in queries Viktor Lofgren 2024-05-23 20:50:55 +0200
  • 0e8300979b (search) Update the no result text to request bug reports. Viktor Lofgren 2024-05-23 20:18:16 +0200
  • 0b60411e5f (query) Bugfix stopword issue Viktor Lofgren 2024-05-23 20:15:14 +0200
  • f83f777fff (converter) Experimental support for searching by URL Viktor Lofgren 2024-05-23 17:10:57 +0200
  • 89aae93e60 (*) Lift jetty and guava-dependencies Viktor Lofgren 2024-05-23 14:20:01 +0200
  • 65b74f9cab (registry) Fix broken test Viktor Lofgren 2024-05-23 14:15:01 +0200
  • 7543e98035
    Merge branch 'MarginaliaSearch:master' into search-dark-theme Sam Storment 2024-05-22 18:06:37 -0500
  • 59ec70eb73 (*) Clean up code related to crawl parquet inspection Viktor Lofgren 2024-05-22 12:55:08 +0200