Commit Graph

9 Commits

Author SHA1 Message Date
Viktor Lofgren
365229991b (control) Improve pagination for crawl data inspector 2024-05-21 19:44:48 +02:00
Viktor Lofgren
959a8e29ee (control) Improve pagination for crawl data inspector 2024-05-21 19:27:25 +02:00
Viktor Lofgren
197c82acd4 (control) Add filter functionality for crawl data inspector 2024-05-21 19:05:44 +02:00
Viktor Lofgren
9539fdb53c (control) Clean up UX for crawl data inspector 2024-05-21 18:27:24 +02:00
Viktor Lofgren
17dc00d05f (control) Partial implementation of inspection utility for crawl data
Uses duckdb and range queries to read the parquet files directly from the index partitions.

UX is a bit rough but is in working order.
2024-05-20 18:02:46 +02:00
Viktor Lofgren
7d1cafc070 (control) Add skip link for navigation in control GUI 2024-05-04 12:36:44 +02:00
Viktor Lofgren
4021a0ae98 (search) Add en-US language tags to all templates 2024-05-04 11:40:59 +02:00
Viktor Lofgren
afc047cd27 (control) GUI for exporting segmentation data from a wikipedia zim 2024-03-18 13:45:23 +01:00
Viktor Lofgren
1d34224416 (refac) Remove src/main from all source code paths.
Look, this will make the git history look funny, but trimming unnecessary depth from the source tree is a very necessary sanity-preserving measure when dealing with a super-modularized codebase like this one.

While it makes the project configuration a bit less conventional, it will save you several clicks every time you jump between modules.  Which you'll do a lot, because it's *modul*ar.  The src/main/java convention makes a lot of sense for a non-modular project though.  This ain't that.
2024-02-23 16:13:40 +01:00