MarginaliaSearch/code/libraries/big-string
Viktor Lofgren 4668b1ddcb (build) Java 22 and its consequences has been a disaster for Marginalia Search
Roll back to JDK 21 for now, and make Java version configurable in the root build.gradle

The project has run into no less than three distinct show-stopping bugs in JDK22, across multiple vendors, and gradle still doesn't fully support it, meaning you need multiple JDK versions installed.
2024-04-24 13:54:04 +02:00
..
java/nu/marginalia/bigstring (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
test/nu/marginalia/bigstring (refac) Remove src/main from all source code paths. 2024-02-23 16:13:40 +01:00
build.gradle (build) Java 22 and its consequences has been a disaster for Marginalia Search 2024-04-24 13:54:04 +02:00
readme.md (controller) Improve the storage interface 2023-07-21 19:56:16 +02:00

Big String

Microlibrary that offers string compression. This is useful when having to load tens of thousands of HTML documents in memory during conversion. XML has been described as the opposite of a compression scheme, and as a result, HTML compresses ridiculously well.

Configuration

If the Java property 'bigstring.disabled' is set to true, the BigString class will not compress strings.

Demo

List<BigString> manyBigStrings = new ArrayList<>();

for (var file : files) {
    // BigString.encode may or may not compress the string 
    // depeneding on its size
    manyBigStrings.add(BigString.encode(readFile(file)));
}

for (var bs : manyBigStrings) {
    String decompressedString = bs.decompress();
    byte[] bytes = bs.getBytes();
    int len = bs.getLength();
}