Commit Graph

8 Commits

Author SHA1 Message Date
Viktor Lofgren
f655ec5a5c (*) Refactor GeoIP-related code
In this commit, GeoIP-related classes are refactored and relocated to a common library as they are shared across multiple services.

The crawler is refactored to enable the GeoIpBlocklist to use the new GeoIpDictionary as the base of its decisions.

The converter is modified ot query this data to add a geoip:-keyword to documents to permit limiting a search to the country of the hosting server.

The commit also adds due BY-SA attribution in the search engine footer for the source of the IP geolocation data.
2023-12-10 17:30:43 +01:00
Viktor Lofgren
84b4158555 (minor) Fix broken test 2023-12-10 14:39:20 +01:00
Viktor Lofgren
91dd45cf64 (search) IP and IP geolocation in site info view
This commit also fixes a bug in the loader where the IP field wouldn't always populate as intended, and refactors the DomainInformationService to use significantly fewer SQL queries.
2023-12-09 20:06:55 +01:00
Viktor Lofgren
c51159672e (build) Move unit test configuration to root build.gradle 2023-10-04 12:46:22 +02:00
Viktor Lofgren
dbe9235f3a (*) Upgrade to JDK21 with preview enabled.
... also move some common configuration into the root build.gradle-file.

Support for JDK21 in lombok is a bit sketchy at the moment, but it seems to work.  This upgrade is kind of important as the new index construction really benefits from Arena based lifecycle control over off-heap memory.
2023-09-24 10:38:59 +02:00
Viktor Lofgren
35996d0adb (docs) Update the documentation up-to-date information 2023-09-14 11:33:36 +02:00
Viktor Lofgren
24b4606f96 (converter,loader) Converter outputs parquet files instead of compressed json. 2023-09-13 16:13:41 +02:00
Viktor Lofgren
064bc5ee76 (processed-data) New parquet-serializable models for converter output 2023-09-11 14:08:40 +02:00