Viktor Lofgren
|
2f8488610a
|
(loader) Fix bug where trailing deferred domain meta inserts weren't executed
|
2023-07-31 14:23:23 +02:00 |
|
Viktor Lofgren
|
d3f01bd171
|
(crawler, converter) Remove monkey patched gson from dependencies
|
2023-07-29 19:18:12 +02:00 |
|
Viktor Lofgren
|
77d5e39fe0
|
Make processed data Serializable
|
2023-07-28 18:11:19 +02:00 |
|
Viktor Lofgren
|
bca4bbb6c8
|
(*) Refactor MQ and MQSM
|
2023-07-17 13:57:32 +02:00 |
|
Viktor Lofgren
|
8b74e3aa0d
|
(*) File Storage WIP
|
2023-07-14 17:08:10 +02:00 |
|
Viktor Lofgren
|
bd2c3855ed
|
Add bits and keywords for generator classes (docs, forum, wiki).
|
2023-06-23 21:35:28 +02:00 |
|
Viktor Lofgren
|
b5ef67ed28
|
Categorize generators by type
This is a great quality signal!
Add the type as document bitflags by category.
|
2023-06-22 16:04:37 +02:00 |
|
Viktor Lofgren
|
7326ba74fe
|
Tweaks to pub date heuristics to make it mostly get the 'historyofphilosophy.net' case right.
Use HTML standard for plausibility checks in the more guesswork-like heuristics. Added more class names to look for date strings.
|
2023-06-20 14:15:05 +02:00 |
|
Viktor Lofgren
|
266ad2e4de
|
Re-introduce monkey patched GSON to make converter run better.
fixup! Re-introduce monkey patched GSON to make converter run better.
fixup! Re-introduce monkey patched GSON to make converter run better.
|
2023-06-19 17:58:19 +02:00 |
|
Viktor Lofgren
|
2eb972dea1
|
Remove unrelated code, break tools into their own directory.
|
2023-03-17 16:03:11 +01:00 |
|
Viktor Lofgren
|
449471a076
|
Yet more restructuring. Improved search result ranking.
|
2023-03-16 21:35:54 +01:00 |
|
Viktor Lofgren
|
d82532b7f1
|
More restructuring, big bug fixes in keyword extraction.
|
2023-03-13 17:39:53 +01:00 |
|