Viktor Lofgren
b6511fbfe2
(converter) Add AnchorTextKeywords to EncyclopediaMarginaliaNuSideloader processing
...
The commit updates EncyclopediaMarginaliaNuSideloader to include the AnchorTextKeywords in processing documents, aiding search result relevance.
It also removes old test-related functionality and a large but fairly useless test previously used to debug a specific problem, to the detriment of the overall code quality.
2023-12-09 15:20:52 +01:00
Viktor Lofgren
eccb12b366
(control) Fix spurious state detection in control-side actors
...
A race condition was found where precession actors would sometimes skip a step, because when invoking ExecutorRemoteActor.getState(), it would get the last 'OK' actor state from a previous run of the actor!
To avoid this, the trigger method was changed from returning a boolean to the message ID, negative if an error occurred, to be passed to getState to select only messages that pertain to the present or future runs.
2023-12-09 12:50:05 +01:00
Viktor Lofgren
d0982e7ba5
(converter) Add error handling and lazy load external domain links
...
The converter was not properly initiating the external links for each domain, causing an NPE in conversion. This needs to be loaded later since we don't know the domain we're processing until we've seen it in the crawl data.
Also made some refactorings to make finding converter bugs easier, and finding the related domain less awkward from the SerializableCrawlData interface.
2023-12-09 12:33:39 +01:00
Viktor Lofgren
fc30da0d48
(converter) Add academia recognition to DomainProcessor
...
The code now includes an additional function in the DomainProcessor class that checks if a domain is associated with academia. An academic domain is identified by the ".edu" TLD, or fits a specific regex pattern matching domains like *.ac.ccTld or *.edu.ccTld.
If these conditions are met, the search term "special:academia" is added to the domain.
The existing academia search filter uses personalized pagerank to select academia-adjacent domains, but it isn't working very well. The hope is that filtering on domain names will be more effective, and that it can supplant the ranking-based approach.
2023-12-08 20:31:34 +01:00
Viktor Lofgren
156c067f79
(search) Fix mobile issues with browse feature
2023-12-05 21:28:50 +01:00
Viktor Lofgren
b33b013d41
(search) Fix broken script tag
...
Apparently it can't be called suggestions.js...?
2023-12-05 20:29:13 +01:00
Viktor Lofgren
e74e2f705f
(search) Fix broken script tag
...
suggestions.js became something else.
2023-12-05 20:20:07 +01:00
Viktor Lofgren
2e438847fc
(search) Optimize related domains queries
...
In the future this logic probably needs to move into a separate
service, as it's still quite slow to load. But this fixes response
times and DOS potential of previous version.
2023-12-05 20:12:03 +01:00
Viktor Lofgren
9301c47d93
(search) Optimize related domains queries
2023-12-05 14:42:03 +01:00
Viktor Lofgren
20ec58b07f
(search) Remove layout-breakingly long URLs from the similar domains view.
...
They're almost all .onion URLs anyway, not really the space we're looking to peer into.
2023-12-05 13:58:15 +01:00
Viktor Lofgren
98983c1015
(search) Hopefully fix race condition that leaves the response with no Content-type header
2023-12-05 13:52:36 +01:00
Viktor Lofgren
67195592c6
(search) Hopefully fix race condition that leaves the response with no Content-type header
2023-12-05 13:48:42 +01:00
Viktor
21abfc6424
Merge pull request #61 from MarginaliaSearch/new-look
...
Design Revamp For search.marginalia.nu
2023-12-05 13:28:54 +01:00
Viktor Lofgren
d1e88df71e
(search) Cleaning up the code a bit
2023-12-05 13:26:05 +01:00
Viktor Lofgren
f36cfe34ab
(search) Hackery to get a more balanced view
2023-12-04 22:50:39 +01:00
Viktor Lofgren
8a1934008c
(search) Merge similar sites results with the info view.
...
WIP: This commit needs to be cleaned up.
2023-12-04 22:10:24 +01:00
Viktor Lofgren
b41bb9cfcf
(search) Use a Ξ for mobile button title instead of "Filters".
...
Makes it easier to distinguish form the search button.
2023-12-03 16:33:25 +01:00
Viktor Lofgren
d58324bbef
(search) Clean up filters menu a bit, improve accessibility.
2023-12-02 18:05:30 +01:00
Viktor Lofgren
cbbd45d3e5
(search) Clean up filters menu a bit, improve accessibility.
2023-12-02 18:01:03 +01:00
Viktor Lofgren
b89633ae4b
(search) Don't render a filter button on mobile when there are no filters to be presented.
2023-12-02 17:23:45 +01:00
Viktor Lofgren
96357e9bfd
(search) Fix typeahead suggestions, as well as improve mobile and desktop UX in small ways.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
d530c3096f
(search) GUI tweaks to make the new interface not fall apart on mobile/chrome
2023-12-02 17:06:40 +01:00
Viktor Lofgren
ae0c1c3f2d
(control) Adjust search result margins for better visual density
2023-12-02 17:06:40 +01:00
Viktor Lofgren
0cc2564380
(search) CSS tweaks
2023-12-02 17:06:40 +01:00
Viktor Lofgren
38d20022ad
(search) Fix script loading for mobile support
2023-12-02 17:06:40 +01:00
Viktor Lofgren
280132dad0
(search) Fix script loading for mobile support
2023-12-02 17:06:40 +01:00
Viktor Lofgren
61de4e2789
(search) Retain filter options when performing a new search from the input field
2023-12-02 17:06:40 +01:00
Viktor Lofgren
f9d3455320
(search) Reduce visual weight of search results
2023-12-02 17:06:40 +01:00
Viktor Lofgren
2ff64c3c12
(search) New toggle for reducing tracking
2023-12-02 17:06:40 +01:00
Viktor Lofgren
902f235b5b
(search) Integrate 'similar' tab in site info.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
97d43a6fa2
(search) Revamp browse results with new look.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
9bc65ff0ca
(search) Desaturate search result titles according to rank
2023-12-02 17:06:40 +01:00
Viktor Lofgren
6cd6a615fd
(search) Add data-filter to body as a data attribute
...
For future shenanigans ;D
2023-12-02 17:06:40 +01:00
Viktor Lofgren
5639f0653d
(search) Rename SearchProfile.name into filterId
...
Avoid foot-gun caused by name clash with the Enumeration method name(), which returns the Java name of the enumeration value.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
251174c9a2
(search) Update front page with new look
2023-12-02 17:06:40 +01:00
Viktor Lofgren
42ea87d637
(search) Update conversion results, error page, and dictionary results with new CSS.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
7c8a60b8cf
(search) Site info view is mostly done
...
Also optimize the rendering a bit to avoid having to allocate huge string buffers, writing directly to Spark's response instead.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
2f4500be5a
(search) New frontend look
2023-12-02 17:06:40 +01:00
Viktor Lofgren
fa7534a362
(search) Remove dead code
2023-12-02 17:06:40 +01:00
Viktor Lofgren
a258f0af7a
(search) Refactor search parameters to include query
2023-12-02 17:06:40 +01:00
Viktor Lofgren
01621c6344
(renderer) Make helpers configurable on a by-service basis.
2023-12-02 17:06:40 +01:00
Viktor Lofgren
c7934342a6
(control) Automatic recrawl
2023-12-02 17:06:24 +01:00
Viktor Lofgren
f5c324c06b
(minor) Fix broken test
2023-12-01 17:44:39 +01:00
Viktor Lofgren
f615cf2391
(convert) Loosen up the rules enforcement for documents that have external links.
2023-12-01 17:44:29 +01:00
Viktor Lofgren
c984a97262
(docs) Update crawling.md
2023-11-30 21:53:56 +01:00
Viktor Lofgren
a02c06a837
(docs) Update sideloading-howto.md
2023-11-30 21:51:03 +01:00
Viktor Lofgren
21d6aa421c
(docs) Update setup instructions
2023-11-30 21:44:29 +01:00
Viktor Lofgren
e5d274fe1c
(docs) Improve architectural documentation
2023-11-30 21:38:57 +01:00
Viktor Lofgren
166a391eae
(docs) Improve architectural documentation for the crawler.
2023-11-30 21:30:57 +01:00
Viktor Lofgren
5fb24bb27f
(docs) Improve architectural documentation for the converter.
2023-11-30 20:43:22 +01:00