Done
Done 2022-01-30
- generate better default thumbnail on the fly (/)
Done 2022-01-19
- public API gateway (/)
Done 2022-01-16
- overhaul CSS of MEMEX (/)
Done 2022-01-15
- Improved random (/)
INSERT INTO EC_RANDOM_DOMAINS SELECT DISTINCT(EC_DOMAIN.ID) FROM EC_DOMAIN_NEIGHBORS A INNER JOIN EC_DOMAIN_NEIGHBORS B ON B.NEIGHBOR_ID=A.DOMAIN_ID INNER JOIN EC_DOMAIN_NEIGHBORS C ON C.NEIGHBOR_ID=B.DOMAIN_ID INNER JOIN EC_DOMAIN ON A.DOMAIN_ID=EC_DOMAIN.ID WHERE C.DOMAIN_ID IN (SELECT ID FROM EC_DOMAIN WHERE URL_PART IN (secret-sauce)) AND EC_DOMAIN.STATE>=0;
Done 2022-01-14
- Dark Mode (/)
- Screengrabs by domain (/)
- Revise exploration mode (/)
- Improve keyboard navigation (/)
Done 2022-01-12
- Search redesign (/)
- Fixed dictionary corruption bug (/)
Done 2022-01-04
- Improve site:-query QOL (/)
- Fix byte folder bug (/)
- refactor EC_URL (/)
ALTER TABLE EC_URL MODIFY COLUMN PROTO ENUM('http', 'https', 'gemini') NOT NULL;
-- put visit-metadata in separate table (/)
Done 2021-12-03
- fix bug in language detection (/)
-- re-fetching some pages (/)
Done 2021-12-02
- new approach for query rewriting (/)
Done 2021-11-14
- make site:-queries return a dummy entry when no site information is available (/)
Done 2021-11-11
- hybridized ordering of domains on reindex, F(previous rank, previous quality). (/)
- mark documents with audio, video, object tags (/)
Done 2021-11-10
- car service <2021-11-18> (/)
Done 2021-10-30
- Add auto redirects for guesswork rss/atom/feed-requests to /log/feed.xml (/)
Done 2021-10-29
- investigate extracting more keywords (/)
-- textrank (/)
-- tf-idf (x)
-- sideload additional keywords for most popular sites (/)
Done 2021-10-12
- refactor index converter (/)
- clean up code garbage (/)
Done 2021-10-05
- trial more vanilla PageRank approach as a tertiary algorithm (/)
- fix a search result priortization bugs for mixed rankings (/)
- fix search interface for firefox on android (x)
It is reportedly broken
-- figure out how to replicate this problem (x)
- fix potential DoS where certain search queries with a large number of common but mutually exclusive terms would take forever to process. (/)
test query: generic stores underground unusual
Done 2021-10-03
- prioritize n-gram matches over word matches (/)
- show informative error page when the index server reboots (/)
Done 2021-10-02
- Personalized Page Rank (/)
- Duelling Algorithms (/)
Done 2021-10-30
- Launch October Update (/)
Done 2021-09-26
- fix broken search use-cases (/)
-- c language (/)
-- 67 chevy (/)
-- 68000 (/)
-- c# (/)
-- @twitterhandle (/)
-- #hashtag (/)
- trial tar based archiving to save the poor ext4 fs (/)
- use words to tag document format etc (/)
- dynamic re-bucketing based on something like (/)
SELECT DEST.URL_PART,EXP(DEST.QUALITY)*SUM(EXP(SOURCE.QUALITY)) AS Q from EC_DOMAIN DEST INNER JOIN EC_DOMAIN_LINK ON DEST.ID=DEST_DOMAIN_ID INNER JOIN EC_DOMAIN SOURCE ON SOURCE.ID=SOURCE_DOMAIN_ID WHERE DEST.INDEXED>0 GROUP BY DEST_DOMAIN_ID
Done 2021-09-19
- Fix several indexing bugs that hid relevant search results (/)
Done 2021-09-17
- Added search profiles (/)
Done 2021-09-16
- Rephrased an error message that some people took to mean they weren't speaking a proper language (/)
Done 2021-09-15
- Using in-site domain link-names to add search terms (/)
- Fixed buggy default content-type (/)
- Even more aggressive unicode language dectection (/)
Done 2021-09-11
- Status flag for domains (/)
Indexed, Active, Blocked
- Improve topic detection (/)
Done 2021-09-09
- Tuned search results to demote very short results (/)
Done 2021-09-08
- Encyclopedia tries harder to find the right article if the case match isn't exact (/)
Done 2021-09-06
- Breaking changes for next Index-rebuild (/)
-- Change writer bucket scaling to 1/4 (/)
-- Move protocol and port from EdgeDomain to EdgeURL (/)
-- Change database schemas to reflect (/)
-- ISO-8859-1/UTF-8 charset sniffer (/)
-- Fixed a bug that would occasionally cause the crawler to re-index the same working set multiple times (/)
Done 2021-09-02
- improve edge-director throughput (/)
- give edge-director state for semi-blocking tasks (/)
Done 2021-08-31
- optimize URL index size (/)
Done 2021-08-28
- clean up gemini navigation (/)
- Atom feed for HTTPS and Gemini (/)
Done 2021-08-27
- Feed gemini server with rendered gmi-content (/)
-- Output the content (/)
-- Generate feeds (/)
-- Make the gemini server read it (x)
-- Switch over (/)
Done 2021-08-26
- Absorb gemini server into WMSA (/)
Done 2021-08-25
- wildcard domain for marginalia.nu (/)
-- move memex to memex-subdomain (/)
- feeds on FEED pragma (/)
Done 2021-08-24
- Top nav bar overhaul (/)
Done 2021-08-23
- add marker for which files are todo files (/)
Added %%%/pragmas for toggling behavior
-- Added template helpers for consuming pragmas (/)
-- Used to improve topic pages (/)
- Fixes for git (/)
Done 2021-08-22
- File manager (/)
-- Delete (/)
-- Delete Empty Dir (x)
-- Move/Rename (/)
--- System for tombstones/redirects (/)
- Edit for / does not work (/)
Needed better support for non-normalized URLs, e.g. //index.gmi
- Backlinks for index (/)
Done 2021-08-21
- Git Integration (/)
-- Use commit hooks to trigger pull (/)
https://git-scm.com/book/uz/v2/Appendix-B%3A-Embedding-Git-in-your-Applications-JGit- Recursive directory watch (/)
- Two column layout (/)
Done 2021-08-20
- Overhaul MEMEX navigation (/)
-- Navigation bar (/)
-- Generate site map (x)
-- Editing (/)
--- Add update-root link (/)
- Tombstones aren't generated properly on-delete (/)
The tombstone db wasn't properly reloaded after being updated.
- Just write static files to disk instead of using an intermediary backend server. (/)
-- Use alias directive to set different root for memex path. (/)
-- Content-type is finnicky (/)
I want to serve html-wrapped .gmi and .html location ~* \.(gmi|png)$ { types { text/html gmi; text/html png; } }
Done 2021-08-19
- Move away from statically generated HTML forms in memex (/)
- Fix stability of podcast scraper (/)
- Get crawling up again (/)
-- Monitoring (/)
--- Extraction (/)
--- Status page (/)
-- Scraper config (/)
-- DNS cache (?)
-- IP Block CDNs (/)
--- Parse CIDR (/)
Apache Commons.Net SubnetUtil seems to do the job, although it can't deal with IPV6 :-/
--- CloudFlare (/)
173.245.48.0/20 103.21.244.0/22 103.22.200.0/22 103.31.4.0/22 141.101.64.0/18 108.162.192.0/18 190.93.240.0/20 188.114.96.0/20 197.234.240.0/22 198.41.128.0/17 162.158.0.0/15 172.64.0.0/13 131.0.72.0/22 104.16.0.0/13 104.24.0.0/14 2400:cb00::/32 2606:4700::/32 2803:f800::/32 2405:b500::/32 2405:8100::/32 2a06:98c0::/29 2c0f:f248::/32
--- Fastly (/)
23.235.32.0/20 43.249.72.0/22 103.244.50.0/24 103.245.222.0/23 103.245.224.0/24 104.156.80.0/20 146.75.0.0/17 151.101.0.0/16 157.52.64.0/18 167.82.0.0/17 167.82.128.0/20 167.82.160.0/20 167.82.224.0/20 172.111.64.0/18 185.31.16.0/22 199.27.72.0/21 199.232.0.0/16
- Refactor task management (/)
-- Fix prepend (/)
-- Add tests (/)
- Refactor Floyd-Steinberg ditherer (/)
- Todo move-to-done function puts header last in #Done (/)
Done 2021-08-16
- Pictures-in-HTML (/)
-- Implement compression via Floyd-Steinberg dithering (/)
https://encyclopedia.marginalia.nu/wiki/Floyd%E2%80%93Steinberg_ditheringhttp://image4j.sourceforge.net/javadoc/index.html?net/sf/image4j/util/ConvertUtil.html
--- Ensure 4 bit (/)
--- On upload (/)
--- Convert existing stuff on-read (x)
-- Render image views (/)
--- Add to index (/)
-- Upload form (/)
Done 2021-08-15
- CSS fixes for mobile (/)
-- text align for tasks (/)
-- indent overflowed tasks (/)
- Fix CME (/)
java.util.ConcurrentModificationException: null at java.util.HashMap.forEach(HashMap.java:1428) ~[?:?] at nu.marginalia.wmsa.memex.MemexData.forEach(MemexData.java:51) ~[WMSA-1628951793.jar:?] at nu.marginalia.wmsa.memex.Memex.reRender(Memex.java:49) ~[WMSA-1628951793.jar:?] at io.reactivex.rxjava3.core.Scheduler$PeriodicDirectTask.run(Scheduler.java:566) [WMSA-1628951793.jar:?] at io.reactivex.rxjava3.core.Scheduler$Worker$PeriodicTask.run(Scheduler.java:513) [WMSA-1628951793.jar:?] at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65) [WMSA-1628951793.jar:?] at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56) [WMSA-1628951793.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?] at java.lang.Thread.run(Thread.java:832) [?:?] ERROR 2021-08-14 16:36:39,467 RxCachedThreadScheduler-2 MemexMain : Uncaught exception java.util.ConcurrentModificationException: null at java.util.HashMap.forEach(HashMap.java:1428) ~[?:?] at nu.marginalia.wmsa.memex.MemexData.forEach(MemexData.java:51) ~[WMSA-1628951793.jar:?] at nu.marginalia.wmsa.memex.Memex.reRender(Memex.java:49) ~[WMSA-1628951793.jar:?] at io.reactivex.rxjava3.core.Scheduler$PeriodicDirectTask.run(Scheduler.java:566) ~[WMSA-1628951793.jar:?] at io.reactivex.rxjava3.core.Scheduler$Worker$PeriodicTask.run(Scheduler.java:513) ~[WMSA-1628951793.jar:?] at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65) [WMSA-1628951793.jar:?] at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56) [WMSA-1628951793.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?] at java.lang.Thread.run(Thread.java:832) [?:?]
Done 2021-08-14
- Automatic TODO task categorization (/)
- Login API on separate service (/)
-- Set up service (/)
-- Route requests (/)
- Fix header auto-location (/)
- Display top tasks in index (/)
Done 2021-08-10
-- + in URLs? (/)
proxy_pass with / forces nginx to parse the url (why?) Bad: proxy_pass http://127.0.0.1:5025/public/wiki/ Good: rewrite ^ $request_uri rewrite ^/(.*) /public/$1 break; return 400; proxy_pass http://127.0.0.1:5025$uri;
- Encyclopedia (/)
-- Search API (/)
-- code tags (/)
Done 2021-08-06
- Memex (/)
-- GemtextParser (/)
-- Service skeleton (/)
-- Link extraction (/)
-- Rendering (/)
--- Stylesheet (/)
-- Metadata (-)
-- Uppdateringar (/)
--- API (/)
--- Formulär (/)
Done 2021-08-04
- Service Lockdown (/)
-- X-Public header in code (/)
-- Move endpoints (/)
--- Resource Store (/)
--- Search (/)
--- Assistant (/)
-- Update clients (/)
--- Resource Store (/)
--- Search Service (/)
--- Assistant (-)
-- Update nginx (/)
-- Update links on website (/)
- Tune wiki archive fs (/)
sudo tune2fs -O ^dir_index /dev/nvme0n1p2
- marginalia.nu:9999 "BBS" (/)
Done 2021-08-03
- encyclopedia.marginalia.nu (/)
- Verify automatic backup of git (/)
- Reddit frontend (/)
-- Scraper: (/)
-- API: Marginalia 2: (/)
- Wiki (/)
-- on Optane (/)
-- fix Hildegard of Bingen (/)
- Block bots on nginx (/)
https://kb.linuxlove.xyz/nginx-badbotblocker.html
Done 2021-08-02
- Install Optane (/)
-- Migrate MariaDB (/)
- Wiki (/)
-- redirects (/)
-- top notices (/)
- Bucket4J rate limiting (/)
- Service Monitoring (/)
Done 2021-08-01
- Update Cert (/)
- Backups for git (/)
Done 2021-07-30
- Load Wikidata from ZIM (/)
- Migrate Server to Debian Buster (/)
Done 2021-07-28
- Update description generation algorithm (/)
-- Recalculate descriptions (...) (/)
- Wiki data (/)
-- Load data (/)
-- Wrap wikipedia (/)
-- ZIM? (-)
-- Wikipedia Cleaner (/)
Done 2021-07-27
- Spell checker service? (/)
https://github.com/wolfgarbe/SymSpell- Calculations (/)
-- Detection (/)
-- Parser (/)
-- Unit conversion (/)
--- Temperature (/)
--- Distance (/)
--- Weight (/)
--- Area (/)
--- Volume (/)
Done 2021-07-26
- Save websites to disk? (/)
-- GZipped (/)
-- XFS (?)
- Local backlinks in GMI (/)
-- Parse GMI for links and titles (/)
-- Create tags system (/)
- Use prime sizing for HashMap! (/)
-- How to find primes (/)
- Arbitarary size HashMap (/)
Done 2021-07-25
- Syntax for orgmode + GMI in kate (/)
Use /usr/share/kde4/apps/katepart/syntax/markdown.xml
Done 2021-07-23
- Dictionary analysis in scraping (/)
It seems viable to estimate the lanaguage of a document based on the overlap with a N-most-common-words dictionary. Threshold 0.05 ok?
-- English (/)
-- Swedish (/)
-- Latin (/)
- Clean up tests (/)
Done 2021-07-22
GZip Compression stats:
63% old 21% new
- Hash map (/)
-- Contiguous memory bins (/)
- Key Folding (/)
-- For strings (/)
-- For integers (/)
-- For dates (x)
- Debian Desktop (/)
-- Docker (/)
-- Java 14 (/)
-- IntelliJ (/)
-- Code (/)
-- Gradle (/)
-- OrgMode (/)
Done 2021-07-21
- Bugfix: Domain Resolution (/)
Done 2021-07-20
- Index Changes (/)
-- Remove Junk Logging (/)
-- Split Query (/)
-- Implement in Frontend (/)
- Dictionary Service (/)
-- Add Index To Table (/)
-- Populate test db (/)
-- Build tests (/)
-- Integrate into frontend (/)
- Site Information (/)
-- Fetch (/)
-- 404 (/)