Syntax
This is a keyword-based search engine. When entering multiple search terms, the search engine will attempt to match them against documents where the terms occur in close proximity.
Search terms can be excluded with a hyphen.
While the search engine at present does not allow full text search, quotes can be used to specifically search for names or terms in the title. Using quotes will also cause the search engine to be as literal as possible in interpreting the query.
Parentheses can be used to add terms to the query without giving weight to the terms when ranking the search results.
Samples
Language Limitations
The search engine currently does not support any languages other than English.
Support for other languages is planned, but not available right now. Adding support for additional languages and making it work well is somewhat time-consuming, meanwhile having bad support for a language won't make anyone happy.
Webmaster Information
If you wish to add your website to the index, follow the instructions in this git repository, if you do not want to mess with git, you can also email kontakt@marginalia.nu with the domain name.
The search engine's crawler uses the user-agent string search.marginalia.nu, and requests come from the IPs indicated in https://search.marginalia.nu/crawler-ips.txt.
If you do not want your website to be crawled, the search engine respects robots.txt. In case of questions, bug reports or concerns, email kontakt@marginalia.nu.
Special Keywords
Keyword | Meaning |
---|---|
site:example.com | Display site information about example.com |
site:example.com keyword | Search example.com for keyword |
browse:example.com | Show similar websites to example.com |
ip:127.0.0.1 | Search documents hosted at 127.0.0.1 |
links:example.com | Search documents linking to example.com |
tld:edu keyword | Search documents with the top level domain edu. |
?tld:edu keyword | Prefer but do not require results with the top level domain edu. This syntax is also possible for links:..., ip:... and site:... |
q>5 | The amount of javascript and modern features is at least 5 (on a scale 0 to 25) |
q<5 | The amount of javascript and modern features is at most 5 (on a scale 0 to 25) |
year>2005 | (beta) The document was ostensibly published in or after 2005 |
year=2005 | (beta) The document was ostensibly published in 2005 |
year<2005 | (beta) The document was ostensibly published in or before 2005 |
rank>50 | The ranking of the website is at least 50 in a span of 1 - 255 |
rank<50 | The ranking of the website is at most 50 in a span of 1 - 255 |
count>10 | The search term must appear in at least 10 results form the domain |
count<10 | The search term must appear in at most 10 results from the domain |
format:html5 | Filter documents using the HTML5 standard. This is typically modern websites. |
format:xhtml | Filter documents using the XHTML standard |
format:html123 | Filter documents using the HTML standards 1, 2, and 3. This is typically very old websites. |
generator:wordpress | Filter documents with the specified generator, in this case wordpress |
file:zip | Filter documents containing a link to a zip file (most file-endings work) |
file:audio | Filter documents containing a link to an audio file |
file:video | Filter documents containing a link to a video file |
file:archive | Filter documents containing a link to a compressed archive |
file:document | Filter documents containing a link to a document |
-special:media | Filter out documents with audio or video tags |
-special:scripts | Filter out documents with javascript |
-special:affiliate | Filter out documents with likely Amazon affiliate links |
-special:tracking | Filter out documents with analytics or tracking code |
-special:cookies | Filter out documents with cookies |