The irritant-free web
There are many irritants on the web.
Number one is spam. Not your inbox spam. But web spam. These pages are obviously spam to a human eye. But most of them are designed solely to show up in search engine results. They often contain fragments or sentences extracted from genuine pages and strung together to create an illusion of a page that will fool a computer into believing that it is legitimate content. Or they are blatant copies of original text with a few links peppered through, which peddle drugs, or ask you to join an online casino.
This kind of page is notoriously difficult to filter out using algorithms that analyze the text. The smarter the algorithm gets, the smarter the spammer becomes. This is why search engines mostly rely on ranking pages by the kinds of links that point to it. The thinking being that no one with a legitimate website will intentionally link to a spam page. Only other spam pages and automated bot-spam content would. This works. This is why you've likely never seen a real spam page when you search for things.
But this link-based ranking is susceptible to other kinds of manipulation. Prime being SEO. When businesses and major media outlets vie for your attention they stoop to SEO tactics to get to the top of search engine results. It's no surprise that most search engine results are SEO spam.
This ranking system also penalizes small websites that aren't linked to by an existing network of authoritative sites. This is the larger problem. It makes the web a hierarchical system, where an individual who puts up good content but doesn't stoop to SEO or social media flooding, drowns in the vast web of noise, never to be discovered.
And the web is vast. It will only get bigger. Compounding this issue.
The underlying cause is content that is created to make money. The intention of such content is not to edify you. Or to share something cool. It is to pander to you. In whichever way possible to keep your attention for as long as possible. Attention translates to revenue via ads. Or paywalls. Paywalls aren't better. They create lock-in. How many sites is a person going to pay to read? Thus again shrinking the web that you access.
The ht3 index contains pages written by individuals; not armies of "content creators". Pages that are written with the intention to share knowledge, to show off something cool. Pages that enable creativity. Making us wonder, question, learn and gain insight. This does mean that websites that you can think of, off the top of your head aren't indexed. Popular news media, ad-heavy sites and sites with monetized content in general are not indexed.
The index contains mostly tech related pages. This does not mean there are only tech pages. Search for "baking bread" or "Watercolors" and I hope you'll be pleasantly surprised by the results.
Finally, please bear in mind that ht3 is the work of one individual and disagreements about what qualifies as good content are to be expected. Zombo com is alive and well even today. My joy may be your irritant.
– Varun A
Please send feedback, questions and suggestions to firstname.lastname@example.org