The first well documented content search engine was named Archie and was introduced on September 10 1990. It was a tool for indexing FTP (file transfer protocol) archives and allowed its users to find specific files of interest. Archie was created by Alan Emtage, Bill Heelan and J. Peter Deutsch while they were computer science students at McGill University in Montreal, Canada. The program created a searchable database of file names by downloading the directory listings of all files located on public anonymous FTP sites.
In 1991, Gopher was created by Mark McCahill of the University of Minnesota. Gopher was a TCP/IP application layer protocol. It was a highly structured alternative to the World Wide Web but was quickly surpassed by the http protocol. Gopher maintained a structured hierarchy of information that paved the way for the early search engines “Veronica” and “Jughead”. However, as of 1993 there still was no search engine for the World Wide Web, which operated on the http protocol. The primary method of exploring the public content of the web at this time was through manually maintained lists of web resources. On September 2, 1993, Oscar Nierstrasz of the University of Geneva released a program written in Perl that scanned existing manually curated lists of websites and rewrote them into a standard format. The program, named W3Catalog, did not meet all three criterion to be defined as a search engine but clearly paved the way. W3Catelog is now known as the web’s first “primitive” search engine.
In June 1993, the first web robot was released. Created by Matthew Gray of MIT, the program was called the World Wide Web Wanderer and was built with the purpose of measuring the size of the web. The index of sites that the program created was called “Wandex” but since these files were not searchable it cannot be considered a search engine. There are rumors that Gray now works for Google.
JumpStation was created in December 1993 by Jonathon Fletcher and was likely the first search engine to use a web robot to index the internet. JumpStation included a build in web form for querying websites and thus became the first tool that met the three features of a web search engine (crawling, indexing, and searching). Due to resource limitations of the hardware that JumpStation operated on, its index stored only titles and headings of the webpages it discovered.
The convenience of searching for a query term across the full-text of indexed webpages was first introduced in 1994 with the launch of a new search engine called WebCrawler. The popularity and effectiveness of WebCrawler’s full-text search marked the beginning of the search engine craze- a period of time that we are still living in today. Between the years of 1994 and 1996, a variety of full-text search engines became available on the web including Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista.
RankDex was developed by Robin Li of IDD Information Systems. This search engine was the first to explore site-scoring page ranking methods similar to those used by today’s complex search engines such as Google. Patents for the RankDex search methods were obtained in 1999 and were eventually used to launch the Chinese company Baidu in 2000. As of early 2017, Baidu is the 4th most trafficked website according to the Alexa Internet rankings.
The seeds for the single most popular website of all time were planted in 1998 with the publication of two landmark research papers by Sergey Brin and Lawrence Page. These papers, entitled “The Anatomy of a Large-Scale Hypertextual Web Search Engine” and “The PageRank Citation Ranking: Bringing Order to the Web” described a search engine named Google.