Aliweb

ALIWEB (Archie Like Indexing for the WEB) is considered the first Web search engine, as its predecessors were either built with different purposes (the Wanderer, Gopher) or were literally just indexers (Archie, Veronica and Jughead).

First announced in November 1993[2] by developer Martijn Koster while working at Nexor, and presented in May 1994[3] at the First International Conference on the World Wide Web at CERN in Geneva, ALIWEB preceded WebCrawler by several months.[4]

ALIWEB allowed users to submit the locations of index files on their sites[4][5] which enabled the search engine to include webpages and add user-written page descriptions and keywords. This empowered webmasters to define the terms that would lead users to their pages, and also avoided setting bots (e.g. the Wanderer, JumpStation) which used up bandwidth. As relatively few people submitted their sites, ALIWEB was not very widely used.

Martijn Koster, who was also instrumental in the creation of the Robots Exclusion Standard,[6][7] detailed the background and objectives of ALIWEB with an overview of its functions and framework in the paper he presented at CERN.[3]

Koster is not associated with a commercial website which uses the aliweb name.[8]

Aliweb
Type of site
Web search engine
Websitewww.aliweb.com at the Wayback Machine (archived February 9, 1998)
LaunchedMay 1994
Current statusactive and accepting suggestions for link inclusion [1]

See also

References

  1. ^ http://www.aliweb.com/ goto the foot of the page - "Want to suggest a site for inclusion in AliLinks? Is there a link that is not catagorized right or is not working? E-mail questions and comments about this service to: webmaster@aliweb.com"
  2. ^ Martijn Koster (30 November 1993). "ANNOUNCEMENT: ALIWEB (Archie-Like Indexing for the WEB)". comp.infosystems).
  3. ^ a b "List of PostScript files for the WWW94 advance proceedings". First International Conference on the World-Wide Web. June 1994. Title: "Aliweb - Archie-Like Indexing in the Web." Author: Martijn Koster. Institute: NEXOR Ltd., UK. PostScript, Size: 213616, Printed: 10 pages
  4. ^ a b Chris Sherman (3 December 2002). "Happy Birthday, Aliweb!". Search Engine Watch. Archived from the original on 2006-10-17. Retrieved 2007-01-03.
  5. ^ Wes Sonnenreich (1997). "A History of Search Engines". John Wiley & Sons website.
  6. ^ Martijn Koster. "Robots Exclusion". robotstxt.org.
  7. ^ Martijn Koster. "Robots in the Web: threat or treat?". Reprinted with permission from ConneXions, The Interoperability Report, Volume 9, No. 4, April 1995. Archived from the original on 2007-01-02. Retrieved 2007-01-03.
  8. ^ Martijn Koster. "Historical Web Services: ALIWEB". Martijn Koster's Historical Web Services page. Archived from the original on 2007-01-16. Note that I have nothing to do with aliweb.com. It appears some marketing company has taken the old aliweb code and data, and are using it as a site for advertising purposes. Their search results are worthless. Their claim to have trademarked "aliweb" I have been unable to confirm in patent searches. My recommendation is that you avoid them.
First International Conference on the World-Wide Web

The First International Conference on the World-Wide Web (also known as WWW1) was the first-ever conference about the World Wide Web, and the first meeting of what became the International World Wide Web Conference. It was held on May 25 to 27, 1994 in Geneva, Switzerland. The conference had 380 participants, who were accepted out of 800 applicants. It has been referred to as the "Woodstock of the Web".The event was organized by Robert Cailliau, a computer scientist who had helped to develop the original WWW specification, and was hosted by CERN. Cailliau had lobbied inside CERN, and at conferences like the ACM Hypertext Conference in 1991 (in San Antonio) and 1993 (in Seattle). After returning from the Seattle conference, he announced the new World Wide Web Conference 1. Coincidentally, the NCSA announced their Mosaic and the Web conference 23 hours later.

JumpStation

JumpStation was the first WWW search engine that behaved, and appeared to the user, the way current web search engines do. It started indexing on 12 December 1993 and was announced on the Mosaic "What's New" webpage on 21 December 1993. It was hosted at the University of Stirling in Scotland.

It was written by Jonathon Fletcher, from Scarborough, England, who graduated from the University with a first class honours degree in Computing Science in the summer of 1992 and has subsequently been named "father of the search engine".He was subsequently employed there as a systems administrator. JumpStation's development discontinued when he left the University in late 1994, having failed to get any investors, including the University of Stirling, to financially back his idea. At this point the database had 275,000 entries spanning 1,500 servers.JumpStation used document titles and headings to index the web pages found using a simple linear search, and did not provide any ranking of results. However, JumpStation had the same basic shape as Google search in that it used an index solely built by a web robot, searched this index using keyword queries entered by the user on a web form whose location was well-known, and presented its results in the form of a list of URLs that matched those keywords.

List of websites founded before 1995

Of the thousands of websites founded prior to 1995, those appearing here are listed for one or more of the following reasons:

They still exist (albeit in some cases with different names).

They made contributions to the history of the World Wide Web.

They helped to shape certain modern Web content, such as webcomics and weblogs.

Martijn Koster

Martijn Koster (born ca 1970) is a Dutch software engineer noted for his pioneering work on Internet searching.

Koster created Aliweb, the Internet's first Search Engine, which was announced in November 1993 while working at Nexor and presented in May 1994 at the First International Conference on the World Wide Web. Koster also developed Achiplex, a search engine for FTP sites that pre-dates the Web, and CUSI, a simple tool that allowed you to search different search engines in quick succession, useful in the early days of search when services provided varying results.

Koster also created the Robots Exclusion Standard.

Meta element

Meta elements are tags used in HTML and XHTML documents to provide structured metadata about a Web page.

They are part of a web page's head section. Multiple Meta elements with different attributes can be used on the same page. Meta elements can be used to specify page description, keywords and any other metadata not provided through the other head elements and attributes.

The meta element has two uses: either to emulate the use of an HTTP response header field, or to embed additional metadata within the HTML document.

With HTML up to and including HTML 4.01 and XHTML, there were four valid attributes: content, http-equiv, name and scheme. Under HTML 5 there are now five valid attributes, charset having been added. http-equiv is used to emulate an HTTP header, and name to embed metadata. The value of the statement, in either case, is contained in the content attribute, which is the only required attribute unless charset is given. charset is used to indicate the character set of the document, and is available in HTML5.

Such elements must be placed as tags in the head section of an HTML or XHTML document.

The two distinct parts of the elements are:

Title tags

Meta description

Nexor

Nexor Limited is a privately held company based in Nottingham, providing product and services to safeguard government, defence and critical national infrastructure computer systems. It was originally known as X-Tel Services Limited.

Search engine indexing

Search engine optimisation indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process in the context of search engines designed to find web pages on the Internet is web indexing.

Popular engines focus on the full-text indexing of online, natural language documents. Media types such as video and audio and graphics are also searchable.

Meta search engines reuse the indices of other services and do not store a local index, whereas cache-based search engines permanently store the index along with the corpus. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while agent-based search engines index in real time.

Search engine optimization

Search engine optimization (SEO) is the process of increasing the quality and quantity of website traffic, increasing visibility of a website or a web page to users of a web search engine.

SEO refers to the improvement of unpaid results (known as "natural" or "organic" results), and excludes the purchase of paid placement.

SEO may target different kinds of search, including image search, video search, academic search, news search, and industry-specific vertical search engines.

Optimizing a website may involve editing its content, adding content, modifying HTML, and associated coding to both increase its relevance to specific keywords and to remove barriers to the indexing activities of search engines. Promoting a site to increase the number of backlinks, or inbound links, is another SEO tactic. By May 2015, mobile search had surpassed desktop search.As an Internet marketing strategy, SEO considers how search engines work, the computer programmed algorithms which dictate search engine behavior, what people search for, the actual search terms or keywords typed into search engines, and which search engines are preferred by their targeted audience. SEO is performed because a website will receive more visitors from a search engine the higher the website ranks in the search engine results page (SERP). These visitors can then be converted into customers.SEO differs from local search engine optimization in that the latter is focused on optimizing a business' online presence so that its web pages will be displayed by search engines when a user enters a local search for its products or services. The former instead is more focused on national or international searches.

Search engine technology

A search engine is an information retrieval software program that discovers, crawls, transforms and stores information for retrieval and presentation in response to user queries.A search engine normally consists of four components e.g. search interface, crawler(also known as a spider or bot),indexer, and database. The crawler traverses a document collection, deconstructs document text, and assigns surrogates for storage in the search engine index. Online search engines store images, link data and metadata for the document as well.

Timeline of web search engines

This page provides a full timeline of web search engines, starting from the Archie search engine in 1990. It is complementary to the history of web search engines page that provides more qualitative detail on the history.

W3Catalog

W3 Catalog was a very early web search engine, first released on September 2, 1993 by developer Oscar Nierstrasz at the University of Geneva.

The engine was given firstly the name jughead, but then renamed. Unlike later search engines, like Aliweb, which attempt to index the web by crawling over the accessible content of web sites, W3 Catalog exploited the fact that many high-quality, manually maintained lists of web resources were already available. W3 Catalog simply mirrored these pages, reformatted the contents into individual entries, and provided a Perl-based front-end to enable dynamic querying.At the time, CGI did not yet exist, so W3 Catalog was implemented as an extension to Tony Sander's Plexus web server, implemented in Perl.

W3 Catalog was retired on December 8, 1996. At an unknown date after December 8 1996 the engine was re-activated.

WebCrawler

WebCrawler is a web search engine, and is the oldest surviving search engine on the web today. For many years, it operated as a metasearch engine. WebCrawler was the first web search engine to provide full text search.

Web mining

Web mining is the application of data mining techniques to discover patterns from the World Wide Web. As the name proposes, this is information gathered by mining the web. It makes utilization of automated apparatuses to reveal and extricate data from servers and web2 reports, and it permits organizations to get to both organized and unstructured information from browser activities, server logs, website and link structure, page content and different sources.

The goal of Web structure mining is to generate structural summary about the Web site and Web page. Technically, Web content mining mainly focuses on the structure of inner-document, while Web structure mining tries to discover the link structure of the hyperlinks at the inter-document level. Based on the topology of the hyperlinks, Web structure mining will categorize the Web pages and generate the information, such as the similarity and relationship between different Web sites.

Web structure mining can also have another direction -- discovering the structure of Web document itself. This type of structure mining can be used to reveal the structure (schema) of Web pages, this would be good for navigation purpose and make it possible to compare/integrate Web page schemes. This type of structure mining will facilitate introducing database techniques for accessing information in Web pages by providing a reference schema.

Web search engine

A web search engine or Internet search engine is a software system that is designed to carry out web search (Internet search), which means to search the World Wide Web in a systematic way for particular information specified in a web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). The information may be a mix of web pages, images, videos, infographics, articles, research papers and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time information by running an algorithm on a web crawler.

Internet content that is not capable of being searched by a web search engine is generally described as the deep web.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.