Search engines: What are they?
A search engine is one of the most important tool that helps you find information on the web. Search engines are giant sized automated cataloguing & retrieval systems. They typically have large databases of web pages & other information found on the net. Upon specific query by the users these databases are scanned & the matching results displayed.
The utility of search engines in web search lies in the fact that they are the repositories of large amounts of information which can be searched very conveniently using certain keywords. This natural language searching & many advanced features now available with search engines makes the recipe tantalizingly attractive. Hence it is no wonder that 78% of the users start their search process for any information via a search engine.
Before we go any further lets clarify the difference between a search engine & a web directory. Search engines as we have learnt are automated software based systems which aid the web search. Directories are generally human based indexes in which the web sites are visited & catalogued by human beings. In the directories the web sites are included in the categories & sub categories . These categories or taxonomies are specific to specific directories.
Users can search for relevant sites in a directory by either mining on the subject they are looking for or searching through keywords. The most famous commercial directory is Yahoo. There are hundreds of directories available. We will be discussing them later. A point to note here is that most of the SE have now started to mix their web results with some kind of directory results.
Though there is a category of SE which are not crawler based. They accept money to list your web site. These paid listings search engines will be dealt in greater detail in subsequent section.
For the moment, lets look at search engine internals a little more closely:
What goes into making a search engine
To have a birds eye view of search engine functionality will help the searcher evaluate the relative search engine merits amongst themselves as also compared with other search resources.
Search engines as we know them, to put simply, are divided into three parts:
Lets take them one by one.
Search Engine Crawler: A traditional database is populated using conventional data entry methods. Not so on the web. Search engines send a software called spider or crawler to the wild web. This spider will visit a web site once it hears of it (through other links pointing to the web site, cataloguing in some directory or direct submission to the SE) & will start to crawl around. Based on links in the site it will digest as much information as it has been programmed to receive about the web site. Thus crawlers are search engines inventory(index) suppliers.
Search Engine Index: Index or the catalogue is that huge book which contains a copy of all the web pages that have been spidered. This in other words is the database of the search engine. Once the site has been crawled it is added to the index of the search engine. This database is refreshed & updated as per the settings done by specific search engines on a regular basis.
Search Engine Ranking Software: This is the sauce which is mostly instrumental in different crawler based search engines serving different platters on being queried about the same recipe. In other words, this is the area where most of the search engines differ in the way they handle queries & display results. This portion of the SE deals with scanning the database (index) according to specific user queries & then ranking the matches found.
This brings us to the variety found in search engines.
Types of search engines:
Search engines come in various flavors. There is as much variety in search engines as one wants. From a searchers angle it makes sense to have a handle on the major search engines in multiple categories as one important aspect of doing a good search is to be aware of specific tools to be applied in specified conditions. To illustrate, a good medical search could be done using medhunt.com as compared to say altavista.com.
Broadly the categories that the major search engine could be divided into are:
General search engines
Specialty search engines
Meta search engines
Paid search engines
All of these search engines have a specific utility & specific application. Lets take some examples to demonstrate.
The almost eponymous General search engine is google.
In specialty search engines we have search engines specially for multimedia, medicine, blogs, law, industry ,scientific papers…
The meta search engines are which use two or three other general search engines to generate results & then customize the format in which they are presented. Some prime examples are mamma.com, myway.com, metacrawler.com etc.
This category of search engines are really growing as a category. Starting with overture.com they come in multiple varieties as well. While some accept paid inclusions like inktomi to PPC( pay per click) engines like overture, espotting etc.
Search utilities are specialized tools utilized for special applications. One example of such utility is lexibot which scavenges the deep web for information.