TKK | Tietoverkkolaboratorio | Opetus

3. How to Use a Search Engine Efficiently

Search sites are under constant renovation, mergers and collaborative agreements, so it would be a full-time job just keeping up with their changes. Therefore if you want to use some specific search engine efficiently, you must delve into site's online help documentation and find out by yourself how the site works, and how to construct effective search queries. The rest of this page tries to outline what kind on information you should find from help documentation.

Concise comparison of features of search engines can be found from Search Engine Features Chart.

3.1 Query Language Features

Unfortunately almost every search engine uses a query language slightly different to others. You should also be aware of differences between Simple Search and Advanced (or Power) search. For instance, when you type words into Alta Vista's Simple Search Field, they'll be searched as if connected with an OR operator, and the results will be ranked according to Alta Vista's algorithm, which you can't conrol. With the Advanced Search, however, you must type a Boolean operator between words and phrases, and you can control how the results are returned to you.

Default Operation

Default operation means how words you have typed into Search field are connected in the search query. If default is AND, documents searched must contain all the words in the search query. If default is OR, each document must contain at least one of the words in the search query. Depending on default operation, the number of documents found can vary quite a lot.

Boolean Searching

Boolean searching means that you can define how different search term are connected with Boolean operators:

AND
match all words
OR
match any words
AND NOT
exclude these words
NEAR
proximity required
( )
search inside first

All search sites support some of these Boolean operators, but not all of them. Some also require (or allow as optional) some symbols to be used as shorthand for some operators. The most common symbols are shown below:

+
and
-
or
&
and
|
or

Proximity Searching

Proximity searching means that you can search terms which are close to each other. The usual option is phrase search which all major search engines offer. It means that you type some words, for instance "search engine review", inside quotation marks and all documents you get as a result must contain this phrase. Alta Vista seems to be the only search site which offers Boolean operator NEAR in Advanced Search and finds words which are not necessarily one after another but within a few words distance.

Truncation

Truncation means that you can use wildcard symbol and search for all words which have same stem. For example, to search on optics, optical and opto-electronics you could use search term opt*, where * is used as wildcard symbol.

Case Sensivity

Case sensivity means that words written uppercase letters are interpreted differently than words written lowercase letters. Search site using case sensitive search query would give different results for search terms bill and Bill whereas case insensitive search query would give exactly the same results for both terms.

3.2 Limited Searches

Some search engines let you narrow your search with options. You can for example search only specific parts of web page, such as page title. This kind of searching is called field searching. Some sites offer possibility to restrict your searches to specific language or domain or date.

Field Searching

For example Alta Vista enables you to search following specific fields of html-document. Meaning of every field has been explained briefly with an example.

anchor:<text>
Pages that contain the specified text inside a hyperlink. Example: "anchor:dancing" for pages with dancing in a link.
applet:<name>
Pages that contain a specified java applet. Example: "applet:poker" for pages with a poker applet.
domain:<domainname>
Pages within the specified domain. Example: "domain:com" for pages with .com addresses.
host:<hostname>
Pages on the specified computer. Example: "host:www.bobdylan.com" for pages on the computer bobdylan.com.
image:<filename>
Pages that contain images with the specified filename. Example: "image:paris" for pages with a picture with "paris" in the name.
like:<URL>
Pages that are similar in topic to the page with the specified URL.
link:<URL>
Pages with a link to a page with the specified URL. Example: "link:www.whitehouse.gov" for pages that link to the White House homepage.
title:<text>
Pages that contain the specified text in the page title. Example: "title:movies" for pages with the word movies in the title.
url:<text>
Pages that contain the specified text within the URL. Example: "url:raging" for pages with the word raging in the URL.

3.3 Stop Words

Stop words are common words like a, the, and ... and when search site is using stop words, it means that when a document is indexed, words which are in site's "stop word list" are not indexed.

3.4 Sorting the Results

Typically, Internet search engines sort the results by "relevance" determined by their proprietary relevance ranking algorithms. Some sites may offer possibility to arrange the results by date, alphabetically by title, or by root URL or host name.

[Previous page] [Contents] [Next page]


Tämä sivu on tehty Teletekniikan perusteet -kurssin harjoitustyönä.
Sivua on viimeksi päivitetty 08.12.2000 23:25
URL: http://www.netlab.tkk.fi/opetus/s38118/s00/tyot/28/use.shtml