Navigation:  PRS Essential Reference > Document Management > Document Search Tool >

Words and Phrases

Previous pageReturn to chapter overviewNext page

Here's how the PRS Document Indexer manages search text:

 

Capitalization

The Document Indexer doesn't distinguish between uppercase letters and lowercase letters. A search for HoLiDay will return all documents that contain the word holiday or Holiday.

 

Words and Punctuation

The Indexer treats every documents as a sequence of terms. A term in this context is any string of letters and digits delimited either by punctuation, non alphanumeric characters or white space (spaces, tabs, ends of lines).

 

To be a word, a string does not have to be spelled correctly or be included in any dictionary. All that is required is that someone typed it as a single word in a document. Thus, the following are words if they appear delimited in a document: 300ZX, 602e21, WWW, HTTP.

 

In some common constructs non alphanumeric characters are included in the term, the following examples are treated as single terms:

 

prshq.com

support@prshq.com

U.S.A

AT&T

25.4

 

Leading a trailing punctuation is always stripped so that C++ and .NET are stored as c and net.

 

Phrases

A phrase is a string of words that are contiguous in a document, although they may be separated by any amount of white space or punctuation. They do not have to make sense grammatically; they just have to occur in a document as a contiguous sequence of words. For example:

 

President of the U.S.A. (4-word phrase)

http://www.election.digital.com (2-word phrase)