Differences between indexed and non-indexed searches

Surround SCM supports indexed and non-indexed searches for text in files. See Searching for text in files. Indexed searches return results more quickly and include binary files, but you may need to use non-indexed searches depending on your needs.

The following information can help you understand more about indexed versus non-indexed searches.

Non-indexed searches

Non-indexed searches only look for matches in text files using a line-by-line search. Binary files are not searched. When you search, Surround SCM opens each text file and parses it for the exact phrase you entered as the search term. These searches can be slow if the branch or repository you are searching contains thousands of files.

Indexed searches

Indexed searches look for matches in text and binary files. Performing indexed searches requires running the indexing server and turning on indexing for branches that searches are performed on. See Indexing branches for optimized searches.

Indexed searches have two steps:

1. A search engine-style search is performed by the indexing server. The server looks for the individual words you entered as the search term to narrow the list of potential file matches and assigns a relevance value.

2. Each text file included as a result from step 1 is opened and searched for the exact phrase you entered as the search term. This is a line-by-line search similar to a non-indexed search except the search is performed on a smaller subset of files, which makes the operation much faster.

The indexed search results contain binary files that pass step 1 and text files that pass step 2. Binary files contain the search term in close proximity, but may not have an exact phrase match.

If indexing is turned on for a branch, you can still perform a non-indexed search. Select Do not use index when performing a search.

Comparison

Support for: Indexed search Non-indexed search More information
Binary files in search results Yes No File types include Adobe PDF (.pdf), Microsoft Excel (.xls, .xlsx), Microsoft Word (.doc, .docx), Open Document Format (.odf), and Rich Text Format (.rtf).
Common words, such as ‘for’ and ‘if’, included in the search term Only in a phrase Yes Common words, such as 'for', 'if', 'in', and 'or' are not indexed. Searching for ‘the’ finds results in a non-indexed search, but not in an indexed search. Searching for ‘the executable’ may find results in both types of searches.
Punctuation-only search term Only when using wildcards Yes Punctuation is not indexed. Searching for '&&' does not return any results in an indexed search unless you use wildcards.
Relevance value to indicate how closely results match the search term Yes No  
Matching text beyond the first 100,000 lines in a file or first 1,000 characters in a line Yes. File is included in results with a relevance value, but specific line numbers are not in results. No When searching text files, only the first 100,000 lines in each file and 1,000 characters in each line are searched. These limits help prevent Surround SCM from becoming unresponsive when searching large files.

Common words

The following common words are not indexed.

  • a, an, and, are, as, at
  • be, but, by
  • for
  • if, in, into, is, it
  • no, not
  • of, on, or
  • such
  • that, the, their, then, there, these, they, this, to
  • was, will, with