You can define parts of content which should not be indexed and made searchable in Funnelback. This is useful in excluding common navigation elements, headers and footers. There are two ways to do this:
No index expression in the collection configuration
Enter a regular expression so that content that matches the regular expression will not be indexed. Content that matches the expression will also be ignored when deciding if two files are duplicates based on their extracted text during a web crawl.
Inserting special HTML comments into the document itself
For example:
... This section is indexed ...
<!--noindex-->
... This section is not indexed ...
<!--endnoindex-->
... This section is indexed ...
Restricting results to specific directories
You can restrict search results to a specific directory in 2 ways:
manually append the "v:<path>" to your search query, for example:
search v:media
This will restrict results to only those that have the word "media" in
the URL. Example result URLs from above's example:
If you would like to be more specific, i.e. results must exist below /media/, you can manually append the "scope" CGI parameter to the URL, for example:
A list of search terms, searches from a certain IP addresses, or a combination of the two can be configured to be ignored by the reporting system. This helps preventing unwanted spam searches from appearing in the Query Reports system. This is controlled from the reporting-blacklist.cfg file accessible from the 'Administer' tab under 'Browse collection configuration files'. Example syntax: