Page tree
Skip to end of metadata
Go to start of metadata

This section allows configuration of each field in the index. These fields are populated by data extracted from content that has been crawled and indexed by the system. Fields that are enabled by default in LucidWorks Search have been optimized for most types of content found in the enterprise today, and editing these settings should only be required in special circumstances.

Editing the List of Fields

Icon

LucidWorks Search provides default fields that will work to index the various types of supported document types and set default settings to maximize their value to the system. However, this may not be the ideal configuration for your system. Some default LucidWorks fields have a special purpose and should only be removed or modified with care to avoid unexpected system behavior. These special fields are marked with a warning icon in the Fields screen and are discussed in the section on Special Fields.

 The page displays some information about a field, but more settings can be seen and modified by clicking on the field name. See the Detailed Field Settings section below for more details on each possible setting.

The following settings are displayed in the main field list:

  • Type: shows the Field Type as defined in the LucidWorks schema.
  • Indexed: shows if the content in the field will be indexed by the system (and thus available for searching).
  • Index for Spell Checking: shows if the field will be used as a base for spell-checking user queries.
  • Stored: shows if the field will be stored by the system and could be used for display to users or for another purpose.
  • Facet: shows if the field will be used for faceting.
  • Include in Results: shows if the field will be used in display to users.
  • Highlight: shows if the field will be used in highlighting.

Note that dynamically generated fields (those created by dynamic field rules) are not displayed in the field list. To view those fields, use the Dynamic Fields screen, or you can add "include_dynamic=true" to the URL in your browser when looking at the Fields screen to see the defined fields and dynamic field rules together.

Care should be taken when editing any field, as changes may have a deep impact on how content is stored and displayed to users. In some cases, re-crawl of content may be required; other changes may require a full system re-index for changes to be seen. Changing the following settings on any field requires a full system reindex:

  • Field Type
  • Indexed
  • Multi_valued
  • Short Field Boost

Changing a field from Stored to not stored (or vice versa) will require re-crawling content and a system re-index.

New fields can be added using the New Field button at the top of the page.

Field names may contain only letters, digits, underscores and hyphens. Spaces and other punctuation symbols are not allowed, and field names are case sensitive.

Detailed Field Settings

The following table lists each of the possible settings for a field and an explanation for each. Fields can be created with the Fields API; the attribute names are provided in the table below for your convenience. Some field configuration options are only available with the Fields API; those options are not listed below.

Parameter

Fields API Attribute Name

Description

Field Type

field_type

The field type setting controls how a field is analyzed. There are many options available, and more can be added by adding a new plugin to the schema.xml file. It is crucial to understand the underlying values for a field in order to correctly set its type. For full text fields, such as "title", "body", or "description", a text field type is generally the desired setting so individual words in the text are searchable. There are various text field types, most of which are language-specific. However, when a text field value is to be taken literally as-is (exact match only, or for faceting), the "string" type is likely the right choice.

There are also types for numeric data, including double, float, integer, and long (and variants of each suitable for sorting: sdouble, sfloat, sint, and slong). The date field accepts dates in the form "1995-12-31T23:59:59.999Z", with the fractional seconds optional, and trailing "Z" mandatory. Field types are defined in Solr's schema.xml file.

Indexed

indexed

An indexed field is searchable on the words (or exact value) as determined by the field type. Unindexed fields are useful to provide the search client with metadata for display. For example, URL may not be a valuable search term, but it is very valuable information to show users in their results list. For performance reasons, a best practice is to index as few fields as necessary to still give users a satisfactory search experience. If you change this setting, you must reindex all documents.

Facet

facet

Enables a field's terms (words for text types, or exact value for "string" type) to be returned to the search client. In the default LucidWorks search interface, faceted fields are displayed and made navigable to constrain the search results. A field must be indexed to be facetable. This setting can be changed without reindexing, as it is used at query time only.

Synonym Expansion

synonym_expansion

These settings are used only with the Lucid query parser for query-time handling of both synonyms and stop words. See Synonyms and Stop Words for more details.

Enable Stopword Handling

query_time_stopword_handling

This will require LucidWorks to apply the stop word list to queries that use this specific field. This does not enable stop words across the board, only to queries on this field (may be most useful for 'body' fields, for example).

Search by Default

search_by_default

This requires that all queries search this field when the user has not specifically defined a field in a query.

Use in "Find Similar"

use_in_find_similar

Controls whether this field is taken into consideration in find-similar/more-like-this computations. The field must be indexed for it to be used for find-similar. This setting can be changed without reindexing, as it is used at query time only.

Short Field Boost

short_field_boost

This relevancy boost compensates for text in short documents that have fewer opportunities for text matches and may otherwise rank lower in results than they should. Use 'moderate' for typical text fields such as the abstract or body of an article. Use 'high' for very short fields like title or keywords. Use 'none' for non-text fields. We strongly recommend that you follow changes to the short field boost with a full reindex.

Index Term Frequencies and Positions

Two parameters: omit_tf and omit_positions

This setting controls if the number of times a term appears in a document (term frequency) or the proximity of terms to other terms in the document (position) will be indexed. This information is important for operation of boosting and proximity searching. On text fields, term frequency and positions should be indexed, but the decision is dependent on how the field will be used for queries and/or display of results.

Three options are available:

  • none: no term frequency or position information will be stored in the index.
  • term frequencies: only the frequency of terms will be stored in the index.
  • term frequencies and positions: both frequency of terms and their positions will be stored.

Stored

stored

A field can be stored independently of indexing, and made available in the results sent to to a search client. Reindexing is not necessary when changing the stored field flag, though fields in documents will remain as they were when they were originally indexed until they are reindexed.

Include in Results

include_in_results

Controls whether the stored field value is returned to the search client. A field must be stored to be included in results. This setting can be changed without reindexing, as it is used at query time only.

Highlight

highlight

Controls whether highlighted snippets of the stored field value are returned to the search client. A field must be stored to be highlighted. This setting can be changed without reindexing, as it is used at query time only.

Multi-Valued

multi-valued

Enable this if the document could have multiple values for a field, such as multiple categories or authors. We recommend that you reindex all documents after changing this setting.

Default Value

default_value

This allows a default value to be entered if the field is empty in the document.

Copy This Field to Fields

copy_fields

This allows copying the values in this field to another field or fields for use in searching or results display. For example, by default the author field is copied to the author_display field, which is used to display author names in results lists.

Index for Spell Checking

index_for_spellcheck

This allows terms from this field to be used in creation of a spelling check index that will be created by default at the time of indexing. All fields selected for use in spell checking are combined into a single "spell" field for use in search suggestions.

Index for Auto-complete

index_for_autocomplete

This allows terms from this field to be used in creation of an auto-complete index that will be created by default at the time of indexing. All fields selected for use in auto-complete are combined into a single "autocomplete" field for use in search suggestions. If you change this setting, we recommend that you recreate the auto-complete index as described in Auto-Complete of User Queries.

Use for De-duplication

use_for_deduplication

This will use content indexed in this field for determining duplicated content and removing it from a user's result list.

Field Configuration for Synonyms

Icon

Fields must be properly configured for synonyms to work properly. If you expect synonyms to operate on a specific field, the settings "Search by Default" and "Enable Query Synonym Expansion" must be enabled or you may experience situations where search results do not include all documents which contain the synonym terms. To achieve the broadest application of synonym matching, these settings are particularly important for the "text_all" field, which is configured this way by default. Unless you give a specific field in your query, LucidWorks will query for synonym terms only in those fields that are both enabled for default search and enabled for synonym expansion.

Related Topics

  • No labels