ContentExtractionSettings Class

Main document content extraction settings class.

Definition

Namespace: OpenDiscoverSDK.Interfaces.Settings
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2026.2.6.0 (2026.02.06)
C#
[DataContractAttribute]
public class ContentExtractionSettings
Inheritance
Object    ContentExtractionSettings
Derived

Remarks

An instance of this class is a required argument in a call to SDK API method ContentExtractorFactory.GetContentExtractor to control what type of content is extracted from documents by the IContentExtractor derived extraction interfaces.

Constructors

Properties

EmbeddedObjectExtraction Embedded document/attachment and embedded office media extraction setting.
EntityExtractionSettings Options for entity extraction in extracted text, metadata, and URLs.
ExtractionType Text and metadata extraction setting.
ExtractOfficeTrackedChanges If true, appends tracked change information/text from office document formats (that support tracked changes) to the end of the document's extracted text; otherwise, tracked changes text is not appended to document's extracted text.
Hashing Document hashing settings.
LanguageId Language identification of extracted text settings.
LargeDocumentCritera Defines the "large" document criteria, in bytes, that determines what type of content extractor is returned by the content extractor factory for "large" unknown/unsupported formats and also "large" encoded text based formats.
PdfDocument PDF document extraction settings.
TimeZoneAndEmail Settings for document collection time zone and related extracted DateTime metadata and email extracted text DateTime display.
UnsupportedFiltering Binary-to-text filtering of unsupported/unknown document file format settings.
UseLargeDocumentUTF16Encoding Determines if UTF-16 or UTF-8 encoding is used when writing the 'large' (see LargeDocumentCritera) unknown/unsupported format binary-to-text extracted text or to re-encode a 'large' encoded text file to the provided Stream.

Methods

EqualsDetermines whether the specified object is equal to the current object.
(Inherited from Object)
GetHashCodeServes as the default hash function.
(Inherited from Object)
GetTypeGets the Type of the current instance.
(Inherited from Object)
ToStringReturns a string that represents the current object.
(Inherited from Object)

See Also