IDocumentContentExtractor Interface

Document content extractor interface.

Definition

Namespace: OpenDiscoverSDK.Interfaces.Extractors
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2026.2.6.0 (2026.02.06)
C#
public interface IDocumentContentExtractor : IContentExtractor, 
	IDisposable
Implements
IContentExtractor, IDisposable

Remarks

This interfaces is used to extract content from documents (e.g., Microsoft Office, OpenDocument, PDF, XPS, email, etc.) that are NOT container types such as archives (ZIP, 7z, etc) or mailstores (e.g., PST, OST, MBOX, etc).

Properties

ContentExtractorType The derived, actual content extractor interface type.
(Inherited from IContentExtractor)
Length Gets the document's length in bytes.
(Inherited from IContentExtractor)
SupportsChildrenExtraction If true, this content extractor supports attachment, embedded item, or container item extraction.
(Inherited from IContentExtractor)
SupportsContentHash Gets whether the specific document content extractor implementing this interface supports MD5ContentHash and SHA1ContentHash calculation.
SupportsDecryption If true, this content extractor supports decrypting password protected documents.
(Inherited from IContentExtractor)
SupportsMetadataExtraction If true, this content extractor supports metadata extraction.
(Inherited from IContentExtractor)
SupportsTextExtraction If true, this content extractor supports text extraction.
(Inherited from IContentExtractor)
Tag Allows user to associate an object with this content extractor.

Methods

DisposePerforms application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
(Inherited from IDisposable)
ExtractContent Extracts the document's content.
OverrideContentExtractionSettings Allows for overriding the ContentExtractionSettings object used by a IContentExtractor instance that was returned by a call to OpenDiscoverSDK.ContentExtractorFactory.GetContentExtractor. See remarks for limitations.
(Inherited from IContentExtractor)

Events

ContentExtractionHeartbeat Notification event that lets implementers of IContentExtractor know that content extraction is still under process. See remarks.
(Inherited from IContentExtractor)

See Also