ILargeEncodedTextExtractorExtractContent Method

Extracts content from a "large" encoded text file and optionally writes encoded text contents of this file to the supplied stream as either UTF-16 or UTF-8 encoding (which unicode encoding depends on UseLargeDocumentUTF16Encoding).

Namespace: OpenDiscoverSDK.Interfaces.Extractors
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2025.4.6.0 (2025.4.6)

Syntax

Copy

DocumentContent ExtractContent(
	Stream textFileOutputStream
)

Parameters

textFileOutputStream Stream: The stream to write text as UTF-16 or UTF-8 encoding. This stream SHOULD be a user supplied FileStream object if not null. If null, then only the "large" text files MD5BinaryHash and SHA1BinaryHash hashes are calculated (sensitive item detection and language identification are also not performed if null).

Return Value

DocumentContent
DocumentContent object.

Remarks

The only content that this extractor extracts is MD5BinaryHash and SHA1BinaryHash hashes of the document. If ExtractContent(Stream)Stream argument 'textFileOutputStream' is not null, then this content extractor interface will also write the encoded text contents of this file to the supplied stream as either UTF-16 or UTF-8 encoding (which unicode encoding depends on UseLargeDocumentUTF16Encoding).

This interface does not set the ExtractedText property, it only writes (if textFileOutputStream is not null) the encoded to text to the supplied Stream as UTF-16 or UTF-8 encoding (which unicode encoding depends on UseLargeDocumentUTF16Encoding).

Reference

ILargeEncodedTextExtractor Interface

OpenDiscoverSDK.Interfaces.Extractors Namespace