Click or drag to resize

DocumentContent Class

Represents extracted document content.
Inheritance Hierarchy
SystemObject
  OpenDiscoverSDK.Interfaces.ContentDocumentContent
    More

Namespace: OpenDiscoverSDK.Interfaces.Content
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2025.4.4.0 (2025.4.4)
Syntax
C#
[DataContractAttribute]
[KnownTypeAttribute(typeof(EmailDocumentContent))]
[KnownTypeAttribute(typeof(PdfDocumentContent))]
[KnownTypeAttribute(typeof(HtmlDocumentContent))]
[KnownTypeAttribute(typeof(ArchiveContent))]
[KnownTypeAttribute(typeof(MailStoreContent))]
[KnownTypeAttribute(typeof(DatabaseContent))]
[KnownTypeAttribute(typeof(BooleanProperty))]
[KnownTypeAttribute(typeof(DateTimeProperty))]
[KnownTypeAttribute(typeof(DoubleProperty))]
[KnownTypeAttribute(typeof(Int32Property))]
[KnownTypeAttribute(typeof(Int64Property))]
[KnownTypeAttribute(typeof(StringProperty))]
[KnownTypeAttribute(typeof(BooleanListProperty))]
[KnownTypeAttribute(typeof(DateTimeListProperty))]
[KnownTypeAttribute(typeof(DoubleListProperty))]
[KnownTypeAttribute(typeof(Int32ListProperty))]
[KnownTypeAttribute(typeof(Int64ListProperty))]
[KnownTypeAttribute(typeof(StringListProperty))]
public class DocumentContent

The DocumentContent type exposes the following members.

Constructors
 NameDescription
Public methodDocumentContent Default constructor.
Public methodDocumentContent(IdResult) Constructor.
Top
Properties
 NameDescription
Public propertyAttributes Document attributes. See DocumentAttributes for an enumeration of supported attributes.
Public propertyChildDocuments Child documents (attachments/embedded items). See remarks for the special cases of archives (.7z, zip, etc), media images, and mail stores (.pst, .ost, .mbox, etc.).
Public propertyCustomMetadata Contains custom (user-defined) document metadata as a dictionary of metadata field names as keys and metadata field data as corresponding values.
Public propertyEntityExtractionResult Document entity item extraction result.
Public propertyErrorMessage Gets or sets an error message associated with Result. This property is only set when Result is not set to Ok.
Public propertyErrorStackTrace Error (exception) stack trace associated with ErrorMessage. This property is only set when Result is not Ok and if an internal exception was caught.
Public propertyExtractedText Extracted text, see remarks for limitations.
Public propertyFileEntropy Shannon entropy of the document's bytes.
Public propertyFormatId Document format identification result from prior file identification (this object value was an input to content extractor factory and stored here for convenience).
Public propertyHyperLinks Document hyperlinks.
Public propertyIsEmailType If true, this document is an email document. This DocumentContent object should be cast to a EmailDocumentContent to get additional email document specific properties.
Public propertyIsEncrypted Document is encrypted if this property is true.
Public propertyIsHtmlType If true, document is an HTML document. This DocumentContent object should be cast to a HtmlDocumentContent to get additional HTML document specific properties.
Public propertyIsPdfType If true, document is an PDF document. This DocumentContent object should be cast to a PdfDocumentContent to get additional PDF document specific properties.
Public propertyLanguageIdResults Extracted text language identification results.
Public propertyMD5BinaryHash MD5 binary document hash (hash of all document bytes).
Public propertyMD5ContentHash MD5 content hash is a proprietary hash on only the content of a document file format.
Public propertyMetadata Contains standard (non-user-defined) document metadata as a dictionary of metadata field names as keys and metadata field data as corresponding values.
Public propertyPassword The password found to decrypt the document by cycling through supplied password list.
Public propertyResult Gets or sets the result of the content extraction. Check this value to see if content extraction was successful.
Public propertySHA1BinaryHash SHA-1 binary document hash (hash of all document bytes).
Public propertySHA1ContentHash SHA-1 content hash is a proprietary hash on only the content part of document file format.
Public propertySHA256BinaryHash SHA-256 binary document hash (hash of all document bytes).
Public propertySHA256ContentHash SHA-256 content hash is a proprietary hash on only the content part of document file format.
Public propertyTextSourceType Gets or sets the method of the acquired document text (if any).
Top
Methods
 NameDescription
Public methodEqualsDetermines whether the specified object is equal to the current object.
(Inherited from Object)
Public methodGetHashCodeServes as the default hash function.
(Inherited from Object)
Public methodGetTypeGets the Type of the current instance.
(Inherited from Object)
Public methodToStringReturns a string that represents the current object.
(Inherited from Object)
Top
Remarks
This class is also the base class for special document classes EmailDocumentContent, HtmlDocumentContent, PdfDocumentContent, ArchiveContent, MailStoreContent, and DatabaseContent. These derived content class types have additional extracted content associated with them.
See Also
Inheritance Hierarchy