DocumentContent Properties |
The DocumentContent type exposes the following members.
Properties| | Name | Description |
|---|
 | Attributes |
Document attributes. See DocumentAttributes for an enumeration of supported attributes.
|
 | ChildDocuments |
Child documents (attachments/embedded items). See remarks for the special cases of archives (.7z, zip, etc), media images, and mail stores (.pst, .ost, .mbox, etc.).
|
 | CustomMetadata |
Contains custom (user-defined) document metadata as a dictionary of metadata field names as keys and metadata field data as corresponding values.
|
 | EntityExtractionResult |
Document entity item extraction result.
|
 | ErrorMessage |
Gets or sets an error message associated with Result. This property is only set when Result is not set to Ok.
|
 | ErrorStackTrace |
Error (exception) stack trace associated with ErrorMessage. This property is only set when Result is not Ok
and if an internal exception was caught.
|
 | ExtractedText |
Extracted text, see remarks for limitations.
|
 | FileEntropy |
Shannon entropy of the document's bytes.
|
 | FormatId |
Document format identification result from prior file identification (this object value was an input to content extractor factory and
stored here for convenience).
|
 | HyperLinks |
Document hyperlinks.
|
 | IsEmailType |
If true, this document is an email document. This DocumentContent object should be cast to a EmailDocumentContent to
get additional email document specific properties.
|
 | IsEncrypted |
Document is encrypted if this property is true.
|
 | IsHtmlType |
If true, document is an HTML document. This DocumentContent object should be cast to a HtmlDocumentContent to
get additional HTML document specific properties.
|
 | IsPdfType |
If true, document is an PDF document. This DocumentContent object should be cast to a PdfDocumentContent to
get additional PDF document specific properties.
|
 | LanguageIdResults |
Extracted text language identification results.
|
 | MD5BinaryHash |
MD5 binary document hash (hash of all document bytes).
|
 | MD5ContentHash |
MD5 content hash is a proprietary hash on only the content of a document file format.
|
 | Metadata |
Contains standard (non-user-defined) document metadata as a dictionary of metadata field names as keys and metadata field data as corresponding values.
|
 | Password |
The password found to decrypt the document by cycling through supplied password list.
|
 | Result |
Gets or sets the result of the content extraction. Check this value to see if content extraction was successful.
|
 | SHA1BinaryHash |
SHA-1 binary document hash (hash of all document bytes).
|
 | SHA1ContentHash |
SHA-1 content hash is a proprietary hash on only the content part of document file format.
|
 | SHA256BinaryHash |
SHA-256 binary document hash (hash of all document bytes).
|
 | SHA256ContentHash |
SHA-256 content hash is a proprietary hash on only the content part of document file format.
|
 | TextSourceType |
Gets or sets the method of the acquired document text (if any).
|
Top
See Also