DocumentAttributes Enumeration

Document attributes. Document attributes give extra information about a document such as if it has hidden content, is password protected (encrypted), has macros, is inline image (e.g., image is an inline email image), has external document references, etc.

Definition

Namespace: OpenDiscoverSDK.Interfaces.Content
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2026.2.6.0 (2026.02.06)
C#
[DataContractAttribute]
public enum DocumentAttributes

Members

PasswordProtected1 Document is password protected (encrypted). If document is an archive then this flag means that the archive central directory is encrypted and information on archive items is not available unless decrypted with password.
DefaultPassword2 Document is password protected (encrypted) with application default password. Some documents, like Excel and PowerPoint, are encrypted with their respective application default password under certain scenarios. To open the document using the application does not require the password (application automatically decrypts), but to extract content using 3rd party software does.
ArchiveItemsPasswordProtected6 Archive has password protected (encrypted) items.
Macros7 Office document has macros.
Comments8 Document has user comments or notes (i.e., non-metadata comment/descriptions).
CustomMetadata9 Document has custom (user defined) metadata fields.
RevisionTracking10 Document has revisions being tracked.
ExternalFileAttachments11 Document has externally referenced attachments (files such as OneNote2010 can have external attachments (.onebin files)).
Template12 Document is a template.
Headers13 Document has page or sheet headers (not set for PowerPoint, all versions; not set for OpenDocument spreadsheets).
Footers14 Document has page or sheet footers (not set for PowerPoint, all versions; not set for OpenDocument spreadsheets).
OfficeLinkedContent20 Office 2007 or newer document has externally linked content either as hyperlinks or OLE linked files. Also supported for PDF and Open Document formats.
OfficeEmbeddedDocuments21 Office document has embedded document(s) (applies to Microsoft Office and OpenDocument formats).
OfficeEmbeddedPictures22 Office document has embedded picture(s) (applies to Microsoft Office and OpenDocument formats).
OfficeEmbeddedMedia23 Office document has embedded media files (applies to Microsoft Office 2007+ and OpenDocument formats).
OfficePictureLinkedContent24 Office 2007 or newer document has linked picture.
OfficeExternDataConnections25 Office 2007 or newer document has external data connections.
OfficeCustomXmlData26 Office 2007 or newer document has custom xml data parts.
OfficeWebExtensionAddIns27 Office 2007 or newer document has web extensions (e.g., such as task pane add-ins).
OfficeModernComments28 Office 365 document has user 'modern comments' which allow assigning tasks in comment threads and other features.
HiddenText35 Document has text characters or textboxes formatted as hidden.
WorkbookProtected40 Workbook is protected.
WorkbookProtectedWorksheets41 Workbook has protected worksheets.
WorkbookHiddenWorksheets42 Workbook has hidden worksheets.
WorkbookVeryHiddenWorksheets43 Workbook has very hidden worksheets.
WorksheetHiddenRows44 Worksheet has hidden rows.
WorksheetHiddenColumns45 Worksheet has hidden columns.
WorksheetAutoFilters46 Worksheet has auto-filters.
WorksheetPivotTables47 Worksheet has pivot tables.
WorkbookExternalWorkbookReferences48 Workbook has external spreadsheet references.
WorksheetThreadedComments49 Workbook has threaded comments (this applies to Excel for Office 365 which changed the way comments worked - comments are now threaded discussions).
PresentationHiddenSlides60 Presentation document has hidden slides.
PresentationHasSpeakerNotes61 Presentation document has speaker notes.
PdfPortfolio70 PDF document is a PDF Portfolio/Package (After Acrobat 8.0, the term PDF Portfolio, versus PDF Package, is used to to describe any document that contains a collection dictionary).
PdfXFA71 PDF document contains XFA form.
PdfAcroForm72 PDF document contains AcroForm (non-XFA) form.
PdfHasFailedPages73 PDF document contains one or more pages where text was not extracted due to a processing exception or the number extracted text characters did not meet the PageExtractedTextCriteria number of characters.
DominoXmlHasEncryptedItems80 Domino XML document (.dxl) has 'item' XML elements that are encrypted.
DominoXmlHasEncryptedAttachments81 Domino XML document (.dxl) has one or more encrypted attachments.
DominoXmlHasNativeMimeFlag82 Domino XML document (.dxl) has 'item' XML element named $NoteHasNativeMime with value of "1" (true).
DominoXmlHasNativeMimeElement83 Domino XML document (.dxl) has 'item' element with <mime> child, i.e., exported DXL data primarily stored in <mime> element.
DominoXmlHasNativeMimeBody84 Domino XML document (.dxl) has item 'body' element with 'rawitemdata' (RFC-822) formatted data.
OutlookEmailHasRefAttachment200 Outlook email object has attachment(s) that are referenced by a fully qualified file system path (HasReferenceAttachment)
OutlookEmailHasRefOnlyAttachment201 Outlook email object has attachment(s) that are referenced by a fully qualified path (HasReferenceOnlyAttachment)
OutlookEmailHasWebRefAttachment202 Outlook email object has an attachment(s) that are by web API reference only (HasWebReferenceAttachment).
TextTruncatedToMaxAllowable1,000 Indicates that extracted text exceeded .NET maximum 2GB string data type total size in bytes (1,073,741,791 UTF-16 characters) and was truncated to maximum string size.
TextTruncatedForDocumentStore1,001 RESERVED. Indicates that extracted text exceeded document store's text per document storage limitation. This attributes is not set by Open Discover SDK. It is reserved for user processing workflows that export processed documents into a document database such as Elasticsearch, Ravendb, MongoDB, etc.
EntityDetectionScanLimited1,010

Indicates that entity detection scan on extracted text was limited to a maximum number of bytes (in case of binary blobs) or characters (in case of 'large' encoded text files).

  • For "large" text files that exceed 200 million characters, only the first 200 million characters are scanned for entity items.
  • For "large" unsupported binary files (blob) that exceed 100 million bytes (100MB), only the first 100 million characters are scanned for entity items.

MaxNumberOfDocumentOcrPagesLimited2,000 RESERVED. Optical Character Recognition (OCR) of document was limited to a user defined maximum number of document pages for OCR. This attribute only applies to Adobe PDFs and multi-page TIFF documents.

See Also