| Member name | Value | Description |
|---|
| PasswordProtected | 1 |
Document is password protected (encrypted). If document is an archive then this flag means
that the archive central directory is encrypted and information on archive items is not
available unless decrypted with password.
|
| DefaultPassword | 2 |
Document is password protected (encrypted) with application default password. Some documents, like Excel and PowerPoint, are
encrypted with their respective application default password under certain scenarios. To open the document using the application
does not require the password (application automatically decrypts), but to extract content using 3rd party software does.
|
| ArchiveItemsPasswordProtected | 6 |
Archive has password protected (encrypted) items.
|
| PersonName | 7 |
Document has author, contributor, last-edited-by, or last-printed-by identifying metadata field(s) (does not include user defined metadata fields).
|
| Macros | 8 |
Office document has macros.
|
| Comments | 9 |
Document has user comments or notes (i.e., non-metadata comment/descriptions).
|
| CustomMetadata | 10 |
Document has custom (user defined) metadata fields.
|
| RevisionTracking | 11 |
Document has revisions being tracked.
|
| ExternalFileAttachments | 12 |
Document has externally referenced attachments (files such as OneNote2010 can have external attachments (.onebin files)).
|
| Template | 13 |
Document is a template.
|
| Headers | 14 |
Document has page or sheet headers (not set for PowerPoint, all versions; not set for OpenDocument spreadsheets).
|
| Footers | 15 |
Document has page or sheet footers (not set for PowerPoint, all versions; not set for OpenDocument spreadsheets).
|
| OfficeLinkedContent | 20 |
Office 2007 or newer document has externally linked content either as hyperlinks or OLE linked files.
Also supported for PDF and Open Document formats.
|
| OfficeEmbeddedDocuments | 21 |
Office document has embedded document(s) (applies to Microsoft Office and OpenDocument formats).
|
| OfficeEmbeddedPictures | 22 |
Office document has embedded picture(s) (applies to Microsoft Office and OpenDocument formats).
|
| OfficeEmbeddedMedia | 23 |
Office document has embedded media files (applies to Microsoft Office 2007+ and OpenDocument formats).
|
| OfficePictureLinkedContent | 24 |
Office 2007 or newer document has linked picture.
|
| OfficeExternDataConnections | 25 |
Office 2007 or newer document has external data connections.
|
| OfficeCustomXmlData | 26 |
Office 2007 or newer document has custom xml data parts.
|
| OfficeWebExtensionAddIns | 27 |
Office 2007 or newer document has web extensions (e.g., such as task pane add-ins).
|
| OfficeModernComments | 28 |
Office 365 document has user 'modern comments' which allow assigning tasks in comment threads and other features.
|
| HiddenText | 35 |
Document has text characters or textboxes formatted as hidden.
|
| WorkbookProtected | 40 |
Workbook is protected.
|
| WorkbookProtectedWorksheets | 41 |
Workbook has protected worksheets.
|
| WorkbookHiddenWorksheets | 42 |
Workbook has hidden worksheets.
|
| WorkbookVeryHiddenWorksheets | 43 |
Workbook has very hidden worksheets.
|
| WorksheetHiddenRows | 44 |
Worksheet has hidden rows.
|
| WorksheetHiddenColumns | 45 |
Worksheet has hidden columns.
|
| WorksheetAutoFilters | 46 |
Worksheet has auto-filters.
|
| WorksheetPivotTables | 47 |
Worksheet has pivot tables.
|
| WorkbookExternalWorkbookReferences | 48 |
Workbook has external spreadsheet references.
|
| WorksheetThreadedComments | 49 |
Workbook has threaded comments (this applies to Excel for Office 365 which changed the way comments worked - comments are now threaded discussions).
|
| PresentationHiddenSlides | 60 |
Presentation document has hidden slides.
|
| PresentationHasSpeakerNotes | 61 |
Presentation document has speaker notes.
|
| PdfPortfolio | 70 |
PDF document is a PDF Portfolio/Package (After Acrobat 8.0, the term PDF Portfolio, versus PDF Package, is used
to to describe any document that contains a collection dictionary).
|
| PdfXFA | 71 |
PDF document contains XFA form.
|
| PdfAcroForm | 72 |
PDF document contains AcroForm (non-XFA) form.
|
| PdfHasFailedPages | 73 |
PDF document contains one or more pages where text was not extracted due to a processing exception or the number extracted text characters
did not meet the PageExtractedTextCriteria number of characters.
|
| DominoXmlHasEncryptedItems | 80 |
Domino XML document (.dxl) has 'item' XML elements that are encrypted.
|
| DominoXmlHasEncryptedAttachments | 81 |
Domino XML document (.dxl) has one or more encrypted attachments.
|
| DominoXmlHasNativeMimeFlag | 82 |
Domino XML document (.dxl) has 'item' XML element named $NoteHasNativeMime with value of "1" (true).
|
| DominoXmlHasNativeMimeElement | 83 |
Domino XML document (.dxl) has 'item' element with <mime> child, i.e., exported DXL data primarily stored in <mime> element.
|
| DominoXmlHasNativeMimeBody | 84 |
Domino XML document (.dxl) has item 'body' element with 'rawitemdata' (RFC-822) formatted data.
|
| OutlookEmailHasRefAttachment | 100 |
Outlook email object has attachment(s) that are referenced by a fully qualified file system path (HasReferenceAttachment)
|
| OutlookEmailHasRefOnlyAttachment | 101 |
Outlook email object has attachment(s) that are referenced by a fully qualified path (HasReferenceOnlyAttachment)
|
| OutlookEmailHasWebRefAttachment | 102 |
Outlook email object has an attachment(s) that are by web API reference only (HasWebReferenceAttachment).
|
| DetectedSocialSecurityNumber | 200 |
Detected possible social security number(s) in extracted text or metadata.
|
| DetectedIndividualTaxpayerIdNumber | 201 |
Detected possible Individual Taxpayer Identification Number(s) (ITIN) in
extracted text or metadata. An ITIN is a tax processing number only available for certain nonresident and resident aliens, their spouses, and
dependents who cannot get a Social Security Number (SSN).
|
| DetectedCreditCard | 202 |
Detected possible credit card number(s) in extracted text or metadata.
|
| DetectedBankAccount | 203 |
Detected possible bank account number(s) in extracted text or metadata.
|
| DetectedIBAN | 204 |
Detected possible international bank account number(s) (IBAN) in extracted text or metadata.
|
| DetectedInvestmentAccount | 205 |
Ddetected possible investment account number(s) in extracted text or metadata.
|
| DetectedEmailAddress | 206 |
Detected possible email address(es) in extracted text or metadata.
|
| DetectedEmailAddressAndName | 207 |
Detected possible email address associated with person's name.
|
| DetectedEmailAddressAndIPAddress | 208 |
Detected email address and associated IP address.
|
| DetectedPhoneNumber | 209 |
Detected possible phone number(s) in extracted text or metadata.
|
| DetectedAddress | 210 |
Detected full physical address(es) in extracted text or metadata.
|
| DetectedDateOfBirth | 211 |
Detected possible date of birth(s) in extracted text or metadata.
|
| DetectedDriversLicense | 212 |
Detected possible driver's license number(s) in extracted text or metadata.
|
| DetectedPassport | 213 |
Detected possible passport number(s) in extracted text or metadata.
|
| DetectedMaidenName | 214 |
Detected possible (mother's) maiden names in extracted text or metadata.
|
| DetectedHealthCareNumberID | 215 |
Detected possible health care insurance number/member ID in extracted text or metadata.
|
| DetectedLicensePlateNumber | 216 |
Detected possible vehicle license plate number in extracted text or metadata.
|
| DetectedVehicleIdentificationNumber | 217 |
Detected possible vehicle identification number (VIN) in extracted text or metadata.
|
| DetectedSocialMediaAccount | 218 |
Detected possible social media account name in extracted text or metadata.
|
| DetectedCryptoCurrencyAddress | 219 |
Detected possible cryptocurrency wallet address in extracted text or metadata.
|
| DetectedIPv4Address | 230 |
Detected IPv4 address(es) in extracted text, hyperlinks, or metadata.
|
| DetectedIPv6Address | 231 |
Detected IPv6 address(es) in extracted text, hyperlinks, or metadata.
|
| DetectedMacAddress | 232 |
Detected MAC address(es) in extracted text, hyperlinks, or metadata.
|
| DetectedIMEINumber | 233 |
Detected IMEI number in extracted text, hyperlinks, or metadata.
|
| DetectedPassword | 270 |
Detected possible password(s) in extracted text or metadata.
|
| DetectedUsername | 271 |
Detected possible login username(s) in extracted text or metadata.
|
| DetectedNetworkName | 272 |
Detected possible Network, workstation, desktop, or computer name(s) in extracted text or metadata.
|
| DetectedDatabaseCredential | 273 |
Detected possible database credential(s) (or url links to Azure, Sharepoint, AWS, etc. storage) in extracted text or metadata.
|
| DetectedMachineReadableZone | 274 |
Detected machine-readable zone (MRZ) zone used by passports, immigration visas, travel documents and driver's licenses.
|
| DetectedCustomEntityItem | 300 |
Detected user defined custom Entity in extracted text or metadata.
|
| TextTruncatedToMaxAllowable | 1,000 |
Indicates that extracted text exceeded .NET maximum 2GB string data type total size in bytes (1,073,741,791 UTF-16 characters) and was
truncated to maximum string size.
|
| TextTruncatedForDocumentStore | 1,001 |
RESERVED. Indicates that extracted text exceeded document store's text per document storage limitation. This attributes is not set by Open Discover SDK.
It is reserved for user processing workflows that export processed documents into a document database such as Elasticsearch, Ravendb, MongoDB, etc.
|
| EntityDetectionScanLimited | 1,010 |
Indicates that entity detection scan on extracted text was limited to a maximum number of bytes (in case of binary blobs) or characters (in case of 'large'
encoded text files).
- For "large" text files that exceed 200 million characters, only the first 200 million characters are scanned for entity items.
- For "large" unsupported binary files (blob) that exceed 100 million bytes (100MB), only the first 100 million characters are scanned for entity items.
|
| MaxNumberOfDocumentOcrPagesLimited | 2,000 |
RESERVED. Optical Character Recognition (OCR) of document was limited to a user defined maximum number of document pages for OCR. This attribute only applies
to Adobe PDFs and multi-page documents.
|