DocumentExchange |
Supported DocumentExchange file formats (IdClassification.DocumentExchange - Document exchange formats. Document exchange formats are non-program specific, i.e., different applications can output these exchangable document formats.)
If a file format does not have a supported content extractor that extracts text then, optionally (default), a binary-to-text content extractor will be used to extract UTF-8, UTF-16, Windows-1252, and ASCII from the binary. In many cases, indexable text can be extract from unknown document formats.
File Format Id Enum Value | Text | Metadata | EmbeddedItem | ContentHash | Description |
|---|---|---|---|---|---|
X | X | Microsoft XPS (Open XML Paper Specification) (.xps). | |||
X | X | Microsoft XPS (Open XML Paper Specification) that is potentially corrupted. The format's zip container failed inspection (zip potentially truncated) and format had to be identified using an alternate means (.xps). | |||
X | X | X | Adobe Portable Document Format (PDF) (.pdf). | ||
X | X | X | Encrypted Adobe Portable Document Format (PDF) (.pdf). | ||
X | X | X | Adobe Portable Document Format (PDF) Portfolio. A PDF Portfolio contains multiple files assembled into an integrated PDF unit (.pdf). | ||
X | X | X | Encrypted Adobe Portable Document Format (PDF) Portfolio. A PDF Portfolio contains multiple files assembled into an integrated PDF unit (.pdf). | ||
X | X | X | Adobe Portable Document Format (PDF) XML Forms Architecture (XFA). An XFA PDF is a interactive and dynamic form created with AEM Forms Designer (.pdf). | ||
X | X | X | Encrypted Adobe Portable Document Format (PDF) XML Forms Architecture (XFA). An XFA PDF is a interactive and dynamic form created with AEM Forms Designer (.pdf). | ||
X | X | X | Adobe Portable Document Format (PDF) AcroForm. AcroForm is Adobe’s older interactive form technology (.pdf). | ||
X | X | X | Encrypted Adobe Portable Document Format (PDF) AcroForm. AcroForm is Adobe’s older interactive form technology (.pdf). | ||
Acrobat Forms Data Format (FDF) (.fdf) | |||||
Adobe XML Data Package (XDP) format, this format allows PDF and/or XFA content resources to be packaged within an XML container (.xdp). | |||||
X | Adobe XML Forms Data Format (XFDF) is a format for representing forms data and annotations in a PDF document. XFDF is an XML version of Forms Data Format (FDF) (.xfdf). | ||||
X | X | X | Microsoft Rich Text Format (*.rtf) | ||
DjVu file format. This file format was designed primarily to store scanned documents but is also used as eBook format and has been promoted as an alternative to PDF (.djv;.djvu). | |||||
Encrypted (secure) DjVu file format. This format designed primarily to store scanned documents but is also used as eBook format and has been promoted as an alternative to PDF (.djv;.djvu). | |||||
Adobe Postscript (.ps). | |||||
Adobe Encapsulated Postscript (.eps;.epsf;.ps). | |||||
Encapsulated PostScript with content preview image (usually TIFF image) (.eps;.epsf;.epsi). | |||||
X | OASIS DocBook XML document for general and technical publishing (.xml). |