Other |
Supported Other file formats (IdClassification.Other - Other miscellaneous types)
If a file format does not have a supported content extractor that extracts text then, optionally (default), a binary-to-text content extractor will be used to extract UTF-8, UTF-16, Windows-1252, and ASCII from the binary. In many cases, indexable text can be extract from unknown document formats.
File Format Id Enum Value | Text | Metadata | EmbeddedItem | ContentHash | Description |
|---|---|---|---|---|---|
File format that could not be identified. | |||||
Special identification for child files of containers (e.g., ZIP archives) that could not be extracted from their container, and thus, not identified. | |||||
Empty file (file has 0 bytes of data). | |||||
Internal ID Only: A Microsoft Outlook message attachment that is both hidden (Exchange property PidTagAttachmentHidden is 'true'), and empty (no binary data). Identifying this type of file allows it be excluded from extracted Outlook attachments. | |||||
X | Outlook email object attachment that is referenced only by a fully qualified file system path and whose file data is not contained in message object. | ||||
X | Outlook email object attachment that is referenced only by a fully qualified file path and whose file data is not contained in message (.msg) object. | ||||
X | Outlook email object attachment that is referenced only by a web API URL and whose file data is not contained in message (.msg) object. | ||||
X | X | X | OLE2 compound file format of unknown application type. | ||
File is a valid Microsoft compound file format (OLE2) but has no storages or streams and a CLSID = 00000000-0000-0000-0000-000000000000. Although sometimes found as an embedded file, this file has no useful content. | |||||
File is a corrupted Microsoft compound file format (OLE2) and the specific file format type could not be determined. | |||||
X | OLE linked object compound file that usually only contains an "1Ole" stream. This object is found embedded in documents and describes a link to an external object such as an Excel or Word document. | ||||
X | Microsoft Office Theme (document theme) (.thmx). | ||||
X | Windows Shortcut file (Shell Link Binary File Format) (.lnk). | ||||
X | A special Windows 'shortcut' file that opens Microsoft's new Windows Settings panel (Windows 8 and above) and which is featured primarily in Windows 10 over the old Control Panel system. Note: having this file type embedded in an Office 365 document is a security concern. | ||||
Jump List file used by Windows 7 and allows one to quickly view items recently edited by a program that is pinned to your taskbar simply by right-clicking the icon (.automaticDestinations-ms). | |||||
Jump List file used by Windows 7 and allows one to quickly view items recently edited by a program that is pinned to your taskbar simply by right-clicking the icon (.customDestinations-ms). | |||||
Temporary file that is usually hidden and is created by Microsoft Office when a previously saved Microsoft Office document is opened for editing, printing, or review. This temporary file is called the "owner file" and contains the user name of person who opened the file. The file name begins with "~$" and the extension is the same as the original document. | |||||
Compound file formatted temporary file that is usually hidden and is created by Microsoft Office when a previously saved Microsoft Office document is opened for editing, printing, or review. This temporary file is called the "owner file" and contains the user name of person who opened the file. The file name begins with "~$$" (for Visio) and the extension is the same as the original document with a prepended '~'. | |||||
Microsoft ActiveX control which is used to render HTML pages. This control may be found embedded in legacy Office documents and can be a security risk. | |||||
Windows clipboard. The clipboard is usually is a just in memory object but sometimes it may be saved in a .clp extension (.clp). | |||||
Windows Cardfile address book application (included with Microsoft Windows 1.0 through Windows Me and Windows NT 4) (.crd). | |||||
X | Windows thumbnail cache (or Thumbs.db format) is a file format used by some versions of Microsoft Windows to store thumbnails of images(.db). | ||||
X | Windows Visa/7/8/10 thumbnail cache (.db). | ||||
Windows Visa/7/8/10 thumbnail cache index (.db). | |||||
Windows Help File (.hlp). | |||||
Microsoft Windows NT 4 (and later) Registry File (REGF) used to store system and application related data (.dat). | |||||
X | X | X | Microsoft Graph (originally known as Microsoft Chart) (.gra). | ||
Microsoft Equation. This is the earliest Microsoft Equation version. | |||||
X | Microsoft Equation Editor 2.0 format. This format is found embedded in Office 97-2003 documents. | ||||
X | Microsoft Equation Editor 3.0 format. This format is found embedded in Office 97-2003 documents. | ||||
X | X | X | Microsoft Photo Editor version 3.0 (image-editing application found in Microsoft Office 97–XP versions for Windows, classified as one of Microsoft Office Tools). This format is often found embedded in Office 97-2003 documents. | ||
Microsoft Clip Art Gallery embedded object. This format is found embedded in Office 97-2003 documents and generally considered an unimportant embedded item (i.e., junk). | |||||
Microsoft WordArt embedded object. This format is found embedded in Microsoft Office documents is decorative text that you can add to a document. | |||||
Microsoft Draw 1.01 (packaged with Office). | |||||
Microsoft Draw 98 (packaged with Office). | |||||
X | X | X | Microsoft VBA (Visual Basic for Applications) Project. This format is often found embedded in Microsoft Office documents. | ||
X | X | X | Metafile (.wmf) OLE2 compound file container. IPicture objects provide a language-neutral abstraction for bitmaps, icons, and metafiles. | ||
X | X | X | Device Independent Bitmap (.bmp) OLE2 compound file container. IPicture objects provide a language-neutral abstraction for bitmaps, icons, and metafiles. | ||
X | X | X | Enhanced Metafile (.emf) OLE2 compound file container. IPicture objects provide a language-neutral abstraction for bitmaps, icons, and metafiles. | ||
X | Microsoft "Outlook File Attachment" - an OLE (compound file) wrapper around an attachment payload. | ||||
Microsoft Exchange public folder shortcut (.xnk). | |||||
Windows Media Player compressed 'skin' file (.wmz). | |||||
X | X | Microsoft InfoPath file (initially released as part of Microsoft Office 2003). InfoPath is an application used for designing, distributing, filling and submitting electronic forms containing structured data (.xsn). | |||
OrgPlus organizational chart file (Insperity Business Services, L.P.) (.opx). | |||||
Mac OS Safari WebArchive file format (archived complete web pages) (.webarchive). | |||||
Micrographx Clip Art Index or Pallete (.sbj). | |||||
InstallShield installation software "CAB" format and is a successor to InstallShield Z format (this format is not the same as Microsoft Cabinet format) (.cab). | |||||
InstallShield installation software "Z" proprietary format (used in version 3 of InstallShield) (.z). | |||||
Microsoft Forms 2.0 Object Library Checkbox control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library ComboBox control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library CommandButton control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library Form control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library Frame control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML CheckBox control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Hidden control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Image control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Option control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Password control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Reset control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Select control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Submit control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Text control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML TextArea control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Image control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML Label control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library HTML ListBox control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library Multi-Page control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library OptionButton control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library ScrollBar control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library TabStrip control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library TextBox control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library ToggleButton control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Microsoft Forms 2.0 Object Library Spin Button control (embedded item found in Microsoft Office Documents). Not considered useful for content extraction. | |||||
Windows MiniDump file used for reporting application crash data (.dmp;.mdmp). | |||||
Apple Desktop Services Store (.DS_Store) is a file (hidden on macOS) that stores custom attributes of its containing folder (.DS_Store). |