Archive |
Supported Archive file formats (IdClassification.Archive - Container document formats (usually compressed) that contain other documents (e.g., ZIP, 7z, tar, rar, etc))
If a file format does not have a supported content extractor that extracts text then, optionally (default), a binary-to-text content extractor will be used to extract UTF-8, UTF-16, Windows-1252, and ASCII from the binary. In many cases, indexable text can be extract from unknown document formats.
File Format Id Enum Value | Text | Metadata | EmbeddedItem | ContentHash | Description |
|---|---|---|---|---|---|
X | Self extracting ZIP archive executable (.exe). | ||||
X | Self extracting RAR versions 2, 3, and 4 archive executable (.exe). | ||||
X | Self extracting RAR version 5 archive executable (.exe). | ||||
X | Self extracting 7z archive executable (.exe). | ||||
X | Self extracting LZH/LZA archive executable (.exe). | ||||
X | X | ZIP archive file format (supports lossless data compression) (.zip;.zipx). | |||
An empty Zip archive file with no files (ZIP contains only an "end of central directory" record) (.zip;.zipx). | |||||
X | X | ZIP split archive segment. This segment (volume) is the end segment of an ZIP split archive and contains the archive central directory (.zip;.zipx). | |||
ZIP split archive segment (volume). Usually has the following file name patterns: filename.zN; filename.zip.N; filename.zxN, where N = segment (volume) number and where N=01,02,03,... | |||||
X | X | RAR archive file format versions 2, 3, and 4 (.rar). | |||
X | X | RAR split archive file format versions 3 and 4. This is first segment (volume) of the split archive (.rar). | |||
RAR split archive segment versions 3 and 4. This file is a segment (volume) of a RAR split archive. Usual file name pattern: "filename.partN.rar" where N= segment number and where N = 01, 02, 03, etc. (.rar) | |||||
X | X | RAR archive file format version 5 (.rar). | |||
X | X | RAR split archive file format version 5. This is first segment (volume) of the split archive (.rar). | |||
RAR split archive segment version 5. This file is a segment (volume) of a RAR split archive. Usual file name pattern: "filename.partN.rar" where N= segment number and where N = 01, 02, 03, etc. (.rar) | |||||
X | X | RAR legacy archive file format (no format documentation is known) (.rar). | |||
X | X | RAR archive file format versions 3 or 4 with encrypted headers. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||
X | X | RAR split archive file format versions 3 or 4 with encrypted headers. This is first segment (volume) of the split archive. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||
RAR split archive segment versions 3 or 4 with encrypted headers. This file is a segment (volume) of a RAR split archive. Usual file name pattern: "filename.partN.rar" where N= segment number and where N = 01, 02, 03, etc. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||||
X | X | RAR archive file format version 5 with encrypted headers. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||
X | X | RAR split archive file format version 5 with encrypted headers. This is first segment (volume) of the split archive. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||
RAR split archive segment version 5 with encrypted headers. This file is a segment (volume) of a RAR split archive. Usual file name pattern: "filename.partN.rar" where N= segment number and where N = 001, 002, 003, etc. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.rar). | |||||
X | X | 7-Zip archive file format (supports lossless data compression) (.7z). | |||
X | X | 7-Zip archive file format with encrypted headers. For archives with encrypted headers, no archive metadata or archive item information is available without password being applied first (.7z). | |||
X | X | 7-Zip split archive file format. This is first segment (volume) of the split archive. Split 7-Zip segments usually have the following file name pattern: "filename.7z.N", where N=segment number (volume#) (.7z). | |||
7-Zip split archive segment (volume), this is the second or greater part (segment) of the parts that make up a 7-Zip split archive. Split 7-Zip segments usually have the following file name pattern: "filename.7z.N", where N=segment number (volume#) (.7z). | |||||
ARC archive format (.arc). | |||||
X | X | Unix compress - LZW archive file format (It is the algorithm of the widely used Unix file compression utility 'compress', and is used in the GIF image format) (.Z). | |||
X | X | TAR archive file format (.tar). | |||
Stuffit archive format (.sit). | |||||
Stuffit X archive format (.sitx). | |||||
X | X | Lzh archive file format (.lzh). | |||
LZ4 compression stream format (.LZ4). | |||||
Lempel-Ziv style data compression stream using Finite State Entropy coding format (LZFSE) (.lzfse). | |||||
Zstandard compressed file format (.zst). | |||||
lzip compressed archive file format (.lz). | |||||
X | X | Gzip archive file format. Gzip normally is used to compress just single files (.gz;.tgz). | |||
X | X | Lzma raw archive format. | |||
X | X | RPM Package Manager (originally Red Hat Package Manager) software package format (.rpm). | |||
ZOO compressed archive (.zoo). Old and uncommon format. | |||||
X | X | ARJ (Archive by Robert Jung), proprietary archive file format (.arj). | |||
X | X | Bzip2 archive file format. Bzip2 only compresses single files and is not a file archiver (.bz2). | |||
X | X | Microsoft cabinet archive file format (.cab). | |||
X | X | Open Debian software package format. Debian packages are standard Unix ar archives (.deb, .udeb). | |||
X | X | eXtensible ARchive format (XAR), is an open source archive file format introduced in Mac OS X 10.5. (.xar). | |||
X | X | xz archive file format (.xz). | |||
X | X | cpio archive file format, primarily installed on Unix-like computer operating systems (.cpio). | |||
BlackHole archive file format (proprietary ZipTV compression format) (.bh). | |||||
WinAce Compressed File (a proprietary compression algorithmn format) (.ace). | |||||
X | X | Microsoft compiled HTML Help file format (.chm). | |||
Microsoft Help 2.x is a file format (not released as a general help platform) (.hxs). | |||||
HDF4 general purpose file format to store and organize large amounts of data (.hdf4;.h4;.hdf). | |||||
HDF5 version 1, general purpose file format to store and organize large amounts of data (.hdf5;.h5;.hdf). | |||||
HDF5 version 2, general purpose file format to store and organize large amounts of data (.hdf5;.h5;.hdf). | |||||
X | X | The 'archiver', mainly known as 'ar', is a Unix utility format that groups files as a single archive file (.ar;.a;.lib). |