WordProcessing |
Supported WordProcessing file formats (IdClassification.WordProcessing - Word processing document formats)
If a file format does not have a supported content extractor that extracts text then, optionally (default), a binary-to-text content extractor will be used to extract UTF-8, UTF-16, Windows-1252, and ASCII from the binary. In many cases, indexable text can be extract from unknown document formats.
File Format Id Enum Value | Text | Metadata | EmbeddedItem | ContentHash | Description |
|---|---|---|---|---|---|
Microsoft Works Word Processor DOS versions 1-3 and version 2.0 for Windows (.wps). | |||||
Microsoft Works Word Processor version 3 for Windows (.wps) | |||||
Microsoft Works Word Processor version 4 for Windows (.wps) | |||||
X | Microsoft Word for DOS versions 1.0 - 4.0 (.doc). | ||||
X | Microsoft Word 5.0 for DOS (.doc). | ||||
X | Microsoft Word 5.5 for DOS (.doc). | ||||
X | Microsoft Word 6.0 for DOS (.doc). | ||||
X | Word for Windows 1.0 (version 1, 1989) | ||||
X | Word for Windows 2.0 (version 2, 1991) | ||||
X | X | X | Word for Windows 6.0 (version 6, 1993). Versions skipped from 2 to 6 to bring Windows version numbering in line with that of DOS. | ||
X | X | X | Microsoft Word 95 (version 7, 1995) | ||
X | X | X | Encrypted Microsoft Word 95 (version 7, 1995) | ||
X | X | X | X | Microsoft Word 97 (version 8, 1997) | |
X | X | X | X | Encrypted Microsoft Word 97 (version 8, 1997) | |
X | X | X | X | Microsoft Word 2000 (version 9, 1999) | |
X | X | X | X | Encrypted Microsoft Word 2000 (version 9, 1999) | |
X | X | X | X | Microsoft Word 2002 (version 10, 2001) | |
X | X | X | X | Encrypted Microsoft Word 2002 (version 10, 2001) | |
X | X | X | X | Microsoft Word 2003 (version 11, 2003) | |
X | X | X | X | Encrypted Microsoft Word 2003 (version 11, 2003) | |
X | X | X | X | Microsoft Word 2007 (version 12, 2006) | |
X | X | X | X | Microsoft Word 2007 macro-enabled document (version 12, 2006) | |
X | X | X | X | Microsoft Word 2007 document template (version 12, 2006) | |
X | X | X | X | Microsoft Word 2007 macro-enabled document template (version 12, 2006) | |
X | X | X | X | Microsoft Word 2010 (version 14, 2010) | |
X | X | X | X | Microsoft Word 2010 macro-enabled document (version 14, 2010) | |
X | X | X | X | Microsoft Word 2010 document template (version 14, 2010) | |
X | X | X | X | Microsoft Word 2010 macro-enabled document template (version 14, 2010) | |
X | X | X | X | Microsoft Word 2013 (version 15, 2013) | |
X | X | X | X | Microsoft Word 2013 macro-enabled document (version 15, 2013) | |
X | X | X | X | Microsoft Word 2013 document template (version 15, 2013) | |
X | X | X | X | Microsoft Word 2013 macro-enabled document template (version 15, 2013) | |
X | X | X | X | Microsoft Word 2016 (version 16, 2015) | |
X | X | X | X | Microsoft Word 2016 macro-enabled document (version 16, 2015) | |
X | X | X | X | Microsoft Word 2016 document template (version 16, 2015) | |
X | X | X | X | Microsoft Word 2016 macro-enabled document template (version 16, 2015) | |
X | X | X | X | Microsoft Word 2007 or higher that is potentially corrupted. The format's zip container failed inspection and format had to be identified using an alternate means (.docx). | |
X | X | X | X | Encrypted Microsoft Word 2007-2013 | |
X | X | X | X | Encrypted and information rights management protected (IRM) Microsoft Word 2007-2016 format.IRM(what Microsoft calls DRM) uses permissions and authorization to help prevent sensitive information from being printed, forwarded, or copied by authorized users, or accessed by unauthorized people. | |
X | X | X | Microsoft Word 2003 (version 11, 2003) saved as XML file (.xml). | ||
X | X | X | Microsoft Word 2007 (version 12, 2006) saved as XML file (.xml). | ||
X | X | Microsoft Word 2007 (version 12, 2006) saved as XML file (.xml). | |||
X | X | X | Microsoft Word saved as (MIME) MHTML (.mht;.mhtml). | ||
Microsoft Word 97-2003 compound file format corrupted. Unable to determine specific format version (.doc). | |||||
X | X | X | Microsoft Word Picture 6.0 metafile (usually embedded metafiles in Microsoft Word 6 documents) (.doc). | ||
X | X | X | Microsoft Word Picture 95 metafile (usually embedded metafiles in Microsoft Word 95 documents) (.doc). | ||
X | X | X | X | Microsoft Word Picture metafile (usually embedded metafiles in Microsoft Word 97-2003 documents) (.doc). | |
X | Microsoft Word 1.0 for Mac OS (.mcw;.clx;.doc; or no extension). | ||||
X | Microsoft Word 3.0 for Mac OS (.mcw;.clx;.doc; or no extension). | ||||
X | Microsoft Word 4.0 for Mac OS (.mcw;.clx;.doc; or no extension). | ||||
X | Microsoft Word 5.0 for Mac OS (.mcw;.clx;.doc; or no extension). | ||||
X | X | X | StarOffice Writer version 5.2. | ||
X | X | X | StarOffice Writer versions 6.0 and 7 (.sxw;.odt). | ||
Encrypted StarOffice Writer versions 6.0 and 7 (.sxw;.odt). | |||||
X | X | X | StarOffice Writer version 8.0 .sxw;(.odt). | ||
Encrypted StarOffice Writer version 8.0 (.sxw;.odt). | |||||
X | X | X | StarOffice Writer version 9.0 (.sxw;.odt). | ||
Encrypted StarOffice Writer version 9.0 (.sxw;.odt). | |||||
X | X | X | OpenOffice.org Writer versions 1.x by Sun Microsystems (.sxw;.odt). | ||
X | X | X | OpenOffice.org Writer Template versions 1.x by Sun Microsystems (.stw;.ott). | ||
X | X | X | OpenOffice.org Writer versions 2.x by Sun Microsystems (.sxw;.odt). | ||
X | X | X | OpenOffice.org Writer Template versions 2.x by Sun Microsystems (.stw;.ott). | ||
X | X | X | OpenOffice.org Writer versions 3.x by Sun Microsystems (Last version of OpenOffice.org until it became Oracle OpenOffice Writer 3.3) (.sxw;.odt). | ||
X | X | X | OpenOffice.org Writer Template versions 3.x by Sun Microsystems (Last version of OpenOffice.org until it became Oracle OpenOffice Writer 3.3) (.stw;.ott). | ||
X | X | X | Oracle OpenOffice Writer version 3.3 (Last version of Oracle OpenOffice until it became Apache OpenOffice) (.odt). | ||
X | X | X | Oracle OpenOffice Writer Template version 3.3 (Last version of Oracle OpenOffice until it became Apache OpenOffice) (.ott). | ||
X | X | X | Apache OpenOffice Writer version 3.4+ (First version of Apache OpenOffice which came from open-sourced Oracle OpenOffice 3.3) (.odt). | ||
X | X | X | Apache OpenOffice Writer Template version 3.4+ (First version of Apache OpenOffice which came from open-sourced Oracle OpenOffice 3.3) (.ott). | ||
X | X | X | Apache OpenOffice Writer version 4.x (.odt). | ||
X | X | X | Apache OpenOffice Writer Template version 4.x (.ott). | ||
X | X | X | LibreOffice Writer version 3.x (3.3 is first version of LibreOffice after fork from Apache OpenOffice) (.odt). | ||
X | X | X | LibreOffice Writer Template version 3.x (3.3 is first version of LibreOffice after fork from Apache OpenOffice) (.ott). | ||
X | X | X | LibreOffice Writer version 4.x (.odt). | ||
X | X | X | LibreOffice Writer Template version 4.x (.ott). | ||
X | X | X | LibreOffice Writer version 5.x (.odt). | ||
X | X | X | LibreOffice Writer Template version 5.x (.ott). | ||
X | X | X | LibreOffice Writer version 6.x (.odt). | ||
X | X | X | LibreOffice Writer Template version 6.x (.ott). | ||
X | X | X | OpenDocument Text version 1.0. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | Encrypted OpenDocument Text version 1.0. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | OpenDocument Text version 1.0. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | Encrypted OpenDocument Text version 1.0. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | OpenDocument Text version 1.1. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | Encrypted OpenDocument Text version 1.1. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | OpenDocument Text version 1.1. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | Encrypted OpenDocument Text version 1.1. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | OpenDocument Text version 1.2. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | Encrypted OpenDocument Text version 1.2. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.odt;fodt). | ||
X | X | X | OpenDocument Text version 1.2. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | Encrypted OpenDocument Text version 1.2. Developed by OASIS, based on the original OpenOffice document format. This is a generic format as Apache OpenOffice, LibreOffice, and other applications use this format (.ott;fott). | ||
X | X | X | Embedded OpenDocument Text version 1.0. Developed by OASIS, based on the original OpenOffice document format. Embedded/extracted ODF documents that are missing their own "manifest.xml" and "mimetype" zip entries. They can result from improper extraction from their parent OpenDocument formats (.odt). | ||
X | X | X | Embedded OpenDocument Text version 1.1. Developed by OASIS, based on the original OpenOffice document format. Embedded/extracted ODF documents that are missing their own "manifest.xml" and "mimetype" zip entries. They can result from improper extraction from their parent OpenDocument formats (.odt). | ||
X | X | X | Embedded OpenDocument Text version 1.2. Developed by OASIS, based on the original OpenOffice document format. Embedded/extracted ODF documents that are missing their own "manifest.xml" and "mimetype" zip entries. They can result from improper extraction from their parent OpenDocument formats (.odt). | ||
Hancom HWP2 word processor file format version 2.0 (.hwp). | |||||
Hancom HWP2.1 word processor file format version 2.1 (.hwp). | |||||
X | X | Hancom HWP3 word processor file format version 3.0 (.hwp). | |||
X | X | X | X | Hancom HWP5 (Hanword) word processor file format version 5.0 (.hwp). | |
Encrypted Hancom HWP5 (Hanword) word processor file format version 5.0 (.hwp). | |||||
X | Hancom Word Processor Markup Language (HWPML) (.hml). | ||||
Ichitaro version 3 Japanese word processor produced by JustSystems (.jaw;.jbw;.jtw;.juw). | |||||
Ichitaro version 4 Japanese word processor produced by JustSystems (.jaw;.jbw;.jtw;.juw). | |||||
X | Ichitaro version 5 Japanese word processor produced by JustSystems (.jaw;.jbw;.jtw;.juw). | ||||
X | Ichitaro version 6 Japanese word processor produced by JustSystems (.jaw;.jbw;.jtw;.juw). | ||||
X | X | X | Ichitaro version 7 Japanese word processor produced by JustSystems (.jfw;.jvw;.jtw;.juw). | ||
X | X | X | Ichitaro version 8 Japanese word processor produced by JustSystems (.jtd;.jtdc;.jtt;.jttc) | ||
X | X | X | Ichitaro compressed version 8 Japanese word processor produced by JustSystems (.jtd;.jtdc;.jtt;.jttc). | ||
Novell PerfectWorks for Windows (.wpw). | |||||
X | X | X | Corel WordPerfect version 1.0 for Mac. | ||
X | X | X | Corel WordPerfect version 2.0 for Mac. | ||
X | X | X | Corel WordPerfect version 3.0 for Mac. | ||
X | X | X | Corel WordPerfect version 3.5e for Mac. | ||
X | X | X | Corel WordPerfect version 4.0 (.wp4;.wpf). | ||
X | X | X | Corel WordPerfect version 4.2 (.wp4;.wpf). | ||
X | X | X | Corel WordPerfect version 5.0 (.wp5;.wp). | ||
Encrypted Corel WordPerfect version 5.0 (.wp5;.wp). | |||||
X | X | X | Corel WordPerfect version 5.1 (.wp5;.wp). | ||
Encrypted Corel WordPerfect version 5.1 (.wp5;.wp). | |||||
X | X | X | Corel WordPerfect version 5.1 Far East (.wp5;.wp). | ||
X | X | X | Corel WordPerfect versions 6.0 to X8 (.wpd;.wp;.wp6;.wp7). | ||
Encrypted Corel WordPerfect versions 6.0 to X8 (.wpd;.wp;.wp6;.wp7). | |||||
X | X | X | Corel WordPerfect versions 6.0 to X8 saved in compound file format (.wp;.wp6;.wp7). | ||
Encrypted Corel WordPerfect versions 6.0 to X8 saved in compound file format (.wp;.wp6;.wp7). | |||||
Corel WordPerfect Template File. Used by Corel WordPerfect to create automated templates (.wpx). | |||||
X | X | X | Corel WordPerfect compound file version 6.1(.wpd;.wp;.wp6). | ||
X | Microsoft Write (.wri). | ||||
XyWrite for DOS and Windows versions 1-4. The final version for DOS was 4.18 (1993); for Windows, 4.13 (.xy;.xy3;.xyp;.xy4;.xyw). | |||||
WordStar word processor version 5 (.wsd;.ws5;.ws). | |||||
WordStar word processor version 5.5 (.wsd;.ws5;.ws). | |||||
WordStar word processor version 6 (.wsd;.ws6;.ws). | |||||
WordStar word processor version 7 (.wsd;.ws7;.ws). | |||||
WordStar word processor version 2000 (version 1) (.wsd;.wsw;.ws). | |||||
Legacy (purchased by WordStar) for Windows (.chp). | |||||
WordStar for Windows (last version of WordStar and was an altered version of LegacyWordProcessor and released as WordStar, 1991) (.wsd). | |||||
X | X | Lotus Ami Pro (originally by Samna, Samna was purchased by Lotus Software in 1990) (.sam). | |||
Lotus Word Pro 97 word processor (based on Lotus Ami Pro) (.lwp). | |||||
Encrypted Lotus Word Pro 87 word processor (based on Lotus Ami Pro) (.lwp). | |||||
Lotus Word Pro 9 word processor (.lwp). | |||||
First Choice word processor (.doc). | |||||
First Choice word processor version 3.0 (.doc). | |||||
IBM DisplayWrite versions 3.0, 4.0, and 5.0 (.txt;.doc). | |||||
IBM DisplayWrite Final Form Text (FFT) (.fft;.txt;.doc). | |||||
IBM DisplayWrite Reversible Format Text (RFT) (.fft;.txt;.doc). | |||||
Symantec JustWrite word processor versions 1.0 and 2.0 (.jw). | |||||
MultiMate word processor version 3.3 - 3.6 (.dox;.doc). | |||||
MultiMate word processor version 4.0 (.dox;.doc). | |||||
Navy Data Interchange Format (DIF) is a historical word processor/spreadsheet standard format. | |||||
OfficeWriter word processor version 6.x (.wp). | |||||
Volkswriter word processor (.vw;.vw3;.vw4). | |||||
Wang IWP (.doc). | |||||
Enable WP 4 (.wpf;.en4). | |||||
Professional Write 1 (.pfs). | |||||
Professional Write 2 (.pfs). | |||||
Adobe FrameMaker document (all versions) (.fm). | |||||
Adobe FrameMaker Interchange Format document version 3.0 (.mif).. | |||||
Adobe FrameMaker Interchange Format document version 4.0 (.mif).. | |||||
Adobe FrameMaker Interchange Format document version 5.0 (.mif).. | |||||
Adobe FrameMaker Interchange Format document version 5.5 (.mif).. | |||||
Adobe FrameMaker Interchange Format document version 6.0 (.mif).. | |||||
Adobe FrameMaker Interchange Format document (all versions) (.mif). | |||||
X | AbiWord Document (open-source word processor similar to Microsoft Word) (.abw). | ||||
X | AbiWord Document (open-source word processor similar to Microsoft Word) (.abw;.zabw;.gz). | ||||
AbiWord Document Template (open-source word processor similar to Microsoft Word) (.abw;.awt). | |||||
AbiWord Document Template (open-source word processor similar to Microsoft Word) (.abw;.awt;.zawt;.gz). | |||||
X | X | X | StarOffice Formula 5.x (.sxf;.sxm). | ||
X | X | X | StarOffice Math versions 6 (beta) to 7 (.sxm;.sxf). | ||
Encrypted StarOffice Math versions 6 (beta) to 7 (.sxm;.sxf). | |||||
X | X | X | OpenDocument Math (formula) document (.sxm;.odf). | ||
X | X | X | Encrypted OpenDocument Math (formula) document (.sxm;.odf). | ||
X | X | X | Embedded OpenDocument Math document. Embedded/extracted ODF documents are missing their own "META-INF/manifest.xml" and "mimetype" zip entries. They can result from improper extraction from their parent OpenDocument formats (.sxm;.odf). | ||
X | X | Apple iWork '05 - '09 Productivity Suite Pages word processor versions 1.0 - 4.0 (.pages;.pages.zip;.zip). | |||
Encrypted Apple iWork '05 - '09 Productivity Suite Pages word processor versions 1.0 - 4.0 (.pages;.pages.zip;.zip). | |||||
X** | Apple iWork 2013-2016 Productivity Suite Pages word processor versions 5.0 - 6.0 (.pages;.pages.zip;.zip). | ||||
Encrypted Apple iWork 2013-2016 Productivity Suite Pages word processor versions 5.0 - 6.0 (.pages;.pages.zip;.zip). | |||||
ClarisWorks Word Processor versions 1 (.cwk). | |||||
ClarisWorks Word Processor versions 2-3 (.cwk). | |||||
ClarisWorks Word Processor version 4 (.cwk). | |||||
ClarisWorks Word Processor version 5 (.cwk). | |||||
AppleWorks Word Processor version 6 (originally ClarisWorks and was renamed AppleWorks after version 5) (.cwk). | |||||
X | X | X | Ability Write word processor format version 4.0-6.0 by Ability Plus Software (.aww). | ||
X | Scrivener word-processing and outliner designed for authors (XML format) (.scrivx). |