Click or drag to resize

SourceCode

Supported SourceCode file formats (IdClassification.SourceCode - Software source code related document)

  • All entries in table below are supported for file format identification.
  • 'X' in "Text" column indicates text extraction is supported for the file format.
  • 'X**' in "Text" column indicates text extraction is supported BUT binary-to-text filtering is used on partially parsed document records.
  • 'X' in "Metadata" column indicates metadata extraction is supported for the file format.
  • 'X' in "EmbeddedItem" column indicates embedded item/attachment extraction is supported for the file format.
  • 'X' in "ContentHash" column indicates a content hash is supported for the file format (see MD5ContentHash and SHA1ContentHash)

If a file format does not have a supported content extractor that extracts text then, optionally (default), a binary-to-text content extractor will be used to extract UTF-8, UTF-16, Windows-1252, and ASCII from the binary. In many cases, indexable text can be extract from unknown document formats.

SourceCode Supported File Formats

File Format Id Enum Value

Text

Metadata

EmbeddedItem

ContentHash

Description

JavaClass

Java class file containing compiled java byte code (.class).

JavaArchive

X

X

Java Archive file (.jar).

AndroidAppPackage

X

X

Android application package (variant of JAR file format) (.apk).

AndroidBinaryXML

Android application binary XML format (.xml).

iOSAppStorePackage

X

X

Apple iOS App Store Package (.ipa).

AppleNIB

Apple nib resource file. An nib is used to store the user interfaces of iOS and Mac apps (.nib).

AppleAssetCatalog

Apple application compiled asset catalog (.car).

PythonCompiled27

Compiled Python script source code file version 2.7 (.pyc;.pyo).

PythonCompiled34

Compiled Python script source code file version 3.4 (.pyc;.pyo).

VisualStudioSolution

X

Microsoft Visual Studio solution file (.sln).

VisualStudioSolutionUserOptions

Microsoft Visual Studio solution user options file (contains per-user solution options) (.sln).

CSharpVisualStudioProject

X

Microsoft Visual Studio C# project file (.csproj).

VBVisualStudioProject

X

Microsoft Visual Studio Visual Basic project file (.vbproj).

CppVisualStudioProject

X

Microsoft Visual Studio C++ project file (.vcxproj).

CppVisualStudioProjectFilters

Microsoft Visual Studio C++ project filters file (specifies where to put a file that is added to the solution. For example, a .h file is put in the Header Files node) (.vcxproj.filters).

VisualStudioPDB7

Microsoft Visual Studio Program Database (PDB) (debugger file) version 7 (.pdb).

VisualStudioResxFile

X

Microsoft Visual Studio XML resource data file (stores application specific data such as strings and objects inside XML tags) (.resx).

VisualStudioCompiledResourceFile

Microsoft Visual Studio compiled resource file (a temporary Visual Studio project build file that is a compiled version of an XML .resx file) (.resources).

EdisonDesignGroupPCH

Edison Design Group (EDG) C/C++ pre-compiled header file (used by C/C++ Visual Studio for IntelliSense parser and by other compilers, these files can become quiet large) (.ipch).

VisualStudioNuGetPackage

X

X

Microsoft Visual Studio NuGet Package (NuGet is the package manager for the Microsoft development platform including .NET.) (.nupkg).

VisualStudioCompileCacheTempFile

Microsoft Visual Studio temporary cache file generated during builds of .NET (C#/VB) projects (.cache).

Xaml

X

Extensible Application Markup Language (XAML) file format. XAML is used in Windows Presentation Foundation (WPF), Silverlight, Windows Work Flow Foundation (WF), Windows Runtume XAML Framework, and Windows Store applications (.xaml).

Baml

Binary Application Markup Language (BAML) file. A BAML file is a compiled .NET XAML file associated with Windows Presentation Foundation (WPF), Silverlight, Windows Work Flow Foundation (WF), Windows Runtume XAML Framework, and Windows Store applications (.baml).

TypeLibraryFile

Microsoft Type Library source code file (.tlb).

ColdFusionML

ColdFusion Markup Language (CFML) (.cfm;.cfc).

CSourceFile

X

C language source code file (.c).

CHeaderFile

X

C language source code header file (.h).

CppSourceFile

X

C++ language source code file (.cpp;.cxx;.cc;.c;.c++).

CppHeaderFile

X

C++ language source code header file (.hpp;.hxx;.hh;.h;.hp;.h++).

CSharpFile

X

C# language source code file (.cs).

BasicFile

X

BASIC language source code file (.bas).

VBNetFile

X

VB.NET language source code file (.vb).

GoFile

X

Go language source code file (.go).

ClojureFile

X

Clojure language source code file (.clj).

CoffeeFile

X

CoffeeScript source code file (.coffee).

GroovyFile

X

Groovy language source code file (.groovy).

JavaFile

X

Java language source code file (.java).

LuaFile

X

Lua language source code file (.lua).

ScalaFile

X

Scala language source code file (.scala).

JavaScriptFile

X

JavaScript language source code file (.js).

COBOLFile

X

COBOL language source code file (.cbl;.cob).

FortranFile

X

Fortran language source code file (.f;.for;.f77;.f90).

PHPScriptFile

X

PHP script source code file (.php;.php3;.php4).

PythonScriptFile

X

Python script source code file (.py).

CythonScriptFile

X

Cython script source code file. Cython is a superset of the Python programming language (.pyx;.pxd).

SQLFile

X

Structured Query Language (SQL) data/statement file (.sql).

VBScriptFile

X

VBScript source code file (.vbs).

MarkdownFile

X

Markdown source code file (.md;.markdown).

AdaFile

X

Ada source code file (.ada;.ads;.adb).

ActionScriptFile

X

ActionScript source code file (.as).

TypeScriptFile

X

TypeScript source code file (.ts;.tsx).

AppleScriptFile

X

AppleScript source code file (.applescript).

ASPFile

X

Active Server Page (ASP) source code file (.asp).

ASPNetFile

X

ASP .NET source code file (.aspx).

AssemblyFile

X

Assembly source code file (.asm).

LispFile

X

LISP source code file (.lisp;.lsp;.cl).

ErlangFile

X

Erlang source code file (.erl).

ForthFile

X

Forth source code file (.4th).

PascalFile

X

Pascal source code file (.pas;.pp;.inc).

RexxFile

X

Rexx source code file (.rexx).

RubyFile

X

Ruby source code file (.rb).

SmalltalkFile

X

Smalltalk source code file (.st).

YAMLFile

X

YAML source code file (.yml;.yaml).

XQueryFile

X

XQuery source code file (.xq;.xql;.xqm;.xqy;.xquery).

CGIScriptFile

X

Common Gateway Interface (CGI) script source code file (.cgi).

IDLFile

X

Microsoft Interface Definition Language source code file (.idl).

PerlFile

X

Perl script source code file (.plx;.pl;.perl).

TclFile

X

Tcl script source code file (.tcl;.tbc).

VHDLFile

X

VHDL source code file (.vhdl;.vhd).

HaskellFile

X

Haskell related source code file (.hs;.lhs;.cabal).

LispFlavoredErlangFile

X

Lisp Flavored Erlang (LFE) related source code file (.lfe;.hrl).

GradleBuildToolFile

X

Gradle build tool file (.gradle).

ObjectDefLanguageFile

X

Microsoft Object Definition Language (ODL) (.odl).

JSONiqFile

X

JSONiq query language source code file (.jq;.jqy).

ColdFusionScript

X

ColdFusion script language (CFScript) (.cfc).