Click or drag to resize

HashingSettingsHashingType Property

Document hashing type.

Namespace: OpenDiscoverSDK.Interfaces.Settings
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2025.4.4.0 (2025.4.4)
Syntax
C#
[DataMemberAttribute]
public HashingType HashingType { get; set; }

Property Value

HashingType
Remarks

MD5/SHA-1 binary hashes are useful in determining if documents are identical at the byte level, i.e., if they are binary duplicates.

MD5/SHA-1 'content hashes' are useful in determining if email and other documents are identical with respect to content (e.g., for email: same sender, sent date, body text, attachments, etc). As way of example, the exact same email object may be stored as MIME (.eml) and as a Microsoft Outlook message (.msg) file and these 2 files will have different binary hashes due to the different file formats and internal data ordering/storage used, but with a proprietary content based hash we are often able to determine if they are the same email (duplicates).

For email formats, the content hash is always calculated as long as HashingType property is not equal to None. Setting HashingType property to BinaryAndContentHash only enables content hashing for supported spreadsheet, word processing, and presentation formats. Calculating the content hash for large spreadsheets can be an expensive operation, so if user is only extracting metadata (see ExtractionType) for their purposes, then consider setting this property value to BinaryHashOnly.

Default property value: BinaryAndContentHash

See Also