Click or drag to resize

EntityExtractionSettingsDeduplicateEntityItems Property

If enabled (true) then duplicate entity items are not included in results. See remarks.

Namespace: OpenDiscoverSDK.Interfaces.Settings.TextAnalytics
Assembly: OpenDiscoverSDK.Interfaces (in OpenDiscoverSDK.Interfaces.dll) Version: 2025.4.4.0 (2025.4.4)
Syntax
C#
[DataMemberAttribute]
public bool DeduplicateEntityItems { get; set; }

Property Value

Boolean
Remarks

The default value of this property is 'true', i.e., only unique entity items detected are returned. Entities that are duplicate items are not returned; only the first unique occurence of an entity is returned. The user can override this behavior by setting this property to false - then every entity item is returned. But the user should be aware that doing so can lead to potentially 10's of thousands of entities (many of them duplicates) being returned from large spreadsheets and other large documents.

When this property is set to true duplicate items that have the same textual property values (Text) and same EntityType are not included - only the first item instance detected is returned in Items.

There are documents such as spreadsheets that can have massive amounts of repeating sensitive item data. If the user does not want to capture every instance of the same sensitive item then set this property to true.

See Also