Click or drag to resize

OpenDiscoverSDK.Platform Namespace

Open Discover® Platform for .NET - Offers software developers a higher level implementation of document processing built entirely upon the Open Discover® SDK.

For those developers tasked with frequently identifying and extracting content from many thousands to millions of documents, Open Discover® Platform offers DocumentTaskEngine: a .NET highly parallel document processing class designed to process thousands of documents recursively* as a single task or to process large archives and mail stores as a single standalone task or as a distributed and partitioned** task.

* Recursively means a container's (or document's) child documents and any attachment or embedded item contained within these child documents (and so on), are processed until no more child documents can be extracted in the parent/child hierarchy. A container or document parent/child hierarchy is completely unrolled and processed.

** Large archives and mailstores can be partitioned into multiple simultaneous tasks using multiple distributed instances of DocumentTaskEngine to simultaneously process these partitions for faster and more fault tolerant processing. For example, a 50 GB Outlook PST with 200K email objects can be partitioned into 4 tasks with 4 instances of DocumentTaskEngine each processing a partition set of 50K email objects.

Open Discover® Platform for .NET System Requirements:

  • Supported .NET versions: – .NET 6 (x64 builds only)
  • Minimum supported client – Windows 7
  • Minimum supported server – Windows Server 2008 R2[desktop apps | UWP apps]
  • Minimum supported hardware requirements for performance – Intel i7 4-core hyper-threaded with 16 GB of RAM and the equivalent architecture for servers.

Classes
 ClassDescription
Public classDocumentTaskEngine Provides functionality to extract content from hundreds to thousands of documents as a single task (see DocumentTaskSettings), or from "large" archives and mail store containers that deserve their own separate tasks. The DocumentTaskEngine is a highly parallel document extraction engine that completely unrolls and processes deep parent document/child document (attachments/embedded objects/media) hierarchies.
Delegates
 DelegateDescription
Public delegateNistDatabaseCreationUpdateDelegate Delegate to update NIST National Software Reference Library (NSRL) Reference Data Set (RDS) database creation progress.
Public delegateTaskCompletedHandler Task completed delegate.
Public delegateTaskFatalExceptionHandler Task fatal error exception delegate.
Public delegateTaskLogUpdatedHandler Task log updated delegate.
Public delegateTaskLongProcessingDocumentWarningHandler Task long processing document warning delegate.