Open |
Open Discover® Platform for .NET - Offers software developers a higher level implementation of document processing built entirely upon the Open Discover® SDK.
For those developers tasked with frequently identifying and extracting content from many thousands to millions of documents, Open Discover® Platform offers DocumentTaskEngine: a .NET highly parallel document processing class designed to process thousands of documents recursively* as a single task or to process large archives and mail stores as a single standalone task or as a distributed and partitioned** task.
* Recursively means a container's (or document's) child documents and any attachment or embedded item contained within these child documents (and so on), are processed until no more child documents can be extracted in the parent/child hierarchy. A container or document parent/child hierarchy is completely unrolled and processed.
** Large archives and mailstores can be partitioned into multiple simultaneous tasks using multiple distributed instances of DocumentTaskEngine to simultaneously process these partitions for faster and more fault tolerant processing. For example, a 50 GB Outlook PST with 200K email objects can be partitioned into 4 tasks with 4 instances of DocumentTaskEngine each processing a partition set of 50K email objects.
Open Discover® Platform for .NET System Requirements:
| Class | Description | |
|---|---|---|
| DocumentTaskEngine | Provides functionality to extract content from hundreds to thousands of documents as a single task (see DocumentTaskSettings), or from "large" archives and mail store containers that deserve their own separate tasks. The DocumentTaskEngine is a highly parallel document extraction engine that completely unrolls and processes deep parent document/child document (attachments/embedded objects/media) hierarchies. |
| Delegate | Description | |
|---|---|---|
| NistDatabaseCreationUpdateDelegate | Delegate to update NIST National Software Reference Library (NSRL) Reference Data Set (RDS) database creation progress. | |
| TaskCompletedHandler | Task completed delegate. | |
| TaskFatalExceptionHandler | Task fatal error exception delegate. | |
| TaskLogUpdatedHandler | Task log updated delegate. | |
| TaskLongProcessingDocumentWarningHandler | Task long processing document warning delegate. |