White PapersMore >>
Developer text retrieval software company dtSearch Corp has produced version 7.72 of its core product line so that expanded proprietary document filters are included.
For customers in need of data parsing, conversion, and extraction only, the dtSearch Engine (with APIs in native 64-bit/32-bit, Win/Linux C++, Java, and .NET through current versions) also provides the document filters for separate OEM licensing.
These document filters support data formats including web-ready static data. This covers integrated image and text support in HTML, XML/XSL, and PDF. Web-based dynamic data is also covered through the dtSearch Spider, so this covers integrated image and text support in PHP, ASP.NET, SharePoint, etc.
Through the dtSearch Engine APIs there is cover for SQL-type databases along with the full-text of BLOB data; all products support Access, XBASE, XML, CSV, and so on. Microsoft Office documents are also supported along with emails and email attachments in MS Exchange, Outlook (PST/MSG), Thunderbird (MBOX/EML), and other popular email types, including nested email attachments.
For all supported formats, the document filters support data parsing and optional extraction, as well conversion to HTML for browser display with highlighted hits.
dtSearch enterprise and developer products can index over a terabyte of data in a single index, spanning multiple directories, emails and attachments, online data, and other databases. The products can create and search any number of indexes.
Indexed search time is typically less than a second, even across terabytes of data. The product line also supports highly concurrent, multithreaded searching.
In terms of developer SDKs, the dtSearch Engine for Win & .NET and the dtSearch Engine for Linux make available dtSearch instant searching and document filters (both together with searching as well as available for separate licensing) for a wide range of Internet, Intranet, and other commercial applications. SDKs include native 64-bit and 32-bit C++, Java, and .NET (through current versions) APIs.