This content is not currently approved and is visible here for review only.

Tika in Action

Tika in Action
Authors
Chris Mattmann, Jukka Zitting
ISBN
1935182854
Published
05 Oct 2011
Purchase online
amazon.com

The information trapped in text files, PDFs, and other digital content is a valuable information asset that can be very difficult to discover and use. Apache Tika is an open source toolkit that makes it easy for search engines, content management systems and other applications to detect and extract content from digital documents in all major file formats.

Editorial Reviews

The information trapped in text files, PDFs, and other digital content is a valuable information asset that can be very difficult to discover and use. Apache Tika is an open source toolkit that makes it easy for search engines, content management systems and other applications to detect and extract content from digital documents in all major file formats.

Tika in Action is a hands-on guide for developers working with search engines, content management systems and other similar applications who want to exploit the information locked in digital documents. It introduces the world of mining text and binary documents as well as other information sources. The book shows where Tika fits within this landscape and how readers can use Tika to build and extend applications. The book's many case studies give real-world experience from domains ranging from search engines to digital asset management and scientific data processing.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Linux is only free if your time has no value” - Jamie Zawinski