Apache Tika
Revision as of 20:43, 9 November 2014 by MediaWiki default (talk | contribs) (→Development Activity)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Description
Java based tool for identifying file formats using signatures and extracting metadata and text content from documents.
User Experiences
- Comparing how Apache Tika and DROID perform HTML identification: How much of the UK's HTML is valid?
- Apache Tika is a core component of the Web Archive Discovery indexer and profiler.
- A number of pages on the OPF Wiki mention Tika.
Development Activity
Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6627629b82fb10_21630556
Release Feed
Failed to load RSS feed from http://projects.apache.org/feeds/rss/tika.xml: There was a problem during the HTTP request: 404 Not Found
Issues Feed
- 2024-04-23 07:26:16
- ASF GitHub Bot updated a link from ASF GitHub Bot logged '10m' on ...
- by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person
- 2024-04-23 07:25:30
- ASF GitHub Bot changed the Labels to 'pull-request-available'...
- by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person
- 2024-04-23 07:25:29
- ASF GitHub Bot created a link from ASF GitHub Bot logged '10m' on ...
- by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person