Apache Tika
Revision as of 14:40, 2 July 2014 by Prwheatley (talk | contribs)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Description
Java based tool for detecting and extracting metadata and text content from documents. Apache Tika is a core component of the Web Archive Discovery indexer and profiler.
User Experiences
- Comparing how Apache Tika and DROID perform HTML identification: How much of the UK's HTML is valid?
- A number of pages on the OPF Wiki mention Tika.
Development Activity
Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6605d9bf22e258_32130644
Release Feed
Link to any RSS feed that is updated when new releases occur, if any, e.g: Failed to load RSS feed from http://projects.apache.org/feeds/rss/tika.xml: There was a problem during the HTTP request: 404 Not Found
Activity Feed
Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:
- 2024-03-28 20:57:09
- ASF GitHub Bot updated a link from ASF GitHub Bot commented on
- by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person
- 2024-03-28 20:55:58
- Joe McDonnell resolved Joe McDonnell commented on
commit 78a3723ad8251fbd786c276bd0f6b75a6b5bae...
- by Joe McDonnellhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=joemcdonnelljoemcdonnellhttp://activitystrea.ms/schema/1.0/person
- 2024-03-28 20:54:53
- ASF GitHub Bot updated a link from ASF GitHub Bot commented on
- by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person
- 2024-03-28 20:53:49
- Chandni Singh changed the Summary of