Apache Tika

From COPTR
Revision as of 17:30, 12 November 2013 by COPTR Bot (talk | contribs) (Trial import from script.)
Jump to navigation Jump to search


Detects and extracts metadata and text content from documents.
Homepage:http://tika.apache.org/
License:Apache License, Version 2.0


Description

Java based tool for detecting and extracting metadata and text content from documents.

Searching for Tika on OPF Labs

{search:query=Tikatype=page}

User Experiences

Development Activity

Release Feed

Link to any RSS feed that is updated when new releases occur, if any, e.g: Failed to load RSS feed from http://projects.apache.org/feeds/rss/tika.xml: There was a problem during the HTTP request: 404 Not Found

Activity Feed

Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:

2026-05-03 09:19:25
Michael Skells updated the Description of

In a environment where we have many small avro files, and large schemas we see multiple TB of schemas being generated when we open files

This change, and the...

by Michael Skellshttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=mkeskellsmkeskellshttp://activitystrea.ms/schema/1.0/person
2026-05-03 09:16:40
Uladzislau Blok updated the Description of Uladzislau Blok updated the Description of Uladzislau Blok created ASF GitHub Bot updated a link from Ben Weidig changed the Assignee to 'ASF GitHub Bot changed the Labels to 'pull-request-available'...
by ASF GitHub Bothttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=githubbotgithubbothttp://activitystrea.ms/schema/1.0/person