Apache Tika

From COPTR
Jump to navigation Jump to search
Detects and extracts metadata and text content from documents.
Homepage:http://tika.apache.org/
License:Apache License, Version 2.0


Description

Java based tool for detecting and extracting metadata and text content from documents.

Searching for Tika on OPF Labs

{search:query=Tikatype=page}

User Experiences

Development Activity

Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6649e4ac3b4bd4_40434364



Release Feed

Link to any RSS feed that is updated when new releases occur, if any, e.g: Failed to load RSS feed from http://projects.apache.org/feeds/rss/tika.xml: There was a problem during the HTTP request: 404 Not Found

Activity Feed

Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:

2024-05-19 11:35:17
ASF GitHub Bot updated a link from Jacques Le Roux commented on Jacques Le Roux commented on Andrew Lamb changed the Summary of Andrew Lamb created

I have always found it very confusing that people refer to the term "page index" when referring to parquet, for example Jian Zhang changed the status to Patch Available on

by Jian Zhanghttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=KeepromiseKeepromisehttp://activitystrea.ms/schema/1.0/person
2024-05-19 11:16:57
ASF GitHub Bot updated a link from