Tika

From COPTR
Revision as of 21:54, 3 November 2013 by COPTR Bot (talk | contribs) (Trial import from script.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Summary

<tbody> </tbody>
Purpose {excerpt}Detects and extracts metadata and text content from documents.{excerpt}
Homepage [1]
Source Code Repository [2]
License Apache License, Version 2.0
Debian Package

Description

Java based tool for detecting and extracting metadata and text content from documents.

User Experiences

e.g. links to AQuA/SCAPE/Hackathon issues that use the tool
* [SP:IS25 Web Content Characterisation]
* [SP:SO11 The Tika characterisation Tool]
* [SO17 Web Archive Mime-Type detection workflow based on Droid and Apache Tika|SP:SO17 Web Archive Mime-Type detection workflow based on Droid and Apache Tika]

News Feeds

Release Feed

Link to any RSS feed that is updated when new releases occur, if any, e.g:
{rss:max=7|url=http://projects.apache.org/feeds/rss/tika.xml}

Activity Feed

Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:
{rss:max=7|url=https://issues.apache.org/jira/activity?maxResults=10&streams=key+IS+TIKA}

Searching for Tika on OPF Labs

{search:query=Tikatype=page}