Web Archive Discovery

From COPTR
Revision as of 10:22, 14 February 2014 by MediaWiki default (talk | contribs) (Added initial webarchive-discovery outline)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.


Indexing and discovery tools for web archives.
Homepage:https://github.com/ukwa/webarchive-discovery
License:Mixed
Platforms:Java

Description

Full-text indexing system, using Apache Solr as the search back-end. Supports command-line and large-scale map-reduce (Hadoop) processing of ARC and WARC files. Also integrates file format analysis and scans for some known preservation risks.

User Experiences

  • Used by the UK Web Archive to provide access to their collections. More details TBA.

Development Activity

Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6622f52ebccb55_78416736