Web Archive Discovery

From COPTR
Revision as of 14:16, 24 September 2018 by MediaWiki default (talk | contribs) (Added activity feeds.)
Jump to navigation Jump to search


Indexing and discovery tools for web archives.
Homepage:https://github.com/ukwa/webarchive-discovery
License:Mixed
Platforms:Java

Description

Full-text indexing system, using Apache Solr as the search back-end. Supports command-line or large-scale map-reduce (Hadoop) processing of ARC and WARC files. Also integrates file format analysis and scans for some known preservation risks.

User Experiences

  • Used by the UK Web Archive to provide access to their collections. More details TBA.

Development Activity

All development activity is visible on GitHub: http://github.com/ukwa/webarchive-discovery

There is also a #webarchive-discovery channel on the IIPC Slack service. Contact https://twitter.com/NetPreserve for details.

Release Feed

Below the last 3 release feeds:

2024-04-02 09:25:58
[tag:github.com,2008:Repository/7257232/warc-discovery-3.3.1 Revert of source_file_path]
by GilHoggarth
2023-06-02 11:04:22
[tag:github.com,2008:Repository/7257232/warc-discovery-3.3.0 warc-discovery-3.3.0]
by anjackson
2020-11-27 12:25:29
[tag:github.com,2008:Repository/7257232/warc-discovery-3.1.0 warc-discovery-3.1.0]
by anjackson


Activity Feed

Below the last 5 commits:

2024-08-09 10:57:54
[tag:github.com,2008:Grit::Commit/40ce1635f79b8d9d13f3fa2a1577f0ca46aa8404 Merge pull request #318 from lasztoth/langid-language-analyser]
by GilHoggarth https://github.com/GilHoggarth
2024-08-09 10:36:40
[tag:github.com,2008:Grit::Commit/380afa66e0d45e569f0dd2971c1a8039daa90402 Added correct version of artifact]
by KGX747@MC212515.gouv.etat.lu
2024-08-09 10:08:55
[tag:github.com,2008:Grit::Commit/170c8dfb3543159065af792cf226e2ea1726c852 Update LanguageAnalyser.java]
by GilHoggarth https://github.com/GilHoggarth
2024-08-09 10:02:40
[tag:github.com,2008:Grit::Commit/4def14555648fc24fd2a5e75f367b799e6759bf2 Update pom.xml]
by GilHoggarth https://github.com/GilHoggarth
2024-08-09 09:00:07
[tag:github.com,2008:Grit::Commit/61e070adf7ebf91efd49a01928b8216b7019cd58 Merge pull request #317 from lasztoth/langid-language-analyser]
by GilHoggarth https://github.com/GilHoggarth


Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt673f8f75bb2937_70577281