WCT (Web Curator Tool)

Jump to: navigation, search

WCT (Web Curator Tool)
Web Curator Tool (WCT) is a workflow management application for selective web archiving.
License:Apache License 2.0
Platforms:Apache Tomcat


[edit] Description

The Web Curator Tool (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive.

[edit] Provider

Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd.

[edit] Platform and interoperability

WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework.

[edit] Functional notes

An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account.

[edit] Documentation and user support

The project site includes a well written quick-start guide and user manual, although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead.  More information about the project can be found in an informative article published in Ariadne Issue 50. The primary forum for technical support appears to be an active “webcurator-users” mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011.

[edit] Usability

WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support.

[edit] Expertise required

Setup, especially if it includes links to an archival repository, requires system administration knowledge. Users must have a comprehensive understanding of their institution’s collections policies when designing harvests.

[edit] Standards compliance

WCT allows users to add basic Dublin Core metadata to the material.

[edit] Influence and take-up

WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011.

[edit] User Experiences

[edit] Development Activity

No information is available on the current funding status for development, although the SourceForge site’s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release.

All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits

[edit] Release Feed

Below the last 3 release feeds:

2017-07-06 23:08:06
[tag:github.com,2008:Repository/38218145/1.6.3 wct-v1.6.3 GA]
by obrienben
2016-03-15 00:44:32
[tag:github.com,2008:Repository/38218145/1.6.2 wct-v1.6.2 GA]
by obrienben
2015-11-02 22:57:47
[tag:github.com,2008:Repository/38218145/1.6.1 wct-v1.6.1 GA]
by obrienben

[edit] Activity Feed

Below the last 5 commits:

2017-07-06 22:06:03
[tag:github.com,2008:Grit::Commit/f92d8642cb0122127ddaf9173e08f40faf6d7e95 Refactor wct-store dependencies to match PR #6]
by obrienben https://github.com/obrienben
2017-07-06 05:07:32
[tag:github.com,2008:Grit::Commit/c7feb4b2bbaa87310753cbcb2f38c6256892c325 Removing requirement of RestrictHTMLSerialAgenciesToHTMLSerialTypes b…]
by obrienben https://github.com/obrienben
2017-03-23 01:25:17
[tag:github.com,2008:Grit::Commit/fe7107ef3f3f3473e02b771f1f3ec9159539cd04 readme markup fix.]
by obrienben https://github.com/obrienben
2017-03-23 01:09:56
[tag:github.com,2008:Grit::Commit/98e99c416d499bc3c57a0d89b1c2e453471448e8 Updated readme for v1.6.3]
by obrienben https://github.com/obrienben
2017-03-23 01:06:11
[tag:github.com,2008:Grit::Commit/f5b779a8f979619626588c867d4d244fda49fd0f bugfixes - HTML Serials restriction toggle, and custom deposit fields…]
by obrienben https://github.com/obrienben

Mailing List(s)

See here for more information.

Issues Feed

2013-06-26 00:24:07
#129 Profile tab returns a Null Pointer error
  • status: pending --> open-fixed

by Chris Mclean
2013-05-23 23:45:16
#80 Harvest result stays in Endorsed state when Archived

  • status: open --> closed-fixed
  • Group: --> Request estimate for v1.5

by Chris Mclean
2013-05-23 23:43:16
#14 Target and Instance anomolies

  • status: open --> closed-out-of-date
  • Group: --> Request estimate for v1.5

by Chris Mclean
2013-05-23 23:40:15
#63 Unable to archive TIs that had "manual intervention"

  • status: open --> closed-fixed
  • Group: --> Request estimate for v1.5

by Chris Mclean
2013-05-23 23:39:45
#70 Login

  • status: open --> closed-out-of-date
  • Group: --> Request estimate for v1.5

by Chris Mclean


Chlara (23.0%), Nullhandle (13.1%), COPTR Bot (63.9%)