Difference between revisions of "WCT (Web Curator Tool)"
(Trial import from script.) |
Prwheatley (talk | contribs) |
||
(8 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | {{ | + | {{Infobox tool |
+ | |image=wct-banner-500x100.jpg | ||
|purpose=Web Curator Tool (WCT) is a workflow management application for selective web archiving. | |purpose=Web Curator Tool (WCT) is a workflow management application for selective web archiving. | ||
− | | | + | |homepage=https://webcuratortool.org/ |
− | | | + | |sourcecode=https://github.com/WebCuratorTool/webcurator |
− | |license=Apache License | + | |license=[http://www.apache.org/licenses/LICENSE-2.0 Apache License 2.0] |
|platforms=Apache Tomcat | |platforms=Apache Tomcat | ||
+ | |function=Metadata Processing, Web Capture | ||
+ | |content=Web | ||
+ | }} | ||
+ | {{Infobox tool details | ||
+ | |ohloh_id=webcurator | ||
}} | }} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
= Description = | = Description = | ||
The [http://webcurator.sourceforge.net/ Web Curator Tool] (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive. | The [http://webcurator.sourceforge.net/ Web Curator Tool] (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive. | ||
====Provider==== | ====Provider==== | ||
Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd. | Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
====Platform and interoperability==== | ====Platform and interoperability==== | ||
− | WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. | + | WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework. |
− | The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework. | ||
====Functional notes==== | ====Functional notes==== | ||
− | An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. | + | An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account. |
− | All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account. | ||
====Documentation and user support==== | ====Documentation and user support==== | ||
− | The project site includes a well written quick-start guide and [http://webcurator.sourceforge.net/docs/1.5.2/Web%20Curator%20Tool%20User%20Manual%20(WCT%201.5.2).pdf user manual], although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. More information about the project can be found in an informative [http://www.ariadne.ac.uk/issue50/beresford/ article] published in Ariadne Issue 50. | + | The project site includes a well written quick-start guide and [http://webcurator.sourceforge.net/docs/1.5.2/Web%20Curator%20Tool%20User%20Manual%20(WCT%201.5.2).pdf user manual], although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. More information about the project can be found in an informative [http://www.ariadne.ac.uk/issue50/beresford/ article] published in Ariadne Issue 50. The primary forum for technical support appears to be an active “webcurator-users” mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011. |
− | The primary forum for technical support appears to be an active “webcurator-users” mailing list. While bugs continue to be posted on the | ||
====Usability==== | ====Usability==== | ||
WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support. | WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support. | ||
Line 38: | Line 29: | ||
WCT allows users to add basic Dublin Core metadata to the material. | WCT allows users to add basic Dublin Core metadata to the material. | ||
====Influence and take-up==== | ====Influence and take-up==== | ||
− | WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The | + | WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011. |
= User Experiences = | = User Experiences = | ||
Line 44: | Line 35: | ||
= Development Activity = | = Development Activity = | ||
− | + | <!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --> | |
− | + | No information is available on the current funding status for development, although the SourceForge site’s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release. | |
− | + | ||
− | + | All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits | |
+ | |||
+ | |||
+ | === Release Feed === | ||
+ | Below the last 3 release feeds: | ||
+ | <rss max=3>https://github.com/DIA-NZ/webcurator/releases.atom</rss> | ||
+ | |||
+ | |||
+ | === Activity Feed === | ||
+ | Below the last 5 commits: | ||
+ | <rss max=5>https://github.com/DIA-NZ/webcurator/commits/master.atom</rss> |
Latest revision as of 16:38, 26 November 2021
Description[edit]
The Web Curator Tool (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive.
Provider[edit]
Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd.
Platform and interoperability[edit]
WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework.
Functional notes[edit]
An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account.
Documentation and user support[edit]
The project site includes a well written quick-start guide and user manual, although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. More information about the project can be found in an informative article published in Ariadne Issue 50. The primary forum for technical support appears to be an active “webcurator-users” mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011.
Usability[edit]
WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support.
Expertise required[edit]
Setup, especially if it includes links to an archival repository, requires system administration knowledge. Users must have a comprehensive understanding of their institution’s collections policies when designing harvests.
Standards compliance[edit]
WCT allows users to add basic Dublin Core metadata to the material.
Influence and take-up[edit]
WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011.
User Experiences[edit]
Development Activity[edit]
No information is available on the current funding status for development, although the SourceForge site’s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release.
All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits
Release Feed[edit]
Below the last 3 release feeds:
- 2020-03-17 15:17:26
- [tag:github.com,2008:Repository/38218145/v2.0.2 v2.0.2]
- by hannakoppelaar
- 2019-04-03 20:29:16
- [tag:github.com,2008:Repository/38218145/v2.0.1 wct-v2.0 GA]
- by obrienben
- 2018-12-20 09:55:03
- [tag:github.com,2008:Repository/38218145/v2.0.0 v2.0.0]
- by obrienben
Activity Feed[edit]
Below the last 5 commits:
- 2020-11-22 20:52:47
- [tag:github.com,2008:Grit::Commit/61cb76ce045e1ca6ef64c8be46abac0d00e2da53 Merge pull request #118 from DIA-NZ/feature/117_copy_h1_to_h3_profile…]
- by obrienben https://github.com/obrienben
- 2019-07-26 10:22:29
- [tag:github.com,2008:Grit::Commit/87606210df4abf8d2d72a01ea32c1e8d15967a18 Merge pull request #119 from DIA-NZ/docs/add-workshop-tutorials]
- by hannakoppelaar https://github.com/hannakoppelaar
- 2019-07-25 23:01:43
- [tag:github.com,2008:Grit::Commit/2a4d3e8834c757f6123565c5c7800e80537fe9dc Tutorial page added to Support section in documentation, with 2019 II…]
- by obrienben https://github.com/obrienben
- 2019-07-23 16:11:42
- [tag:github.com,2008:Grit::Commit/da83678ae708dd5970725a06bf6f1dd48254665f #117: Removed superfluous update statements]
- by hannakoppelaar https://github.com/hannakoppelaar
- 2019-07-23 16:02:08
- [tag:github.com,2008:Grit::Commit/d3cfc3ed9f60ed2a968d4aa83bb6cc6d69414e7b #117: Copy h1 fields to h3 fields in profile_overrides (MySQL upgrade…]
- by hannakoppelaar https://github.com/hannakoppelaar