https://coptr.digipres.org/api.php?action=feedcontributions&user=Prwheatley&feedformat=atomCOPTR - User contributions [en-gb]2024-03-29T02:02:37ZUser contributionsMediaWiki 1.35.14https://coptr.digipres.org/index.php?title=OpenRefine&diff=6016OpenRefine2022-11-30T11:54:46Z<p>Prwheatley: Created page with "{{Infobox tool |purpose=For dealing with messy data, cleaning it and transforming it |homepage=https://openrefine.org/ |sourcecode=https://github.com/OpenRefine/OpenRefine |li..."</p>
<hr />
<div>{{Infobox tool<br />
|purpose=For dealing with messy data, cleaning it and transforming it<br />
|homepage=https://openrefine.org/<br />
|sourcecode=https://github.com/OpenRefine/OpenRefine<br />
|license=Open source<br />
|function=Metadata Processing<br />
|content=Metadata, Research Data<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Geospatial&diff=6014Geospatial2022-11-30T09:06:36Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox content<br />
|definition=Tools that support the preservation of data describing geographical or topographical features, typically referred to as Geospatial data.<br />
}}<br />
==Available best practice guides==<br />
* Open Geospatial Consortium (OGC) - OGC Standards and Resources: https://www.ogc.org/standards<br />
* Artefactual Systems and the Digital Preservation Coalition - Preserving GIS: http://doi.org/10.7207/twgn21-16<br />
* Archaeology Data Service / Digital Antiquity - Guides to Good Practice, GIS Guide to Good Practice: https://guides.archaeologydataservice.ac.uk/g2gp/Gis_Toc<br />
* Preserving GIS, DPC Technology Watch Guidance Note: http://doi.org/10.7207/twgn21-16<br />
<br />
==Available tools for R programming language==<br />
* Duke University - R: Mapping and Geospatial: https://guides.library.duke.edu/r-geospatial</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=NT_(New_Tool)&diff=5988NT (New Tool)2022-11-16T09:28:51Z<p>Prwheatley: Created page with "{{Infobox tool |purpose=This is a presrvation tool |function=De-Duplication, Rendering }} {{Infobox tool details}} == Description == <!-- Describe the what the tool does, foc..."</p>
<hr />
<div>{{Infobox tool<br />
|purpose=This is a presrvation tool<br />
|function=De-Duplication, Rendering<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5899Wish list of tools to add to COPTR2022-03-10T12:12:14Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)<br />
*Curate.Us<br />
*Find It! Keep It!<br />
*Heritrix plug-in for rich media capture<br />
*WAXToolbar</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=MediaWiki:Sidebar&diff=5884MediaWiki:Sidebar2022-03-02T12:19:51Z<p>Prwheatley: </p>
<hr />
<div><br />
* navigation<br />
** mainpage|COPTR Home<br />
** Workflow:Community_Owned_Workflows|Community Owned Workflows<br />
** recentchanges-url|recentchanges<br />
* Find tools<br />
** Tools_Grid|Tools grid<br />
** Lifecycle_Stages|By lifecycle stage<br />
** Tool_Functions|By function<br />
** Content_Types|By content type<br />
** File_Formats|By file format<br />
** Category:Tools|All tools<br />
* Help<br />
** About COPTR|About COPTR<br />
** Video_guides_to_using_COPTR|Video guides to using COPTR<br />
** helppage|Mediawiki help<br />
** https://en.wikipedia.org/wiki/Help:Cheatsheet|Wikitext cheat sheet<br />
** Data structures in COPTR|COPTR data structures<br />
* SEARCH<br />
* TOOLBOX<br />
* LANGUAGES</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=MediaWiki:Sidebar&diff=5883MediaWiki:Sidebar2022-03-02T12:19:33Z<p>Prwheatley: </p>
<hr />
<div><br />
* navigation<br />
** mainpage|COPTR Home<br />
** Workflow:Community_Owned_Workflows|Community Owned Workflows<br />
** recentchanges-url|recentchanges<br />
* Find tools<br />
** Tools_Grid|Tools grid<br />
** Lifecycle_Stages|By lifecycle stage<br />
** Tool_Functions|By function<br />
** Content_Types|By content type<br />
** File_Formats|By file format<br />
** Category:Tools|All tools<br />
* Help<br />
** About COPTR|About COPTR<br />
** Video_guides_to_using_COPTR|Video guide to using COPTR<br />
** helppage|Mediawiki help<br />
** https://en.wikipedia.org/wiki/Help:Cheatsheet|Wikitext cheat sheet<br />
** Data structures in COPTR|COPTR data structures<br />
* SEARCH<br />
* TOOLBOX<br />
* LANGUAGES</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=MediaWiki:Sidebar&diff=5882MediaWiki:Sidebar2022-03-02T12:19:09Z<p>Prwheatley: </p>
<hr />
<div><br />
* navigation<br />
** mainpage|COPTR Home<br />
** Workflow:Community_Owned_Workflows|Community Owned Workflows<br />
** recentchanges-url|recentchanges<br />
* Find tools<br />
** Tools_Grid|Tools grid<br />
** Lifecycle_Stages|By lifecycle stage<br />
** Tool_Functions|By function<br />
** Content_Types|By content type<br />
** File_Formats|By file format<br />
** Category:Tools|All tools<br />
* Help<br />
** About COPTR|About COPTR<br />
** Video guide to using COPTR<br />
** helppage|Mediawiki help<br />
** https://en.wikipedia.org/wiki/Help:Cheatsheet|Wikitext cheat sheet<br />
** Data structures in COPTR|COPTR data structures<br />
* SEARCH<br />
* TOOLBOX<br />
* LANGUAGES</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Form:Format&diff=5615Form:Format2022-02-18T17:24:37Z<p>Prwheatley: </p>
<hr />
<div><noinclude><br />
This is the "Format" form.<br />
To create a page with this form, enter the page name below;<br />
if a page with that name already exists, you will be sent to a form to edit that page.<br />
<br />
{{#forminput:form=Format}}<br />
<br />
</noinclude><includeonly><br />
<div id="wikiPreview" style="display: none; padding-bottom: 25px; margin-bottom: 25px; border-bottom: 1px solid #AAAAAA;"></div><br />
{{{for template|Infobox format}}}<br />
{| class="formtable"<br />
! Wikidata ID {{#info:The ID of the relevant wikidata item, which should begin with a "Q"}}: <br />
| {{{field|Wikidata ID}}}<br />
|-<br />
! File formats wiki ID {{#info: The last part of the URL after "http://fileformats.archiveteam.org/wiki/"}}: <br />
| {{{field|File formats wiki ID}}}<br />
|-<br />
! PRONOM PUID: <br />
| {{{field|PRONOM PUID}}}<br />
|}<br />
{{{end template}}}<br />
<br />
'''Free text:'''<br />
<br />
{{{standard input|free text|rows=10}}}<br />
</includeonly></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Category:3D&diff=5584Category:3D2022-02-15T10:29:35Z<p>Prwheatley: Created page with "{{Infobox content |definition=Tools that support the preservation of 3D data. }}"</p>
<hr />
<div>{{Infobox content<br />
|definition=Tools that support the preservation of 3D data.<br />
}}</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=3D&diff=55833D2022-02-15T10:27:57Z<p>Prwheatley: Created page with "{{Infobox content |definition=Tools that support the preservation of 3D data. }}"</p>
<hr />
<div>{{Infobox content<br />
|definition=Tools that support the preservation of 3D data.<br />
}}</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=MediaWiki:Sidebar&diff=5572MediaWiki:Sidebar2022-02-04T11:54:57Z<p>Prwheatley: </p>
<hr />
<div><br />
* navigation<br />
** mainpage|COPTR Home<br />
** Workflow:Community_Owned_Workflows|Community Owned Workflows<br />
** recentchanges-url|recentchanges<br />
* Find tools<br />
** Tools_Grid|Tools grid<br />
** Lifecycle_Stages|By lifecycle stage<br />
** Tool_Functions|By function<br />
** Content_Types|By content type<br />
** File_Formats|By file format<br />
** Category:Tools|All tools<br />
* Help<br />
** About COPTR|About COPTR<br />
** helppage|Mediawiki help<br />
** https://en.wikipedia.org/wiki/Help:Cheatsheet|Wikitext cheat sheet<br />
** Data structures in COPTR|COPTR data structures<br />
* SEARCH<br />
* TOOLBOX<br />
* LANGUAGES</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Yara&diff=5571Yara2022-02-04T11:46:00Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image={{PAGENAMEE}}.png<br />
|purpose=YARA is a tool that allows the identification of files that match user-defined textual or binary patterns<br />
|homepage=https://plusvic.github.io/yara/<br />
|license=Apache 2.0<br />
|function=Content Profiling, Forensic<br />
|content=Binary Data, Document<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Yara<br />
|releases_rss=https://github.com/plusvic/yara/commits/master.atom<br />
}}<br />
== Description ==<br />
YARA is a tool that allows one to identify files that match user-defined textual or binary patterns. It is primarily aimed at malware researchers.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Brunnhilde&diff=5570Brunnhilde2022-02-04T11:44:33Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image=Brunnhilde.png<br />
|purpose=Siegfried-based characterization of directories and disk images<br />
|homepage=https://github.com/tw4l/brunnhilde<br />
|license=MIT License<br />
|platforms=Linux, macOS, OS X<br />
|function=Content Profiling, Metadata Extraction, Appraisal<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Brunnhilde is a command-line utility that runs Siegfried against a specified directory or disk image, loads the results into a sqlite3 database, and queries the database to generate reports to aid in triage, arrangement, and description of digital archives. The program will also check for viruses unless specified otherwise, and will optionally run bulk_extractor against the given source. Reports include CSVs, a tree report, and a human-readable HTML summary of the directory or disk image. All outputs are placed into a new directory named after the identifier passed to Brunnhilde as the last argument. Brunnhilde is also capable of exporting files from logical disk images utilizing many file systems, including HFS+.<br />
<br />
Dependencies include Python (tested in 2.7 and 3.4+), Siegfried, ClamAV, bulk_extractor, Sleuth Kit, and HFSExplorer. All dependencies already installed and compiled in the BitCurator environment.<br />
<br />
To install the command-line utility with pip: "pip install brunnhilde".<br />
<br />
For a GUI wrapper for Brunnhilde, see the [https://github.com/tw4l/brunnhilde-GUI Brunnhilde GUI Github repo].<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/tw4l/brunnhilde/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/tw4l/brunnhilde/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/tw4l/brunnhilde/commits/master.atom</rss></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=YT-DLP_(You_Tube_Download_P)&diff=5514YT-DLP (You Tube Download P)2021-12-09T15:52:08Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Supports download of youtube videos, based on the now defunct YT-DL<br />
|sourcecode=https://github.com/yt-dlp/yt-dlp<br />
|function=Web Capture<br />
|content=Web, Video<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=YT-DLP_(You_Tube_Download_P)&diff=5513YT-DLP (You Tube Download P)2021-12-09T15:49:58Z<p>Prwheatley: Created page with "{{Infobox tool |purpose=Supports download of youtube videos, based on the now defunct YT-DL |sourcecode=https://github.com/yt-dlp/yt-dlp |function=Web Capture |content=Web }}..."</p>
<hr />
<div>{{Infobox tool<br />
|purpose=Supports download of youtube videos, based on the now defunct YT-DL<br />
|sourcecode=https://github.com/yt-dlp/yt-dlp<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Web&diff=5505Web2021-12-09T13:27:27Z<p>Prwheatley: </p>
<hr />
<div>Also see the [https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community#the-master-lists Web Archiving Community master list of software].<br />
<br />
{{Infobox content<br />
|definition=Tools that support the preservation of live web data and archived web data (such as ARC and WARC formats).<br />
}}</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Video_guides_to_using_COPTR&diff=5489Video guides to using COPTR2021-12-03T13:32:22Z<p>Prwheatley: </p>
<hr />
<div>These short video guides provide walk throughs on how to use and get the most out of COPTR:<br />
*[https://youtu.be/ZI1ICLzzO-0 How to find preservation tools with COPTR]<br />
*[https://youtu.be/ITlmx2fwG6s How to contribute to COPTR]<br />
*[https://youtu.be/PQ4-Pb73VUY How to add a workflow to COW]<br />
<br />
These videos were funded by the SEADDA COST Action community and the Horizon 2020 Framework Programme of the European Union https://www.seadda.eu/</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Video_guides_to_using_COPTR&diff=5488Video guides to using COPTR2021-12-03T13:31:31Z<p>Prwheatley: Created page with "These short video guides provide walk throughs on how to use and get the most out of COPTR: *[https://youtu.be/ZI1ICLzzO-0 How to find preservation tools with COPTR] *[https:/..."</p>
<hr />
<div>These short video guides provide walk throughs on how to use and get the most out of COPTR:<br />
*[https://youtu.be/ZI1ICLzzO-0 How to find preservation tools with COPTR]<br />
*[https://youtu.be/ITlmx2fwG6s How to contribute to COPTR]<br />
*[https://youtu.be/PQ4-Pb73VUY How to add a workflow to COW]</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=About_COPTR&diff=5487About COPTR2021-12-03T13:26:31Z<p>Prwheatley: </p>
<hr />
<div>__NOTOC__<br />
COPTR is a wiki based registry of digital preservation tools. It's main aim is to help practitioners discover preservation tools that will help them tackle particular preservation challenges. It can be browsed and searched directly by practitioners, or queried by other systems via an API.<br />
<br />
Each tool in COPTR is categorised by the [[Lifecycle Stages|Lifecycle Stage]] it falls within, the (sub) [[Tool Functions|Function]] it performs and by the [[Content Types|Content Type]] that it operates on. It may also input and/or output specific [[File_Formats|file formats]]. For example, [[EpubCheck]] performs the Function [[Validation]], it operates on [[EBook]] Content and has [[EPUB]] as an input format. COPTR can help by enabling you to browse to tools that meet a particular need - perhaps relating to a specific [[Content Types|Content Type]] or [[Tool Functions|Function]].<br />
<br />
COPTR also holds [[Workflow:Community_Owned_Workflows|digital preservation workflows]]. Sharing a workflow is a great way to exchange digital preservation practice and enable conversations around tools and how they are used.<br />
<br />
[[Video guides to using COPTR|Watch these videos to find out more about how to use, and how to contribute to, COPTR]].<br />
<br />
===How can I get involved?===<br />
COPTR is a wiki, which means anyone can edit it and your contributions are very welcome. All changes are tracked, so if a mistake is made, it can easily be undone. So don't hold back - please dive in a help make the information in COPTR better. Here are some suggestions:<br />
*[[Add a tool to COPTR|Add a new tool to COPTR]]. Add a tool you're familiar with or check out this [[Wish list of tools to add to COPTR|wish list of tools to add to COPTR]].<br />
*[[Contribute your workflow to COW|Add a new workflow to COW]]. It doesn't have to be a finished workflow - a draft or proposal is still useful!<br />
*Correct or add more detail to existing entries - use the "Edit with form" option to do this. For example, you could add input or output formats. Ask for help (see below) if you need to create new [[File_Formats|File format]] entries.<br />
<br />
===What is the scope of COPTR?===<br />
COPTR captures basic, factual details about a tool, what it does, how to find more information (relevant URLs) and references to user experiences with the tool. The scope is a broad interpretation of the term "digital preservation". In other words, if a tool is useful in performing a digital preservation function such as those described in the OAIS model or the DCC lifecycle model, then it's within scope of this registry.<br />
<br />
*In scope: Characterisation, visualisation, rendering, migration, storage, fixity, access, delivery, search, web archiving, open source software ->everything inbetween<- commercial software.<br />
*Out of scope: Digitisation, file creation<br />
<br />
===Why does COPTR exist?===<br />
The digital preservation community had created a whole host of different tool registries that described preservation tools. Many people had also written blog posts about tools or organisations hosted web pages with lists of tools that tackled particular areas. There was lots of duplication between these lists and registries, but at the same time, each one often held tools that weren't described elsewhere. COPTR was built by bringing together and rationalising these other registries and lists. It was originally created as a small activity within the Jisc funded [http://wiki.opf-labs.org/display/SPR SPRUCE Project], and has been significantly developed since - most notably with an upgrade to Semantic Mediawiki in 2021.<br />
<br />
===Sustainability===<br />
The following arrangements are in place to ensure the sustainability of COPTR:<br />
<br />
*Hosting is kindly provided by the OPF.<br />
*Content is provided and maintained by you, the community! DPC, DDHN and POWRR have been active in coordinating and implementing maintenance and editathons.<br />
*Regular data dumps are provided to NDIIPP and arrangements are in place to provide a route to future hosting should OPF be unable to continue to provide this service.<br />
<br />
===Technical stuff===<br />
*[[Data structures in COPTR]] - information on the COPTR data model and its maintenance<br />
*[[How COPTR Works]] - information on the technical setup of COPTR<br />
*How to use the [[Using the COPTR data feed|COPTR data feed and API]]<br />
*[[Development plans]]<br />
<br />
===Organizational Contributions===<br />
Organisations who would like to contribute the contents of their tool registry to COPTR should:<br />
<br />
*Merge your own tool registry data with the new community registry (COPTR can help with this)<br />
*Remove your own registry and agree not to set up any new project owned registries/lists<br />
*Link to COPTR or expose the COPTR data by utilising the COPTR data feed<br />
*Contribute any effort you have in adding new tools over time, to the community registry<br />
<br />
COPTR is aiming to concentrate information about digital preservation tools in one place, keep the information in the registry maintainable and current by pooling the community's effort in one place, and avoid the emergence of new registries.<br />
<br />
===Contact, feedback and questions===<br />
Contact [[User:Prwheatley|Paul Wheatley]] or [[User:Andy_Jackson|Andy Jackson]], or contribute feedback on [[Talk:Main_Page| Main page/discussion]].</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Xenu%27s_Link_Sleuth&diff=5485Xenu's Link Sleuth2021-11-26T17:00:30Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The tool checks the hyperlinks on websites.<br />
|homepage=http://home.snafu.de/tilman/xenulink.html<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on its digital preservation value. Keep it factual. --><br />
Xenu's Link Sleuth checks the hyperlinks of websites. Originally intended as a tool to find broken links for webmasters, the output can be used to guide a crawler in web archiving. The tool is written by Tilman Hausherr and is proprietary software available at no charge. <br />
[[File:Xenu.PNG]]<br />
<br />
Screenshot of Xenu's Link Sleuth checking the COPTR pages<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wayback_Machine&diff=5484Wayback Machine2021-11-26T17:00:15Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files).<br />
|homepage=http://netpreserve.org/open-wayback<br />
|license=GNU Lesser General Public License 2.1<br />
|platforms=Platform independent. Tomcat, an Apache.org Java-based Web server, is the only server under which Wayback has been tested and is known to work. There may be others, but they have not been tested nor are they supported.<br />
|function=Access, Discovery, Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=openwayback<br />
}}<br />
= Description =<br />
The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files). Developed by Internet Archive. Written in Java.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WarcManager&diff=5483WarcManager2021-11-26T16:57:31Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The WARC Manager is a web-based UI for managing and querying collections of web crawl data.<br />
|homepage=https://wiki.umiacs.umd.edu/adapt/index.php/WarcManager<br />
|platforms=Apache Tomcat, MySQL jdbc connector, context.xml, schema.sql, and the warc webapp<br />
|function=File Management, Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=WarcManager<br />
}}<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Web Crawl]]<br />
[[Category:Web]]<br />
<br />
<br />
= Description =<br />
The WARC Manager is a web-based UI for managing and querying collections of web crawl data. The WARC Manager allows libraries to easily locate pages, determine the completeness of a web collection, and view crawl statistics for a page or collection. The WARC Manager has been tested on collections containing over 177 million pages covering 37 million unique URL's. Developed by University of Maryland.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WAXToolbar&diff=5482WAXToolbar2021-11-26T16:56:49Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=WAXToolbar is a firefox extension to help users with common tasks encountered surfing a web archive.<br />
|homepage=http://archive-access.sourceforge.net/projects/waxtoolbar/<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=WAXToolbar<br />
}}<br />
= Description =<br />
WAXToolbar is a firefox extension to help users with common tasks encountered surfing a web archive. This extension depends on the open source wayback machine. Among the features of the WAX Toolbar is a search field for querying the wayback machine OR for searching a full-text NutchWAX index (if one is available). You can also use the toolbar to switch between proxy-mode and the regular Internet; when in proxy-mode you can easily go back and forth in time. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5481Wish list of tools to add to COPTR2021-11-26T16:56:33Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)<br />
*Curate.Us<br />
*Find It! Keep It!<br />
*Heritrix plug-in for rich media capture<br />
*WAXToolbar</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=TubeKit&diff=5480TubeKit2021-11-26T16:55:44Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=TubeKit is a toolkit for creating YouTube crawlers.<br />
|homepage=https://www.tubekit.org/<br />
|license=Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License<br />
|platforms=Web based<br />
|function=Web Capture<br />
|content=Video<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=TubeKit<br />
}}<br />
= Description =<br />
TubeKit is a toolkit for creating YouTube crawlers. It allows one to build one's own crawler that can crawl YouTube based on a set of seed queries and collect up to 16 different attributes. TubeKit assists in all the phases of this process starting database creation to finally giving access to the collected data with browsing and searching interfaces. In addition to creating crawlers, TubeKit also provides several tools to collect a variety of data from YouTube, including video details and user profiles. Developed by UNC Chapel Hill. Written in PHP.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=RARC_(ARC_replicator)&diff=5479RARC (ARC replicator)2021-11-26T16:54:54Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=rARC is a distributed system that enables Internet users to provide storage space from their computers to replicate small parts of the archived data stored in the central repository of the Web archive.<br />
|homepage=http://arquivo-web.fccn.pt/about-the-archive/how-does-it-work/rarc/rarc-arc-replicator<br />
|function=Web Capture<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=rARC: ARC replicator<br />
}}<br />
= Description =<br />
rARC is a distributed system that enables Internet users to provide storage space from their computers to replicate small parts of the archived data stored in the central repository of the Web archive. It is being developed within the Portuguese Web Archive project. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Metaproducts&diff=5478Metaproducts2021-11-26T16:54:37Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Metaproducts offers several commercial capture and off-line browsing tools.<br />
|homepage=http://www.metaproducts.com/<br />
|function=Web Capture<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Metaproducts<br />
}}<br />
= Description =<br />
Metaproducts offers several commercial capture and off-line browsing tools. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Heritrix_plug-in_for_rich_media_capture&diff=5477Heritrix plug-in for rich media capture2021-11-26T16:54:07Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The Rich Media Capture module (RMC), developed in the LiWA (Living Web Archives) project, is designed to enhance the capturing capabilities of the crawler, with regards to different multimedia content types.<br />
|homepage=http://code.google.com/p/liwa-technologies/<br />
|function=Web Capture<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Heritrix plug-in for rich media capture<br />
}}<br />
= Description =<br />
The Rich Media Capture module (RMC), developed in the LiWA (Living Web Archives) project, is designed to enhance the capturing capabilities of the crawler, with regards to different multimedia content types. The current version of Heritrix is mainly based on the HTTP/HTTPS protocols and it cannot treat other content transfer protocols widely used for multimedia content, such as streaming. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5476Wish list of tools to add to COPTR2021-11-26T16:53:55Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)<br />
*Curate.Us<br />
*Find It! Keep It!<br />
*Heritrix plug-in for rich media capture</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Find_It!_Keep_It!&diff=5475Find It! Keep It!2021-11-26T16:53:04Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Find It! Keep It! is a tool to save and organise web content.<br />
|homepage=http://www.ansemond.com/<br />
|function=Web Capture<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Find It! Keep It!<br />
}}<br />
= Description =<br />
Find It! Keep It! is a tool to save and organise web content. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5474Wish list of tools to add to COPTR2021-11-26T16:52:48Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)<br />
*Curate.Us<br />
*Find It! Keep It!</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=DIMAG&diff=5473DIMAG2021-11-26T16:51:46Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=A software suite supporting archives with preservation of digital information for eternity<br />
|homepage=https://dimag-wiki.la-bw.de<br />
|license=Proprietary<br />
|function=Access, File Format Migration, Metadata Extraction, Preservation System, Storage, Workflow<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
DIMAG is a software suite composed of several modules. DIMAG is developed and used in common by a considerable number of archives in Germany, Austria and Switzerland. <br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Curate.Us&diff=5472Curate.Us2021-11-26T16:50:54Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=With a simple click of the mouse, you can create visually compelling clips and quotes of web content that are easily embedded in blog posts, email, forums, and websites.<br />
|homepage=https://secure.curate.us/content/front<br />
|function=Web Capture<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Curate.Us<br />
}}<br />
= Description =<br />
With a simple click of the mouse, you can create visually compelling clips and quotes of web content that are easily embedded in blog posts, email, forums, and websites. Use the Curate.Us bookmarklet, which is easily installed on your browser, or go to the website to create visual clips and formatted quotes of online content with attribution. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5471Wish list of tools to add to COPTR2021-11-26T16:50:38Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)<br />
*Curate.Us</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Webrecorder&diff=5470Webrecorder2021-11-26T16:47:00Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Webrecorder is a hosted web archiving tool with which users can capture what they see as they browse websites and save that information (locally or to a free account)<br />
|homepage=https://webrecorder.io/<br />
|license=Webrecorder is an open-source software product (under the Apache License) and shared via [GitHub][https://github.com/webrecorder/webrecorder]<br />
|platforms=Platform agnostic, operates via web browser<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
<br />
Webrecorder is a hosted web archiving tool with which users can capture what they see as they browse websites and save that information. Via a web browser Webrecorder collects content and data from web pages including: HTML, images, scripts, stylesheets, Flash, Java applets as well as video, audio and other elements used to make web pages and web apps. Webrecorder can capture dynamic web content that cannot be captured by most crawler-based web archiving tools. Webrecorder can record what you see when you are logged in to a social media profile (though it does not record site login credentials).<br />
<br />
One does not need to login to use Webrecorder to capture web content if the intent is to download the captures right away (as a WARC file) and save them locally. Desktop software that can open a WARC file, such as Webrecorder Player [https://github.com/webrecorder/webrecorderplayer-electron], is needed to view web archives downloaded from Webrecorder. Webrecorder Player is available at no charge and with this software you will be able to view all the content contained in a WARC file without being connected to the internet. For continued access to archived content online, and to be able to add to a collection, it is necessary to log in to a free account, which comes with 5 GB of storage (at least as of fall 2017).<br />
<br />
Webrecorder is a project of Rhizome [https://rhizome.org/] under its digital preservation program.<br />
<br />
== User Experiences ==<br />
<br />
*Happy accidents: adventures in web preservation [https://anoldhanddigital.wordpress.com/2017/08/09/happy-accidents-adventures-in-web-preservation/]<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/webrecorder/webrecorder/commits<br />
<br />
<!-- <br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/webrecorder/webrecorder/releases.atom</rss><br />
--><br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/webrecorder/webrecorder/commits/master.atom</rss><br />
<br />
<!-- Add the OpenHub.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Webkit2png&diff=5469Webkit2png2021-11-26T16:46:40Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=webkit2png is a command line tool that creates png screenshots of webpages.<br />
|homepage=http://www.paulhammond.org/webkit2png/<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=webkit2png<br />
}}<br />
= Description =<br />
webkit2png is a command line tool that creates png screenshots of webpages. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WebShot&diff=5468WebShot2021-11-26T16:46:23Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=WebShot allows you to take screenshots of web pages and save them as full sized images or thumbnails.<br />
|homepage=http://www.websitescreenshots.com/<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=WebShot<br />
}}<br />
= Description =<br />
WebShot allows you to take screenshots of web pages and save them as full sized images or thumbnails. Screenshots images can be output in the JPG, GIF, PNG, or BMP formats. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WebCite&diff=5467WebCite2021-11-26T16:46:05Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material.<br />
|homepage=http://www.webcitation.org/<br />
|function=Persistent Identification, Web Capture, Citation and Impact Tracking<br />
|content=Research Data, Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=WebCite<br />
}}<br />
= Description =<br />
[http://www.webcitation.org/ WebCite] is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material. Its goal is to allow both scholars and academic publishers to have confidence that any web-based materials cited in their publications will remain available for future readers.<br />
====Provider====<br />
The WebCite Consortium, hosted at the University of Toronto / University Health Network&#39;s Centre for Global eHealth Innovation.<br />
====Licensing and cost====<br />
The service is freely available under a [http://creativecommons.org/licenses/by-nc-sa/2.5/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5] License.<br />
WebCite asks that scholarly publishers wishing to automatically archive and identify new articles become members of the WebCite Consortium. Membership is currently a voluntary donation, based on publishing revenue and the number of webcitations assigned per year.<br />
====Development activity====<br />
The service is available as of January 2012.<br />
The project website does not advertise current development activity, although the service itself is still running. Most of the information on the website is out of date, publicising developments to be released in 2008, but which are not available on the site.<br />
====Platform and interoperability====<br />
WebCite is a web-based application, which renders it platform agnostic.<br />
====Functional notes====<br />
WebCite stores a copy of the target material&rsquo;s html, including any other associated files regardless of format. The crawl respects robot exclusion policies and firewalls.<br />
Users may initiate a web cache of materials they produce themselves before publishing. Authors may also submit XML files of already published articles, and WebCite will crawl the references and attempt to retrospectively archive cited webpages.<br />
====Documentation and user support====<br />
Documentation consists of a [http://www.webcitation.org/doc/WebCiteBestPracticesGuide.pdf Technical Background and Best Practices Guide]. The site gives a contact email for the developer.<br />
====Usability====<br />
The archiving request interface is extremely straightforward. Individuals can also use a personalised bookmarklet to automatically request that materials be archived.<br />
====Expertise required====<br />
Basic knowledge of citation standards.<br />
====Standards compliance====<br />
WebCite, if requested, will incorporate an object&rsquo;s DOI into the webcitation; however, the service itself does not assign DOIs.<br />
====Influence and take-up====<br />
Nearly 200 journals belong to the consortium; no information is available on individual use.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Web_Scraper_Plus%2B&diff=5466Web Scraper Plus+2021-11-26T16:45:15Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Web Scraper Plus+ takes data from the web and puts it into a spreadsheet or database.<br />
|homepage=http://www.velocityscape.com/Products/WebScraperPlus.aspx<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Web Scraper Plus+<br />
}}<br />
= Description =<br />
Web Scraper Plus+ takes data from the web and puts it into a spreadsheet or database. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Warrick&diff=5465Warrick2021-11-26T16:43:56Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image=Warricklogo.gif<br />
|purpose=Warrick is a free utility for reconstructing (or recovering) a website from web archives.<br />
|homepage=https://github.com/oduwsdl/warrick<br />
|license=[http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html GNU General Public License 2+]<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Warrick<br />
|mailing_lists=https://groups.google.com/forum/#!topic/warrickrecovery/<br />
}}<br />
= Description =<br />
Warrick is a free utility for reconstructing (or recovering) a website when a back-up is not available. Warrick utilizes the Memento Framework to discover archived versions of resources from web archives. The resources are gathered to provide a single collection of files.<br />
====Provider====<br />
[http://www.harding.edu/fmccown/ Frank McCown] and [http://www.justinfbrunelle.com/ Justin Brunelle]<br />
====Licensing and cost====<br />
[http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html GNU General Public License 2+]<br />
====Standards compliance====<br />
Can use [http://mementoweb.org/about/ Memento] to retrieve archived web content.<br />
<br />
= User Experiences =<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/oduwsdl/warrick/commits<br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/oduwsdl/warrick/commits/master.atom</rss></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Warcit&diff=5464Warcit2021-11-26T16:43:15Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Warcit converts directories, files and ZIP files to Web Archives (WARC)<br />
|homepage=https://pypi.org/project/warcit/<br />
|sourcecode=https://pypi.org/project/warcit/<br />
|license=Apache 2.0<br />
|platforms=Windows, Linux, Mac<br />
|language=English<br />
|formats_in=ZIP<br />
|formats_out=WARC<br />
|function=File Format Migration<br />
|content=Web<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
Warcit converts directories, files and ZIP files to Web Archives (WARC)<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5463Wish list of tools to add to COPTR2021-11-26T16:40:41Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete<br />
*WERA (Web ARchive Access)</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WCT_(Web_Curator_Tool)&diff=5462WCT (Web Curator Tool)2021-11-26T16:38:16Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image=wct-banner-500x100.jpg<br />
|purpose=Web Curator Tool (WCT) is a workflow management application for selective web archiving.<br />
|homepage=https://webcuratortool.org/<br />
|sourcecode=https://github.com/WebCuratorTool/webcurator<br />
|license=[http://www.apache.org/licenses/LICENSE-2.0 Apache License 2.0]<br />
|platforms=Apache Tomcat<br />
|function=Metadata Processing, Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=webcurator<br />
}}<br />
= Description =<br />
The [http://webcurator.sourceforge.net/ Web Curator Tool] (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive.<br />
====Provider====<br />
Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd.<br />
====Platform and interoperability====<br />
WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework.<br />
====Functional notes====<br />
An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account.<br />
====Documentation and user support====<br />
The project site includes a well written quick-start guide and [http://webcurator.sourceforge.net/docs/1.5.2/Web%20Curator%20Tool%20User%20Manual%20(WCT%201.5.2).pdf user manual], although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. &nbsp;More information about the project can be found in an informative [http://www.ariadne.ac.uk/issue50/beresford/ article] published in Ariadne Issue 50. The primary forum for technical support appears to be an active &ldquo;webcurator-users&rdquo; mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011.<br />
====Usability====<br />
WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support.<br />
====Expertise required====<br />
Setup, especially if it includes links to an archival repository, requires system administration knowledge. Users must have a comprehensive understanding of their institution&rsquo;s collections policies when designing harvests.<br />
====Standards compliance====<br />
WCT allows users to add basic Dublin Core metadata to the material.<br />
====Influence and take-up====<br />
WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
No information is available on the current funding status for development, although the SourceForge site&rsquo;s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release. <br />
<br />
All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/DIA-NZ/webcurator/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/DIA-NZ/webcurator/commits/master.atom</rss></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WCT_(Web_Curator_Tool)&diff=5461WCT (Web Curator Tool)2021-11-26T16:37:48Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image=wct-banner-500x100.jpg<br />
|purpose=Web Curator Tool (WCT) is a workflow management application for selective web archiving.<br />
|homepage=https://webcuratortool.org/<br />
|sourcecode=https://github.com/WebCuratorTool/webcurator<br />
|license=[http://www.apache.org/licenses/LICENSE-2.0 Apache License 2.0]<br />
|platforms=Apache Tomcat<br />
|function=Metadata Processing, Web Crawl<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=webcurator<br />
}}<br />
= Description =<br />
The [http://webcurator.sourceforge.net/ Web Curator Tool] (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive.<br />
====Provider====<br />
Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd.<br />
====Platform and interoperability====<br />
WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework.<br />
====Functional notes====<br />
An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account.<br />
====Documentation and user support====<br />
The project site includes a well written quick-start guide and [http://webcurator.sourceforge.net/docs/1.5.2/Web%20Curator%20Tool%20User%20Manual%20(WCT%201.5.2).pdf user manual], although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. &nbsp;More information about the project can be found in an informative [http://www.ariadne.ac.uk/issue50/beresford/ article] published in Ariadne Issue 50. The primary forum for technical support appears to be an active &ldquo;webcurator-users&rdquo; mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011.<br />
====Usability====<br />
WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support.<br />
====Expertise required====<br />
Setup, especially if it includes links to an archival repository, requires system administration knowledge. Users must have a comprehensive understanding of their institution&rsquo;s collections policies when designing harvests.<br />
====Standards compliance====<br />
WCT allows users to add basic Dublin Core metadata to the material.<br />
====Influence and take-up====<br />
WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
No information is available on the current funding status for development, although the SourceForge site&rsquo;s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release. <br />
<br />
All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/DIA-NZ/webcurator/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/DIA-NZ/webcurator/commits/master.atom</rss></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=5460Wish list of tools to add to COPTR2021-11-26T16:36:05Z<p>Prwheatley: </p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*https://www.logipole.com/konvertor-en.htm<br />
*http://www.chumba.ch/chumbalum-soft/ms3d/index.html<br />
*https://www.embulk.org/<br />
*ISO16363 PTAB http://www.iso16363.org/<br />
*Guymager https://www.kali.org/tools/guymager/<br />
*Bulk Extractor https://www.kali.org/tools/bulk-extractor/<br />
*SoX http://sox.sourceforge.net/<br />
*MPlayer http://www.mplayerhq.hu/design7/info.html<br />
*IFIscripts https://ifiscripts.readthedocs.io/en/latest/index.html<br />
*Transformenator https://github.com/RetroFloppy/transformenator<br />
*Document Liberation Project https://www.documentliberation.org/projects/<br />
*https://github.com/kieranjol/IFIscripts<br />
*https://formats.kaitai.io/<br />
<br />
Some great data manipulation tools in this thread: https://twitter.com/SarahRBarsness/status/1452684578905337866?s=20<br />
*https://pandas.pydata.org/<br />
*https://www.tableau.com/products/prep<br />
*https://openrefine.org/<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Flashback_project_-_optical_media_imaging_workflow<br />
*Toc2Cue <br />
*BChunk<br />
*GenISOImage<br />
*Sophos AV Scan<br />
<br />
(note that the 4 tools below are referenced in this workflow. After they have been added, the links in the workflow should be updated so they point to the new tool entries) https://coptr.digipres.org/index.php/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive<br />
*DocuTeam Feeder<br />
*cURL<br />
*Saxon<br />
*ClamAV<br />
<br />
==Tools that are obsolete==<br />
*Khtml2png<br />
*ArchiveFacebook<br />
*Pearl Crescent replaced with https://addons.mozilla.org/en-GB/firefox/addon/pagesaver-we/<br />
*Spadix Software is actually Rafabot and looks old<br />
*Storytracker ?<br />
*WAS (Web Archiving Service) - obsolete</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WAS_(Web_Archiving_Service)&diff=5459WAS (Web Archiving Service)2021-11-26T16:35:45Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information.<br />
|homepage=http://webarchives.cdlib.org/<br />
|platforms=Web-based. Javascript must be enabled in the user?s browser. User must be able to install browser bookmarklets to use the "add sites while browsing" feature. Login and password are required.<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=WAS (Web Archiving Service)<br />
}}<br />
= Description =<br />
The Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information. Developed by California Digital Library. Written in Java, Ruby on Rails.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=WARCreate&diff=5458WARCreate2021-11-26T16:35:16Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image=Icon.png<br />
|purpose=Google Chrome browser extension for creating WARC files from web pages<br />
|homepage=https://warcreate.com<br />
|sourcecode=https://github.com/machawk1/warcreate<br />
|license=GPLv3<br />
|platforms=Cross-platform<br />
|language=JavaScript<br />
|formats_out=WARC<br />
|function=Data capture and Deposit, Personal Archiving, Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=warcreate<br />
}}<br />
== Description ==<br />
WARCreate is a browser extension for Google Chrome that preserves web pages viewed by end-users in the browsers into WARC files to be stored on the user's local disk.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. --><br />
* [http://ws-dl.blogspot.com/2013/07/2013-07-10-warcreate-and-wail-warc.html 2013-07-10: WARCreate and WAIL: WARC, Wayback and Heritrix Made Easy]<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/machawk1/warcreate/commits<br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/machawk1/warcreate/commits/master.atom</rss><br />
<br />
<br />
<!-- Add the Ohloh.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=UKWA_Access_API&diff=5457UKWA Access API2021-11-26T16:32:32Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|image={{PAGENAMEE}}.png<br />
|purpose=Web archives access API<br />
|homepage=https://github.com/ukwa/ukwa-access-api<br />
|license=Apache License 2.0<br />
|function=Persistent Identification, Service<br />
|content=Web<br />
}}<br />
{{Infobox tool details}}<br />
== Description ==<br />
UKWA Access API is an application to wrap up APIs for accessing UKWA content including an ARK resolver.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --></div>Prwheatleyhttps://coptr.digipres.org/index.php?title=The_DeDuplicator_(Heritrix_add-on_module)&diff=5456The DeDuplicator (Heritrix add-on module)2021-11-26T16:32:07Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.<br />
|homepage=http://landsbokasafn.github.io/DeDuplicator/<br />
|function=De-Duplication, Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=The DeDuplicator (Heritrix add-on module)<br />
}}<br />
= Description =<br />
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatleyhttps://coptr.digipres.org/index.php?title=Teleport&diff=5455Teleport2021-11-26T16:30:44Z<p>Prwheatley: </p>
<hr />
<div>{{Infobox tool<br />
|purpose=Teleport is a web crawling tool that enables offline browsing<br />
|homepage=http://www.tenmax.com/teleport/home.htm<br />
|function=Web Capture<br />
|content=Web<br />
}}<br />
{{Infobox tool details<br />
|ohloh_id=Tennyson Maxwell Information Systems<br />
}}<br />
= Description =<br />
Tennyson Maxwell Information Systems offers a variety of features to support multithreaded retrieval, password-protected access, filtering, batch capture, and management of derived databases. <br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =</div>Prwheatley