Editing User:Andy Jackson/DigiPresHack

Jump to navigation Jump to search

User account "Andy Jackson" is not registered. Please check if you want to create/edit this page.

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
= Introduction =
+
== DigiPresHack Outline Proposal==
The idea behind DigiPresHack is to have recurring events in hackathon style where we build up the information we need to do digital preservation better. These could be regular fixtures alongside conferences (iPres, IIPC GA, IDCC, etc.) but would also have a strong remote-participation element. There would be three main outcomes:
+
 
 +
The idea is to have recurring events in hackathon style where we build up the information we need to do digital preservation better. These could be regular fixtures alongside conferences (iPres, IIPC GA, IDCC, etc.) but would also have a strong remote-participation element. There would be three main outcomes:
  
 
* More information documenting more formats, risks and other preservation issues.
 
* More information documenting more formats, risks and other preservation issues.
Line 8: Line 9:
 
i.e. there would always be an educational/introductory strange to help people learn about the issues and learn how to contribute to the registries/data sources.
 
i.e. there would always be an educational/introductory strange to help people learn about the issues and learn how to contribute to the registries/data sources.
  
= Potential Events =
 
  
* [[User:Andy_Jackson/DigiPresHack@IIPC-GA-2015|DigiPresHack@IIPC-GA-2015]]
+
DigiPresHack: Formats & Tools Hackathon
 +
 
 +
Andrew N. Jackson
 +
 
 +
Themes: Preservation
 +
 
 +
To effectively capture and preserve the web, we need to understand the formats and protocols of the web, and the tools that can be used to manage them over time. This need has manifested itself via a range of digital preservation tool and format registries and test corpora, but so far these have only represented a partial success. Many registries have been developed but have failed to take hold, and those that have succeeded are those that have sought to identify, support and recognised those individuals willing to spend time contributing their effort and knowledge.
 +
 
 +
The idea of this DigiPresHack is to support those who wish to contribute in this area by providing a supportive environment and a clear framework for contribution.  The hackathon format would be a one-day workshop in the ‘unconference’ style. Suggested activities would include:
 +
 
 +
*Generating example test files for various formats, e.g.
 +
** WARC and ARC files demonstrating the different de-duplication methods
 +
** HTML files demonstrating particular features, ideally accompanied by screenshots to capture the results (e.g. using emulators for old browsers).
 +
oExtending the Archival Acid Test suite: https://github.com/machawk1/archivalAcidTest
 +
* Extending the PWG database, possibly combining it with the newly-developed PET tools (https://github.com/pericles-project/pet)
 +
* Review and/or add web archiving tool information to COPTR (http://coptr.digipres.org/)
 +
* Document difficult or particularly interesting/challenging formats (http://fileformats.archiveteam.org)
 +
* Extend the aggregations and visualisations at http://www.digipres.org/ in order to be able to see how far we’ve come.
 +
 
 +
To go ahead, this hackathon would require some additional funding to bring in appropriate individuals who could facilitate this event and who would not otherwise be able to attend. If possible, modest prizes for significant contributions could help build momentum. Ideally, we could use a webcast/hangout or similar to enable engagement by those who cannot attend.
  
  
= Potential Strands =
+
== Potential Strands ==
  
== Introductory Track ==
+
=== Introductory Track ===
  
 
These are tasks that only require basic technical skills and a willingness to learn how to document their findings. We would perform basic tasks where we create test files and check how they are rendered.
 
These are tasks that only require basic technical skills and a willingness to learn how to document their findings. We would perform basic tasks where we create test files and check how they are rendered.
Line 29: Line 48:
 
Goal is to better understand formats and software dependencies and document genuine preservation risks.
 
Goal is to better understand formats and software dependencies and document genuine preservation risks.
  
== Technical Track ==
+
=== Technical Track ===
  
 
Improving tools, making new ID signatures in forms suitable for PRONOM etc.
 
Improving tools, making new ID signatures in forms suitable for PRONOM etc.

Please note that all contributions to COPTR are considered to be released under the Attribution-ShareAlike 3.0 Unported (see COPTR:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)