PRONOM Signature Development Utility

From COPTR
Jump to navigation Jump to search


Output DROID compatible file format signature files using PRONOM syntax
Homepage:https://github.com/exponential-decay/signature-development-utility
License:Open source (see URL above)
Platforms:Golang + PHP + JQuery + Javascript + text/html
Function:File Format Identification




Description[edit]

Utility to enable the creation of DROID compatible signature files using PRONOM regular expression syntax. The tool outputs in an XML format compatible with DROID 4 upwards (including DROID 5 and 6). Three sequences can be combined to create a single file format signature. Signature files can be concatenated manually if more complex collections are required for testing.

Version 2: FFDev.info[edit]

Version 2 of the utility was released in October 2020. The utility adds support for Container Signature sequences. The first iteration is bootstrapped to the original utility which has already been well proven.

Hosting[edit]

Version 2 is hosted on:

Version 1: PRONOM signature development utility[edit]

The original code for version 1 of the utility is still available in a branch on Github.

Hosting[edit]

Version 1 is hosted at exponentialdecay.co.uk and The National Archives, UK:

Using its Output[edit]

DROID[edit]

The signature development utility output can be directly uploaded to DROID via its 'Install Signature File' mechanism. Be sure to then select the new file from 'Preferences'. Container signatures need to be added via the users /home/<username>/.droid6 directory. On Windows the user's home folder can be found under Users in 'C:'.

Siegfried[edit]

Siegfried is a useful tool for testing because it can combine a new signature file from this utility with all the other signatures in the PRONOM corpus, plus its many others.

Instructions for Linux[edit]

We use a utility called Roy to extend signature files. This is installed alongside Siegfried.

To extend a signature file to include a custom signature file, we need to make sure that there is a 'custom' folder, where Roy can find it. We will need our signature to be on a path that looks as follows:

/home/{username}/siegfried/custom/{custom-dev-sig}.xml

Extending the signature file can then be completed in two stages:

1) Roy harvest

This will download PRONOM signature file reports. This is the most accurate way to run Siegfried. An alternative is to let Siegfried parse the DROID signature file, but this has shown to lead to some inconsistencies where PRONOM and DROID do not reflect each other entirely.

2) Roy build -extend {custom-dev-sig}.xml

Try Siegfried on a single file of the format we have created the signature for:

3) sf {custom-dev-format}.{ext}

The result will be:

---
siegfried   : 1.7.6
scandate    : 2017-10-25T23:35:31+13:00
signature   : default.sig
created     : 2017-10-25T23:23:59+13:00
identifiers : 
  - name    : 'pronom'
    details : 'DROID_SignatureFile_V91.xml; container-signature-20170330.xml; extensions: {custom-dev-sig}.xml'
---
filename : '{custom-dev-format}.{ext}'
filesize : 1492992
modified : 2017-10-22T16:02:13+13:00
errors   : 
matches  :
  - ns      : 'pronom'
    id      : 'dev/1'
    format  : '{custom-dev-format}'
    version : '1.0'
    mime    : 'application/{custom-dev-format}'
    basis   : 'extension match img; byte match at 1024, 32'
    warning : 

RDF[edit]

The tool's output suggests a potential RDF representation of a DROID signature (NB. this is a version 1 option only).

RDF from the utility looks as follows:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:sigdev="http://nationalarchives.gov.uk/preservation/sigdev/signature/" xmlns:bytes="http://nationalarchives.gov.uk/preservation/sigdev/signature/byteSequence/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <sigdev:DevelopmentSignature rdf:about="http://nationalarchives.gov.uk/preservation/sigdev/signature/ddc3ab7d-db41-49c5-b3fa-262ad83dd703">
    <rdfs:label>Development Signature</rdfs:label>
    <sigdev:version>1.0</sigdev:version>
    <sigdev:extension>ext</sigdev:extension>
    <sigdev:internetMediaType>text/x-test-signature</sigdev:internetMediaType>
    <sigdev:puid>dev/1</sigdev:puid>
    <sigdev:byteSequence>
      <rdf:Description rdf:about="http://nationalarchives.gov.uk/preservation/sigdev/signature/byteSequence/228c0626-3e0d-40bc-98d3-d897a21b20e1/1">
        <bytes:string rdf:datatype="http://nationalarchives.gov.uk/preservation/sigdev/signature/droidRegularExpression">255044462D312E34</bytes:string>
        <bytes:anchor>BOFoffset</bytes:anchor>
        <bytes:offset rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">0</bytes:offset>
        <bytes:maxOffset rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">0</bytes:maxOffset>
      </rdf:Description>
    </sigdev:byteSequence>
  </sigdev:DevelopmentSignature>
</rdf:RDF>

This can be used to generate a graph visualization of a signature, and provides another serialization of a DROID signature that could potentially be consumed by tools in the future.

User experiences[edit]

Additional reading[edit]

Development activity[edit]

All development activity is visible on GitHub: https://github.com/exponential-decay/signature-development-utility/commits

Activity feed[edit]

Below the last 5 commits:

2020-10-05 23:20:25
[tag:github.com,2008:Grit::Commit/66d3cd248f281abcb707197577679df50d8af8a5 Removes the upper case default on container seqs]
by ross-spencer https://github.com/ross-spencer
2020-10-02 01:35:28
[tag:github.com,2008:Grit::Commit/85efed6a4b4ee353b7c807f61f7a4df3136476b0 Condense layout a little more]
by ross-spencer https://github.com/ross-spencer
2020-10-02 01:35:09
[tag:github.com,2008:Grit::Commit/8d0fe614e8d65245cc8bf82ab44bd38bcc3a9b30 Remove redundant code and reset fields]
by ross-spencer https://github.com/ross-spencer
2020-09-29 02:03:40
[tag:github.com,2008:Grit::Commit/e5b11757ea71bdd8fcf7f9080ea932125aa475f4 Correct Variable form field handling]
by ross-spencer https://github.com/ross-spencer
2020-09-28 14:13:43
[tag:github.com,2008:Grit::Commit/d366c4d4a4aeebdd1777b4fe505e34faea3d0066 Update filename output]
by ross-spencer https://github.com/ross-spencer