Editing Demystify

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Infobox tool
+
<!-- Use the structure provided in this template, do not change it! -->
|purpose=Format Identification Analysis and Reporting
+
 
|homepage=https://github.com/exponential-decay/demystify
+
{{Infobox_tool
 +
|purpose=Analysis and automatic generation of summary information from DROID output
 +
|image=
 +
|homepage=https://github.com/exponential-decay/droid-siegfried-sqlite-analysis-engine
 
|license=Open source (see URL above)
 
|license=Open source (see URL above)
 
|platforms=sqlite + Python + text/html
 
|platforms=sqlite + Python + text/html
|function=Metadata Extraction, Content Profiling, De-Duplication
 
 
}}
 
}}
{{Infobox tool details}}
+
 
 +
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply -->
 +
[[Category:Metadata Extraction]]
 +
[[Category:Content Profiling]]
 +
[[Category:De-Duplication]]
 +
 
 +
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply -->
 +
 
 
== Description ==
 
== Description ==
 
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. -->
 
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. -->
Now known as "Demystify" (formerly 'DROID Siegfried Sqlite Analysis Engine') with thanks to Joshua Ng for the suggestion to rename it. Demystify is an engine for the analysis of [https://github.com/digital-preservation/droid DROID] CSV export files, [https://github.com/richardlehane/siegfried Siegfried] YAML export files, and Siegfried 'DROID compatible' output. The tool has three purposes, break the exports into their components and store them within a table in a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions.
+
Engine for analysis of [https://github.com/digital-preservation/droid DROID] CSV export files, [https://github.com/richardlehane/siegfried Siegfried] YAML export files, and Siegfried 'DROID compatible' output. The tool has three purposes, break the exports into their components and store them within a table in a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions.
  
The tool provides archivist definitions for each of the sections output; these definitions are customizable. The tool also supports output of statistics about files that may require further triage or may not be appropriate for long-term preservation based on institutional rules, in the form of a blacklist. The tool also analyses file names and directory names for non-ascii characters, and also characteristics that may present problems cross-file-system based on known Microsoft rules: http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx
+
The tool provides archivist definitions for each of the sections output; these definitions are customisable. The tool also supports output of statistics about files that may require further triage or may not be appropriate for long-term preservation based on institutional rules, in the form of a blacklist. The tool also analyses file names and directory names for non-ascii characters, and also characteristics that may present problems cross-file-system based on known Microsoft rules: http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx
  
 
The engine can be used to generate a list of file paths for files that may present digital preservation risks (Rogues) or files which on the surface i.e. via identification alone, look okay (Heroes) and these listings can be used in conjunction with [http://manpages.ubuntu.com/manpages/trusty/man1/rsync.1.html rsync] to isolate these sets from one-another to be more flexible to work with.  
 
The engine can be used to generate a list of file paths for files that may present digital preservation risks (Rogues) or files which on the surface i.e. via identification alone, look okay (Heroes) and these listings can be used in conjunction with [http://manpages.ubuntu.com/manpages/trusty/man1/rsync.1.html rsync] to isolate these sets from one-another to be more flexible to work with.  
 
=== Demystify Lite ===
 
 
[https://ross-spencer.github.io/demystify-lite/ Demystify Lite] provides a Pyscript/WASM implementation of Demystify's features and runs completely browser side for users with DROID or Siegfried reports that they would like to see analyzed.
 
  
 
== User Experiences ==
 
== User Experiences ==
Line 26: Line 31:
 
**'''[2016-05-23]''' [http://openpreservation.org/blog/2016/05/23/whats-in-a-namespace-the-marriage-of-droid-and-siegfried-analysis/ The integration of Siegfried output for consistent and repeatable reporting.]
 
**'''[2016-05-23]''' [http://openpreservation.org/blog/2016/05/23/whats-in-a-namespace-the-marriage-of-droid-and-siegfried-analysis/ The integration of Siegfried output for consistent and repeatable reporting.]
 
**'''[2016-05-24]''' [http://openpreservation.org/blog/2016/05/24/while-were-on-the-subject-a-few-more-points-of-interest-about-the-siegfrieddroid-analysis-tool/ Creating a multi-lingual consistent, digital preservation dialect and exploring alternative methods of format identification using Siegfried's capabilities.]
 
**'''[2016-05-24]''' [http://openpreservation.org/blog/2016/05/24/while-were-on-the-subject-a-few-more-points-of-interest-about-the-siegfrieddroid-analysis-tool/ Creating a multi-lingual consistent, digital preservation dialect and exploring alternative methods of format identification using Siegfried's capabilities.]
**'''[2022-05-09]''' [https://journal.code4lib.org/articles/16351 Fractal in detail: What information is in a file format identification report?]
 
  
 
= Development Activity =
 
= Development Activity =
 
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. -->
 
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. -->
All development activity is visible on GitHub: http://github.com/ross-spencer/demystify/commits
+
All development activity is visible on GitHub: http://github.com/ross-spencer/droid-sqlite-analysis/commits
 
   
 
   
 
=== Release Feed ===
 
=== Release Feed ===
 
Below the last 3 release feeds:
 
Below the last 3 release feeds:
<rss max=3>https://github.com/exponential-decay/demystify/releases.atom</rss>
+
<rss max=3>https://github.com/ross-spencer/droid-sqlite-analysis/releases.atom</rss>
 
   
 
   
 
=== Activity Feed ===
 
=== Activity Feed ===
 
Below the last 5 commits:
 
Below the last 5 commits:
<rss max=5>https://github.com/exponential-decay/demystify/commits/main.atom</rss>
+
<rss max=5>https://github.com/ross-spencer/droid-sqlite-analysis/commits/master.atom</rss>
 
   
 
   
 
<!-- Add the Ohloh.com ID for the tool, if known. -->
 
<!-- Add the Ohloh.com ID for the tool, if known. -->
 +
{{Infobox_tool_details
 +
|ohloh_id=
 +
}}

Please note that all contributions to COPTR are considered to be released under the Attribution-ShareAlike 3.0 Unported (see COPTR:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)