DROID Siegfried Sqlite Analysis Engine

From COPTR
(Redirected from DROID sqlite analysis)
Jump to navigation Jump to search



Format Identification Analysis and Reporting
Homepage:https://github.com/exponential-decay/demystify
License:Open source (see URL above)
Platforms:sqlite + Python + text/html
Function:Metadata Extraction,Content Profiling,De-Duplication




Description[edit]

Now known as "Demystify" with thanks to Joshua Ng for the suggestion to rename it. Demystify is an engine for the analysis of DROID CSV export files, Siegfried YAML export files, and Siegfried 'DROID compatible' output. The tool has three purposes, break the exports into their components and store them within a table in a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions.

The tool provides archivist definitions for each of the sections output; these definitions are customizable. The tool also supports output of statistics about files that may require further triage or may not be appropriate for long-term preservation based on institutional rules, in the form of a blacklist. The tool also analyses file names and directory names for non-ascii characters, and also characteristics that may present problems cross-file-system based on known Microsoft rules: http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

The engine can be used to generate a list of file paths for files that may present digital preservation risks (Rogues) or files which on the surface i.e. via identification alone, look okay (Heroes) and these listings can be used in conjunction with rsync to isolate these sets from one-another to be more flexible to work with.

User Experiences[edit]

Development Activity[edit]

All development activity is visible on GitHub: http://github.com/ross-spencer/droid-sqlite-analysis/commits

Release Feed[edit]

Below the last 3 release feeds:

2022-06-05 20:34:15
[tag:github.com,2008:Repository/15066530/v2.0.0rc1 demystify v2.0.0rc1]
by ross-spencer
2022-06-05 14:03:06
[tag:github.com,2008:Repository/15066530/v1.0.0 v1.0.0 release candidate for Python 2 and 3 compatibility]
by ross-spencer
2022-01-16 22:20:19
[tag:github.com,2008:Repository/15066530/v0.6.7-BETA v0.6.7-BETA]
by ross-spencer

Activity Feed[edit]

Below the last 5 commits:

2022-06-05 20:27:30
[tag:github.com,2008:Grit::Commit/e54cccfceb8a87df68c937f81dc92392673bafeb Up version for rc1]
by ross-spencer https://github.com/ross-spencer
2022-06-05 20:26:07
[tag:github.com,2008:Grit::Commit/02d3e83bc37d0e377eea067474198cf4e15526b5 Change package name for PyPi]
by ross-spencer https://github.com/ross-spencer
2022-06-05 20:14:19
[tag:github.com,2008:Grit::Commit/aafc72481b7d7276eac043fe7501fa3267c37c80 Add packaging to demystify]
by ross-spencer https://github.com/ross-spencer
2022-06-01 08:02:03
[tag:github.com,2008:Grit::Commit/a018862c6e1e402aebf8d6dc45b318dce5bb2189 demystify.py - fixes --txt help]
by kieranjol https://github.com/kieranjol
2022-04-10 13:32:03
[tag:github.com,2008:Grit::Commit/c8f527585f74d3cdeee71c112e4773e531df67ad Up sqlitefid version to v2.0.2]
by ross-spencer https://github.com/ross-spencer