DROID Siegfried Sqlite Analysis Engine

Jump to navigation Jump to search

Analysis and automatic generation of summary information from DROID output
License:Open source (see URL above)
Platforms:sqlite + Python + text/html
Function:Metadata Extraction,Content Profiling,De-Duplication


Engine for analysis of DROID CSV export files, Siegfried YAML export files, and Siegfried 'DROID compatible' output. The tool has three purposes, break the exports into their components and store them within a table in a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions.

The tool provides archivist definitions for each of the sections output; these definitions are customisable. The tool also supports output of statistics about files that may require further triage or may not be appropriate for long-term preservation based on institutional rules, in the form of a blacklist. The tool also analyses file names and directory names for non-ascii characters, and also characteristics that may present problems cross-file-system based on known Microsoft rules: http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

The engine can be used to generate a list of file paths for files that may present digital preservation risks (Rogues) or files which on the surface i.e. via identification alone, look okay (Heroes) and these listings can be used in conjunction with rsync to isolate these sets from one-another to be more flexible to work with.

User Experiences[edit]

Development Activity[edit]

All development activity is visible on GitHub: http://github.com/ross-spencer/droid-sqlite-analysis/commits

Release Feed[edit]

Below the last 3 release feeds: Failed to load RSS feed from https://github.com/exponential-decay/droid-siegfried-sqlite-analysis-engine/releases.atom: Error parsing XML for RSS

Activity Feed[edit]

Below the last 5 commits:

2020-01-25 03:56:13
[tag:github.com,2008:Grit::Commit/114b1d08ea0d4dcb2bcf25a37f6e22d60f07a271 Create FUNDING.yml]
by ross-spencer https://github.com/ross-spencer
2019-07-02 20:33:20
[tag:github.com,2008:Grit::Commit/6c146a84fc27c76729dce51e5a3f7f37fc6ff9dc Add testing framework]
by ross-spencer https://github.com/ross-spencer
2019-07-02 19:37:30
[tag:github.com,2008:Grit::Commit/b569231b3285ad18b45878c8324f58973d09699c Add code of conduct]
by ross-spencer https://github.com/ross-spencer
2019-05-08 10:50:22
[tag:github.com,2008:Grit::Commit/31e082f31ce71f3c03bec5aebc8f7073388ad68c Update README.md]
by ross-spencer https://github.com/ross-spencer
2019-04-13 19:15:59
[tag:github.com,2008:Grit::Commit/94b98d6078620132a53a08fd87adacd84e129ba1 Ensure SF exports are quoted CSV during ID]
by ross-spencer https://github.com/ross-spencer