Difference between pages "Library (xklb)" and "Sumfolder1"

From COPTR
(Difference between pages)
Jump to navigation Jump to search
(remove input formats)
 
m (3 revisions imported)
 
Line 1: Line 1:
 
{{Infobox tool
 
{{Infobox tool
|purpose=Media indexing multi-tool
+
|purpose=sumfolder1 is a utility for use within the archival and digital preservation community to generate checksums for file system directories, and to generate an overall "collection" checksum for a given set of files. The utility may be used in support of de-duplication at a directory/folder level.
|sourcecode=https://github.com/chapmanjacobd/library/
+
|homepage=https://pypi.org/project/sumfolder1/
|license=BSD 3-Clause
+
|sourcecode=https://github.com/ross-spencer/sumfolder1
|formats_out=DB
+
|license=GPL-3.0
|function=File Management, Quality Assurance, Web Capture
+
|cost=Free as in kittens (or a donation of 1 million dollars to an offshore account if you have the funding)
 +
|platforms=Python 3
 +
|function=Appraisal, De-Duplication, Fixity
 
}}
 
}}
 
{{Infobox tool details}}
 
{{Infobox tool details}}
Line 10: Line 12:
 
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. -->
 
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. -->
  
Web Capture subcommands:
+
sumfolder1 uses a DROID CSV output to generate checksums for file system directories and to generate an overall "collection" checksum for a given set of files. It can be used for fixity and de-duplication at the folder level.
 
 
* web-add: index open web directories using ffprobe and exifTool to fetch additional metadata from remote file headers (without downloading the full file) for later automated selective downloading.
 
* tube-add: index video site metadata via yt-dlp
 
* gallery-add: index image gallery site metadata via gallery-dl
 
* extract-links: extract links from within a webpage, even if the page uses ShadowDOM, postMessage, and nested frames
 
* links-add: build updatable link-scraping databases for paginated content
 
 
 
Local file management subcommands:
 
 
 
* fs-add: index local files with ffprobe, exifTool, and textract
 
* cluster-sort: sort lines of text by similarity (a common use for this is to identify similar file paths)
 
* merge-folders: merge file trees (similar to [https://github.com/chapmanjacobd/journal/blob/main/programming/linux/misconceptions.md#mv-src-vs-mv-src rclone move] but it will print detailed information about overwrites and trumps (future overwrites from multiple source folders) before moving anything)
 
* relmv: move but preserve parent folder information
 
* process-image: convert large images as scaled AVIF files as an alternative to file deletion
 
* process-ffmpeg: convert large video/audio files to AV1/Opus as an alternative to file deletion
 
 
 
Quality Assurance subcommands:
 
 
 
* media-check: check video and audio files for corruption by decoding small sections or the whole file
 
  
 
== User Experiences ==
 
== User Experiences ==
 
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->
 
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->
  
- [https://old.reddit.com/r/opendirectories/comments/1adbv4b/i_made_a_little_cli_opendirectory_scanner_tool/ Introducing webadd to the /r/opendirectories community]
+
**'''[2023-01-16]''' [https://openpreservation.org/blogs/what-is-the-checksum-of-a-directory/ What is the checksum of a directory? Using DROID reports and the concepts behind Merkle Trees to generate directory and collection checksums.]
 +
 
 +
= Development Activity =
 +
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. -->
 +
All development activity is visible on GitHub: http://github.com/ross-spencer/sumfolder1/commits
 +
 +
=== Release Feed ===
 +
Below the last 3 release feeds:
 +
<rss max=3>https://github.com/ross-spencer/sumfolder1/releases.atom</rss>
 +
 +
=== Activity Feed ===
 +
Below the last 5 commits:
 +
<rss max=5>https://github.com/ross-spencer/sumfolder1/commits/main.atom</rss>

Latest revision as of 12:05, 28 March 2025



sumfolder1 is a utility for use within the archival and digital preservation community to generate checksums for file system directories, and to generate an overall "collection" checksum for a given set of files. The utility may be used in support of de-duplication at a directory/folder level.
Homepage:https://pypi.org/project/sumfolder1/
Source Code:https://github.com/ross-spencer/sumfolder1
License:GPL-3.0
Cost:Free as in kittens (or a donation of 1 million dollars to an offshore account if you have the funding)
Platforms:Python 3
Function:Appraisal,De-Duplication,Fixity




Description

sumfolder1 uses a DROID CSV output to generate checksums for file system directories and to generate an overall "collection" checksum for a given set of files. It can be used for fixity and de-duplication at the folder level.

User Experiences

Development Activity

All development activity is visible on GitHub: http://github.com/ross-spencer/sumfolder1/commits

Release Feed

Below the last 3 release feeds:

2023-05-21 10:36:45
[tag:github.com,2008:Repository/585151971/v0.0.2 v0.0.2]
by ross-spencer
2023-05-20 17:58:37
[tag:github.com,2008:Repository/585151971/v0.0.1 v0.0.1]
by ross-spencer

Activity Feed

Below the last 5 commits:

2024-03-25 09:27:30
[tag:github.com,2008:Grit::Commit/717ae649ac3a8fe4d9f1e44769fde03c05eb1c2f Update Black]
by ross-spencer https://github.com/ross-spencer
2024-03-25 09:26:12
[tag:github.com,2008:Grit::Commit/e444a444a35a2788996818f000f5ec0cc1bb14c1 Add Breitwieser to previous work]
by ross-spencer https://github.com/ross-spencer
2024-03-25 09:20:53
[tag:github.com,2008:Grit::Commit/78496e33a67cc1592b5b915830fdfacbcc9aae7a Update CI and pre-commit configuration]
by ross-spencer https://github.com/ross-spencer
2023-05-21 10:25:49
[tag:github.com,2008:Grit::Commit/8dfefd3417d545b13b99dd11eb8afd321f79ad43 Remove sort_order from folder objects]
by ross-spencer https://github.com/ross-spencer
2023-01-16 13:27:03
[tag:github.com,2008:Grit::Commit/486dcda53a55a747bf4b255dddee73f2c57bc9af sumfolder1 v0.0.1]
by ross-spencer https://github.com/ross-spencer