Difference between revisions of "NSRL (National Software Reference Library)"
(Initial article) |
(wording) |
||
Line 24: | Line 24: | ||
The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS). | The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS). | ||
− | The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate | + | The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate significant files from operating system and application parts. |
− | [http://www.nsrl.nist.gov/Downloads.htm Downloads] are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non- | + | [http://www.nsrl.nist.gov/Downloads.htm Downloads] are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-significant files from a personal digital archive, the minimal set should be sufficient. |
[http://www.nsrl.nist.gov/Documents/Data-Formats-of-the-NSRL-Reference-Data-Set-16.pdf Details of the record format] are available. | [http://www.nsrl.nist.gov/Documents/Data-Formats-of-the-NSRL-Reference-Data-Set-16.pdf Details of the record format] are available. |
Revision as of 15:03, 28 July 2015
Description
The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS).
The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate significant files from operating system and application parts.
Downloads are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-significant files from a personal digital archive, the minimal set should be sufficient.
Details of the record format are available.