Difference between revisions of "NSRL (National Software Reference Library)"

From COPTR
Jump to navigation Jump to search
(Initial article)
 
(wording)
Line 24: Line 24:
 
The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS).  
 
The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS).  
  
The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate important files from operating system and application parts.
+
The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate significant files from operating system and application parts.
  
[http://www.nsrl.nist.gov/Downloads.htm Downloads] are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-individual files from a personal digital archive, the minimal set should be sufficient.
+
[http://www.nsrl.nist.gov/Downloads.htm Downloads] are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-significant files from a personal digital archive, the minimal set should be sufficient.
  
 
[http://www.nsrl.nist.gov/Documents/Data-Formats-of-the-NSRL-Reference-Data-Set-16.pdf Details of the record format] are available.  
 
[http://www.nsrl.nist.gov/Documents/Data-Formats-of-the-NSRL-Reference-Data-Set-16.pdf Details of the record format] are available.  

Revision as of 15:03, 28 July 2015


NSRL (National Software Reference Library)
The NSRL provides a large data set of metadata on computer files which can be used to identify the files and their provenance
Homepage:http://www.nsrl.nist.gov/
License:Public domain
Platforms:Platform independent


Description

The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS).

The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate significant files from operating system and application parts.

Downloads are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-significant files from a personal digital archive, the minimal set should be sufficient.

Details of the record format are available.