NSRL (National Software Reference Library)

Revision as of 14:52, 28 July 2015 by Kramski (talk | contribs) (Initial article)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

NSRL (National Software Reference Library)
The NSRL provides a large data set of metadata on computer files which can be used to identify the files and their provenance
License:Public domain
Platforms:Platform independent


The National Software Reference Library (NSRL) is a large collection of software packages from various sources. Technical metadata about millions of files (including MD5 and SHA-1 hashes) is published every three months as the NSRL Reference Data Set (RDS).

The NSRL RDS is primarily used in forensic investigations to eliminate non-unique, irrelevant files but may also be useful for archives and digital curators of unstructured personal archives to automatically separate important files from operating system and application parts.

Downloads are available in several variants. The full RDS release is 6 GB in size (.iso file). A "minimal" set of 42,060,540 file hashes is only 2.77 GB in size (.zip file) but only lists one example of every file in the NSRL and cannot be used to determine all possible sources. (All sizes and quantities refer to RDS version 2.49 as of June 2015.) For just filtering out non-individual files from a personal digital archive, the minimal set should be sufficient.

Details of the record format are available.