Create or Receive (Acquire)

Jump to navigation Jump to search
Lifecycle stage definition: Functions that support the DCC Lifecycle Stage defined as "Create data including administrative, descriptive, structural and technical metadata. Preservation metadata may also be added at the time of creation. Receive data, in accordance with documented collecting policies, from data creators, other archives, repositories or data centres, and if required assign appropriate metadata."
Lifecycle order: 1

Functions within this lifecycle stage

AppraisalTools that enable the assessment of content against in order to decide on it's relevance or appropriateness for preservation
Data capture and DepositTools that enable the capture and deposit of data.
Disk ImagingTools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).
File CopyTools that support the copying of files from one storage location to another, typically with facilities to verify the completeness of the copy and enable resumption of copying after an interruption.
OCRTools that support the generation of text from bitmap images, otherwise known as Optical Character Recognition
TransferTools that support transfer of packaged digital resources from one organization to another.
Web CrawlTools that support the capture of data from the world wide web, typically by "crawling" links between resources.
Web SnapshotTools that support the capture of a static snapshot of a web page.
Workflow and Lab Notebook ManagementTools that support the capture and management of research data as well as the details of the research activities which generated them.

Tools for this lifecycle stage

7-Zip is a file archiver with a high compression ratio, and encryption and fixity check capabilities
AFF Open Source Computer Forensics SoftwareDisk ImagingTools for the creation of disk images, used in conjunction with the AFF open and extensible file format to store disk images and associated metadata.
ANTS (Archives Network Transfer System)TransferANTS runs on a Windows desktop and is designed to package digital records with contextual metadata and transfer them to an institutional archives.
ArchiFiltreFile Management
Overview of folder trees with fine diagrams
Archive-ItWeb Crawl
Archive-It is the leading web archiving service for collecting and accessing cultural heritage on the web. It is a service provided by the Internet Archive.
File Copy
BagIt API for Perl
ArchiveFacebookWeb CrawlArchiveFacebook is a Firefox extension which allows individuals to save and manage Facebook web content.
ArtivityData capture and Deposit
Workflow and Lab Notebook Management
A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.
BIL (BagIt Library)Fixity
File Copy
BagIt Library is a Java software library that supports the creation, manipulation and validation of bags.
BagIt Transfer UtilitiesFixity
File Copy
BagIt transfer Utilities are a collection of tools developed for the purpose of validation and transfer of bags.
GUI application to facilitate the creation and verification of BagIt bags.
CDRDAO (CDR Disk At Once)Disk ImagingCdrdao records audio or data CD-Rs in disk-at-once (DAO) mode based on a textual description of the CD contents.
CRunchWorkflow and Lab Notebook Management
Managing Active Research Data
cRunch provides an infrastructure for exploratory data analysis with the statistical programming language and environment R
CloneCDDisk ImagingCloneCD is the perfect tool to make backup copies of your music and data CDs, regardless of copy protection.
ContextMinerWeb Crawl
Metadata Processing
ContextMiner is a framework to collect, analyze, and present the contextual information along with the data.
Cp Unix commandFile Copycp copies files (or, optionally, directories). Part of GNU coreutils.
CryptcatFile CopyCryptcat is a lightweight version of netcat with integrated transport encryption capabilities.
Curate.UsWeb CrawlWith a simple click of the mouse, you can create visually compelling clips and quotes of web content that are easily embedded in blog posts, email, forums, and websites.
DART (Digital Archivist's Resource Tool)Storage
File Management
Provides both a GUI and a command-line interface for packaging files and uploading them to remote repositories.
DArcMailData capture and Deposit
Processing and access to email accounts
DIMAGMetadata Extraction
Preservation System
File Format Migration
Web Crawl
A software suite supporting archives with preservation of digital information for eternity
DIMAG IngestListMetadata Extraction
Accompanies ingest process from donor to archive, logs process steps.
Dc3dd for computer forensicsDisk Imaging
dc3dd is a patched version of GNU dd with a number of features useful for computer forensics.
DcflddFile Management
File Copy
dcfldd is an enhanced version of GNU dd with features useful for forensics and security.
Dd Unix commandFile CopyThis page gives information on using the dd Unix command.
DeepArcWeb Crawl
File Format Migration
Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.
DiscImageChefMetadata Extraction
Disk Imaging
Media dump software and disc image manager
DiskFormatIDDisk Imaging
File Format Identification
Identify floppy disk formats from kryoflux stream files
DisktypeMetadata Extraction
Disk Imaging
Tool for detecting the content format of a disk or disk image. It knows about common file systems, partition tables, and boot codes.
Quality Assurance
Document digitization workflow software
DriveImage XMLDisk ImagingDriveImage XML is an easy to use and reliable program for imaging and backing up partitions and logical drives.
Duke Data AccessionerMetadata Processing
Disk Imaging
Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer.
EPADDMetadata Processing
Metadata Extraction
Content Profiling
ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
Easy CD-DA ExtractorFile Format Migration
Metadata Extraction
Disk Imaging
Easy CD-DA Extractor is CD Ripper, Music Converter, Audio Converter, Metadata Editor, and CD/DVD burning software.
Exact Audio CopyFile Format Migration
Metadata Extraction
Disk Imaging
Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only.
ExactlyTransferPacks data in BagIt bags and transfers them to/from remote location via FTP, SFTP
FC5025Data capture and DepositDevice Side Data's FC5025 USB 5.25" floppy controller plugs into any computer's USB port and enables you to attach a 5.25" floppy drive.
Generates and compares MD5 values stored in an XML file.
Find It! Keep It!Web CrawlFind It! Keep It! is a tool to save and organise web content.
GImageReaderOCRA customisable GUI for Tesseract
GNU WgetWeb CrawlNon-interactive network downloader
GetDriveInfo2Disk ImagingGetDriveInfo2 is a Win32 program that examines the optical and removable media drives currently mounted on a computer, and returns information about those devices (in the case of optical devices it also returns information about the any media currently mounted in the device).
Quality Assurance
Workflow Management Tool
HTTrackWeb CrawlHTTrack is a website copying utility.
HeritrixWeb CrawlHeritrix is an open-source web crawler, allowing users to target websites they wish to include in a collection and to harvest an instance of each site.
Heritrix plug-in for rich media captureWeb CrawlThe Rich Media Capture module (RMC), developed in the LiWA (Living Web Archives) project, is designed to enhance the capturing capabilities of the crawler, with regards to different multimedia content types.
IMAGEDisk ImagingIMAGE is a DOS application capable of generating either highly compressed or "flat" images for forensic analysis.
IMacrosQuality Assurance
Web Snapshot
iMacros makes it easy to test web-based applications.
IsoBusterDisk ImagingRecover data from CD, DVD, BD, HDD, Flash drive, USB stick, media card, SD and SSD.
KeplerWorkflow and Lab Notebook Management
Managing Active Research Data
Kepler is a scientific workflow modelling and management system that enables users, regardless of programming experience, to set up data analysis pipelines.
Khtml2pngWeb Snapshotkhtml2png is a command line program to create screenshots of webpages.
KrakenOCROpen Source turn-key OCR system forked from ocropus
KryoFluxDisk ImagingFloppy disk controller software that accompanies a KryoFlux drive
LabTroveWorkflow and Lab Notebook Management
Managing Active Research Data
LabTrove is a blogging platform specifically designed for use in a research environment.
Limb ProcessingMetadata Processing
Software for processing, enhancing and converting cultural heritage into digital cultural heritage
MetaproductsWeb CrawlMetaproducts offers several commercial capture and off-line browsing tools.
MyExperimentWorkflow and Lab Notebook Management
Managing Active Research Data
Academic Social Networking
myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
NetarchiveSuiteWeb CrawlNetarchiveSuite is a web archiving software package designed to plan, schedule and run web harvests of parts of the Internet.
NumaHOPQuality Assurance
Platform for digitization projects management
NutchWAXWeb CrawlNutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
OSFMountDisk Imaging
disk image file mounting
Optical-media-checkDisk ImagingCollates information into a CSV from log files for a batch optical media rip
PackageHandlerFile Management
Metadata Processing
Personal Archiving
View, create, edit, and validate Swiss archival packages
PageVaultWeb CrawlpageVault supports the archiving of all unique responses generated by a web server.
PagelyzerMetadata Extraction
Quality Assurance
Web Crawl
Suite of tools for detecting changes in web pages and their rendering
ParanoiaDisk Imaging"Use your CDROM drive to read audio tracks.... and have it actually work right!"
Pearl Crescent Page SaverWeb SnapshotPearl Crescent Page Saver is an extension for Mozilla Firefox that lets you capture images of web pages, including Flash content.
Perma.ccPersonal Archiving
Web Snapshot
A tool that captures, stores, plays-back and provides a new URL for web citation. Built and maintained at the Harvard Law School Library.
PhotoRescueDisk Imaging
File Recovery
PhotoRescue is a picture and data recovery solution for digital film - sd cards, compact flash, memory sticks, microdrive, etc.
Power ISODisk ImagingPowerISO is a powerful CD/DVD image file processing tool, which allows you to open, extract, create, edit, compress, encrypt, split and convert ISO files, and mount these files with internal virtual drive.
QPxToolDisk ImagingWith QPxTool you can measure the quality of CDs and DVDs.
RARC (ARC replicator)Web CrawlrARC is a distributed system that enables Internet users to provide storage space from their computers to replicate small parts of the archived data stored in the central repository of the Web archive.
Metadata Extraction
Review, Appraisal, and Triage of Mail (RATOM) is software to assist archives and other collecting organizations with email analysis, selection, and appraisal tasks
SafeBackDisk ImagingSafeBack is used to create mirror-image (bit-stream) backup files of hard disks or to make a mirror-image copy of an entire hard disk drive or partition.
Screen-scraperData capture and Deposit
Web Snapshot
screen-scraper is a tool for extracting data from websites.
SiteStoryWeb CrawlSiteStory is a transactional web archive. It archives resources of a web server it is associated with.
SnagitWeb SnapshotSnagit is screen capture software to create interesting training documents, collaborative design work, IT bug reports, and more.
Spadix softwareWeb CrawlSpadix Software can download websites from a starting URL, search engine results or web dirs, and is able to follow external links.
StorytrackerWeb Crawl
Web Snapshot
Tools for tracking stories on news homepages
TabulaData capture and DepositExtract tabular data from PDF files
Workflow and Lab Notebook Management
Managing Active Research Data
Taverna is a scientific workflow management system designed to assemble, run, document and share sequences sequences of web services and scripts.
TeleportWeb CrawlTeleport is a web crawling tool that enables offline browsing
File Copy
File Management
Performs file copying, whilst also logging and verifying accuracy and completeness by using checksums
Tesseract-ocrOCROpen source OCR engine, accepting uncompressed TIFF files as input
The DeDuplicator (Heritrix add-on module)Web Crawl
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.
TreeMetadata Processing
File Management
Tree displays the directory structure of a path or of the disk in a drive graphically.
TreeSizeFile Management
Manage disk space and scan your hard disks.
TubeKitWeb CrawlTubeKit is a toolkit for creating YouTube crawlers.
Tufts Submission-Agreement Builder ToolPlanning
Data capture and Deposit
SABT is a web-based tool that guides records creators and records managers through the process of creating submission agreements, both for single transfers and for standing submissions.
Securely encrypts large amounts of files
Virtual CloneDriveDisk ImagingVirtual CloneDrive works and behaves just like a physical CD/DVD drive, but it exists only virtually.
WARCreateWeb Crawl
Web Snapshot
Personal Archiving
Data capture and Deposit
Google Chrome browser extension for creating WARC files from web pages
WAS (Web Archiving Service)Web CrawlThe Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information.
WAXToolbarWeb CrawlWAXToolbar is a firefox extension to help users with common tasks encountered surfing a web archive.
WCT (Web Curator Tool)Metadata Processing
Web Crawl
Web Curator Tool (WCT) is a workflow management application for selective web archiving.
WarcManagerWeb Crawl
File Management
The WARC Manager is a web-based UI for managing and querying collections of web crawl data.
WarrickWeb Crawl
Web Snapshot
Warrick is a free utility for reconstructing (or recovering) a website from web archives.
Wayback MachineAccess
Web Crawl
The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files).
Web Scraper Plus+Web SnapshotWeb Scraper Plus+ takes data from the web and puts it into a spreadsheet or database.
WebCiteWeb Snapshot
Persistent Identification
WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material.
... further results