Create or Receive (Acquire)

From COPTR
Jump to navigation Jump to search
Lifecycle stage definition: Functions that support the DCC Lifecycle Stage defined as "Create data including administrative, descriptive, structural and technical metadata. Preservation metadata may also be added at the time of creation. Receive data, in accordance with documented collecting policies, from data creators, other archives, repositories or data centres, and if required assign appropriate metadata."
Lifecycle order: 1

Functions within this lifecycle stage

FuntionDefinition
AppraisalTools that enable the assessment of content against in order to decide on it's relevance or appropriateness for preservation
Data capture and DepositTools that enable the capture and deposit of data.
Disk ImagingTools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).
File CopyTools that support the copying of files from one storage location to another, typically with facilities to verify the completeness of the copy and enable resumption of copying after an interruption.
OCRTools that support the generation of text from bitmap images, otherwise known as Optical Character Recognition
TransferTools that support transfer of packaged digital resources from one organization to another.
Web CaptureTools that support the capture of data from the world wide web, by "crawling" links between resources or other approaches.
Workflow and Lab Notebook ManagementTools that support the capture and management of research data as well as the details of the research activities which generated them.

Tools for this lifecycle stage

ToolFunctionPurpose
7-ZipRendering
Transfer
Fixity
7-Zip is a file archiver with a high compression ratio, and encryption and fixity check capabilities
AFF Open Source Computer Forensics SoftwareDisk ImagingTools for the creation of disk images, used in conjunction with the AFF open and extensible file format to store disk images and associated metadata.
ANTS (Archives Network Transfer System)TransferANTS runs on a Windows desktop and is designed to package digital records with contextual metadata and transfer them to an institutional archives.
Aaru Data Preservation SuiteBackup
Disk Imaging
Metadata Extraction
Media dump software and disc image manager
ArchiFiltreFile Management
Appraisal
Overview of folder trees with fine diagrams
Archive-ItService
Web Capture
Archive-It is the leading web archiving service for collecting and accessing cultural heritage on the web. It is a service provided by the Internet Archive.
Archive::BagItFixity
File Copy
BagIt API for Perl
ArchiveFacebookWeb CaptureArchiveFacebook is a Firefox extension which allows individuals to save and manage Facebook web content.
ArtivityData capture and Deposit
Workflow and Lab Notebook Management
A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.
Autopsy Digital ForensicsAppraisal
Content Profiling
De-Duplication
Disk Imaging
Forensic
Open source, free digital forensics tool
BIL (BagIt Library)Fixity
File Copy
BagIt Library is a Java software library that supports the creation, manipulation and validation of bags.
BagIt Transfer UtilitiesFixity
File Copy
BagIt transfer Utilities are a collection of tools developed for the purpose of validation and transfer of bags.
BaggerFixity
Transfer
GUI application to facilitate the creation and verification of BagIt bags.
BrozzlerWeb CaptureFrom GitHub (https://github.com/internetarchive/brozzler):

Brozzler is a distributed web crawler that uses a real browser (Chrome or Chromium) to fetch pages and embedded URLs and to extract links.

Brozzler is designed to work in conjunction with warcprox for web archiving.
CDRDAO (CDR Disk At Once)Disk ImagingCdrdao records audio or data CD-Rs in disk-at-once (DAO) mode based on a textual description of the CD contents.
CINCHWeb CaptureCINCH (Capture INgest and CHecksum Tool) facilitates batch downloading and ingest of Internet-accessible documents and/or images to a central repository.
CRunchWorkflow and Lab Notebook Management
Managing Active Research Data
cRunch provides an infrastructure for exploratory data analysis with the statistical programming language and environment R
CloneCDDisk ImagingCloneCD is the perfect tool to make backup copies of your music and data CDs, regardless of copy protection.
ContextMinerMetadata Processing
Web Capture
ContextMiner is a framework to collect, analyze, and present the contextual information along with the data.
Cp Unix commandFile Copycp copies files (or, optionally, directories). Part of GNU coreutils.
CryptcatFile CopyCryptcat is a lightweight version of netcat with integrated transport encryption capabilities.
Curate.UsWeb CaptureWith a simple click of the mouse, you can create visually compelling clips and quotes of web content that are easily embedded in blog posts, email, forums, and websites.
DART (Digital Archivist's Resource Tool)Storage
File Management
Fixity
Transfer
Provides both a GUI and a command-line interface for packaging files and uploading them to remote repositories.
DArcMailData capture and Deposit
Access
Appraisal
Processing and access to email accounts
DIMAG IngestListMetadata Extraction
Transfer
Accompanies ingest process from donor to archive, logs process steps.
Dc3dd for computer forensicsDisk Imaging
Forensic
dc3dd is a patched version of GNU dd with a number of features useful for computer forensics.
DcflddFile Management
Forensic
File Copy
dcfldd is an enhanced version of GNU dd with features useful for forensics and security.
Dd Unix commandFile CopyThis page gives information on using the dd Unix command.
DeepArcFile Format Migration
Web Capture
Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.
DiskFormatIDDisk Imaging
File Format Identification
Identify floppy disk formats from kryoflux stream files
DisktypeMetadata Extraction
Disk Imaging
Tool for detecting the content format of a disk or disk image. It knows about common file systems, partition tables, and boot codes.
DocuTeam PackerMetadata Processing
Appraisal
Data capture and Deposit
File Management
Fixity
Creates and edits SIPs
DocworksOCR
Workflow
Quality Assurance
Document digitization workflow software
Double CommanderFixity
File Copy
Batch Rename
File Management
De-Duplication
Open source file manager with two panels side by side
DriveImage XMLDisk ImagingDriveImage XML is an easy to use and reliable program for imaging and backing up partitions and logical drives.
Duke Data AccessionerFile Copy
File Format Identification
Metadata Extraction
Transfer
Validation
Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer.
EPADDMetadata Processing
Metadata Extraction
Content Profiling
Access
Appraisal
ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
Easy CD-DA ExtractorFile Format Migration
Metadata Extraction
Disk Imaging
Easy CD-DA Extractor is CD Ripper, Music Converter, Audio Converter, Metadata Editor, and CD/DVD burning software.
Exact Audio CopyFile Format Migration
Metadata Extraction
Disk Imaging
Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only.
ExactlyTransferPacks data in BagIt bags and transfers them to/from remote location via FTP, SFTP
FC5025Data capture and DepositDevice Side Data's FC5025 USB 5.25" floppy controller plugs into any computer's USB port and enables you to attach a 5.25" floppy drive.
FCIVFixity
Transfer
Generates and compares MD5 values stored in an XML file.
Find It! Keep It!Web CaptureFind It! Keep It! is a tool to save and organise web content.
FreeCommanderFile Management
De-Duplication
Fixity
File Copy
Split-screen file manager with desirable extras
GImageReaderOCRA customisable GUI for Tesseract
GNU WgetWeb CaptureNon-interactive network downloader
GetDriveInfo2Disk ImagingGetDriveInfo2 is a Win32 program that examines the optical and removable media drives currently mounted on a computer, and returns information about those devices (in the case of optical devices it also returns information about the any media currently mounted in the device).
GoobiWorkflow
OCR
Planning
Quality Assurance
Workflow Management Tool
HTTrackWeb CaptureHTTrack is a website copying utility.
HeritrixWeb CaptureHeritrix is an open-source web crawler, allowing users to target websites they wish to include in a collection and to harvest an instance of each site.
Heritrix plug-in for rich media captureWeb CaptureThe Rich Media Capture module (RMC), developed in the LiWA (Living Web Archives) project, is designed to enhance the capturing capabilities of the crawler, with regards to different multimedia content types.
IMAGEDisk ImagingIMAGE is a DOS application capable of generating either highly compressed or "flat" images for forensic analysis.
IMacrosQuality Assurance
Web Capture
iMacros makes it easy to test web-based applications.
IsoBusterDisk ImagingRecover data from CD, DVD, BD, HDD, Flash drive, USB stick, media card, SD and SSD.
KeplerWorkflow and Lab Notebook Management
Managing Active Research Data
Kepler is a scientific workflow modelling and management system that enables users, regardless of programming experience, to set up data analysis pipelines.
Khtml2pngWeb Capturekhtml2png is a command line program to create screenshots of webpages.
KrakenOCROpen Source turn-key OCR system forked from ocropus
KryoFluxDisk ImagingFloppy disk controller software that accompanies a KryoFlux drive
LabTroveWorkflow and Lab Notebook Management
Managing Active Research Data
LabTrove is a blogging platform specifically designed for use in a research environment.
Limb ProcessingMetadata Processing
OCR
Software for processing, enhancing and converting cultural heritage into digital cultural heritage
MetaproductsWeb CaptureMetaproducts offers several commercial capture and off-line browsing tools.
MyExperimentWorkflow and Lab Notebook Management
Managing Active Research Data
Academic Social Networking
Workflow
myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
NetarchiveSuiteWeb CaptureNetarchiveSuite is a web archiving software package designed to plan, schedule and run web harvests of parts of the Internet.
NumaHOPQuality Assurance
OCR
Platform for digitization projects management
NutchWAXWeb CaptureNutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
OSFMountDisk Imaging
Forensic
disk image file mounting
Optical-media-checkDisk ImagingCollates information into a CSV from log files for a batch optical media rip
PackageHandlerFile Management
Metadata Processing
Validation
Personal Archiving
Appraisal
View, create, edit, and validate Swiss archival packages
PageVaultWeb CapturepageVault supports the archiving of all unique responses generated by a web server.
ParanoiaDisk Imaging"Use your CDROM drive to read audio tracks.... and have it actually work right!"
Pearl Crescent Page SaverWeb CapturePearl Crescent Page Saver is an extension for Mozilla Firefox that lets you capture images of web pages, including Flash content.
PhotoRescueDisk Imaging
File Recovery
PhotoRescue is a picture and data recovery solution for digital film - sd cards, compact flash, memory sticks, microdrive, etc.
Power ISODisk ImagingPowerISO is a powerful CD/DVD image file processing tool, which allows you to open, extract, create, edit, compress, encrypt, split and convert ISO files, and mount these files with internal virtual drive.
QPxToolDisk ImagingWith QPxTool you can measure the quality of CDs and DVDs.
RARC (ARC replicator)Web CapturerARC is a distributed system that enables Internet users to provide storage space from their computers to replicate small parts of the archived data stored in the central repository of the Web archive.
RATOMAppraisal
Discovery
Metadata Extraction
Review, Appraisal, and Triage of Mail (RATOM) is software to assist archives and other collecting organizations with email analysis, selection, and appraisal tasks
SafeBackDisk ImagingSafeBack is used to create mirror-image (bit-stream) backup files of hard disks or to make a mirror-image copy of an entire hard disk drive or partition.
SafeMoverData capture and Deposit
Transfer
Fixity
Python tool to support the overtly "safe" copying of files from one location to another. Uses fixity, and OS file system metadata.
Screen-scraperData capture and Deposit
Web Capture
screen-scraper is a tool for extracting data from websites.
SiteStoryWeb CaptureSiteStory is a transactional web archive. It archives resources of a web server it is associated with.
SnagitData capture and DepositSnagit is screen capture software to create interesting training documents, collaborative design work, IT bug reports, and more.
Spadix softwareWeb CaptureSpadix Software can download websites from a starting URL, search engine results or web dirs, and is able to follow external links.
StorytrackerWeb CaptureTools for tracking stories on news homepages
TOMES (Transforming Online Mail with Embedded Semantics)File Format Migration
Content Profiling
Metadata Processing
Data capture and Deposit
A package of open source tools for handling the preservation of government email records
TabulaData capture and DepositExtract tabular data from PDF files
TavernaWorkflow
Workflow and Lab Notebook Management
Managing Active Research Data
Taverna is a scientific workflow management system designed to assemble, run, document and share sequences sequences of web services and scripts.
TeleportWeb CaptureTeleport is a web crawling tool that enables offline browsing
TeraCopyTransfer
File Copy
File Management
Performs file copying, whilst also logging and verifying accuracy and completeness by using checksums
Tesseract-ocrOCROpen source OCR engine, accepting uncompressed TIFF files as input
The DeDuplicator (Heritrix add-on module)De-Duplication
Web Capture
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.
TreeMetadata Processing
File Management
Appraisal
Tree displays the directory structure of a path or of the disk in a drive graphically.
TreeSizeFile Management
Appraisal
Manage disk space and scan your hard disks.
TubeKitWeb CaptureTubeKit is a toolkit for creating YouTube crawlers.
Tufts Submission-Agreement Builder ToolPlanning
Data capture and Deposit
SABT is a web-based tool that guides records creators and records managers through the process of creating submission agreements, both for single transfers and for standing submissions.
UKWA GSuite Add-OnAppraisal
Validation
GSuite functions for people working with web archives. The functions use the Memento API (specifically the TimeGate) to look up whether a given archive holds a given URL. It currently supports checks against:
  • UK Web Archive
  • UK Government Web Archive
  • Internet Archive
VeraCryptTransfer
Decryption
Securely encrypts large amounts of files
Virtual CloneDriveDisk ImagingVirtual CloneDrive works and behaves just like a physical CD/DVD drive, but it exists only virtually.
WARCreatePersonal Archiving
Data capture and Deposit
Web Capture
Google Chrome browser extension for creating WARC files from web pages
WAS (Web Archiving Service)Web CaptureThe Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information.
WAXToolbarWeb CaptureWAXToolbar is a firefox extension to help users with common tasks encountered surfing a web archive.
... further results