Difference between revisions of "Workflow:LAC Pre-Ingest Workflow"
Jump to navigation
Jump to search
Line 14: | Line 14: | ||
<!-- Describe your workflow here with an overview of the different steps or processes involved--> | <!-- Describe your workflow here with an overview of the different steps or processes involved--> | ||
− | + | ||
− | + | #Ensure that content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored) | |
− | #Ensure that | ||
#Confirm security classification of records and infrastructure that is required for processing information at this level of classification | #Confirm security classification of records and infrastructure that is required for processing information at this level of classification | ||
− | #Create a processing workspace on the Pre-ingest server (or classified infrastructure) | + | #Create a processing workspace on the Pre-ingest server (or classified infrastructure) that conforms to a standardized directory structure which segregates transferred digital objects from metadata created during processing. |
#Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder) | #Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder) | ||
#Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001"). | #Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001"). | ||
− | #Review and | + | #Review the Digital Processing Checklist and create a record/row for the content being processed. The Digital Proccesing Checklist serves a master tracking sheet that documents information related to each Pre-Ingest project, indicates each step taken in the Pre-Ingest workflow,a nd where a project sits in the Pre-Ingest workflow. |
#Write protect physical media when possible | #Write protect physical media when possible | ||
#Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report | #Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report |
Revision as of 20:39, 19 July 2021
Workflow Description
- Ensure that content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored)
- Confirm security classification of records and infrastructure that is required for processing information at this level of classification
- Create a processing workspace on the Pre-ingest server (or classified infrastructure) that conforms to a standardized directory structure which segregates transferred digital objects from metadata created during processing.
- Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder)
- Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001").
- Review the Digital Processing Checklist and create a record/row for the content being processed. The Digital Proccesing Checklist serves a master tracking sheet that documents information related to each Pre-Ingest project, indicates each step taken in the Pre-Ingest workflow,a nd where a project sits in the Pre-Ingest workflow.
- Write protect physical media when possible
- Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report
- Review file format/directory structure of physical media at a high level
- If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet
- Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it
- Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer
- For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory
- Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure
- Run DROID on the entire registration and create a DROID export (*.CSV) for all content
- Start populating the Pre-Ingest Report. Record Pre-Ingest volume, number of files, and number of folders
- Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value
- In the Pre-Ingest Report, record if viruses were detected as part of the virus scan
- If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow.
- Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records.
- Use TreeSize Pro to identify:
- (sub-bullet)Duplicated records existing within the corpus. Note existence of duplicate records to be addressed as part of archival processing.
- (sub-bullet)Container files. Extract any container files identified using software like WinZip.
- (sub-bullet)Email file formats. Identified email files may require a stand alone workflow for processing
- (sub-bullet)Data file formats.
- (sub-bullet)Database file formats
- (sub-bullet)Website file formats. Consult with the Web-Archives and Social Media Program.
- (sub-bullet)zero bytes files. Use TreeSize Pro to move zero byte files into disposition folder.
- (sub-bullet)long file paths.
- (sub-bullet)empty folders.
- Triage the DROID report.
- Create a DROID report.xlsx, which can be filed in the package's metadata subfolder
- (sub-bullet) color code formats in the spreadsheet according to the following categories as needed - for example:
- (sub-sub-bullet) Green = Ok as-is
- (sub-sub-bullet) Red = format unknown or non-standard: archivist to determine how to access prior preservation
- (sub-sub-bullet) Grey = ineligible for selection - file format preservation issue
- (sub-bullet)Refer to the Guidelines for Transferring Information Resources of Enduring Value for information on file formats already encountered during processing
- (sub-sub-bullet) these guidelines document LAC's formal and information file format policy decisions based on feedback and discussion with LAC Subject Matter Experts
- (sub-bullet) Further research may be required if unrecognized or unknown file formats are encountered
- Identify password protected and encrypted files
- (sub-bullet)Run software to identify files that are encrypted or password protection
- (sub-bullet) create a report in CSV format that reflects the results of the analysis
- Record Pre-Ingest metrics
- (sub-bullet) record post pre-ingest volume, number of files and folder in the Pre-Ingest Report
- Follow up with archival processing staff
- (sub-bullet) email archival processing staff to report that pre-ingest is complete
- (sub-bullet) provide hyperlink to Pre-Ingest Report or indicate it is in the MET subfolder of the processing workspace
- (sub-bullet)provide hyperlink to Pre-Ingest Report or indicate it is sitting in the the metadata subfolderof the processing workspace
- (sub-bullet)flag anything else that is noteworthy - especially file formats requiring research by archival staff
- (sub-bullet) If feasible/required, schedule a meeting with archival processing staff to review Pre-Ingest analysis and findings
Purpose, Context and Content
Pre-ingest is the technological review of digital records transferred to LAC. It involves technical appraisal using multiple software tools to automate the process. The goal of pre-ingest is to aid in creating a SIP that conforms to LAC’s preservation policies and that which LAC has a reasonable success of preserving in its repository and making accessible for the long-term.
There are two major tasks for pre-ingest:
- Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.).
- Identify file format or other issues that need addressing prior to or during appraisal/selection/description by archival staff.