Difference between revisions of "Workflow:LAC Pre-Ingest Workflow"
Jump to navigation
Jump to search
Prwheatley (talk | contribs) |
|||
Line 10: | Line 10: | ||
<!-- To add an image of your workflow, open the "Upload File" link on the left in a new browser tab and follow on screen instructions, then return to this page and add the name of your uploaded image to the line below - replacing "workflow.png" with the name of your file. Replace the text "Textual description" with a short description of your image. Filenames are case sensitive! If you don't want to add a workflow diagram or other image, delete the line below --> | <!-- To add an image of your workflow, open the "Upload File" link on the left in a new browser tab and follow on screen instructions, then return to this page and add the name of your uploaded image to the line below - replacing "workflow.png" with the name of your file. Replace the text "Textual description" with a short description of your image. Filenames are case sensitive! If you don't want to add a workflow diagram or other image, delete the line below --> | ||
− | + | ||
<!-- Describe your workflow here with an overview of the different steps or processes involved--> | <!-- Describe your workflow here with an overview of the different steps or processes involved--> | ||
− | Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation | + | #Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation |
− | Transfer the content | + | #Transfer the content |
− | Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored) | + | #Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored) |
− | Confirm security classification of records and infrastructure that is required for processing information at this level of classification | + | #Confirm security classification of records and infrastructure that is required for processing information at this level of classification |
− | Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure | + | #Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure |
− | Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder) | + | #Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder) |
− | Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001"). | + | #Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001"). |
− | Review and complete Digital Processing Checklist | + | #Review and complete Digital Processing Checklist |
− | Write protect physical media when possible | + | #Write protect physical media when possible |
− | Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report | + | #Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report |
− | Review file format/directory structure of physical media at a high level | + | #Review file format/directory structure of physical media at a high level |
− | If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet | + | #If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet |
− | Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it | + | #Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it |
− | Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer | + | #Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer |
− | For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory | + | #For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory |
− | Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure | + | #Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure |
− | Run DROID on the entire registration and create a DROID export (*.CSV) for all content | + | #Run DROID on the entire registration and create a DROID export (*.CSV) for all content |
− | Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders | + | #Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders |
− | Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value | + | #Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value |
− | In the Pre-Ingest Report, record if viruses were detected as part of the virus scan | + | #In the Pre-Ingest Report, record if viruses were detected as part of the virus scan |
− | If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow. | + | #If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow. |
− | Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records. | + | #Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records. |
− | Use TreeSize Pro file search to identify any duplicated records existing within the corpus | + | #Use TreeSize Pro file search to identify any duplicated records existing within the corpus |
Line 48: | Line 48: | ||
There are two major tasks for pre-ingest: | There are two major tasks for pre-ingest: | ||
− | Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.). | + | #Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.). |
− | Identify file format or other issues that need addressing prior to or during appraisal/selection/description by archival staff. | + | #Identify file format or other issues that need addressing prior to or during appraisal/selection/description by archival staff. |
Revision as of 08:35, 29 April 2021
Workflow Description
- Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation
- Transfer the content
- Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored)
- Confirm security classification of records and infrastructure that is required for processing information at this level of classification
- Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure
- Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder)
- Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001").
- Review and complete Digital Processing Checklist
- Write protect physical media when possible
- Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report
- Review file format/directory structure of physical media at a high level
- If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet
- Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it
- Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer
- For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory
- Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure
- Run DROID on the entire registration and create a DROID export (*.CSV) for all content
- Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders
- Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value
- In the Pre-Ingest Report, record if viruses were detected as part of the virus scan
- If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow.
- Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records.
- Use TreeSize Pro file search to identify any duplicated records existing within the corpus
Purpose, Context and Content
Pre-ingest is the technological review of digital records transferred to LAC. It involves technical appraisal using multiple software tools to automate the process. The goal of pre-ingest is to aid in creating a SIP that conforms to LAC’s preservation policies and that which LAC has a reasonable success of preserving in its repository and making accessible for the long-term.
There are two major tasks for pre-ingest:
- Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.).
- Identify file format or other issues that need addressing prior to or during appraisal/selection/description by archival staff.