Difference between revisions of "Workflow:LAC Pre-Ingest Workflow"

From COPTR
Jump to navigation Jump to search
Line 10: Line 10:
 
<!-- To add an image of your workflow, open the "Upload File" link on the left in a new browser tab and follow on screen instructions, then return to this page and add the name of your uploaded image to the line below - replacing "workflow.png" with the name of your file. Replace the text "Textual description" with a short description of your image. Filenames are case sensitive! If you don't want to add a workflow diagram or other image, delete the line below  -->
 
<!-- To add an image of your workflow, open the "Upload File" link on the left in a new browser tab and follow on screen instructions, then return to this page and add the name of your uploaded image to the line below - replacing "workflow.png" with the name of your file. Replace the text "Textual description" with a short description of your image. Filenames are case sensitive! If you don't want to add a workflow diagram or other image, delete the line below  -->
  
[[File:workflow.png|Textual description]]<br>
+
 
  
 
<!-- Describe your workflow here with an overview of the different steps or processes involved-->
 
<!-- Describe your workflow here with an overview of the different steps or processes involved-->
  
Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation
+
#Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation
Transfer the content
+
#Transfer the content
Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored)
+
#Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored)
Confirm security classification of records and infrastructure that is required for processing information at this level of classification
+
#Confirm security classification of records and infrastructure that is required for processing information at this level of classification
Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure  
+
#Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure  
Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder)
+
#Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder)
Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001").
+
#Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001").
Review and complete Digital Processing Checklist
+
#Review and complete Digital Processing Checklist
Write protect physical media when possible
+
#Write protect physical media when possible
Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report
+
#Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report
Review file format/directory structure of physical media at a high level
+
#Review file format/directory structure of physical media at a high level
If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet
+
#If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet
Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it
+
#Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it
Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer  
+
#Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer  
For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory
+
#For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory
Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure
+
#Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure
Run DROID on the entire registration and create a DROID export (*.CSV) for all content
+
#Run DROID on the entire registration and create a DROID export (*.CSV) for all content
Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders
+
#Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders
Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value
+
#Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value
In the Pre-Ingest Report, record if viruses were detected as part of the virus scan
+
#In the Pre-Ingest Report, record if viruses were detected as part of the virus scan
If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow.
+
#If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow.
Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records.
+
#Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records.
Use TreeSize Pro file search to identify any duplicated records existing within the corpus
+
#Use TreeSize Pro file search to identify any duplicated records existing within the corpus
  
  
Line 48: Line 48:
 
There are two major tasks for pre-ingest:
 
There are two major tasks for pre-ingest:
  
Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.).
+
#Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.).
Identify file format or other  issues that need addressing prior to or during appraisal/selection/description by archival staff.
+
#Identify file format or other  issues that need addressing prior to or during appraisal/selection/description by archival staff.
  
  

Revision as of 08:35, 29 April 2021

LAC Pre-Ingest Workflow
Status:Draft
Tools:
Input:Born digital content transferred to Library and Archives Canada
Output:Pre-ingest report
Organisation:Library and Archives Canada

Workflow Description

  1. Review Digital Transfer Assessment Form (DTAF) and provide strategic advice to acquiring staff on potential challenges for pre-ingest, archival processing, and long-term digital preservation
  2. Transfer the content
  3. Ensure that the content has a persistent identifier (e.g., Registration Control Number; barcode for containers where physical storage media are stored)
  4. Confirm security classification of records and infrastructure that is required for processing information at this level of classification
  5. Create a processing workspace on the Pre-ingest server (or classified infrastructure). Note: The Pre-ingest server is a server location where processing workspaces are organized by a standardized directory structure
  6. Ensure that all appropriate metadata and metadata generating templates are copied to the processing workspace (MET subfolder)
  7. Complete a physical carrier inventory documenting all physical storage media involved in the transfer. Media must be numbered sequentially (e.g. "01" or "001").
  8. Review and complete Digital Processing Checklist
  9. Write protect physical media when possible
  10. Insert into drive/attach to computer/anti-virus software (LAC network network will automatically scan for viruses; if a virus is detected, do not copy the data and document the virus in the Pre-ingest Report
  11. Review file format/directory structure of physical media at a high level
  12. If interactive CD/DVD content or authored AV content is detected, do not perform pre-ingest for the physical carriers containing that content. Follow workflow for authored AV content. Document this in the Physical Carrier Inventory spreadsheet
  13. Create sub-folders for physical media on the Pre-ingest server. If a physical carrier is blank or unreadable, do not create a subfolder for it
  14. Copy the content to the Pre-ingest server using SafeCopy. If issues occur, troubleshoot via SafeCopy or other checksum software like Fsum or MD5summer
  15. For any content that is unreadable, send to Digital Preservation for extraction. Record this in Physical Carrier Inventory
  16. Use TreeSize Pro to create a digital object listing of all digital objects successfully copied to LAC infrastructure
  17. Run DROID on the entire registration and create a DROID export (*.CSV) for all content
  18. Start populating the Pre-Ingest Report. Record pre-ingest volume, number of files, and number of folders
  19. Weed transitory objects (temporary files, system files, configuration files, Thumbs.db, program files) that are not required to render the information of long-term value
  20. In the Pre-Ingest Report, record if viruses were detected as part of the virus scan
  21. If authored Audio or video files are present, add a note that content must be processed according to a separate AV workflow.
  22. Non-authored AV material (i.e. unstructured AV material that has been copied to the Pre-Ingest server)can be processed in the same manner as other digital records.
  23. Use TreeSize Pro file search to identify any duplicated records existing within the corpus



Purpose, Context and Content

Pre-ingest is the technological review of digital records transferred to LAC. It involves technical appraisal using multiple software tools to automate the process. The goal of pre-ingest is to aid in creating a SIP that conforms to LAC’s preservation policies and that which LAC has a reasonable success of preserving in its repository and making accessible for the long-term.

There are two major tasks for pre-ingest:

  1. Weed any digital records that should not have transferred in the first place (i.e. computer files that are configuration, developmental, temporary, software files etc.).
  2. Identify file format or other issues that need addressing prior to or during appraisal/selection/description by archival staff.


Evaluation/Review

Further Information