Editing Workflow:Appraise email and other large, unstructured text collections

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 30: Line 30:
 
<!-- Describe what your workflow is for - i.e. what it is designed to achieve, what the organisational context of the workflow is, and what content it is designed to work with -->
 
<!-- Describe what your workflow is for - i.e. what it is designed to achieve, what the organisational context of the workflow is, and what content it is designed to work with -->
 
Huge, unorganized, unstructured data with scattered sensitive content presents challenges that other structured categorized, foldered, and form-based electronic records do not present. Email gives context for decisions. The fact that email is frequently the subject of FOIA requests is an indicator of its documentary and evidentiary value. It is a challenge to know what to keep. The U.S. NARA Capstone approach helps, but non-records exist within a Capstone account and records exist in non-Capstone accounts. Also, it is a challenge to identify what content can be accessed by whom at what times and by what means if it contains PII. These records end up ignored and deleted, or preserved but neglected, causing us to lose or overlook a valuable resource, reducing awareness of its value and risking format obsolescence. We have solutions for acquiring (Capstone & PST/MBOX/EPADD, etc.), reformatting (Emailchemy, etc.), indexing, (Acrobat, etc.), preserving (Preservica, etc.), arranging (EPADD/Preservica, etc.), describing (EPADD/Preservica, etc.), and accessing (EPADD/Preservica, etc.) email. The purpose of this workflow is to provide a way to code and classify large, unstructured datasets at the discrete item level. To do this, we use predictive coding where a statistical model based upon training and control groups is created that finds what you want in a large collection using representative samples to make decision with high confidence.
 
Huge, unorganized, unstructured data with scattered sensitive content presents challenges that other structured categorized, foldered, and form-based electronic records do not present. Email gives context for decisions. The fact that email is frequently the subject of FOIA requests is an indicator of its documentary and evidentiary value. It is a challenge to know what to keep. The U.S. NARA Capstone approach helps, but non-records exist within a Capstone account and records exist in non-Capstone accounts. Also, it is a challenge to identify what content can be accessed by whom at what times and by what means if it contains PII. These records end up ignored and deleted, or preserved but neglected, causing us to lose or overlook a valuable resource, reducing awareness of its value and risking format obsolescence. We have solutions for acquiring (Capstone & PST/MBOX/EPADD, etc.), reformatting (Emailchemy, etc.), indexing, (Acrobat, etc.), preserving (Preservica, etc.), arranging (EPADD/Preservica, etc.), describing (EPADD/Preservica, etc.), and accessing (EPADD/Preservica, etc.) email. The purpose of this workflow is to provide a way to code and classify large, unstructured datasets at the discrete item level. To do this, we use predictive coding where a statistical model based upon training and control groups is created that finds what you want in a large collection using representative samples to make decision with high confidence.
 +
  
 
==Evaluation/Review==
 
==Evaluation/Review==

Please note that all contributions to COPTR are considered to be released under the Attribution-ShareAlike 3.0 Unported (see COPTR:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)

Template used on this page: