Scout is a preservation watch system being developed within the SCAPE project. It provides an ontological knowledge base to centralize all necessary information to detect preservation risks and opportunities. It uses plugins to allow easy integration of new sources of information, as file format registries, tools for characterization, migration and quality assurance, policies, human knowledge and others. The knowledge base can be easily browsed and triggers can be installed to automatically notify users of new risks and opportunities. Examples of such notification could be: content fails to conform to defined policies, a format became obsolete or new tools able to render your content are available.
The reason why we should worry about preservation of digital content and why some preservation action needs to be done is closely related to the idea that content is at risk. The risk relates to the potential of losing something of value, weighted against the potential of gaining something of value. In digital preservation, the risk relates to losing long-term and continuous access (or usability) of content by the intended users and it is weighted against the cost (or profit) of maintaining such access. The long-term and continuous aspects of this access mean that there should be a continuous and long-term process that knows when content is misaligned with the requirements of the intended users, and this process is preservation watch.
In practice preservation watch becomes even more complex as long-term and continuous are many times conflicting requirements. To tackle this, an institution would normally define a "preservation format" which tries to fulfill the long-term access requirement, and create "access" or "dissemination" copies, which are optimized for user community.
Monitoring if content is aligned with the long-term and continuous access requirements, i.e. if selected preservation and access format are still adequate, is a big endeavor that quickly becomes infeasible with large-scale content. Institutions are normally able to tackle the usual suspects, like images and text documents, but are unable to process the long tail of file formats that almost all institutions have.