Content definition: Tools that support the creation, processing or displaying of data in ALTO XML format. It is mainly used for OCR data.

See this category's main page ALTO format for more information about this content type

ALTO (Analyzed Layout and Text Object) is a XML Schema that details technical metadata for describing the layout and content of physical text resources, such as pages of a book or a newspaper. It most commonly serves as an extension schema used with METS. However, ALTO instances can also exist as a standalone document used independently of METS.

Each ALTO file contains a style section where different styles (for paragraphs and fonts) are listed. The layout section contains what’s on the page. A page is divided into several regions (Print space, left margin, right margin, top margin and bottom margin). For each region all objects are listed which have been detected inside.

More information at the official ALTO website and the official ALTO Github website.

