Accessible PDF: A Glossary of Terms
Thursday, June 21, 2012 Return to Logical Structures
As PDF/UA moves towards publication as the international standard for accessible PDF, it seems like a good idea to provide non-technical definitions of some of the key terms users, developers and policy-makers will encounter as they learn more about PDF accessibility.
This post is intended to make some of the more common technical language of PDF accessibility itself accessible. If you seek further clarification, by all means, leave a comment and I'll update this glossary!
A feature in Adobe's Acrobat Professional software, "Add Tags" creates a structure tree and attempts to create tags in a logical order and with correct semantics. While Add Tags can work fairly well on very simple documents the potential for errors increases with document complexity. While "Add Tags" is the name of a specific feature in Adobe Acrobat Professional, other PDF creation applications, most notably Microsoft Office and Open Office, are also capable of creating tagged PDF files.
Items that you'd not expect to hear if the document were read aloud. While wholly distinct from real content, artifacts sometimes provide important information to users - printed page numbers being the most obvious example. Obvious examples of artifacts include the borders of table cells, background images, cosmetic shading and the like. Less obvious examples are page headers and footers. PDF/UA specifies that conforming software be capable of reporting artifacts to the user.
A suite of tools for the creation, verification, correction and management of accessible PDF files. CommonLook is a brand of NetCentric Technologies, the company that owns the website you are reading right now.
Adobe's Acrobat Professional includes a "Content Panel" feature that allows users to review (and change) the reading order of the PDF.
The sequence of content as consumed by people (ie, words, sentences, paragraphs, etc). Not to be confused with reading order, which should be read as "computer reading order" logical order is critical to accessibility.
The sequence in which software (as opposed to humans) reads the page's content in order to display it. The reading order may or may not match the Logical Order, which predominates over mere "reading order" for accessibility purposes.
Text and images required to comprehend the document. Page content that isn't necessary for understanding the document should be marked as an artifact.
Semantics define the purpose of content in terms of its relationship to other content. Some simple examples: headings are often used to group paragraphs of text together for organizational purposes. Lists have a certain number of List Items; lists can be ordered (with numbers) or unordered (with bullets).
The "branches" and "leaves" of the structure tree, tags are the fundamental mechanism by which PDF content is made accessible.
Adobe Acrobat Professional includes this tool to facilitate modification of both reading order and logical order of content on a PDF page. While useful in some circumstances (typically with simple documents), users must exercise caution: if they aren't very careful the TUROT will readily generate z-order errors, among other problems.
As a function of the page's reading order, text and images on a PDF page may be "behind" or "in front of" other objects, such as when text is overlaid on a photograph. PDF editing tools, including Acrobat's Add Tags and Touch Up Reading Order Tool can create "z-order errors" in which the reading order of content is altered, damaging the page.