Defining "Heading" in HTML and PDF
Thursday, April 19, 2012 Return to Logical Structures
The term "Heading" has substantially different meaning depending on the technology context. This post explains that the "heading" concept in HTML 4 differs critically from the "heading" structure concept in PDF with profound implications for how WCAG 2.0 applies to the Portable Document Format.
I've been spending time recently doing intensive research on WCAG 2.0 and how it applies to technologies that are not inherently "of the web". As always, my focus is on PDF.
Last week I wrote an article discussing whether headings are for navigation or decoration which concluded that WCAG 2.0's mission could only be served in PDF if headings were logical, which boils down to the simple statement that skipping heading levels (H2 to H5, for example) is prohibited.
My argument, in brief, was that just as WCAG 2.0 requires attention to valid structural markup for lists and regulates how authors can use color (for example), in PDF at least, so too must authors provide "logical heading levels" in the tags tree if they provide "structure" elements at all. Neither PDF nor PDF/UA require the author to provide structure elements if the content does not include structure.
Heading levels in PDF that do not reflect the a meaningful and navigable document structure cannot comply with WCAG 2.0, Success Criterion 1.3.1 or 2.4.3, both Level "A".
As I continued to look at this question I realized that I was actually dealing with a more basic point of confusion: the casual assumption that an "<H2>" tag in PDF actually means the same thing as "<H2>" in HTML.
Headings in HTML and PDF, a Comparison
|Specification||Definition of <H> and <H1> through <H6>|
Unambiguously "importance", use as section heading is discussed.
PDF (ISO 32000-1:2008 188.8.131.52.2, Table 335)
Official ISO version
Free authorized download from adobe.com (pdf)
Structure Type "H"
"A label for a subdivision of a document’s content. It should be the first child of the division that it heads."
Structure Types "H1-H6"
"Headings with specific levels, for use in conforming writers that cannot hierarchically nest their sections and thus cannot determine the level of a heading from its level of nesting."
Unambiguously "section heading" (no mention of "importance").
HTML 4, HTML 5 and PDF each have a distinct definition of "heading", but HTML 5 and PDF are far more similar than HTML 4 and PDF.
By definition, in PDF, the "H" and "Hn" tags denote "subdivisions" of content (we'll not get into "strongly" and "weakly" structured for now). "Subdivision," of course, may or may not mean the same as "importance," depending on the specific document, but either way, the concept is distinct from the HTML 4 notion of "heading".
What logically and invariably follows
- WCAG 2.0 Technique H42 cannot be applied to PDF. This technique is specific to HTML 4.
- Eric Meyer's blog post (referenced in Technique H42) is HTML 4 specific and likewise does not apply to PDF. What's more, this post simply represents his "gut feeling" (he says it himself), hardly grounds for formal advice on conforming with normative text!
- WCAG 2.0 PDF Technique 9 (PDF9) states "In some technologies, headings are designed to convey logical hierarchy". This is misleading in a PDF-specific technique. According to ISO 32000-1:2008, PDF headings ONLY convey hierarchy: that is their function.
WCAG 2.0 informative text cannot arbitrarily redefine terms in existing normative standards (ISO 32000). "Headings" in PDF are what ISO 32000 says they are, period.
How to Achieve Success Criteria 1.3.1 and 2.4.3 in PDF
In PDF, the standard structure types are the AT-operable structure, content relationship and navigation mechanism of choice.
In PDF, Headings (H, Hn) are "structure elements", not indicators of "importance", although the two may overlap.
WCAG 2.0 Success Criterion 1.3.1 (Level A) requires "information, structure and relationships" be "programmatically determinable".
WCAG 2.0 Success Criterion 2.4.3 (Level A) requires in the case of sequentially navigable content that "navigation sequences... [are] in an order that preserves meaning and operability."
We can deduce that logical heading levels are Level A (in the cases of SC 1.3.1 and SC 2.4.3) and Level AA (in the case of SC 2.4.6) requirements for PDF files that use headings.
In practice this requirement will tend to matter on longer (>5 pages, perhaps?) documents rather than shorter ones. For short documents, headings used without attention to the logical ordering of heading levels is usually sufficient to provide basic navigation to the AT user.
PDF implementers refer to ISO 14289-1 7.4 for normative text detailing requirements for meeting WCAG 2.0 Success Criterion 1.3.1, 2.4.3, 2.4.6 and 2.4.10 in PDF documents that include structure elements.
REMINDER OF MY USUAL DISCLAIMER: This post represents my personal view and is not intended to represent in any way the official views of the US Committee for PDF/UA or ISO 32000.