June 8, 2017

Taking a Look at PDF Document Tagging

PDF Icon

A basic PDF file has no logical structure –  it is just a set of text and graphic elements positioned on a sequence of pages. However, PDF does support a logical structure tree that can be used to describe the underlying content of the document. This structure tree is defined separately from the text and graphic elements in the file and increases the effort required to create the final document. Fully functional document composition or post composition tools are needed to automate the process.

Why Tag?

Tags allow a screen reader to access the content of a PDF file and either read it aloud or display it on a refreshable braille device. Blind and partially sighted individuals use text-to-speech software to access electronic documents. Keyboard commands take the end user to the various elements in the file, such as headings, paragraphs and images. For example, a properly tagged graphic enables a screen reader to say “Graphic, Sun in Blue Sky” where the “Sun in Blue Sky” is the alternate text for the picture and “Graphic” comes from the  document itself. The use of tags is the basis of the PDF/UA and WCAG 2.0 AA standards that define accessible documents.

Tagging Challenges

Tagging each paragraph and each image of a document is a manually intensive process. It is difficult enough when the content is fixed, but when the content is personalized and variable the effort increases exponentially. Some form of system support is required to aid in this effort. Much of the information required for tagging is already present on the page, but it needs to be identified, extracted and copied to the tag tree and alternative text fields. With careful setup, rules can be established to handle all document variations, enabling a more automated process, and reducing bottlenecks.

CrawfordTech’s PRO Designer provides an easy to use interface to tag documents for accessibility. It allows you to configure tagging for all document elements, define the proper reading order, identify artifacts, and visually proof the output. With PRO Designer, virtually any existing print stream can be converted to Accessible PDF, Accessible HTML5 or other accessible formats.

We’ll be offering a webinar later this month on accessible formats and document tagging, and will discuss an exciting new offering we’ll be announcing soon – so stay tuned!