Introducing Editoria11y – Editoria11y Accessibility Checker

Creating highly accessible Web content is complicated, and tends to start with a lot of training.

Familiar practices must be discouraged:

Tables should not be used to create fake columns
Running text should not be centered or justified
Visual-only formatting (font weight and size) should not be relied on to provide meaning to users, who may be hearing the content or viewing it in “reader mode”

New practices must be encouraged:

Images need contextually meaningful alternative text
Pages need real structure
Links should have meaningful titles

Certainly some of our trainings are rolled out because we are teaching new concepts, but at some point the question needs to be asked: how much of the need for new skills is coming from the concepts being new, and how much is coming from us rolling out new expectations without updating our tools? Just because something is new does not mean it cannot be intuitive.

Lessons from the history of spelling

Phase 1: Blame the author

clay tablet with hundreds of conical markings in two columns — Tablet 16 of 24 a 2,100-year-old Sumerian-Akkadian lexicon.
*© Marie-Lan Nguyen / Wikimedia Commons*

From the dawn of recorded history until the early 1980s, we blamed the author when they misspelled a word. The responsibility was on the author to learn to spell well.

How did someone improve their spelling? This is going to sound familiar:

They went to trainings, under tutors or in a classroom
They were handed lists, guidelines and dictionaries to study, dating back to ancient cuneiform lexicons etched in clay.

Phase 2: Blame the dictionary

But something changed in the 1980s: let’s call it “spelling phase 2.” The most popular word-processing programs implemented spell check.

And what happened to the blame game? Almost overnight, it shifted the blame to the tool:

Don’t know how to spell a word? Take a guess and hope a match is found. If it isn’t, blame the dictionary.
Use a real word the dictionary flags because it doesn’t know it? Grumble and decide whether to ignore the underline or right-click and add the word to the dictionary.

Did spell check fix all of our spelling? Certainly not. But it helped. And it shifted much of the burden and responsibility from the author to the machine.

This was not the end, though…

Phase 3: Blame the authoring tool

Recent years have brought another step forward: computers have started spelling for us:

Predictive algorithms try to guess the next word, the next phrase, maybe even a whole text message response.
Auto-format tools try to convert asterisks, numbers and letters to lists, try to convert shortcodes to special characters and emoji, etc.
Dictation tools try to recognize concepts to pick between homonyms.

And who gets blamed when this falls apart? The tool!

Social media aggregators love the embarrassing “That’s not what I meant!” screenshot.

Editoria11y is a step towards phase 2

In our trainings, we talk a lot about testing, and introduce several quite good testing tools. None of these can catch everything: concepts like “avoiding explaining concepts through chromatic or spatial references” need to be checked for by a human; tools typically catch just under half of the actual issues a site might have. But the tools do a very good job at catching many common mistakes in editorial content.

Three problems, though:

Most are manual tools. The content creator needs to have been taught to use them, needs to remember they exist when writing their content, and needs to remember to run them when they are done. Those three points of failure conspire to leave much (most?) content unchecked.
They flag too much. They do not just mark mistakes the author just made, they mark color issues the designer made, technical issues the developer made, structural elements for informational purposes, etc. This raises the bar for trainings: authors do not only need to understand the tool, they need to be taught to ignore a list of false positives or very-real things that are not their responsibility to fix today, while writing this particular article.
The automated tools tend to have a crawl-the-site, visit-the-dashboard paradigm. Again: this assumes an author has been trained to visit the dashboard, and will remember to do so.

So Princeton set out make something new: a website-integrated, automatic checker that would only look at common editorial mistakes.

Automatic is…tough.

Automatic checkers need to be subtle: if they are too assertive, they interrupt and frustrate the author. If they are too polite, they can become easily-ignored background noise.
Automatic checkers need to be highly performant: if the page shudders or hesitates as they run, authors will resent their presence and disable them.
Automatic checkers need novice-friendly tips. For many authors, this will be the first time they hear about the tagging, structure and language that will make their content more accessible. Alerts need to be in simple, non-technical language, with practical advice for improving the content.

So we started with the closest tool we could find to these goals: Sa11y, out of Ryerson University. We took its test architecture and user-friendly information-panel-with-tooltips approach, and spent half a year adapting the tool so that tests always ran automatically, optimizing its performance, tweaking the tooltips and creating a long list of configuration options a developer could use to quickly adapt it to any platform.

Here is how it looks today:

an open tooltip on a link that has been wrapped around an image with no alt text, noting that this link will be silent for screen readers — Editoria11y panel flagging a link with no accessible text.

When an author is logged into their site, Editoria11y’s toggle indicates the number and type of issues (pass, manual checks needed, likely issues found) on each page. They can click to reveal inline highlighting and tips. If the error is new, because they are looking at a page they just edited, the panel pops open automatically.

As of March 2021, this means more than 400 Princeton websites are automatically checking for:

A good document outline
- Skipped heading levels
- Empty headings
- Very long headings
- Suspiciously short blockquotes that may actually be headings.
Text alternatives
- Images with no alt text
- Images with a filename as alt text
- Images with very long alt text
- Alt text that contains redundant text like “image of” or “photo of”
- Images in links with alt text that appears to be describing the image instead of the link destination
- Embedded visualizations that usually require a text alternative
Meaningful links
- Links with no text
- Links titled with a filename
- Links only titled with only generic text: “click here,” “learn more,” “download,” etc.
- Links that open in a new window without an external link icon
General quality assurance
- Making sure list formatting is used rather than asterisks, numbers and letters
- LARGE QUANTITIES OF CAPS LOCK TEXT
- Tables without headers and tables with document headers (“Header 3”) instead of table headers (<th>)
- Links to PDFs and other documents, reminding the user to test the download for accessibility or provide an alternate, accessible format
- Video embeds, reminding the user to add closed captions
- Audio embeds, reminding the user to provide a transcript
- Social media embeds, reminding the user to provide alt elements

We have released the JavaScript library to the community, created a turnkey Drupal integration, and started work on a turnkey WordPress integration.

Towards Phase 3: Accessible by Default

In many ways, Editoria11y is a stopgap solution. Most of the mistakes it catches were made because we placed new expectations on content authors (“tag your content with good structure”), but gave them tools that encouraged the opposite.

The path forward is to make accessible content creation easier, and as automatic as possible. What that looks like is not yet obvious, but probably includes things like:

Rethinking what options are available. For example: if we keep telling users not to justify text on the Web, is it time to retire that “justify text” button?
Rethinking the default options. For example: the default table in most content management system does not have headers. The onus is on the user to know to click “properties” and add accessibility features; maybe it is time to reverse that?
Pondering predictive formatting and “ambient hinting.” If a user makes a whole line bold, can we suggest they make it a heading? If a user goes to select a heading, can we guess what heading levels they might want rather than giving them six options? Can we provide just-in-time training for difficult concepts?

Spell check as a roadmap

Phase	Spelling	Content Accessibility
1	Dictionaries and manual checks	Guidelines and manual tools
2	Automatic spell check	Editoria11y and future automatic accessibility checkers
3	Predictive spelling	Predictive structuring