We thought this would meet an unmet need. Existing options all required training and diligent use to be useful, and experience had showed us that trying to train and monitor thousands of content authors over a large organization was not an efficient way to tackle the problem. For a program to succeed, we were confident it needed to have a high level of automation, providing polite “just in time training” to content authors when it detected common mistakes.
At Princeton, it largely reversed the direction of my communications. On platforms with it installed, we went from sending repetitive emails to content authors requesting alt text, link or heading improvements, to receiving questions from content authors about how to improve their writing to avoid it getting flagged so much.
Externally, the reaction was elation:
We took a “tease and link” approach with the v1 tips: a short explanation of the issue, and a link for more details.
This was an improvement on tools written for developers: the tips were short and clear. But the decision of where to link was always an issue. Several government and higher education users felt the need to modify the tips to remove links to Princeton’s documentation, which made for extra work for them. So we rewrote the default tips to self-contained, with inline examples.
So we converted yellow warnings to “manual checks” to clarify their meaning, and added buttons to let authors dismiss alerts. “Ignored” alerts are only hidden for the person who clicks it; “marked OK” alerts are hidden for all site editors. Site administrators can choose who has permission to do each.
Creating a dashboard
A common lament by site owners was that the tool was doing a great job of finding issues all over the site, but there was no way to view a site-wide report.
The rewrite adds an option to sync findings back to a site’s database. Site owners can now browse findings by page or issue, and review which issues have been ignored or marked as ok – and restore the issue if they disagree!
Under the hood
V2 was nearly a full rewrite. Some of the new features include:
Removing the jQuery dependency lets it be installed on more platforms, and provides a 2x performance improvement.
The checker can now be configured to look within multiple independent parts of the page (e.g., article contents and the page footer).
Many new parameters make platform integrations easier, such as auto-ignoring all issues on certain pages for certain users, if they are not responsible for content on that page but may still want to view results on demand.
The checker now ships with multiple base themes, and parameters for customizing colors.
Creating highly accessible Web content is complicated, and tends to start with a lot of training.
Familiar practices must be discouraged:
Tables should not be used to create fake columns
Running text should not be centered or justified
Visual-only formatting (font weight and size) should not be relied on to provide meaning to users, who may be hearing the content or viewing it in “reader mode”
New practices must be encouraged:
Images need contextually meaningful alternative text
Pages need real structure
Links should have meaningful titles
Certainly some of our trainings are rolled out because we are teaching new concepts, but at some point the question needs to be asked: how much of the need for new skills is coming from the concepts being new, and how much is coming from us rolling out new expectations without updating our tools? Just because something is new does not mean it cannot be intuitive.
Lessons from the history of spelling
Phase 1: Blame the author
From the dawn of recorded history until the early 1980s, we blamed the author when they misspelled a word. The responsibility was on the author to learn to spell well.
How did someone improve their spelling? This is going to sound familiar:
They went to trainings, under tutors or in a classroom
They were handed lists, guidelines and dictionaries to study, dating back to ancient cuneiform lexicons etched in clay.
Phase 2: Blame the dictionary
But something changed in the 1980s: let’s call it “spelling phase 2.” The most popular word-processing programs implemented spell check.
And what happened to the blame game? Almost overnight, it shifted the blame to the tool:
Don’t know how to spell a word? Take a guess and hope a match is found. If it isn’t, blame the dictionary.
Use a real word the dictionary flags because it doesn’t know it? Grumble and decide whether to ignore the underline or right-click and add the word to the dictionary.
Did spell check fix all of our spelling? Certainly not. But it helped. And it shifted much of the burden and responsibility from the author to the machine.
This was not the end, though…
Phase 3: Blame the authoring tool
Recent years have brought another step forward: computers have started spelling for us:
Predictive algorithms try to guess the next word, the next phrase, maybe even a whole text message response.
Auto-format tools try to convert asterisks, numbers and letters to lists, try to convert shortcodes to special characters and emoji, etc.
Dictation tools try to recognize concepts to pick between homonyms.
And who gets blamed when this falls apart? The tool!
In our trainings, we talk a lot about testing, and introduce several quite good testing tools. None of these can catch everything: concepts like “avoiding explaining concepts through chromatic or spatial references” need to be checked for by a human; tools typically catch just under half of the actual issues a site might have. But the tools do a very good job at catching many common mistakes in editorial content.
Three problems, though:
Most are manual tools. The content creator needs to have been taught to use them, needs to remember they exist when writing their content, and needs to remember to run them when they are done. Those three points of failure conspire to leave much (most?) content unchecked.
They flag too much. They do not just mark mistakes the author just made, they mark color issues the designer made, technical issues the developer made, structural elements for informational purposes, etc. This raises the bar for trainings: authors do not only need to understand the tool, they need to be taught to ignore a list of false positives or very-real things that are not their responsibility to fix today, while writing this particular article.
The automated tools tend to have a crawl-the-site, visit-the-dashboard paradigm. Again: this assumes an author has been trained to visit the dashboard, and will remember to do so.
So Princeton set out make something new: a website-integrated, automatic checker that would only look at common editorial mistakes.
Automatic checkers need to be subtle: if they are too assertive, they interrupt and frustrate the author. If they are too polite, they can become easily-ignored background noise.
Automatic checkers need to be highly performant: if the page shudders or hesitates as they run, authors will resent their presence and disable them.
Automatic checkers need novice-friendly tips. For many authors, this will be the first time they hear about the tagging, structure and language that will make their content more accessible. Alerts need to be in simple, non-technical language, with practical advice for improving the content.
So we started with the closest tool we could find to these goals: Sa11y, out of Ryerson University. We took its test architecture and user-friendly information-panel-with-tooltips approach, and spent half a year adapting the tool so that tests always ran automatically, optimizing its performance, tweaking the tooltips and creating a long list of configuration options a developer could use to quickly adapt it to any platform.
Here is how it looks today:
When an author is logged into their site, Editoria11y’s toggle indicates the number and type of issues (pass, manual checks needed, likely issues found) on each page. They can click to reveal inline highlighting and tips. If the error is new, because they are looking at a page they just edited, the panel pops open automatically.
As of March 2021, this means more than 400 Princeton websites are automatically checking for:
Links to PDFs and other documents, reminding the user to test the download for accessibility or provide an alternate, accessible format
Video embeds, reminding the user to add closed captions
Audio embeds, reminding the user to provide a transcript
Social media embeds, reminding the user to provide alt elements
Towards Phase 3: Accessible by Default
In many ways, Editoria11y is a stopgap solution. Most of the mistakes it catches were made because we placed new expectations on content authors (“tag your content with good structure”), but gave them tools that encouraged the opposite.
The path forward is to make accessible content creation easier, and as automatic as possible. What that looks like is not yet obvious, but probably includes things like:
Rethinking what options are available. For example: if we keep telling users not to justify text on the Web, is it time to retire that “justify text” button?
Rethinking the default options. For example: the default table in most content management system does not have headers. The onus is on the user to know to click “properties” and add accessibility features; maybe it is time to reverse that?
Pondering predictive formatting and “ambient hinting.” If a user makes a whole line bold, can we suggest they make it a heading? If a user goes to select a heading, can we guess what heading levels they might want rather than giving them six options? Can we provide just-in-time training for difficult concepts?
Spell check as a roadmap
Dictionaries and manual checks
Guidelines and manual tools
Automatic spell check
Editoria11y and future automatic accessibility checkers