I read comments online about “blind people” and “screenreaders” and they often confuse where and how things happen: specifically what is the job of the app, what is the job of the app, and what is the job of the TTS voice – the speech synthesizer that speaks the content. And particularly how screenreaders handle emojis and other characters like © or Ø.
Screenreaders work with text. So your app’s job is to make sure there is text: text on buttons, text labelling input fields, text markup for diagrams. Though sometimes, because listening to text is really, really slow, the best text is no text at all – explicitly setting the text to “”. Screenreader users do not care you have a pretty picture you generated with AI to make your article look pretty, mark it up as “”, or they’ll have to sit and listen to “a woman wearing business attire looks concernedly at a piece of paper” when the screenreader user wants to get on and learn about investments. There also some technical stuff here about whether your app will communicate the text to the screenreader correctly – probably fine with US English, but if you start using non-ASCII characters like Ø then you may find there are problems, depending on the age of your app technology.
The screenreader gets the text from the application. It may or may not do some work with this text, like deciding whether to add annotations like “this is bold” to help with understanding or editing, or to spell out some acronyms like “The U N” depending on what the user wants to do – editing a document is different from reading social media. And the user may have put in their own custom rules, like “speak Alasdair like Alister” because the user finds that this often goes wrong.
Then the screenreader hands the text to the TTS speech synthesizer. This in turn makes its own decisions on what to do with the text: should it pronounce “bow” like “I show the bow” or “I took a bow”? Should it pronounce the dots in www.example.com? Each TTS voice will have different rules, depending on the vendor or the individual voice. One voice might say “one-thousand-two-hundred-and-twenty-three” and one might say “one two two three” and a third “twelve twenty-three” when the screenreader passes it “1223”. And “12:23” might or might not get a different result from “1223AD”.
OK, that’s the principles: your app needs to make text available, but the screenreader or the TTS engine may or may not handle it well. Naïve users might say “this does not work with screenreaders!” but it depends on the app, screenreader and TTS engine.
An example. Let’s say there is a button with an emoji on it, like 🙂. Do screenreaders support this? Well, let’s see:
- The button has to provide the 🙂 as text: maybe it does the Unicode code point, maybe it is an image with text that says “smiley”, maybe it has only the picture.
- The screenreader decides what to do with it. Maybe it converts it to the text “smiley”, maybe it passes it straight to the TTS engine, maybe it doesn’t handle it because it is an English screenreader and this is a non-English character so it is probably some junk punctuation, which you usually want to skip.
- The TTS engine decides what to do with it. Maybe it knows how to say 🙂, maybe it says “smiley”, maybe it says the French word for “smiley” because it is a French TTS voice, maybe it again throws it away because it knows to skip random.punctuation!so-that-it-sounds#better.
So do screenreaders support emojis? Certainly they can! But they might not. You need to know where you find the emoji, which screenreader, and which TTS voice. Blanket statements are generally wrong.
Finally, I expect that screenreaders five years ago would not support emojis and other characters, and screenreaders now (2024) would, because technology and standardisation marches on. So this article is probably redundant round about now – which is great, because I love emojis. ✊😜✅