WebbIE, a web browser for visually impaired people
A. King, G. Evans and P. Blenkhorn. Poster presented at the 2nd Cambridge Workshop on UNIVERSAL ACCESS and ASSISTIVE TECHNOLOGY (CWUAAT). 22-24 March 2004. Cambridge, UK.
Webpages are documents written in Hypertext Mark-up Language
(HTML) defined by the international web standards body, the World-Wide-Web
Consortium (W3C) ( HTML 2003 ) . HTML combines text information with
meaningful semantic mark-up of the text, for example
This is a header or
this is a list of information. An HTML client has responsibility for
rendering the content to the user in accordance with the mark-up. For example,
text indicated as
header text in HTML might be rendered in a large, bold font
in a visual client, spoken at a louder volume in an audio client or explicitly
Headline in a text-only client. Having written a document in
HTML, the author can provide formatting information for a particular medium
using the related Cascading Style Sheets (CSS) (CSS 2003 ) . These
contain information on how the author wishes the information to be presented
that may be used by the client, so a visual Stylesheet might instruct the
client to use a particular font or colour combination, and an audio Stylesheet
to use a particular voice or volume. This approach separates content (HTML)
from presentation (CSS). The the end user should thus be able to view the
document as intended by the author (using their CSS) or to access the content
in a manner more desirable or practical for the user (using the HTML).
In practice, however, the web is primarily a visual
medium, and the principle of separating content and presentation has not been
honoured. Presentation information has been included in HTML mark-up. This
has repercussions for access by users of non-visual clients. For example, HTML
tables, designed to contain information best presented in a tabular format (for
example, arrays of number), are often used as layout containers to position
content on the screen for the visual user. This can break up the normal flow
of a document when viewed in a linear manner, as with a screen reader, and frustrate
resizing and reformating of content for the user. For example, enlarging text
trapped in a fixed size area makes some of the text impossible to read. Dedicated mechanisms for navigating
real tables are frustrated.
The example above illustrates a more general problem of
the reliance on visually-meaningful formatting rather than correct HTML markup
to communicate semantic content. For example, instead of using the HTML
headline markup to indicate the page's main headline, the author makes the
headline text bold, centred, and a larger font size. This is a perfectly
recognisable convention for sighted users, but is not useful for non-sighted
users, who are forced to choose between losing this vital piece of semantic
information or trying to identify titles by guesswork. The corollary of this
problem is using semantic markup for purely visual effect, for example using
headline markup simply to create the visual effects normally associated
with headlines for text that is not headline text. This lack of semantic
information can cause severe problems for blind webpage users. Sighted users
can, on first seeing a webpage, quickly identify the salient features -
headline, navigation bars, main text content, advertising - and therefore the
meaningful content of a webpage and how to access it. Blind users can be
forced to move laboriously through the text of a webpage, perhaps starting with
a navigation bar with fifteen hypertext links, then some advertising copy,
until they encounter the content of the page which may or may not be useful to
them. This is a slow and frustrating process.
A second obstacle to the use of webpages by blind people
is the embedding of non-text content into HTML documents. The most obvious
example of this is the use of images, used not only to present pictures and
diagram graphics, but as a way to provide absolute layout of other components;
headings and text with the size and font desired by the author; and decorative
features such as borders and bullet points. Bitmap graphics contain no useful
information for visually-impaired people: at best, the meaning of the image can
be inferred from the filename (e.g.
cat.jpg). HTML does
provide mechanisms for annotating this embedded content: the specification
states that an author can set the ALT attribute of an embedded content element
to describe the image as text or indicate that the content is of no value to a
non-sighted user (e.g. an image used as a background spacing element).
However, use of this attribute is not mandated by the HTML specification, and
even if ALT text is provided, there is no guarantee that its information
content is equivalent to the embedded content. There are other types of
embedded content, for example, Java applets or Macromedia Flash animations,
that are not rendered natively by the client browser but rely upon the action
of a supporting application provided by the client format developer termed a
plug-in. The accessibility of such content depends upon the supporting
application and the nature of the content. For example, Java applets prior to
Java version 1.2 displayed content using native operating system controls, such
as buttons or text fields, that are normally accessible to a screen reader. As
a result these applets are often usable. Java applets from Java 1.2 onward use
lightweight controls implemented purely within Java (Swing components) and
these are supported only by screen-readers such as JAWS which have been written
to take advantage of the new Java Accessibility API (Sun 2003 ) . In
either case, if the Java applet is used to display non-accessible content, such
as animated images, the accessibility of the content interface is irrelevant.
A final accessibility problem originates with web browsing clients themselves. Designed to display content to sighted users, they typically paint content onto a canvas intended purely for viewing and lacking features like a caret, the ability to focus on text content, or access to ALT text content. Screen reader users of Microsoft Internet Explorer, for example, have no simple way to access the text content of any page: the application only allows focus to lie upon form elements and links, so the user cannot focus the screen reader upon the text content and have it read out to them. Inaccessible clients can be overcome by screen readers designed to address this problem directly, and the predominance of Internet Explorer has encouraged this, but the result can be a very complex user interface.
Most of the problems detailed so far can be ameliorated to
a great degree by the efforts of web authors to produce web pages that are
accessible to non-sighted users. The means of doing this are codified in a
number of standards, notably the
Web Content Accessibility Guidelines from
the W3C (WCAG 2003 ) which provide a checklist of recommendations for
use by web authors such as
Don't rely on color alone. While disability
legislation and lobbying by pressure groups and individuals has made
accessibility a key factor in web design, it does not necessarily follow that
compliance with these standards results in an accessible website. If accessibility
is considered to be a matter of ticking the appropriate boxes rather than
addressing the likely needs of visually-impaired website users, then accessible
webpages are unlikely to result.
2 Existing solutions for blind people
Solutions to the problem of web accessibility fall into one of four categories: reliance on a conventional web browser and a screen reader; utilising the accessibility features of HTML and existing web clients; using transcoding proxy servers to convert webpage HTML into a more accessible format; and using a dedicated web browser.
2.1 Conventional web browser and screen reader
The web browser market is dominated by Microsoft's Internet Explorer (MSIE), which holds a 95% share (CNET 2002 ) . It is therefore the defacto standard for web clients, and web authors frequently write HTML and DTHML code targeted at MSIE. Using MSIE and a screen reader or magnifier guarantees that a maximum of websites will work for the user, in the sense that the functionality intended by the author will be available to the sighted user, and that the user interface will be common to sighted people - such as those providing technical support - with the obvious exception of the use of the assistive technology. The problems with this approach are the inaccessibility of content displayed by the browser and the complexity of the user interface already described. However, progress has been made by screen reader developers, notably Freedom Scientific's JAWS (JAWS 2003 ) , in supporting MSIE and by extension the vast majority of web users. The resulting control interface can be, as noted, very complex.
2.2 Utilising HTML accessibility
The second approach takes advantage of the principle of HTML, separating content and presentation, and the native abilities of clients to present content in a way desirable to the user. Web clients permit the user to define their own presentation preferences, for example using a particular mix of colours (yellow on black is preferred by many visually-impaired people), fonts (Tiresias (Tiresias 2003 ) is designed to be very legible) and font sizes. Clients can also choose to ignore presentation dictates from web pages, stripping out decorative and confusing background images or preventing text from blinking (harmful to users with epilepsy (WCAG 2003 ) ). These are all helpful approaches for visually-impaired people. The Mozilla browser allows the user to turn on a caret, overcoming the normal web browser canvas problem by providing a means to indicate to a screen reader the current content of interest. The problems with these approaches are that they fail to address a range of problems related to overly-complex interfaces (tables and page layout are generally still preserved, so the user must still search over the page for content of interest) and the needs of users without any degree of functional vision. The other practical problem is that users are required to specify their user preferences within the client, which is not common user behaviour and may not be possible in the user's environment, for example where the user is on a different computer or where user preferences are locked by their network policy.
2.3 Using a transcoding proxy server
The third approach places the solution between the author
and the client by running requested HTML pages through a transcoding proxy
server. Requests for webpages from servers are made not to the servers
themselves but to a intermediate server, a proxy, which fetches the page,
converts it according to a set of rules, and returns the converted page to the
requesting client. This process is employed for users of limited browsing
devices, such a mobile telephones, which cannot handle fully-featured webpages
and relay on proxy
gateways to reduce the standard webpages into a limited
format supported by the telephone (Kennel 1996, Brown 2001 ) .
Visually-impaired users can use the same approach: the proxy can be configured
to alter the HTML document to provide the font, font size, colour and other
settings desired by the user in much the same way as the use of the accessibility
features of a client. The advantage is that these can be set remotely, so the
client itself need not be amended by the user. The disadvantages relate to the
second-hand nature of the HTML document transmission. Page features, such as
client redirects, may not be supported by the proxy, and many websites assume
the use of a client directly and provide functionality based on this assumption
password-authenticated services to them). Finally, the processing performed by
the proxy server requires the server to have full access to the content of the
HTML document, which means that secure transmission protocols used in Internet
commerce such as HTTPS are unusable.
2.4 Using a dedicated web browser
The final approach is to use a dedicated web browser designed for visually-impaired or blind people. There are two tactics employed: the first, exemplified by the Home Page Reader from IBM, is a self-voicing application that provides a complete audio interface to web pages. The second is to render the content of a webpage as a text-only flat document and permit the user to access this accessible content using their normal assistive technology, typically a screen reader. This tactic is demonstrated by Webwizard from Baum and WebFormator from Frank Audiodata. Developing a dedicated web browser affords the maximum flexibility in approach, but requires the developer to take more responsibility for the presentation of web content. Although in theory a non-visual web browser is just as standard as a visual one presenting marked-up HTML, in practice the visual bias of the web means that alternative applications have to focus on providing access to resources designed for the sighted. The greater flexibility in approach has lead to a number of different products which are worthy of examination.
IBM's Home Page Reader (HPR) (IBM 2003 ) is a standalone product that breaks down a web page into a linear array of items which can be moved through by the user and are voiced as they are encountered. The user can select the granularity of the array, from letters upwards. Links are presented in a different voice (female rather than male) to distinguish them: the ability to present information like this is an advantage of developing a self-voicing application. The default setting presents the page as an array of structural mark-up elements: list items, headers, and paragraphs. This is a good level of resolution for well-constructed web pages, since it allows the user to immediately access the document via a reasonable number of segments which reflect the semantic meaning known to the document author. Less well-designed web pages where mark-up is used for visual presentation are presented less successfully, since there is less scope for inferring the semantic meaning of particular items of content from the mark-up.
BrookesTalk (Zajicek 1998 ) is another self-voicing web browser that employs a similar approach to HPR. In addition, it attempts to address the problem of communicating to the end user the semantic content of a page by providing summaries and keywords obtained by analysing the structure of the web page. Zajicek reports that blind users did not find the summary information of use because it was regarded as inaccurate: certainly, interpreting a page for its important semantic meaning is a very difficult computing problem.
Asakawa et al's talking web browser (Asakawa
2002 ) focused on the problem of communicating semantic information about
content to the user. It utilised a number of different auditory and tactile
interfaces to communicate structural information derived from analysis of the
HTML of a web page. Assuming that visual users used grouping of similar
elements as a vital part of understanding web page structure (e.g.
link buttons make up a navigation bar) Asakawa's system attempted to group
HTML elements by colour, area and border, identify items of emphasis, and
communicate the resulting groups using background music and tactile output.
Individual components of the page, such as text or buttons, were communicated
with auditory icons and earcons. Emphasis was communicated through bell-like
sounds. Results indicated that the indication of emphasis was well received:
it may be that this is because it successfully communicated important semantic
information about the page.
WebFormator from Webwizard ( Webwizard 2003 ) and Frank Audiodata (WebFormator 2003 ) from Baum use the second tactic, running simultaneously with MSIE and re-presenting the contents in a text field that can be accessed by a screen reader. This text can be navigated with a caret as a normal text field, and like the other two applications users can bring up lists of links, frames and other features that can be of use in understanding the content of the web page. WebFormator/Webwizard also provide different navigation modes for exploring HTML tables, navigating from cell to cell within the table: while tables are typically used for layout, rather than structuring data, if a real data table is encountered this may be of use.
WebbIE, developed at UMIST, uses the same tactic as WebFormator and Webwizard, presenting the web page content as accessible text rather than self-voicing an entirely novel interface. It goes a step further in creating a freestanding independent application providing web access, and is described fully in the next section.
WebbIE was developed to fulfil our design philosophy of allowing users to access standard applications, in this case Windows Internet Explorer, through an interface that simplifies and represents the content without losing information or being too complicated for non-expert users. It is not self-voicing, but rather provides support for partially-sighted people and allows screen reader users to continue to use their familiar environment.
Internally WebbIE uses the MSIE control object (WebBrowser), and this handles the acquisition of webpages and parsing the HTML into the W3C standard Document Object Model (DOM) (DOM 2003 ) (Figure 1). Using MSIE guarantees maximum compatibility with websites, although another control that handles fetching webpages and parsing them into the DOM could be used with a minimum of alteration (the Mozilla control has been tested and works successfully). The DOM provides a rich API for manipulating and querying the webpage
Figure 1: The WebbIE architecture
WebbIE navigates the DOM, collecting active content
components such as hypertext links and form components, and building up a
plain-text representation of the content. This plain text is presented to the
user. Components are presented on new lines with distinguishing titles, like
for a hypertext link. Functionality is accessed through pressing the return
key on a line with a presented component. Figure 2 shows WebbIE in action.
WebbIE supports existing MSIE bookmarks, frames, the great majority of HTML 4,
forms, tables, and display of embedded multimedia.
Figure 2: WebbIE in action
As a dedicated web browser, WebbIE attempts to address the accessibility issues associated with web pages already described:
Complex web pages - WebbIE presents the whole web page in a linear text form, so it can be explored as a standard familiar text document, which is much simpler than puzzling out the potentially complex interface of a web page. The disadvantage of this is that any information inherent in the spatial layout of the web page is unavailable to the user, such as separation of content into body and navigation parts.
Summarising web page content - WebbIE highlights marked-up headlines and enables the user to access them directly. It allows the user to skip links to non-link text (this works especially well when skipping the navigation bars commonly found at the top of pages). It also makes an attempt to identify the section of the page containing the main content text, and the section of the page containing navigation links. It can either work directly on the processed content, checking for successive lines with text or links, or use a more sophisticated approach by scoring the component parts of a webpage - frames, table cells, and HTML division elements - for text content and link content and identifying the two winning sections to the user.
Images -WebbIE presents the ALT text or ignores the
image if this is not available, unless it is also a hypertext link, in which
case it gives the destination as the most meaningful possible information.
This is sometimes very useful (
home.htm) and sometimes not (
Flash/Java/multimedia embedded content - WebbIE can present this content separately in a pop-window that can be accessed by the user's screen reader, so if the content is accessible the user should be able to access it.
Forms - WebbIE allows forms to be handled in the
page using simple text components. For example, input boxes are presented as
BOX: (content) on a line. If the user presses the return key, WebbIE pops
up an input box to receive the user's input text, and then updates the page
with the text input for review. The same simple approach is taken with select
buttons and other form elements (see Figure 2)
Frames - WebbIE runs frames together to present them as a single, linear text page, so the user does not have to navigate different panes of content. It does allow users to navigate within the page as though the different areas were still operating as in the frames, so frame navigation is supported, but it is assumed to be simpler for users with the same consistent interface for frame and non-frame pages.
WebbIE was evaluated with nine users by means of a questionnaire. The users ranged in experience and levels of visual impairment, and the sample size was small, so the data acquired is anecdotal but has the benefit of being from actual users. The users were all associated with a company that distributes WebbIE and performs training, so the results reflect some common background of training and preference.
The users were all screen reader users. The six users that had used the web before used MSIE in conjunction with their screen reader, although their level of success varied: after using of WebbIE three intended to use it.
The users cited a variety of favourite sites and most users browsed for new pages of interest. This suggests that able VIPs do successfully overcome browsing problems to an extent that allows them to gain advantage from the exploration of unknown sites, although all expressed some confusion over or ignorance of non-HTML embedded content, confirming that HTML is the most accessible format for web content.
All the users that expressed a preference preferred Google as a search engine suggesting that a tailored Google interface within WebbIE might be a good next development: WebbIE already allows users to query Google from WebbIE directly, but doesn't perform any special processing on the result, for example to prioritise the search results over the page navigation content. Other popular sites included the BBC Radio sites to obtain radio program recordings and banking and grocery shopping sites. These commercial sites permit visually-impaired people access to services that usually require either customised information (e.g. bank statements in Braille) or intervention by a sighted person (e.g. to shop in a supermarket). Using a web site puts blind people on a more equal footing and allows providers to make their services more accessible at relatively little expense.
Aside from specific issues with the WebbIE interface general complaints were made about the many links often encountered at the top of a web page before the content of interest. These links are typically navigation bars, very useful for sighted people but get a distraction for visually-impaired people. As a consequnve the WebbIE function that skips links and moves the cursor to content (mentioned above) proved popular.
The main benefits of WebbIE were perceived to be the
ability to cut and paste text from the simple text interface, allowing users to
prepare content in other formats, and the handling of forms through a simple
text interface. Users did not report any general problems with accessibility to
websites, but as one user reported if an inaccessible site is encountered
is lots of choice so I leave them alone, so this may reflect why the sites
that were singled out as being inaccessible were service providers where the
user has a strong reason to wish to gain access to that service and not another
generic one, for example financial or supermarket sites.
WebbIE is available for download from www.screenreader.co.uk. It is freely available for use and distribution. For more information, contact email@example.com.
Chieko Asakawa, Hironobu Takagi, Shuichi Ino and Tohru Ifukube (2002)
"Auditory and Tactile Interfaces for Representing the Visual Effects on the Web", ACM ASSETS 2002, Edinburgh, Scotland, UK, 8-10 July 2002.
Silas S Brown and Peter Robinson (2001)
"A World Wide Web Mediator for Users with Low Visions", presented at CHI2001 Workshop 14, Seattle, Washington, US, 31 March - 5 April 2001
Cascading Style Sheets, http://www.w3.org/Style/CSS/, accessed
Internet Explorer 95.3, Mozilla 0.4, http://news.com.com/2100-1023-938784.html,
accessed August 2003.
W3C Document Object Model, http://www.w3.org/DOM/, accessed August 2003.
IBM Accessibility Center: IBM Home Page Reader
accessed August 2003.
JAWS for Windows, http://www.freedomscientific.com/fs_products/software_jaws.asp,
accessed August 2003.
Kennel, A., Perrochon, L. and Darvishi, A (1996)
World-wide-web access for blind and visually-impaired computer users, ACM
accessed August 2003.
Tiresias Fonts Website, http://www.tiresias.org/fonts/index.htm,
accessed September 2003.
Official WebFormator Site, http://www.webformator.com/, accessed
Webwizard (2003) http://www.baum.de/webwizard.htm, accessed September 2003.
World-Wide-Web Consortium HTML Standard, http://www.w3c.org/html/, accessed August
Web Content Accessiblity Guidelines 1.0, http://www.w3.org/TR/WCAG10/, accessed
Zajicek, Mary, Powell, Chris and Reeves, Chris (1998)
Web Navigation Tool for the Blind, ASSETS 1998, New York.
Alasdair King, 10 August 2004. Last updated 13 August 2004.