GONVI: Non-Visual Access to Documents and GUIs with a Constraint-Based Approach

Gottfried Zimmermann
gzimmermann@acm.org
University of Stuttgart, Germany

In: Edwards, A. D. N., Arató, A., & Zagler, W. L. (Eds.), Computers and Assistive Technology ICCHP '98 (pp. 379-386). Vienna: Austrian Computer Society (OCG), 1998.


ABSTRACT

This paper describes a constraint-based interactive system for non-visual representation of graphical documents and graphical user interfaces (GUIs). GONVI ("Graphical Object Server for Non-Visual Interaction") establishes spatial and other constraints between textual and graphical objects; these constraints rule the transformation process for non-visual interaction. Of central interest is the integration of text and graphical elements considering the non-scalability of braille. A prototype of GONVI is currently being implemented in Java using PostScript documents as graphical information source.

Note: In this paper the term "graphical object" refers to both graphical and textual objects.

1. Introduction

Graphical information plays an important role in communications. Most prints like school books, office documents or instruction manuals today contain images and sketches which are essential for understanding. Today's human-computer interaction bases on graphical interfaces and visual interaction techniques mainly designed for sighted users. One of the main electronic information sources in our society, the World Wide Web (WWW), also exhaustively makes use of graphical elements which are mostly not sufficiently described in the textual context and therefore cannot be regarded as redundant.

In general graphical images and sketches are interwoven with textual elements that form an integral part of the graphical representation. The graphical and textual elements with their mutual spatial relations raise a main problem for blind readers and computer users. Graphics can be transformed to tactile images to a certain degree, and text is traditionally converted to braille for non-sighted readers. But the fixed size of braille characters in contrast to the scalability of ink-print fonts causes either huge tactile formats or at least the loss of spatial relations implied by the visual original.

There are several approaches to solve this problem. [Lötzsch94] mentions four types of tactile graphics: 1. full text graphics possibly with pointers, 2. labelled graphics (with additional legend) possibly with pointers, 3. textual guided pure graphics (with an additional guide in braille or on tape) and 4. pure graphics (requires help by a person or by a computer). The production of tactile graphics still remains a time-consuming process requiring manual design by experts for each graphical image.

In the non-visual representation of graphical user interfaces the so called screen reader programs generally concentrate on textual elements by speech output or output on a braille terminal, but more or less neglect the reproduction of graphical elements and their relations to the interwoven text parts [Burger/Stöger96]. This leads to a non-acceptable loss of information for blind computer users.

For the non-visual representation of text and graphics we need a solution that provides

2. Constraints

2.1. Constraint Programming

A constraint is a declarative description of mutual relations between a set of objects. Constraints affect the properties of objects on runtime in correlation to other objects' properties. In contrast to the assignment in imperative programming languages which only defines a one-sided relation at evaluation time, a constraint affects all involved objects from creation until deletion time of the constraint. Constraints do not specify an algorithm for the determination of the constrained properties - it is the job of a constraint solver to satisfy as many constraints as possible. The constraint solver enforces a constraint every time an involved object has changed.

A set of constraints may include conflicting constraints which cannot be satisfied all together. To specify an order in which the constraints should be solved constraints can be organised as constraint hierarchy. The most important constraints belong to the highest hierarchy level. The constraint solver guarantees that when solving a constraint of a certain hierarchy level no other constraint of a higher level is violated.

2.2. Constraints and Their Use in GONVI

In GONVI constraints represent mutual relations between graphical objects [2] or relations between graphical objects and an output or input device. Constraints set restrictions to the representation and interaction process in order to access the graphical objects. This means that the system's behaviour is controlled by a set of graphical objects, a set of output and input devices and their associated constraints. Since in GONVI constraints are created by user-defined rules the interaction of the system has to be specified by implementing a set of constraint creation rules.

2.2.1. Spatial constraints

To maintain spatial relations in a visual graphical document or interface these relations have to be detected and expressed by spatial constraints between graphical objects. The spatial constraints include:

2.2.2. Medium-specific constraints

Medium-specific constraints specify the non-visual representation in the context of a given user environment. These constraints are generally given by technical restrictions of the output or input medium. Medium-specific constraints define:

2.2.3. Interaction constraints

Interaction constraints directly affect the way the system interacts with the user. They specify "interaction rules" like

3. GONVI System Architecture

GONVI is a constraint-based transformation and interaction process. It takes a set of graphical objects (from a visual context) as input and provides representation and interaction techniques for a non-visual user environment as output. In the GONVI prototype (see section 4) this process is implemented for a typical graphical information source (input) and user environment (output) (see Figure 1).

System architecture of GONVI[d]

Figure 1: System Architecture of GONVI.

3.1. Lingo

GONVI defines a special language for graphical object descriptions. Each graphical information source encodes its graphical description in Lingo ("Language for Information Exchange on Graphical Objects"). Since Lingo is used in the context of inter-process communication (IPC) and communication through character streams, it is a character-based language.

A special Lingo dialect is introduced for each graphical information source. On the other side, the GONVI kernel provides a special parser component for each Lingo dialect. In GONVI, builder objects [Gamma et al. 95] are responsible for building internal graphical object instances from Lingo dialect descriptions.

3.2. GONVI Graphical Objects

GONVI distinguishes three types of graphical objects. Graphical primitive objects are the atomic parts of graphical or textual information; examples are straight line segments, curve segments, characters, filled areas or pixel-based bitmaps. Graphical composite objects consist of graphical primitive or other graphical composite objects and represent complex graphical descriptions which can be regarded as a greater unit; examples are rectangles, tables or text paragraphs. For application-specific contexts GONVI provides graphical application objects which consist of graphical primitive or graphical composite objects. Graphical application objects represent semantic units of an application's domain or the real world; examples are interaction objects (e.g. buttons, menus) or real world objects.

Parsing a Lingo description results in a set of (mostly) graphical primitive objects. In GONVI a graphical object detector (see Figure 2) creates higher level graphical objects (graphical composite and graphical application objects) on top of graphical primitive objects.

Internal components of the GONVI kernel[d]

Figure 2: Internal components of the GONVI kernel.

3.3. Attaching Constraints by Rules

A constraint attacher (see Figure 2) seeks conditions for creating and attaching constraints (see section 2) between the graphical objects. Medium-specific constraints also consider the complete user environment with its output and input devices. This process is triggered by the constraint attaching rule base which stores user-defined rule sets. They can be regarded as domain and environment-specific interaction descriptions. For experimental purposes the user can inspect the automatically created constraints and additionally attach constraints to individual graphical objects and/or environmental devices. The result of this phase is a set of constraints organised as constraint hierarchy.

3.4. Constraint Solving

After creating the constraints a constraint solver (see Figure 2) has to satisfy them. Many different constraint solving techniques are described in literature, e.g. in [Leler88] and [Saraswat/Van Hentenryck95]. In GONVI the constraint solver must be able to handle constraint hierarchies. Because of its applicability to GUIs it also must be incremental.

3.5. Non-Visual Representation of Graphical Objects

In the object-oriented design of GONVI each type of graphical objects can implement its specific representation and interaction techniques. Thus tables can display a cell's content on a dynamic tactile device while on the same time the corresponding row and column heading are given by speech output, for instance.

4. Prototypical Implementation of GONVI

In our prototypical implementation the description of graphical objects is taken from a description in PostScript format [Adobe88]. For non-visual interaction the following devices can be connected: tactile output devices (pin matrix display and/or tactile printer), speech and sound output.

System architecture of the GONVI prototype[d]

Figure 3: System architecture of the GONVI prototype. The white components are direct parts of the prototypical implementation.

4.1. Lingo PostScript Filter

For the GONVI prototype a PostScript program called LingoPSFilter implements a "plug in" for any PostScript interpreter in order to generate a graphical description in the LingoPS format from a given PostScript description. LingoPSFilter must have been run in the PostScript interpreter prior to the actual PostScript description. It redefines some of the PostScript system operators and thus reproduces the "whole picture" in the LingoPS language, a special Lingo dialect used for PostScript information sources.

[Gunzenhäuser/Weber94] introduce three main approaches for accessing graphics and graphical user interfaces for blind users: the bottom-up, middle-out and top-down approach. The GONVI prototype with its PostScript information source implements a mixture of the bottom-up and the middle-out approach. The main reasons for using PostScript as graphical information source are:

The present prototype of GONVI is a very special application whereas the GONVI kernel is basically independent from any concrete graphical information source. So it is possible to implement the GONVI system to use graphical information delivered by the Hypertext Markup Language (HTML) or the Java Abstract Window Toolkit (AWT), for instance.

4.2. The User Environment for the Prototype

The present user environment focus lies on tactile output; in this medium spatial relations can be most adequately expressed. Tactile output can be given on a tactile printer or on a dynamic pin matrix with an integrated pointing device, the latter allowing real user interaction techniques like zooming in/out and user-driven exploration of graphical properties.

Speech and sound output can be used as additional devices to support different interaction modes. The use of speech and sound is driven by the constraints attached to the graphical objects and therefore depends on the constraint attaching rule base (see section 3.3).

5. Conclusion

In our world in which information more and more consists of graphical representations designed for visual perception there is a great need to make this information also accessible to non-sighted persons. GONVI describes and shows how graphical information can be transformed into non-visual representations without loosing its internal consistency. The powerful concept of constraint programming also allows specifying interaction techniques for a non-visual user environment.

6. References

[Adobe88] Adobe Systems Inc.: PostScript - Handbuch. Addison Wesley, Bonn, 1988.

[Burger/Stöger96] F. Burger, B. Stöger: Access to the WWW for Visually Impaired People using Browsing Tools based on GUI: State of the Art and Prospects. In: ICCHP'96 Proceedings, Oldenbourg, 1996, pp. 303-310.

[Gamma et al. 95] E. Gamma, R. Helm, R. Johnson, J. Vlissides: Design Patterns - Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.

[Gunzenhäuser/Weber94] R. Gunzenhäuser, G. Weber: Graphical User Interfaces for Blind People. In: K. Brunnstein, E. Raubold (Eds.): 13th World Computer Congress 94, Vol. 2. Elsevier Science B. V., Holland, 1994, pp. 450-457.

[Laufenberg/Lötzsch95] W. Laufenberg, J. Lötzsch: Tastbare Abbildungen für Blinde - Thesen zu Bedarf, Entwurf, Fertigung, Zugriff. In: W. Laufenberg, J. Lötzsch (Eds.): Taktile Medien - Kolloquium über tastbare Abbildungen für Blinde, Tagungsband, Deutsche Blindenstudienanstalt e.V., Blinden- und Sehbehinderten-Verband Sachsen e.V., Nov. 1995, pp. 11-22.

[Leler88] W. Leler: Constraint Programming Languages - Their Specification and Generation. Addison-Wesley, 1988.

[Lötzsch94] J. Lötzsch: Computer-aided Access to Tactile Graphics for the Blind. In: ICCHP'94 Proceedings, Springer, 1994, pp. 575-581.

[Saraswat/Van Hentenryck95] V. Saraswat, P. V. Hentenryck (Hrsg.): Principles and Practice of Constraint Programming. MIT Press, 1995.

[Wisskirchen90] P. Wisskirchen: Object-Oriented Graphics - From GKS and PHIGS to Object-Oriented Systems. Springer, 1990.

7. Acknowledgement

I wish to thank Prof. Dr. Gunzenhäuser and many members of his research group for help and inspiring discussions.