The main goal of the research work at the institute is to improve human-machine interaction by developing new pattern recognition based input modalities and new intelligent dialog strategies entailing advanced interface concepts. As there is no experience of how people use and accept these new technologies, usability studies have to be made from an early stage of system development. Therefore the institute has recently installed a special laboratory for studying and evaluating new interaction methods and techniques.
In order to employ such investigations, it usually will be necessary to observe and record the behavior of experimental subjects. Thereby both, goal orientated actions for solving a problem and spontaneously occurring actions and reactions to stimuli brought in from the outside while handling technical systems, are of interest. For later analysis the actions of the experimental subject can be recorded by video and audio equipment and processed with computers.
The usability lab consists of three separate rooms (see fig. 1, left):
Observation room: It is equipped with several remote controlled video cameras and microphones for observation and recording, and with loudspeakers to provide any acoustical environment. In that room the subject is supposed to deal with the experimental setup.
Control room: It is provided with video and audio mixers, computers and various studio equipment (see fig. 1, right). The subject is additionally monitored through an one way mirror. From that room a scientist controls the whole experiment.
This usability lab is used to test and optimize new approaches for multimodal man-machine interfaces (MMI) in the environment of a car. It is called "navigation lab" since its primary use was to investigate the MMI of automotive navigation systems.
Because of large environmental influences on the test persons, MMIs have to be tested in a real environment. It obviously makes little sense to evaluate the usability of a car mobile phone - which actually has to be operated while driving - in a desktop scenario. Unfortunately, it is difficult to integrate the required technology into a car boot. Some test scenarios might even lead to dangerous situations in real traffic. To make a replica of real conditions, the navigation lab was built up. However, real conditions do not only have effect on the test-person, but also on the used technology.
Image processing suffers under changing illumination and background setups - especially in the car environment conditions for this are really rough. Speech recognition is affected by driving noise or additional passengers in the car. Drivers may switch, but the system still has to function correctly. These conditions call for robust algorithms and statistical methods for user and situational modelling. Naturally, operating the MMI is not the primary task of the driver. Thus a number of precautions (e.g., adaptation to user and situation, structure of dialog, guidance to the user) to support or even assist the driver have to be taken.
Surely, not all the technologies have to be implemented before testing the usability of an MMI. We want to know exactly how the whole system has to function before wasting time inventing technologies not satisfying the user's needs.
Real car environment is simulated in a car with force-feedback steering, automatic transmission, hand brake, pedals, knobs, car-audio etc. (see fig. 2, right).
A lcd-monitor with a haptic input device displays the user interface. Several cameras and microphones are installed inside the car in order to observe the test person's activities throughout a usability test. The driving simulation is projected (5 m projection-diagonal) by a lcd-beamer onto the wall in front of the car (see fig. 3, right). Four speakers assure that acoustical events can be positioned in every direction.
A control room (see fig. 3, left) is also located in the navigation lab, which can be fully separated (acoustically and visually) from the test-scenario. It is equipped with several monitors, video recorders, a video mixer, a scan-converter, audio compressors, audio mixer, audio amplifiers for car audio, monitoring and driving simulation, a MMI computer (hi-speed PC with two monitors) and the driving-sim computer (hi-speed PC with fast graphic engine). This equipment allows a maximum of flexibility for all conceivable setups and scenarios.
More computers are scheduled, e.g. for gesture recognition, adaptive components, emotion estimation, multimodal dialog concepts or speech recognition (cf. sections 4.1-4.5). They can easily be integrated into the setup. The navigation lab is connected to the data processing equipment of the institute through 100 Mb/s LAN, which allows the use of practically any desired computing performance.
The computer vision lab was established at the institute of Human-Machine Communication in the beginning of the year 2000. In this lab, both analytical investigations concerning digital image processing and image synthesis are taken up.
Currently, the technical equipment of the lab consists of two Silicon Graphics Workstations (Indigo2 Impact and Indy with special grabbing and video compression devices), three high-performance PCs (with standard frame grabber cards) and several high-quality video cameras. Moreover the laboratory is equipped with a video cassette recorder, a studio TV monitor and a sound audio-mixing device to do basic usability studies. To avoid interfering effects on picture sampling a flicker-free high-frequency lighting was installed.
On the image processing side, usually the following steps are carried out to digitize images. The scene is recorded by one or more cameras. The analog signal of the camera(s) is sent to the A/D-converter of the computer, which is called frame grabber card. The sequence of digitized pictures, i.e. the video frames, can be viewed on the monitor and optionally, single frames or a sequence of frames can be saved in a computer-compatible file format (e.g. common image formats like Jpeg, PNG, or video formats like AVI or MPEG). For further evaluation the frames can be improved by image preprocessing (e.g. segmentation or texture analysis). By overlaying a rectangular pattern on the image it is subdivided into single sections. This process is called screening. From these frames, some special (densitorical, geometrical etc.) features can be extracted. By applying special determination algorithms the single features can be classified. Motions can be detected by comparing the frames of a sequence. Thus a set of translation vectors can be defined and rated.
On the image synthesis side, some research is done concerning virtual 3D-scenarios that are modelled using the virtual reality modelling language (VRML). One of the current endeavors in this field is to build a prototypical virtual model of several facilities of the institute. In this context, a system is generated and tested which enables the user to navigate through the virtual room with the help of natural speech utterances. Moreover, with regard to a multimodal handling, both usual haptic interfaces (keyboard and mouse) and a dynamic gesture-recognition module have been integrated.
Many physical-technical, electroacoustical, and psychoacoustical measurements rely on environments with defined room-acoustical parameters. Thus an anechoic chamber and a reverberation chamber, both available at the Institute for Human-Machine Communication, establish the standard equipment of acoustical measurement instrumentation.
Although both rooms define opposite acoustic environments, they both need to fulfill the following two requirements: noise, even infrasonic, must have a very low level in the chambers and noise transmitted from the chambers to the control room must not annoy or even endanger the operators. A combination of room and structural acoustics measures as the costly room-in-room construction ensures the fulfillment of these requirements.
The anechoic chamber should support the generation of a free, undisturbed sound field. The sound intensity reflected from the walls should therefore be minimal. For this reason all walls, the ceiling, and the floor are covered with an absorbing material. Nested side-by-side arranged wedges of mineral fibres achieve a high noise absorption coefficient over a wide frequency range through a continuous transition of the sound wave from the air to the absorber. The lower cutoff frequency of the anechoic chamber is defined as the lowest frequency at which the noise absorption coefficient under normal angle of incidence is at least 0.99. For the arrangement of the absorption wedges used in this anechoic chamber the lower cutoff frequency of the room is about 125 Hz and its wavelength corresponds to four-times the length of the absorption wedges. The usable volume of the room is L∗W∗H = 7.5 m ∗ 4.2 m ∗ 2.8 m = 88.2 m3. Extensive measurements were carried out to optimize speaker- and microphone positions against the background of standing waves at frequencies below the cutoff frequency of the anechoic chamber.
Sound fields in the reverberation chamber should be statistical, i.e. the temporal mean of the sound intensity should be equal for all room directions at all places in the room and for all measurement frequencies. Therefore the walls are built non-parallel and all surfaces are covered with sound-reflective material. Furthermore perspex reflectors are installed as sound-diffusors. The reverberation time of the room at low frequencies is longer than 10 s and can be reduced for experiments by mountable plate absorbers. Mounting bars allow the fast attachment of different materials whose acoustical properties are to be investigated. The size of the reverberation chamber is L∗W∗H = 5.5 m ∗ 4.9 m ∗ 3.9 m = 106 m3.
In a control room adjacent to both chambers the experimenter finds systems to generate, analyze, control and document measurements and experiments. Besides different electroacoustical transducers, systems for generating and analyzing measurement signals and procedures are available.
[Previous] - [Table of Contents] - [Next]
© Lehrstuhl für Mensch-Maschine-Kommunikation, Feb. 2001