Needs Assessment and Evaluation of a Digital Environmental Library:
the Berkeley Experience

Nancy Van House
nav-lis@cmsa.berkeley.edu
Mark Butler
Lisa Schiff
School of Information Management and Systems
University of California, Berkeley
Submitted to
DL96: the First ACM International Conference
on Digital Libraries
Bethesda, MD, March 20-23, 1996
October 17, 1995

ABSTRACT

The UC Berkeley Electronic Environmental Library Project is a multi-disciplinary project funded under the NSF/NASA/ARPA Digital Libraries Initiative. The goals of the user needs assessment and evaluation component of the project are to improve the understanding of the use of information for complex cognitive work; to develop conceptual bases and methods for evaluating digital libraries; and to support the development of the UC Berkeley Digital Environmental Library by providing the other components of the project with an understanding of the users, their context, and their work; to advise on content, functionality, and interface design; and to provide feedback on designs and prototypes by means of heuristic evaluation, small-scale user feedback, and larger-scale user testing. This paper puts the evaluation and needs assessment in the context of research on sense-making, and presents some key findings approximately one-third of the way through the four-year project.

KEY WORDS: Evaluation; needs assessment; usability; environmental planning; sense-making.

INTRODUCTION

The UC Berkeley Electronic Environmental Library Project is a multi-disciplinary project funded under the NSF/NASA/ARPA Digital Libraries Initiative. The goal is to develop a massive, distributed, electronic, work-centered library of environmental information containing text, images, maps, numeric datasets, and hypertextual multimedia composite documents to support actual environmental planning decisions by means of a coherent, content-based view of a diverse distributed collection which will scale to very large collections and large numbers of clients and servers, and improved data acquisition technology

By work-centered, we mean designed to support the work of the users, in this case, environmental planning. User-based, iterative system design is central to our process.

Our research is focussing on:

The testbed consists of diverse information types that are used in environmental planning. The content is primarily public-domain information generated by government agencies. The initial focus has been water planning, which is critical in California and which encompasses a wide range of participants, geographical areas, and research areas. The target users are the participants in environmental planning, which includes employees of state, federal, and local agencies, policymakers, environmental and industry groups, and the public.

The project is proceeding by means of an iterative design process with substantial attention to the needs of users in the areas of content, resource discovery and retrieval methods, document analysis, interface design, and browsing. We are currently in the second year of a four year project [11].

USER NEEDS ASSESSMENT AND EVALUATION

The goals of the user needs assessment and evaluation component of the project are to improve our understanding of the use of information for complex cognitive work; to develop conceptual bases and methods for evaluating digital libraries; and to support the development of the UC Berkeley Digital Environmental Library by providing the other components of the project with an understanding of the users, their context, and their work; to advise on content, functionality, and interface design, at the design stage, based on our knowledge of users and uses; and to provide feedback on designs and prototypes, by means of heuristic evaluation, small-scale user feedback, and larger-scale user testing.

The fundamental premises of the user needs assessment and evaluation efforts are that design should be user-centered, situated, iterative, and adaptive. Ours is a grounded theory approach, rooted in observation, conceptualization, testing, and refinement. We are using a variety of qualitative methods. We have conducted extensive interviews, learning about people's present work processes and tools and their perceived needs for a digital library. We are experimenting with a variation on thinking-aloud protocols that consists of videotaping users interacting with the system and then asking them to review and comment on their tapes, so thinking-aloud process is retrospective. This minimizes the disruption of the users' work while giving us the benefit of their sense-making process.

SENSE-MAKING AND DIGITAL LIBRARIES

The emerging model of information needs and uses assumes that information needs arise and information is used in the context of particular, concrete circumstances: the user's own history, the environment, and the task and situation at hand. In other words, information needs and uses are situated [5].

The process by which individuals give form and meaning to their situations is termed sense-making. Three research literatures address the process of sense-making. Cognitive psychology is concerned with individuals' knowledge representation and processing. The communications/information studies literature is concerned with the role of individual sense-making in information needs and uses [4]. The organizational literature is concerned with how organizations make collective sense, which facilitates information processing and decision-making; and how they instill this sense in their members [12]. The relationship between individual and organizational sense-making is a topic of lively debate with the organizational research community. Groups -- organizations, professions, social groups, whatever kinds of groups -- share their interpretations and cognitive space, that is, how they map their world, their situation, and information resources -- the problem environment and domain and the information universe.

Sense-making is socially-constructed [3; 4]: the understandings that people derive or the knowledge structures that they impose are strongly determined by their social interactions. The individual's own history interacts with his or her "communities of practice"[9], such as the organizational and professional groups to which the individual belongs. How professions define problems and their solutions, in other words, how they make sense of them, is one of the defining differences across professions [1]. The conceptualization of the information space differs across professions and disciplines.

INFORMATION TOOLS

People have always used information tools -- paper and pen are among the earliest and are still used. The rapid development of computing and telecommunications, however, has made possible an unprecedented range and variety of information tools.

Information tools are themselves both products and determinants of sense-making. Tools are not neutral. "Every human tool relies upon, and reifies, some underlying conception of the activity that it is designed to support" [10, p. 3]. Information tools rely upon and reify a conception, not only of how people do information work (such as searching for information), but of how they do their work, how they map their cognitive domain, and they make sense of their situation. Indexing is a form of sense-making in which the developer of the system has determined what it means for two information objects, such as books or journal articles, to be "about" the same topic. Abstracting and indexing services that serve different disciplines, for example, will index the same publication differently, and will have thesauri that differ in how they relate subjects to one another.

In an information system, the design consists of the choice of content; the arrangement of that content; the search, analysis, synthesis, and processing functions performed by or supported by the system; the interface; and the representation of the information. Changing the information tools available can change users goals (for their work, or for their information tasks), cognition (how they understand their task, their context, and the data), behavior (their work, and their information seeking and use), and their relationships and interactions with their environment. Information systems can alter the relationships between users and tools, and the techniques, and systems for data interpretation [9]. In other words, altering the tool can alter the cognitive work.

The understandings that get reified into information systems are those of the designers: their understandings about how users do, or should do, their work, of the information that they use to do their work, and of how they use information. The thrust of user-centered systems design has been to shift the locus of decision-making from the designers to the users of the technology themselves, on the premise that the user is the expert on his or her work [7]. "No one understands human behavior enough to predict how a new computer system will be used."

The hazard of being too responsive to users, however, is that of casting existing practices in concrete, practices that may be based on adaptation to superseded technology. Ideally, the relationship between an information system and its users is mutually and iteratively adaptive: as new tools are developed, users find new ways to do their work, and to use the tools, and identify ways in which the tools can be improved.

Once a system is designed and built, users and designers will still have differing mental models of the system. Norman [8] describes a system as an abstract entity that is a result of mediation between users and designers. System as perceived and used is not the same as the system the designers designed. Rather, the information system that users "use" is constituted by their actions.

Evaluation is based on how well an information system meets its goals, which are generally to help users to better do their work. Evaluation, then, is based on yet a third conceptual model, that of the evaluators: their understanding of the goals of the system and of users' needs and actions. The choice of evaluation methods and criteria are based on the evaluators' prior assumptions. Evaluation itself is a form of sense-making. Evaluation itself is also situated. One definition of usability is a system's "capability in human functional terms to be used easily and effectively by the specified range of users, given specified training and support, to full a specified range of tasks, within the specified range of environmental scenarios" (Shackel quoted in [6]).

UNIQUE FEATURES OF THE BERKELEY DIGITAL LIBRARY

Several features of this project are particularly notable in the overall context of digital libraries and in considering the role of sense-making in digital library design, use, and evaluation. First, we are addressing several levels of users' work simultaneously. To understand people's use of a digital library, we need to understand first the overall context of their task, in this case, the environmental planning in California; the individual user's work (e.g. water engineering) and tasks (e.g. designing a specific engineering project); his or her information acts (including information searching, analyzing, repackaging), and finally his or her digital library use. That is, the context within which digital library use is situated -- in this case, the work to be supported and its environment -- is critical to understanding the user's needs and actions, designing a useful information system, and evaluating the system.

A second critical feature of this project is that we are working with actual users doing real work. The content is documents, images, and data directly relevant to users' work. The searching and analytical functions can be revised in response to our observation of users. Laboratory studies, experimental search tasks, short-term use, and other such methods of studying digital libraries lack ecological validity.

A third significant feature is that our users are an extremely varied group, ranging from scientists and engineers to public information specialists and the public. We are studying multiple, overlapping, and sometimes antagonistic communities of practice united by their common concern for an issue area but differing on several important dimensions, including scientific expertise, familiarity with environmental planning, the frequency with which they use the Digital Library, and their access to and skills with information technology. A key question addressed by this project is, Can we design an effective digital library to serve multiple communities of practice, some of which are undefined and unstructured? What adaptations may we need to make to different communities of practice?

Another critical feature of this project is that ours is a developing, research-based system. That it is research-based means that, unlike most commercial products, we are at the leading edge of design in several areas, notably document analysis, natural language processing for automatic indexing, image analysis, and database technology. The research team is continually developing new tools and approaches; for example, we have recently implement what we are calling Advance Structured Documents, which take published tables and turn them into SQL-searchable databases. By this method we can retrospectively convert state Department of Water Resources publications that contain critical data in a format that is currently of limited flexibility. Combining a research focus with real users performing real tasks gives us a unique ability to test innovative approaches against user needs and preferences.

That ours is a developing system means that it can be adapted to users' reactions to our experimental tools. Blomberg et al. [2] discuss the importance of what they call case-based prototypes, addressing the work of particular practitioners and reflecting the researchers' understanding of their work. They describe the importance of cycling among studies of work, co-design, and user experience with mock-ups or prototypes as a way of bringing an understanding of specific work practices into design as a basis for innovative design and better integrated technologies. This is the second year of a projected four-year project, which gives us substantial opportunity for iterative design based on user needs and feedback.

CURRENT FINDINGS

At this stage in the project, approximately a third of the way through, some preliminary findings can be reported on the use of information for environmental planning, and on users' needs and preferences for a digital environmental library.

Information for Environmental Planning

Information use in a politicized environment: Environmental planning is highly political. It combines rational/analytical activity, such as hydrological modeling and water supply projections, with political decision-making, such as balancing environmental and agricultural interests. In such an environment, information plays many and varied roles. A key need in such a context is the availability of information for inspection, critique, and re-analysis. An information system can make the chain of raw data, analyses, assumptions, projections, recommendations, and plans readily available to interested parties. This kind of public review is critical to building consensus around environmental decisions.

Also, in a political environment, issues change rapidly and so do information needs. When a species is proposed for endangered listing, for example, that sets into motion a wide range of actions that suddenly bring into new prominence a geographical area, an ecological system, and a set of environmental issues, and changes the information needed.

Another key characteristic of a political environment is the importance of public image and public use of the system. Although the system may be designed primarily for expert planners, public use is of political value and to be encouraged.

Time and the meaning of currency: In environmental planning, the time value of information is variable. Averting environmental damage often requires responsive management based on fast access to current data. But the most relevant data are not always the most current. Planning is highly contextual. Data on situations "like this one" are needed -- with similarity defined differently for different contexts. For example, California goes through cycles of wet and dry years. After many years of drought, 1995 was the wettest year on record. Information was needed on previous flood years, not on the most immediate previous years.

Because trends are important for understanding patterns and making projections, long time series of data are extremely valuable. A dataset's value depends in part on the length of time covered. Earlier data are likely to be on poor-quality paper in someone's files, but this legacy data adds considerable value of the current data. Tracking down and converting such data becomes a high priority.

Virtual organizations: Currently in California, and probably elsewhere, environmental projects are less likely to be large, centralized, and based on mandate or coercion and than be small, cooperative, coordinated, and based on incentives such as grants. Federal and state policy is often implemented by local water agencies, individuals, and businesses. The result is that policy is carried out by way of incentives. Getting people to respond requires that they have information: about the incentives (e.g. grant programs) and about the actions encouraged by incentives (e.g. how farmers can preserve wetlands).

Much environmental work is done by cross-agency task forces and/or agencies working with and through consultants: in effect, virtual organizations. Information needs to be widely distributed to be used across different organizations and to play a role in coordinating activities.

DESIRED USES OF THE DIGITAL LIBRARY

We have identified a number of potential uses of the Environmental Digital Library from our study of users' work and their reactions to the present system.

Locating information: People need information, not necessarily publications; but much environmental information is contained in diverse, complex, sometimes ephemeral documents. Environmental documents are multifaceted. An Environmental Impact Statement, for example, may cover the biology, geology, transportation and utility infrastructures, and history of an area, just to name a few topics. Similarly, a species of fish will spend different parts of its life-cycle in different places, and may merit a short treatment in each of many documents covering many different geographical areas. The ability to delve into a large trove of documents and retrieve the segments that substantively address a single topic is critical.

Much valuable information exists in local government documents, in limited distribution reports, in closely-held datasets, and in other elusive sources. Changing issues can make old and rare documents suddenly important. To restore a wetlands to an earlier state, for example, requires information about that earlier state. Identifying and locating such documents has in the past been extremely difficult. A centralized digital repository, however, that provides searching and retrieval, makes those documents suddenly accessible.

A related issue is the identification of the holders of and experts on a wide range of information. Using a statistical dataset appropriately requires a comprehensive understanding of the data. Often this can only be gained by talking with the person responsible for the dataset.

Analyzing data: The Digital Environmental Library contains statistical datasets. Raw data is of limited use. It needs to be analyzed and presented, often visually. Some users, for example, would like to be able to animate complex time-series data. The relationship between the digital library and associated analytical tools raises questions about whether common analyses can be identified, and/or the analytical tools made easy enough to use.

Disseminating, publishing, and re-using information: With digital information, the lines between information use and production becomes blurred. Many government agencies are mandated to make information publicly available. This task is often burdensome, and agencies look to the digital library to help. This task also sometimes generates revenues, which agencies wish to preserve. And, in a politicized situation, public relations are critical.

Users want information of many kinds in many formats to be easily captured for re-used: for reports, presentations, web pages, newsletters, pamphlets, CD-ROMs, and so on. Generally users express a strong desire to make information collection, analysis, and publication a single, seamless process. This is difficult, however, when dealing with the varieties of agencies, contractors, and technologies.

Governmental and nonprofit organizations are aware of how the distribution of their information via a digital library can affect their image. They are concerned that they project a professional image appropriate to a public agency.

We have also found complex social interactions around the dissemination of complex datasets such as monitoring data, which can be of varying quality and requires normalization and adjustment. Those responsible for the data have legitimate concerns about its understanding and appropriate use. The result is differing degrees of gatekeeping on the data. Some data keepers want people to have to ask for the data because it can then be handed over with caveats. Others make the data freely available but also make advisors readily accessible.

Technology Tools and Skills: Among potential users, we have found what we in a technology-intensive research setting would regard as low to moderate levels of technology in many of the units we have studied, often with slow telecommunications that make WWW access problematic. For organizations to adopting technology requires that they have incentive, resources, and technical support; for many the last is the critical missing factor. We have also found, that the public is rapidly gaining access to the WWW. The rate at which organizations can be expected to change is debatable.

USER-CENTERED DESIGN

In performing user-centered design in a research environment, we are confronted with a problem common to innovation in work practices in other settings: how do we design a truly innovative system in concert with users who are attached to current practices? How do we bring together the potential directions identified by the research team with the needs of users? In other words, how do we find the intersection between what users need and what the designers can possibly design? Users are best able to evaluate working prototypes, that is, systems with which they can interact in an ecologically valid way. Yet the development of prototypes requires considerable investment by the design team. At this point we have no resolution to this dilemma, but it is one that is particularly salient in this research environment.

CONCLUSIONS

The UC Berkeley Digital Library of the California Environment offers a unique opportunity to engage in adaptive, iterative, research-based development of a digital library. Its use by working environmental planners is a key feature in supporting user-based, work-centered design of a digital library to support complex cognitive work. To understand digital libraries and to learn how to evaluate them requires investigation of working systems in real environments. Digital libraries are evolving into something similar to but different from their predecessor libraries and work support systems; by observing (and shaping) this evolution, research can contribute to better design of work-centered digital libraries.

ACKNOWLEDGMENTS

Partial support for this work has been provided by the NSF/NASA/ARPA Digital Libraries Initiative. We would like to thank our colleagues, in particular Robert Wilensky, Principle Investigator; Robert Twiss and Howard Foster of the UC Berkeley REGIS Project; Gary Darling of the California Resources Agency; and Ray MacDowell of the Department of Water Resources. Our discussions with Lucy Suchman, Jeanette Blombert, Randy Trigg, and David Levy of XEROX PARC have also been useful.

REFERENCES

1. Abbott, A. (1988). The System of Professions: an Essay on the Division of Expert Labor. Chicago: University of Chicago Press.

2. Blomberg, J. Suchman, L. and Trigg, R. (1994) Reflections on a work-oriented design project. Proceedings of the Participatory Design Conference (PDC 94), Chapel Hill, NC, Oct. 27-28, 1994.

3. Bruner, J. (1990) Acts of Meaning. Harvard University Press.

4. Dervin, B. (1983). An Overview of Sense-Making Research: Concepts, Methods, and Results to Date. Paper presented at the International Communcation Association annual meeting, Dallas.

5. Dervin, B., and Nilan, M. (1986). Information needs and uses. Annual Review of Information Science and Technology 21, 3-33.

6. Dillon, A. (1994) Designing usable electronic text: ergonomic aspects of human information usage. Bristol, PA: Taylor and Francis, Inc..

7. Greenbaum, J., and Kyng, M. (1991). Design at Work: Cooperative Design of Computer Systems. Hillsday, NJ: Lawrence Erlbaum Associates.

8. Norman, D. A. (1986). Cognitive engineering. In Norman, D. A., and. Draper, S.E., eds. User Centered System Design: New Perspectives on Human-Computer Interaction. Hillsdale, NJ: Lawrence Erlbaum Associates, p. 31-62.

9. Ruhleder, K. (1994). Rich and lean representations of information for knowledge work: the role of computing packages in the work of classical scholars. ACM Transactions in Information Systems 12:2 (April) p. 208-30.

10. Suchman, L. A. (1987). Plans and Situated Actions: the Problem of Human-Machine Communication. Cambridge: Cambridge University Press.

11. Van House, N. (1995). User Needs Assessment and Evaluation for the UC Berkeley Electronic Environmental Library Project. Digital Libraries '95: The Second International Conference on the Theory and Practice of Digital Libraries, June 11-13, 1995, Austin, TX.

12. Walsh, J.P. Managerial and organizational cognition: notes from a trip down memory lane. Organization Science 6:3 (May) pp. 280-321. fn:h:\...\nsf\dl96