Clifford Lynch on information creation, management and organization in the digital environment.
Clifford Lynch has been the Director of the Coalition for Networked Information (CNI) since July 1997. Prior to joining CNI, Lynch spent 18 years at the University of California Office of the President, the last 10 as Director of Library Automation. Lynch, who holds a Ph.D. in Computer Science from the University of California, Berkeley, is an adjunct professor at Berkeley's School of Information Management and Systems. He is a past president of the American Society for Information Science and a fellow of the American Association for the Advancement of Science and the National Information Standards Organization. Lynch currently serves on the Internet 2 Applications Council and the National Digital Preservation Strategy Advisory Board of the Library of Congress; he was a member of the National Research Council committees that published The Digital Dilemma: Intellectual Property in the Information Infrastructure and Broadband: Bringing Home the Bits, and now serves on the NRC's committee on digital archiving and the National Archives and Records Administration.
UBIQUITY: Tell us about CNI -- the Coalition for Networked Information.
LYNCH: The Coalition was established in 1990 as a joint project of the Association of Research Libraries (ARL), CAUSE, and Educom. ARL represents about 120 of the major research libraries in North America. Educom and CAUSE, which have since merged to become a joint organization called EDUCAUSE, were both organizations concerned with the use of information technology in higher education. My good friend, the late Paul Evan Peters, was the founding director.
UBIQUITY: And what is CNI's membership?
LYNCH: Membership is organizational. It's predominantly US colleges and universities. We also include among our membership government agencies, public libraries, publishers, information technology and network service providers, and scholarly and professional societies. We include a number of major universities and government entities from Canada and a handful from Europe. We have about 200 members, and the Coalition is entirely supported by member dues.
UBIQUITY: What did CNI bring to the table that ARL and at the time Educom or CAUSE couldn't do by themselves? Why was it established?
LYNCH: Cast your mind back to the early 1990s and remember the Internet at that time. The Internet reached most of the major universities in the US, but if you looked at who was using it, at what was there to advance research and education, and at the extent to which the Net was used to communicate information to the public, it was a pretty dismal picture. Of course, computer scientists were exchanging e-mail and doing the things that they had done since the days of the Arpanet and Bitnet. There was a happy community of computational scientists using the national supercomputer centers and doing visualization. Growing pockets of humanists were discovering e-mail and having a transformative experience as their interactions with colleagues worldwide changed radically. But the Net was still way out on the margins for teaching and scholarship.
UBIQUITY: Specifically, what was missing from the picture?
LYNCH: In particular, you wouldn't find any course material to speak of on the Net. You wouldn't find library collections, scholarly journals or government information -- much less any of these in the transformed states that advanced information technology might make possible. ARL, CAUSE and Educom sensed that a concerted effort needed to be launched to start addressing these issues. One of the reasons those three organizations came together was a recognition that it was going to require both IT expertise and also content expertise, as represented by the library as stewards, acquirers and managers of content.
UBIQUITY: What types of people attend CNI meetings?
LYNCH: Our member institutions typically identify two representatives to the Coalition: a library leader (like the University librarian) and an IT leader (like the CIO), which reflects that collaboration. Since 1990, however, the scope of the collaborations needed have grown. Along with librarians and information technologists, we now also see instructional technologists, faculty, publishers, records managers, policy makers, and others coming into the mix. It's a very rich dialog.
UBIQUITY: Do you think that CNI was a major force for changing the Net into what it is today?
LYNCH: I would dearly love to say that we made all the difference, and if you look at the Net today versus the Net in 1990, that's all our doing. In honesty I can't say that, but I believe that the coalition has made valuable and substantial contributions to moving us forward. We have, I believe, been leaders in sorting out a lot of things: the transition of journals and monographs to the digital world, some of the early issues around online advertising, authentication and authorization issues, the digitization of special collections, the convergence of museums, archives, and libraries as cultural memory organizations in the digital world, distributed search, metadata harvesting, the preservation of digital content, records management in the digital environment, the interconnections between learning management systems and libraries these are a few of the examples of major initiatives that come to mind.
UBIQUITY: What were the major forces that changed the nature of the Net over the past decade or so?
LYNCH: Obviously, from a technical perspective the emergence and subsequent convergence of the World Wide Web as opposed to the fragmentary thing we had before with FTP sites and gopher and such. That helped a great deal in terms of establishing common infrastructure and common expectations. Speed, of course, and massive cost/performance gains in storage and in computing; the storage gains in the past few years are particularly striking.
In the mid- to late-'90s, the Net emerged into the public arena and was deployed very widely. That also made a huge set of differences in terms of expectations, ubiquity and other areas.
UBIQUITY: Did commercialism in general have a positive effect?
LYNCH: I think it had a mixed impact. The uptake of the Net by commercial forces helped to legitimize it as good business practice for industry and government. That was significant from a policy and cultural point of view. And of course there are the ugly sides of commercialism -- Spam, for one.
I think sometimes the academic and not-for-profit worlds don't get as much credit as they should for pioneering much of the use of the Net that the commercial sector then followed behind and embellished. They were really important early on -- and still are today -- in terms of reasons why people want to use the Net.
UBIQUITY: Which contributions are you thinking of?
LYNCH: For example, some of the government agencies were early and aggressive adopters of the Net as a way to reach the public. The National Institute of Health's National Library of Medicine is a beautiful example. Universities got on the Net early with free and publicly accessible resources, and were part of setting the expectation in the public mind that enormous quantities of information and other materials would be freely available on the Net.
UBIQUITY: Going back to the concept of collaboration between librarians and information technologists, has that relationship gone well? Have there been any particular problems?
LYNCH: There is no question that there are substantial cultural differences between the library culture and the information technology culture on research university campuses, for example. By and large the two groups have learned to work together in order to make progress. Recently, I've been working to extend this collaboration to instructional technologists, which have yet a third culture.
But, back on the IT-library collaboration it's interesting that some complicating factors have come up.
UBIQUITY: What are the complicating factors?
LYNCH: Let me give you an example. At about the same time that the Web was emerging in earnest, there was another idea that caught on: digital libraries. Digital libraries became a popular idea -- and a popular phrase -- around 1995 or '96, particularly with the launch of the NSF digital libraries research program. "Digital libraries" is a deeply confusing and even self-contradictory phrase, yet also one that really resonates with a lot of people. It begs questions about the relationships between libraries as organizations and services and these new digital services.
Academic computer scientists dominated "digital libraries" at least for its first five years. Big culture gap -- information technology and libraries both share being service organizations; these are researchers. Sorting through the relationships between "digital libraries" and libraries seeking to exploit digital content and advanced technology has been complex, culturally and intellectually, probably more so than the IT-library relationship. Particularly because it has had to be done very much in public, while we are trying to figure out what digital libraries really are at the same time.
More deeply difficult, perhaps, are the really radical visions of the future of information creation, management, and organization that are coming out of computer science and some of the scientific research disciplines that are aggressively embracing advanced technology, grid computing, data-intensive science and the like. And how such visions collide with the view of the more traditional library community, which is among other things burdened with responsibility for enormous physical collections that are particularly crucial to the humanities. It's not just the libraries -- they are really in some sense caught between trying to balance traditional and very aggressive, new scholarly practices in the communities they serve.
UBIQUITY: Speaking of intellectual backgrounds, you yourself are a Ph.D. in computer science. What brought you into the area that led to the position that you hold?
LYNCH: I was one of those (undoubtedly annoying) slightly precocious math majors at Columbia but got sidetracked my junior year by learning to program computers. I ended up working part-time at New York University at the library and computer center while I was finishing my bachelor's degree and then switching over from math to computer science when I did my master's at Columbia. There was a wonderful professor, Ted Hines, who taught at the library school at Columbia (back when they had a library school); he taught me that computers could deal with information and language, not just numbers. Ed Brownrigg -- one of Ted's students and a brilliant guy in his own right -- was running library automation at NYU and hired me to work on various projects there. After I got my masters in 1976 I went to work for Ed at NYU full time.
UBIQUITY: What did you do after you left New York?
LYNCH: Ed Brownrigg took a job as Director of Library Automation for the University of California (then) nine-campus System, and hired me to join him in Berkeley in 1979. We worked together with a team of about 50 people building an online information system called MELVYL which was used all over the world by hundreds of thousands of people from the mid-'80s on. It was an online catalog for the unified holdings of the UC system -- something like 9 million volumes, if I recall -- and also later incorporated journal abstracting and indexing databases, full text resources and other materials. This system, which was locally developed at UC, had an amazing run. Just this year, 2003, they are finally phasing it out in favor of a commercial system, which means it had 20 years as an operational system. It was the first (or for sure one of the first) public access library catalogs on the Internet. After Ed Brownrigg left in '88, I took over as Director, not only continuing the development of the MELVYL system but also running a lot of the emerging intercampus TCP/IP networking for UC.
I left UC in 1997 to take over as director of CNI, after Paul Peters, the founding director, died suddenly the year before. I am still an adjunct professor at Berkeley's School of Information Management and Systems, where I've been doing a seminar with my colleague and friend Michael Buckland for over a decade now; we still do the seminar on Friday afternoons.
UBIQUITY: When did you get your doctorate?
LYNCH: I got my doctorate in computer science at Berkeley in 1987; this was a spare-time activity while I was working full time at the UC Office of the President on MELVYL. My work there was with Professor Michael Stonebraker, looking at issues involved in making relational database management systems support high performance information retrieval applications. It was a very nice coming together of theory and practice. I had a lot of hard data from my work with MELVYL, which I could bring to bear on my research.
You know, I'm reminded as I tell this story about how fortunate I have been to have so many great teachers and mentors and colleagues it's really remarkable.
UBIQUITY: How much of the intellectual problem of library management today is something that you could consider a legacy problem in the sense that they're not the kinds of problems you would be dealing with 10 years from now?
LYNCH: Our great libraries and archives have spent a long time amassing tremendous, fabulous print collections. Not only books, but also manuscripts, photographs and everything else that they hold. That material is not going away, even if you could digitize it all. There are still scholars who want and need access to the originals. It's a monumental inventory and collection management challenge, even if they never added anything else.
Having said that, I think that the kinds of issues that were on the table for libraries 10 years ago are a bit different than the ones they're facing now, and the ones they'll be facing in 10 years. One way to think about it is this; libraries exist in symbiosis with changes in the practices of scholarly communication and changes in the modes of transmission of entertainment and culture. And those worlds are changing. The way that scholarly communication happens today is hugely different than the way it happened 10 years ago in some ways. I think those changes will continue to accelerate. The agenda of issues is going to get longer and longer, and some of the new issues are going to be radically different than things libraries have historically concerned themselves with.
UBIQUITY: Going back to your statement on collections, have you heard criticism of the big research libraries for not digitizing their collections and making them available online?
LYNCH: I have run into this notion that the big research libraries are obtuse on this matter. In most cases it isn't about obtuseness by the library. It's not even fundamentally an economic problem of paying for scanning, but one about clearing rights to digitize material. Things remain in copyright for an astoundingly long time under the current legal regime. The libraries would have to clear rights book by book for every book published after, say, 1920-something. It's an incomprehensibly monumental and costly task -- and a stupid one for them to have to undertake.
UBIQUITY: Is there a way that you could imagine cutting through all that and having policies and laws that would make it much simpler?
LYNCH: You can certainly imagine legal regimes that would make it simpler. Probably the single thing that would make all the difference would be a shorter term of copyright. In fact, tragically, from my point of view, a couple of years ago Congress passed the so-called Sonny Bono copyright extension act, which, in order to protect a handful of commercially lucrative properties such as some of the early Walt Disney films, put a 20-year hold on any new content entering the public domain. The trend in Congress right now is to make the problem worse rather than better, it seems. There are some very good proposals coming from people like James Boyle and Larry Lessig which could make a big difference, but it's hard for me to be optimistic that they will gain legislative traction.
UBIQUITY: Will the legacy problem continue forever?
LYNCH: It's going to be quite some time to come before we realize the visions of paperless libraries that people have tossed around. Or that we reach the point where everything in the Library of Congress is accessible online, even if they kept the artifacts where it made sense. And I'd be careful about focusing too much on converting paper to digital surrogates; the changes going forward are much deeper than that.
UBIQUITY: Do you think that the paperless library could happen in 25, 50, 100 years?
LYNCH: I think that the issues are not technical, they are fundamentally economic and legal. They are issues of social will. I'm starting to sense a growing public recognition of the value of access to information on a large scale. For instance, there's a big debate going on now within the scientific publishing community where a number of researchers are arguing that all refereed scientific or scholarly articles ought to be freely available to the public, perhaps some limited time after they're published, like six months or a year. This is sometimes called the "open scholarship movement". The advocates cite many reasons to open access, such as the notion that the public has paid for the research and has a right to see it, that it makes scientific information more accessible to the developing world, and it makes research more accessible to people at institutions who don't have richly funded libraries. Some people are going even farther and saying that the whole publishing model ought to be changed so that fees are collected from the authors and then those fees are used to subsidize immediate open access to the public.
UBIQUITY: If you had floated these ideas five years ago, people would have considered them on the fringe.
LYNCH: Yes, you would have been painted a communist or something. But no more. It's something that most journals are starting to deal with seriously. I was involved in putting together a symposium a few weeks ago at the National Academies looking at what electronic publishing was doing to the practices of scientific, medical and technical publishing. The question of how journals are going to respond to demands to open up their content was very much on the table. To be clear, this doesn't mean that all journals are signing on to the program; it means they are recognizing that they have to take a position with regard to the open scholarship program.
UBIQUITY: If you think of the mission of CNI as a great campaign, like a war, one thinks about exit strategies, as has become fashionable in recent years. When could you theoretically declare victory and go home?
LYNCH: Our mission statement is simple but at the same time open-ended. It's concerned with the use of information technology and digital content to advance and transform the practices of research, teaching and learning, and information dissemination. That seems to me to be a very open-ended frontier.
UBIQUITY: What is happening along the frontier? What excites you?
LYNCH: I'm struck, for example, that scholars today are at the very beginning of learning to author effectively for the digital medium. To me, this is one of the most exciting things that will unfold in the next 10 or 20 years. If you look at the vast majority of scholarly publishing today on the Net, it still emulates paper. Believe it or not, some journals on the Net use artifacts of print publishing such as dual column formats, which are pretty ugly to work with on screen. Because of the way that things are presented, many people print out articles for intensive reading rather than view them on screen. People are using the network mostly as a way of storing and transmitting and reproducing paper on demand. One of the really significant developments and challenges -- and it is one that has some generational characteristics to it -- is to move away from emulating paper with digital tools to forms of authorship and genres of scholarly (and other) works that organically and fundamentally exploit the capabilities of the evolving digital medium.
And then we need to learn how to do libraries for these new materials.
UBIQUITY: As more science and even commercial development activities are done through simulations and virtual reality, will that change your job and the job of libraries and digital libraries?
LYNCH: Yes, absolutely -- and if you'll permit me to bicker a little, I'd substitute "scholarship" for "science" in your question -- these developments are equally applicable to the humanities and the social sciences.
It's going to change them in a couple of directions. One direction is that scholarship is becoming more simulation-, computation-, data- and visualization-intensive. The recent blue-ribbon committee on cyber-infrastructure that Dan Atkins chaired for NSF has made a great case for this in its report. But we still don't have well-established practices for how those new kinds of material fit with the various scholarly literatures. In many cases today they exist off to the side rather than integrated with scientific exposition and communication as it's historically been practiced in journals and monographs. Some fabulous work has been done using simulations and data sets and visualizations and models. But right now it happens separate from the established system of scholarly publishing, and there's terrible risk there.
UBIQUITY: What is the risk?
LYNCH: The risk is that because so much of this new digital scholarship is not part of that scholarly publishing stream, it's not going to get preserved or be managed. It exists only as long as the person who created it is alive, interested in it, and has funding to underwrite it. When any of those conditions change, these important scholarly works are at risk. Research libraries, acting on behalf of institutions of higher education, are starting to step up to this challenge and responsibility. The general approach is a set of services called an institutional repository. For example, MIT in collaboration with Hewlett Packard pioneered a system called DSPACE that implements an institutional repository service.
UBIQUITY: Explain DSPACE.
LYNCH: It is a place -- a system, technically, but it's important to recognize that it's really a service that will be manifest in a series of evolving systems over time -- where faculty can deposit content and the institution will commit to both continued dissemination and preservation of that content, thereby making it less at risk than if it were running on a machine in some faculty office. And an institutional commitment to support the service on a continuing basis. I think that's a very important step. DSPACE has been open-sourced, and the Andrew Mellon Foundation and other sources have made funds available for a number of additional institutions to replicate it both in the US and abroad. So there are some reasons to believe that we will see the growth of institutional repositories to provide an institutional stewardship safety Net for this new genre scholarship work.
UBIQUITY: Could the word "repository" be improved on?
LYNCH: One of the things that I'm fighting my way through at the moment is that "repository" means different things to different communities, and hence is a huge source of confusion. It's used in a specific technical way by certain groups, for example, by the people who are interested in the management of learning objects. It's used in a more general way by universities that are talking about building institutional repository systems. Some other groups also use it in a more narrowly specified way. One of the questions is whether it's a technical/architectural construct, or one that comes with a whole series of policy contexts and constraints.
UBIQUITY: Could the problem be that the term "repository" has become passé in the same way that there was a tension between the old word, "library," and the newer term, "learning center," which has an active voice.
LYNCH: I don't know specifically about the term "repository" (other than it's a huge mess right now); but the tension that you identify is central to the discussion about the future of digital libraries.
As libraries digitize large amounts of material, what does this have to do with digital libraries? How much of it is a collection and how much of it is an access mechanism? There is a debate about how active or passive the system should be. Historically, libraries have been relatively passive. For example, they have not generally provided analytical tools. They make material available, but it's up to the patrons to figure out what to do with it. Now there is a view that says that digital libraries are not just places for calling up material, they're spaces for collaboration and annotation and analysis and for authorship. At the extreme end are notions like "collaboratories."
UBIQUITY: Where do you see digital libraries on that spectrum?
LYNCH: My personal view is that we're going to start seeing a two-layered model. We're going to see digital collections that are presented and managed in a passive way. They will function similar to a repository where stewardship is the major theme. Then you're going to find access systems layered on top of these, which may be more volatile. They may have shorter lives than the underlying collections. You may see the same collections presented through multiple access systems. These access systems will be not just retrieval tools, but analysis environments in some cases. We'll see a great diversity in these access systems -- what I call "digital libraries" as opposed to digital collections. Some will be affiliated with traditional libraries; others will come from very different places, and will go very far out towards the "active" end of the services spectrum that we were discussing, towards analysis environments and collaboratories. I think we'll also increasingly harness social and community capabilities in digital libraries, for example. Libraries (as organizations) have avoided this due to concerns and traditions around patron privacy, but commercial organizations like Amazon.com have been very aggressive and creative in deploying recommender system technology and other social filtering and personalization technologies. These will find their way into the next generation of digital libraries on the Net.