[ Skip links ]
Sarah Pritchard Keynote Presentation
Arizona State University
March 2, 2005
Thank you Sherrie, and thank you Rob and the organizers of the committee for inviting me, and let me just say I do know that we're in Tempe not Phoenix but I don't know, I was thinking about the airport I guess when I formatted my slides. It's a little early in the morning for me to have great humorous remarks and no I did not get out and take a walk at 5:30, but I'll try and keep it not too heavy.
I'm very pleased at Cliff's remarks last night because it set up a lot of the same themes that I think you'll see more concretely in my talk this morning. We hear this term everywhere these days, whether it's in scientific research or administrative data management at corporate environments, and of course its original context in information science. So I wanted to put up a couple definitions of what we think we're talking about when we say informatics. Informatics itself is actually not really the subject of my talk today, but it gets thrown around as the term on campus, and that's how I began to grapple with this really big issue, which is really the archival curation of digital research data.
My campus has a lot of these projects, they call 'em all informatics, it sounds fancier than saying you know, systems. They're faculty driven large scale computing initiatives mostly in the sciences, where they design all their own data ingest and fancy interfaces and all of the sudden it's informatics. When I first came to UCSB I couldn't figure out what they were doing. Was it computer science, was it data crunching, and was it digital libraries, and all these projects were very self sufficient, so I just didn't pay a lot of attention. But when grant proposals from those faculty looking for support for their informatics projects began to come to the attention of the same funders who look at digital library projects, like the Mellon Foundation, we started to have a dialogue, and to realize that these stand-alone systems were growing very fast and were sort of like digital libraries, and sort of like old fashioned faculty data files, and that either way they didn't talk to each other. Nobody took care of the systems of the data, by which I mean you know more than five years, and we all started to think well you know so what are we talking about.
It helps to understand a little bit of our campus context, which may not be unique but we certainly gave a lot of shape to how these systems grew and to potential solutions. We're unusually interdisciplinary in the faculty and very technologically innovative, so there was a lot of development of these standalone campus research systems. We have enormous amounts of work going on at the sort of text archiving level with the California Digital Library and courseware. Nobody was really looking at this faculty stuff which kind of falls in the middle of big courseware repository initiatives on the one hand, big administrative data records or journals and text documents on the other hand. Faculty research data, when it was in paper, typically stayed in the offices and we didn't worry about it. But now we have this very large campus investment in gathering and processing this raw scientific data and developing these analytical tools, and to what degree is that part of the university's intellectual record? These are the kinds of things I'm talking about with faculty research data, and again people can argue about which and what parts of this are the university's responsibility versus the researcher's responsibility. Faculty might want some of it taken care of and not other parts of it, there's really quite a universe of possibilities, but this is basically the kinds of things that we start to look at.
At the same time as we have this innovative interdisciplinary faculty environment we have a very decentralized computing environment. I think that there are some interesting consequences from being a very early adopter of technology. Everyone has already jumped out and built their systems both in the administrative and the academic environment. We have a big mix of very innovative and homegrown patchwork off the shelf, and at this point the expense of changing, centralizing, coordinating is so frightening that it never happens. So basically there's no centralized academic computing support on campus and so the school of engineering has really fabulous network support and the division of humanities sort of limps along with what it can do.
The, since the time this project started, just over the last year the campus has decided to try to put in place a CIO model. As you can imagine this is extremely controversial, and some form of modest, I can't even call it centralization, coordination, centralized is way too strong for us. So some of the things we were looking at as problems with data curation and the potential models that we might solve it, may not work if the whole model is shifting under our feet. Actually in California we call the mud is sliding under our feet.
Here's where our library was at the time that this all started. We were one of the first six big NSF digital library projects and built the Alexandria digital library, which is a geospatial map and imagery, digital data and very large amounts of metadata even for non-digital formats, and a fairly elaborate gazetteer. This has become a very robust architectural platform, we've used it to extend out into other disciplines that are interested in georeference data and very productive faculty collaborations.
We built on that to apply last year for one of the Library of Congress digital preservation grants, and as of last December officially finally received that grant. And so we've just really begun a major ramp up of the Alexandria digital library platform into being long term federated, will be called the National Geospatial Digital Archive, we played around with the words quite a bit. It will involve our principal partner Stanford University, it will also involve a variety of other universities, government agencies and corporate partners. At the same time we have a really robust consortial environment within the University of California, that is California Digital Library, which is working on a preservation archive at the consortial level. Also got an NDIIP grant to harvest government documents websites, has in it what we call e-scholarship which is sort of a repository for faculty, first pre-prints and just as of last week now post prints, and the Online Archive of California for special collections finding aids and other documents. So in this environment we see both locally and consortially a variety of the pieces being put into place, and we thought the library was in a very good position to move into this somewhat untended, un, not unintended but untended to, area of research data.
So that the questions we started to ask are basically "Why are faculty doing this?" "Why are they building all their own digital library systems?" "Do they really want to, or do they just have to?" "Are there commonalities?" Are the tools and systems that the people are developing over in the Center for Ecological Analysis and Synthesis, applicable to what the neuroscience imaging people want to do on the other side of campus?" "How many of these systems are open to anybody else except the people in the research group, and do they pay any attention at all to the things that we care about in the information community, such as intellectual property rights, and metadata and archiving?"
So we decided with very active support from the Mellon Foundation to make a formal inquiry into the nature of these projects. And the goal was not to start out by saying lets design a repository. The goal was just to find out what they were doing, and what kinds of support they felt they needed, and then how best organizationally to provide that, whether it would be through the library, or through a combination with IT units. But as you know over the course of a project it somewhat tends to morph based on what people are actually telling you, which we did of course experience.
What we did, we went out and we did alot of end research on who was doing what in informatics and faculty research data, then we went around and the core of the project was all these interviews with faculty. We tried to kind of chart out the key systems issues, we convened a bunch of cross-disciplinary roundtables to try to get more faculty talking to each other about their interests in informatics. We started talking with the other IT units on campus, and we wrote up a white paper that was intended to serve as a basis for campus discussion among the IT units, and then we circulated it out to the campus leadership right when they then announced this whole CIO initiative. So we haven't really had the discussions yet because we've been sort of waiting, because a lot of the scenarios we started put out in this white paper may be completely irrelevant.
We weren't random about this survey. I won't pretend it was a structured sample. We went specifically after the faculty we knew were doing these things, based on either our own previous experience or talking to department chairs in certain departments and saying "Who does these things in your department?" And then, we have so many faculty on our campus who are innovative who are not in science, we couldn't resist pulling them in. Even though initially there was some concern from the Mellon Foundation that we needed to keep it to one cluster of disciplines so we could really look at the commonalities, but that would be leaving out a very key chunk of data intensive faculty on our campus.
We only interviewed about forty people, but it was really a very rich sample. These are some of the questions we asked, and I don't mean for you to try to actually read these off the slide. The slide is in your packet, and the complete report and the complete endless much longer than this, list of questions is at our website, I'll have the URL at the end. All the project documents, the full report, all the data are on our informatics website. We went over and sat down in faculty offices and talked with them. We hired a wonderful extroverted research analyst who went and listened and interviewed and taped all of these interviews. I was recently, its very interesting its much better to be in the faculty's own environment when you ask these kinds of questions because these kinds of questions can be threatening. And I was just last week mentioning this project to a faculty member in computer science from another UC campus and he got very agitated and said "but I don't want the library taking over my website, I don't want the library taking over my systems". I said "We don't want to take over your system".
So it was important for us to do this project by going and saying to them, "Tell us about your research", of course they loved to talk about their research. "Tell us what support needs you have", and trying to leave aside you know they want an upgrade to their hardware or something like that we were really trying to focus on data support.
So we have both structured and unstructured data from these interviews and these roundtables, and the findings are quite interesting. I'm just going to present some of the highlights now in the different areas. The group of systems issue is to say: What determines when faculty upgrade and enhance and invest all this money in these systems, and basically the data push it. It's a very labor intensive, some of the faculty's data gathering. So it's a combination of how labor intensive it is for them to keep entering their data, and just how much streams of data are coming in. Throughout this project we found a very large difference between faculty for whom data collection is highly automated and it's just flooding in, or where data collection is very laborious and they feel just personally attached to every little data point that they discover. I'll talk amore about that as we go along. Despite the prominence of a lot of these informatics projects, the majority of the faculty develop them as by products. They are not in it because they're fascinated by databases, they had to do it because they needed the ingest, they needed the data collocation, the needed the analytical tools, and they had these research teams. Most of these people come from disciplines where they're working in research teams.
These are some of, graphical representation of some of the perceived opinions. Do they think preservation's a problem? And you can see that you know, a good half of them think there's probably a need, some of them think there's actually a critical need, a lot of 'em aren't sure. One of the problems is the word "long-term" has very different meanings to faculty. There's a lot of semantics issues when talking about these issues with faculty. We discovered the word "collection" means a very different kind of thing to many people. We would say to them you know "Do you have a digital collection?" and they'd say yes and it would turn out not to be, or vice versa. So the word long term and the word archive need to be really clarified up front with scholars. To them we, to them, you know its five years. We're talking a hundred years and they're thinking OK does it last beyond the life of the grant.
Here's some of the specific findings when we actually talked to them about what they do for their data preservation, and my favorite one is the fourth bullet here, this phrase " the removable media of the day" was the phrase of our research analyst. I mean they're sticking, you know its on a zip drive or its on a backup hard drive somewhere, I mean that to them is good archival curation. I've got a backup somewhere I forget where, you know in another office. The people that are well off are the people that are doing for example genomic research where they're mandated to put their stuff in the genome database, and there's a nice repository, there's no arguments about it, everybody knows where it is. There are starting to be, as Cliff mentioned, other fields where these disciplinary repositories are growing. But most departments, people haven't got the time, its dependent on their grad students, when the grad student leaves maybe or maybe not they'll do something with it. They don't have enough staff at the same time as their research is really really growing and the system complexity is growing. So we're seeing an increasingly critical situation where the long-term status of these data is going to suffer.
Here's what they actually do, broken out in percentages. So people are doing a lot of different things. They're doing a little backuping, their contributing to a portal, they have it on a CD. You know so it's a mishmosh. Most people realize they need to do something but its very unplanned, unstrategic.
Here's how they organize their data. The number of faculty that we found who basically use their Microsoft operating system directory structure as their basic file organizing mechanism, it's like close to 100 percent. That's how everybody thinks well its organized, I have a directory structure. Databases are less common than we thought. We thought a lot of these folks were building elaborate databases. Most of them don't have the time, they you know throw it together. In fields that have made a disciplinary commitment to a portal of some kind, there's a couple in eco-informatics for example, you'll hear about the virtual astronomy observatory later, and then there are certain fields where people have collaboratively put together some very elaborate data stores. But in the absence of that people are sort of throwing their stuff together. In the science departments where storage isn't a problem they just throw more memory at it. And the people who do worry about it are the IT support. We interviewed a number of IT support staff along with faculty, and of course their views are quite a bit more worrisome in terms of there isn't going to be enough storage and it's not being backed up, but they're not in a position to do too much about it either, but they do see this as coming to a head.
Metadata is becoming more widely known among faculty as a concept, but there are enormous variations in faculty use. And again the term needs to be clarified, everybody thinks they know what we mean when we say metadata. Many faculty equate metadata simply with taxonomies or subject headings. Usually the practices of a field seem to determine a scholar's interest in metadata and the formats that they want to use. But again it comes down to the time investment. Whether or not the scholar thinks anyone else will ever need that data. One faculty member on our campus is actually on his scholarly society's committee on metadata and terminology and even he doesn't apply metadata tags to his digital objects. You know, you cannot underestimate the impact of time on these researchers. If it takes too much time, they are not going to do it.
The other issue is the faculty culture of collaboration in different disciplines, and the impact that it has on things like metadata formats and data archiving in general. And I'm not going to give as much of a data example this morning on that but I do have more information on that. But some disciplines are much more inculcated with sharing, others are much more possessive. It's clear that most of the fields prefer to use very discipline-specific terminologies, I mean this isn't news, we've been doing disciplinary thesauri since the 1950's. But faculty appear to be reinventing the wheel in the digital environment. They have forgotten about thesaurus work in different fields and they're busy redesigning taxonomies and ontologies, that's the fancy word for subject headings now. And they don't think of this as something where there's available consultation, they're always amazed to hear that librarians can actually help them with this. Very few faculty are aware of the work on metadata crosswalks or standardized formats that are easily adapted.
The conclusion we reluctantly came to at this point was there's not going to be any way that in short run we're going to be looking at a single metadata format. And so the kinds of repositories that we build on our campuses whether its text or digital data, are going to have to be metadata agnostic and have ingest patterns that will take metadata from a lot of formats, because the different disciplines are gonna do what they're gonna do. Here's a breakout of the metadata findings, and you know here's forty percent of them who aren't doing anything. And most of the rest are using very very selectively, so metadata is a real concern. And again it comes down to, do they think of these data as public, even within their own discipline, because of course if nobody ever has to find it again than it doesn't need any metadata.
We were more encouraged on what we discovered about intellectual property. There was much greater awareness of intellectual property issues, even if the way they went about it was somewhat inconsistent. But certainly almost all faculty we talked to realized these were important issues. There were some very interesting disciplinary differences about things like open source, patenting, the small print bullet there in Disciplines where things move very quickly, they don't want to bother to patent, it takes too long. They'd rather open source their stuff right away, get their product out there, and then later do a startup company or do the next version of it and patent that and things like that. So, but in disciplines where the university feels there a lot of money there's pressure on these departments to do the patents.
We actually found that most of the portals on campus that had heterogeneous data ingest are aware of the need to have rights management and waivers from the participants. And again this comes down to, to what extent does this discipline support a culture of shared data? The interesting insight that our research analyst developed about this culture of sharing was that, what I was alluding to earlier: The more labor intensive the data collection the more guarded a researcher is likely to be, and for example in a field like archeology, where they have a lot of digital data and now they're very heavily using GIS and imagery, and yet this data are hard-won, literally out there in the field.
Whereas the more that the data are gathered by a lot of automated sensors, people doing say meteorology or astrophysics, and the stuff is streaming in. They can't, you know they can't even deal with a fraction of the data the sensors are picking up. They're really not worried about the data, what's much more important to them is the applications software and the analytical tools, because those data are no good unless you can figure out what they're telling you. So there's quite a difference in, even within sort of scientific fields, and we think it comes down you know, to some kind of, something intuitive makes sense, the difficulty of getting the data in the first place.
Here's how some of the digital rights management practices break out. People do feel it's an issue, the thirty percent over there, it does affect my research, and people have issues with collaborators. Only eight percent said they never encountered any IP issues, so I think this is quite telling. We see controversies on our campus, even with the university's copyright consultation office, which is part of the Office of Research. Some faculty see them as very pushy, that all they want is to patent, patent, patent and make money for the university. So is the goal to help income production, is the goal to encourage, in the public good sense, the dissemination of the university's intellectual product at low cost, and as much fair use as possible? Those two positions often come out at odds with each other. Is the need simply to help faculty advocate with their external publishers? So it has done a little bit of all of these things but I think our faculty have very selective perceptions of what is the role of rights management officers on campus, and is that a help to them or a hindrance to them? I think it comes down to the need for tiered approaches: Different disciplines are going to need different kinds of support, to multiple kinds of services from intellectual property offices, and it's certainly not something that we can handle just in the library or just in the office of research. Now also separate from this whole topic we have the issues with student peer to peer networks and Digital Millennium Copyright Act, and so I think the complexities for digital rights management on our campuses are now very pervasive. Many, many areas of campus all share in these issues.
The institutional repositories that we have in place already have these issues with the shifting nature of rights management over the time of the product. For instance it may start out as unpublished data who want a lot of protection, its syllabi, its raw data, then it moves to being a published article, then maybe it moves to being a textbook. So the need will be and already is, for levels of control that can be moderated over time depending on the status of the object, and the faculty themselves want a very strong hand in this. They do not want to lose control of their intellectual property or of the migration of products, or of avoiding the migration. In some fields the data are no good after six months and they're not worried about it, in other fields the data are good for decades and they do not want that stuff migrating anywhere that they don't know about it because they can keep going back to it. We have an ecology, a large very heterogeneous data store on campus that uses a lot of current data, but they recently came across nineteenth century field notes of a naturalist who studied a particular area for ten years and had all these detailed handwritten notes. They are dying to get that stuff transformed into data so they can match it up with their field notes from a hundred years later on the same area. So the shifting control of data over time varies again with the discipline.
So we started to talk about "OK what do you need for your data support?" and as I said once we got past the problem that they all want more technical staff to do all this stunt work for them, we did get some very good feedback that cut across all of the departments on campus regardless of current arrangements. They would love a clearinghouse, they had no idea that there were consultants on campus who could help them decide about things like applicable software. They would love advice about how to build language into their grants about infrastructure and persistence because now that's pretty much required by funders, that you address these issues in their grants and they don't know how to do that in some cases. And as I said on our campus at the time we did this interviewing there was no centralized academic computing, so they weren't aware of a place that they could go, unless it happened to be in their department, for data advice and consultation. And some faculty just need a little steering in the right direction, others need someone to do all their metadata for them, and so the tiered support services seems like a very reasonable model that, and you might charge for some services and not others depending on how labor intensive it was. It's clear that one size does not fit all, and it would take a very strong digital architecture and there are a lot of campus policy issues that I'm not even alluding to here. But there was a great outpouring of interest in the idea that faculty might get some sort of research data management advisory services and selective actual curation services as we went along.
We saw trends as we talked to faculty about these services, and they talked about the changes they saw. We saw the same trends Cliff talked about last night. I want to assure you we didn't actually plan and share any of this in advance, but you'll see the same trends here. I'm relieved that, you know, you didn't disagree with these kinds of trends. But the growth and complexity of digital objects is continuing to increase very rapidly. The funding agencies themselves are driving, for the better, changes. We have the NSF cyrberinfrastructure, the ACLS, the NIH initiatives, and then these disciplinary repositories both for text and in some cases for data, that are becoming more common. And we have some fingerpointing going on across the disciplines I find now. I was last week in the meeting of all the UC library directors with all of our faculty library committee chairs from every campus. And all it takes in one of these kinds of groups is one physicist, and they can't understand why everybody isn't just like high energy physics and has the ArXiv server and everybody goes to one place and they don't see what the problem is. Like well "We've figured this out". "Well your discipline is just behind the times because you don't have your, you know, your big physics archive." Now there's actually, I have a lot of problems with the physics ArXiv, I'm not sure its' long term digital persistence is really being tended to. But the policies are uneven and it's difficult for disciplines who now hear about another discipline where things are really well taken care of.
Faculty are doing this work literally globally, whether its physical field work or whether its international collaborations like astrophysics. And what they want out of a repository is they want to be able to get at their data even if they're over in Pisa, Italy where we're working on a collaboration, they're going to be deploying our Alexandria software, or they're in New Zealand or they're up in the Dung Guoung caves in China. So the bandwidth issues and the communication issues they want to be able to access these things you know over their laptop from where ever they are on the globe. We have very technology intensive academic programs across the disciplines, We have media arts, we have intense social science data work, and so the data repositories and curation is increasing on campus as well as what faculty research, and of course its from a faculty members's mindset these are connected, what they have for their own research data and what they have to be able to use in their teaching, and of course we're not getting any more staff.
Here's some of the systems characteristics that faculty would like to see out of a data curation system. As I've been hinting all along, they want a lot of control. They want to know when is an object going to be really dark, when it is dim, when is it light? They want to make those decisions because it affects their use of their data and who else can get it. They don't want anybody to get it before they're done with it. They need secure access from anywhere, as I said before metadata agnostic, and then of course its' got to seamlessly integrate with every other application in their entire life. As we know this is non-trivial, especially on my campus where we have so many homegrown systems. They want a clearinghouse, they'd like to know who is working on what. We found in these roundtables that even faculty on our own campus weren't aware that somebody down the hall, or in the next building over in a different department, was working on an application or a device. In one case we found out that some of the bioscience folks were working on a handheld device for data entry that would automatically you know GIS it and metadata and upload it and do a bunch of clever things all at once, and the archeology guy about went ballistic thinking about how this would revolutionize his life in the field. So they want a clearinghouse so they can find out who is doing what. It boggles my mind how we would put together a clearinghouse because how would you narrow it down, and how would you ensure keeping it up to date? But again that's a doable thing. This is something that you know is clearly a library service that we could be doing. And then at the technical platform level there need to be advanced services as we already know, for migration and emulation and long-term archiving.
That's what faculty want, and that's, as if that's not enough, we know from the library side, there's enormous bandwidth implications of faculty data curation. We're talking you know petabytes worth of stuff and I've heard it described as "How many shipping containers worth of hardware and servers do you need you know, for X number of petabytes". If you start doing 3D imagery, whether its bio imagery or if you get into film and TV and audio. I've started having some discussions recently with what it takes to archive IMAX film, which is shot in several different perspectives at once. It's just boggling. Then there's, so there's enormous amounts of just server space, access and interface issues. 24/7 high load transactional traffic issues in terms of our networks, many of the same details you worry about with digital records management, except that when we get into scientific research data it simply scales up very rapidly and very very enormously.
Here's some of the topics that we see for campus discussion on any campus. Here are ways to look at the questions: What are the gaps in what you're doing now? How do the technology services interact on campus and are there new collaborative models needed? CNI has been working on at this for several years looking at new IT/library collaborations and new services. What are the faculty priorities? We cannot take for granted what their priorities are because they vary by discipline, by tenure of the researchers, that is their length of time in their field. What kinds of research data should be at the highest priority? What's most important to the campus, what's most important to the researcher, those are not the same thing. How much is at risk? Who decides what's important? Who decides what's at risk? This incentives question is a really big one. Nobody's gonna do it if it's not easy and if they see any loss of control, and those are really key things. Its gotta be easy, but at the same time they've got to have control.
This is our last bullet - We sort of know this is true. We never put this into our interview questions or our official goals for the project because I knew I would get flamed immediately as you know "You're a librarian, that's none of your business." However it came up quite often unbiddden in the interview responses from faculty, that part of the reason they didn't fuss with a lot of this data curation is they got no reward. They weren't going to take any more time than absolutely minimally necessary to keep the data organized for their research. Maintaining web portals and extensive databases and big collaborative disciplinary things was simply not rewarded in most of their fields, especially in scientific fields. Unless you know in a very selective way, you know, you actually created the genome archive or something like that. But for the most part, research results in publications were what they were after, and the data curation and the web portals were by products. That's not quite true in the humanities, we do see where the actual end research product in the humanities is some sort of elaborate digital product and those folks are now getting credit in the tenure reward system if these products are seen by their peers in the peer review sense as major scholarly contributions. But that's not the same thing as the data management problem. The data management problem they get no reward for, and they don't want to waste their time on it anymore than necessary.
Here's some of the outcomes we might see on our campus or on any campus. Everything could stay the same, which is to say we worry a whole lot because nobody is taking care of it. We might, at least at the very lowest level, get more peer to peer sharing of expertise. Next I would really hope to see better policy discussions on campus and that we get some consistency across campus, and this would be an enormous contribution if we were to get a CIO for example. Its simply better policy coordination and a commitment to sustainability, that the campus would say it is important to maintain these things and we're gonna try to put in place services. We would look at next some new organizational approaches, and ultimately and we do want to put in a pilot, a next pilot proposal for a model service that would include some database design and some consultation and metadata services.
Our plan right now at our campus is to kind of get our big LC NDIIP project going and then use that architecture to leverage these results and expand into faculty data curation once we've got a nice digital archiving infrastructure put together for the geospatial digital archive. They're very very highly complimentary and even in our library we don't have enough people to do both at once. So we think that actually we will be in a better position to reassure our faculty about research data once we have sort of unveiled this very targeted digital repository.
We are also poised to do a priority survey on faculty with certain options: What would you like to see? What is you're greatest need in terms of data curation? I personally think this, all this area of data curation across the country, across the world, is going to be much more difficult than and controversial, than repositories of either administrative data records or of published work. I think there are incredible policy questions. It comes back to very old issues of faculty autonomy and who owns their work. Many faculty don't think of themselves as employees of the university, it shocks them to learn that the university might have the right to take a key and walk into their office while they're on sabbatical because you know they have a library book out or something like that. So the idea that they might place research data in a university archive is not something that is just a natural extension from the other kind of digital archiving work that we have been doing. There are some common technical issues and some policy issues like rights management, but I think the faculty culture issues are very complex. However if we do not start addressing some of these we are going to lose an enormous amount of not only data but of intellectual capital of the output of our universities and our funding agencies.
So with that, here are the URL's for our project website. This paper, the second bullet, just came out last month in the ECAR series. I tried to get permission to bring copies for everyone today and I couldn't get a response out of ECAR people, they may all be on vacation or snowed in or something. But that paper is sort of a summary of the project, the really detailed boring summaries are all posted publicly at our website.
So I would be happy to take questions, I think we have a few minutes.