Skip to content
Curtin University
John Curtin Prime Ministerial Library
Curtin University Library

John Curtin records open to the world: How Australia's first Electronic Research Archive was developed

By Vicki Williamson, University Librarian and JCPML Director & Kandy-Jane Henderson, Archivist, JCPML

LASIE: Library Automation Systems Information Exchange, 30(1), March: 16-24.

INTRODUCTION

On February 9, 1999 the John Curtin Prime Ministerial Library (JCPML) unveiled to the public its Electronic Research Archive (ERA) for the first time. The driving force behind developing ERA was the premise that Australia's first prime ministerial library would be-not a repository holding substantial original records-but an electronic gateway giving people access to John Curtin-related material held in its collection, and in other collections in Australia and around the world.

The JCPML was initially conceived using the US Presidential library model but as our thinking matured and we researched US presidential libraries further (Williamson, 1993) it quickly became evident that we needed to take into account the characteristics of our unique cultural heritage. For example, the general public's attitude towards Australian politicians compared with the North American perspective on leadership; our lack of a philanthropic tradition and acceptance of private/specialist archives in Australia; and the relationship which has developed in Australia between the professions of archives, librarianship and museums. As well, the JCPML operates in a legal and legislative environment-in regard to official archival records-that is quite different from the USA.

It remains to be seen whether or not the JCPML model will be further replicated in Australia.

STARTING SMALL

Attempting to turn our electronic vision into archival reality took us into unchartered territory and we decided not to run the marathon before we could walk the course. Armed with some ideas of what we wanted our electronic research archive to be capable of delivering to users, during 1995 we developed three projects to test the ability of the current digital technology to provide such services.

1. John Curtin: A Prime Minister and his People

This joint venture with the National Archives of Australia (NAA) was initiated to test the feasibility of remote-site scanning. A selection of letters to and from the prime minister during 1941-45 were chosen from the NAA files and none of the material ever physically left the NAA premises in Canberra. The database is composed of nearly 600 images comprising approximately 500 individual documents which provide high-quality digital facsimiles of the original material. The project was launched on the internet in July 1997.

2. John Curtin Memorial Lectures

This project was undertaken to test how well the current technology could digitize older material since many records in the JCPML's physical collection are composed of old type-written or manuscript documents. As the official repository for the John Curtin Memorial Lectures, the JCPML is responsible for maintaining, preserving and providing access to this material. This project tested current OCR software to convert the lectures into digital format making it easier for users to access and also search the lectures-a valuable consideration since the speeches average 20 typed pages with some ranging to 50 or more pages. This project was launched on the internet in February 1998.

3. John Curtin: Australia's wartime Prime Minister

This interactive CD-ROM was developed to bring together a range of media from the JCPML collection and test how to best enhance access to such items as photographs, oral history recordings, textual documents and video recordings. The focus of the project is an educational one with users being able to access "chapters" of Curtin's life and obtain information in various formats.

While undertaking these digital projects we determined some key principles that became underpinning foundations. In particular, we wanted to:

  • Add value to what other institutions had already contributed rather than re-inventing the wheel.
  • Build partnerships with other organizations that might lead to future digital opportunities.
  • Establish a reputation as a professionally credible and collaborative institution.
  • Focus on imaging and accessing material.
  • Enhance access to material from our own collection and that held by other institutions that would give insight into what kind of person John Curtin was.
  • Identify standards for software, hardware, scanning formats and metadata.

Embarking on our digital journey also gave us the opportunity to undertake environmental scanning and we found that making some sense of the available information was a challenge in itself. We cast a wide net in an attempt to discover what the current practice was and we believe the importance of environmental scanning to our approach should not be underestimated.

  • What we learnt from these small beginnings was that:
  • Although a huge budget was not necessary, access to a critical mass and range of professional expertise was important.
  • The technology was not always where we wanted it to be-for example in the areas of OCR, sound and film.
  • The internet is a convenient vehicle for providing access to digitized material.
  • Public recognition and interest in what we were trying to achieve was high.

Above all we demonstrated that we could successfully digitize and thereby improve access to archival materials.

MOVING ON

The success of these projects and what we learnt from them enabled us to move onto developing the JCPML Electronic Research Archive (ERA) itself. Digitizing our physical collection means we can give researchers around the world full access-electronically-to the content of our archives. Digitizing John Curtin-related material held in other institutions will eventually allow us to pull together dispersed John Curtin records-including the contents of individual records and series of records, whilst maintaining the context in which those records were created.

To reiterate, our principal purpose for establishing ERA is to enhance access to records and create an electronic gateway giving people access to John Curtin-related material held in collections around the world.

There have been a number of critical steps involved in developing ERA. Firstly, a range of policy and planning documents were produced in 1997 to help us refine our electronic vision. These included our Strategic and Information Plan, Program Statement, Collection Development Policy and our Electronic Research Archive Management Framework. This latter document establishes the principles and best practices to be applied in the Electronic Research Archive and addresses the areas of:

  • cooperation between institutions;
  • criteria for selection of material for digitization
  • integrity;
  • access;
  • technology and systems;
  • storage and back-up;
  • networking;
  • future migration paths; and
  • budget considerations.

During 1997 we also investigated other digital sites in Australia and the United States to better understand the current digital environment and to re-confirm our own future direction. We documented our concept in a paper entitled: JCPML Strategic Directions for Information Systems and Information Technology, and this assisted us in explaining the concept and direction to vendors and colleagues.

ENVIRONMENTAL SCANNING

What we discovered while visiting Australian sites was that significant work had been done in digitizing photographs, but apart from the work of the State Library of New South Wales little was being done to digitize manuscript or typescript documents.

In the United States we visited the National Archives and Records Administration; the Office of Presidential Libraries; the Library of Congress; the National Digital Project; the Heinz Archives; the World Bank and various presidential libraries. A digital archive with similar aims to the JCPML concept has been developed by the Heinz Archives at the Carnegie Mellon University in Pittsburgh. It is called the HELIOS Project and is accessible via the internet.

While the number of sites visited in the US is limited they do include some of the more prominent American archival institutions, so we feel justified in coming to the following conclusions:

  • Resolutions for scanning documents must be flexible. For example if scanning textual documents it is critical to capture the information content whereas museums and art galleries have different requirements when scanning objects. Where objects may need to be scanned at 2,400 dpi and upwards to achieve the quality required, a typed or manuscript document may only need to be scanned at 300 dpi. This lower resolution obviously has benefits in terms of storage capacity when you are dealing with hundreds or thousands of documents.
  • To meet basic preservation and access requirements the information content needs to be appropriately captured and stored. There must be a migration path to ensure that the digitized material will always be available.
  • Search/retrieval software is necessary to access digitized material.

COSTS

The costs of developing any major digitization program-in terms of investing in staff time and training; and in purchasing appropriate technology and equipment-are without doubt extremely substantial. Without appropriate sponsorship the JCPML Electronic Research Archive would not have progressed as quickly as it has. Digital Corporation is a major sponsor of The John Curtin Centre and has supplied generous discounts for all of our computer hardware, including PCs, laptop computers and a Unix server. Digital also met the costs of staff training-which could only be provided from the United States-to enable us to use the software.

The most critical, and also the most expensive, decision we had to make during the process of developing ERA was in selecting software that provided the search/retrieval capabilities we required. As early as May 1996 the JCPML established the need to purchase search/retrieval software to progress beyond the pilot project phase. The JCPML does not have the time, expertise or desire to build its own systems from the ground up, so to achieve this aim we made innovative use of commercial off-the-shelf software systems which are flexible enough to provide the necessary functions of capturing, storing, managing and accessing the JCPML collection (and other collections) to our clients world-wide. The drawback is that this has necessitated some compromises on our part.

The system selected was the Electronic Filing System (EFS) from Excalibur Technologies. It was primarily chosen because of its search capabilities, especially its "fuzzy" logic which eliminates the need to have 100% clean text for searching purposes. Also, EFS technology has been around for a number of years so one of its benefits is that it is a tried and tested system.

EFS allows a number of options when digitizing such as producing an image, image and text, or text only, and gives us the flexibility of providing access to our finding aids. The documents digitized in EFS have a range of intellectual, control, administrative and technical metadata attached to them. Rather than store image and text files within EFS itself, we made the decision to store them in separate directories. This reduces our dependence on proprietary software and allows for continued access by other software in the future.

Excalibur Technologies are currently developing a new system which promises further search enhancement capabilities as well as visual and sound data management and access so there is an exciting migration path available to us. This was obviously a critical element in deciding to purchase EFS.

ACCESS TO OTHER COLLECTIONS

An important aspect of the work being done at the JCPML is gathering information about collections held in other institutions and private hands. The JCPML Collections Information website has been established to collect data and provide a mechanism for evaluating series and items relating to John Curtin. This means that consistent information can be gathered from anywhere in the world via an electronic form which is then forwarded to the Archivist who evaluates the information against specified criteria and makes a decision about whether or not to digitize or copy the material in some other form. Historians with a particular interest in John Curtin or his times are regularly consulted to help us determine the research value of records that may be relevant to our collection. This website is also building up a resource of information for referral purposes and reference enquiries.

We also feel that building collaborative digitization partnerships is an essential mechanism for the cooperative sharing of information and we have already had some successful negotiations. For example, the National Library has already made available one of its collections for digitization and incorporation into EFS, and the JCPML has previously collaborated with the National Archives of Australia on a large pilot project. We anticipate that further collaboration on future proposals will continue as the majority of the records of interest to the JCPML for digitizing are held by other institutions, not only in Australia but also in Britain and the United States.

One of our early hopes was to have a portable scanning station available to take to remote locations. We had envisaged that a researcher or operator would be able to take a notebook computer and scanner to a remote site to scan records and down-load them directly to our server. This proved to be a bit ambitious! However, we believe the technology is close to making this scenario practical in some instances in the future, but for the time being scanners, at least, are not robust enough to be transportable to the extent we need.

Copyright can be a major stumbling block when it comes to providing electronic access to records. What we found when discussing digitization with other Australian institutions was a general agreement to digitize records out of copyright wherever possible. Given that most of the records we are interested in are about 50 years old and unpublished, this is not always possible. Luckily we gained a great deal of experience when we dealt with the copyright issues involved in our digital projects and have now developed guidelines and procedures to assist in this process.

WHAT'S NEXT?

For the last few months of 1998, ERA was in a developmental and testing phase. A critical aspect of the testing phase involved representatives from a variety of user groups and reference archivists to help determine and refine access requirements. Access to ERA is now provided to the public via the internet in the JCPML reading room. Following successful data migration to new software later this year, full web access to ERA should be available world-wide.

We have not addressed the issue of standards in this article. That does not mean that we do not think they are of critical importance. We have spent a substantial amount of time investigating relevant standards, reviewing the literature to determine how they have been applied and looking at the emergence of defacto standards where there has been a lack of formal standards. However, ample literature has already been published on the topic.

In the process of developing ERA we have learnt that, given the will (and by that we mean the determination to digitize) and of course the funds, anyone can do it. A large staff is not required. The JCPML has a full-time equivalent staff of two and a half working on ERA as well as providing all of our other services and programs. What is essential is to have appropriate technical and systems support available when embarking on a program of this magnitude. This on-going support is absolutely crucial when venturing into unchartered waters and cannot be too highly stressed.

CONCLUSION

The JCPML developed the Electronic Research Archive concept for a number of reasons:

  • To build upon the expertise and confidence which was developed during the pilot projects by moving into a phase which is fundamental to our vision. This expertise has been retained in-house and continues to be developed within the JCPML and has been a significant factor in our progress.
  • To create a research archive that is easy to access and which is not dependent on a physical presence in the reading room.
  • To bring together archival materials that are located not only in the JCPML collection, but in other institutions and private hands in Australia and around the world.
  • To provide integrity in the context of digitized archival records through a variety of control mechanisms relating to intellectual control, administrative and technical metadata.

While enhancing access is the primary aim, digitization is also about preservation by limiting handling of the original documents and encouraging use of the digital object for most, if not all, research purposes.

Our guiding principles for all of our programs and activities, not just the development of ERA, are the values which we believe Curtin himself represented. With ERA we have tried to:

  • advance our vision for improving access to archival materials;
  • show leadership in professional practice; and
  • share our experience and lessons learnt with the professional community.

The development of ERA has been a truly collaborative one and we hope to continue developing our existing alliances and build new partnerships as we progress into the future.

ACKNOWLEDGEMENTS

Our thanks to Lesley Carman-Brown, Public Programs Coordinator; David Wylie, Archives Technician, and the staff of the Library and Information Service for their substantial contributions in assisting to develop ERA.

REFERENCES

Carpenter L., Shaw S. and Prescott A. (1998) Towards the Digital Library: The British Library's Initiatives for Access Program. The British Library, London.

Digitisation Forum Online http://www.digitisation.net.au

Galloway, Edward and Michalek, Gabrielle. (1998) "The Heinz Electronic Library Interactive On-Line System (HELIOS): An Update", The Public-Access Computer Systems Review, Volume 9, Number 1. http://epress.lib.uh.edu/pr/v9/n1/gall9n1.html

Heinz Archives helios project. http://diva.library.cmu.edu/HELIOS/

JCPML Website http://john.curtin.edu.au

John Curtin Prime Ministerial Library. (1997) Collection Development Policy. Curtin University of Technology, Perth.

John Curtin Prime Ministerial Library. (1997) Electronic Research Archive Management Framework. Curtin University of Technology, Perth.

John Curtin Prime Ministerial Library. (1997) Program Statement. Curtin University of Technology, Perth.

John Curtin Prime Ministerial Library. (1997) Strategic Directions for Information Systems and Information Technology. Unpublished Paper, Curtin University of Technology.

John Curtin Prime Ministerial Library. (1997) Strategic Plan and Information Plan, 1997-2001. Curtin University of Technology, Perth.

MetaWeb Project http://purl.nla.gov.au/metaweb/home

Newsnotes (1997) in Archives and Manuscripts 25(2), Nov, 439-440

Smith, Clive. (1997)"Implementation of imaging technology for recordkeeping at the Worldbank", Bulletin of the American Society for Information Science, June/July.

Williamson, V K (1993) Report on visits to US Presidential Libraries, July-August 1993, Unpublished Report, Curtin University of Technology, Perth

Williamson, V.K. and Henderson, K-J. (1998) 'The electronic research archive at the John Curtin Prime Ministerial Library', Presented at the Australian Society of Archivists Conference "Place, Interface & Cyberspace : Archives at the Edge", Fremantle, August.