Metadata and Archival Description: Paper by Peter Horsman



Metadata is not a user-friendly term. However, the underlying concept � is relatively straightforward; metadata is simply meaningful data describing another data object. [1]

Indeed, outside information science the term is more confusing than helpful, since it almost suggests that metadata is one specific category of data and that metadata standards should apply to all metadata. Neither of these suggestions is true. If one leaves out the word 'information', metadata becomes just data about any kind of object. What is 'meaningful' depends on the kind of object the data is about, as well as its purpose for use. For archival (meta)data the object is not always one, discrete object, but often a complex of interrelated objects, both physical and abstract, including documents, document aggregations, document creators, business processes, curators, etc., reflected in the three major archival concepts about records: content, structure and context. An archival document is not just one, 'discrete' data object, and the data about it archivists create, collect, update and use, are not just descriptive metadata, but they serve a variety of archival business functions, including retrieval, preservation, storage, appraisal, disposition, etc. They are indispensable for physical, administrative and intellectual control over archival materials. [2]

Archival Standards

Emerging archival (meta)data standards, such as ISAD(G), ISAAR(CPM), and EAD refer primarily to the area of intellectual control: archival description for purposes of access to and retrieval of records - even if ISAD is somewhat broader and includes some standards for administrative and physical control. Basically the standards reflect the complexity of archival description, paying full attention to the records context, content and structure.

But there is more to metadata than description; a more inclusive conceptualisation of metadata is needed as information professionals consider the range of their activities that may end up being incorporated into digital information systems. [3]

From the moment archivists became seriously involved in electronic records, metadata became an issue from an archival point of view. One of the most recent results in this field is the Australian metadata standard.

Although up to now a close comparison between traditional descriptive metadata standards and metadata for electronic records has not been made, it is obvious first, that an integration of both categories should take place, and second, that full archival metadata standards will be broader than ISAD and ISAAR.

Establishing such broader standards should - in my opinion - not start with listing all possible kinds of metadata elements, but with an analysis of archival information systems. Once again, I refer to David Bearman's contribution to the international expert meeting on descriptive standards, Ottawa, 1988. [4] The whole of metadata is a metadata system, which belongs to the class of data systems. A data system is a part of an information system. Data serve a goal, to satisfy information needs. Anne Gilliland's article in the cited Getty Institute publication distinguishes categories of metadata according to archival business processes, as, more or less, do the Australian Metadata standards.

Metadata and the World Wide Web

Records are certainly not the only 'data objects' about which meaningful data is created and used. As a matter of fact, other, even younger, information disciplines were at least using the term metadata before archivists discovered the word. Particularly in database administration, metadata systems (data dictionaries, data directories) play an important role. A new area of metadata emerged with the recent worldwide use of the Internet. Since anyone can open a web-site (and virtually everyone seems to do so), the biggest challenge is how to make your web-site findable. And from a user point of view: how to find the web-site which contains the information you are looking for: resource discovery on the www is like 'finding needles in a global haystack'. Web technology offers two kinds of search engines: crawlers and directories. Crawlers (also termed spiders or robots) are smart pieces of software that 'traverse the web, visiting sites continuously, saving copies of the resources and their locations as they go in order to build up a huge catalogue of fully indexed pages. They typically provide powerful searching facilities and extremely large result sets, which are "relevance-ranked" in an effort to make them useable.' Directories are human created lists of network resources, using metadata in order to classify and catalogue web resources. Crawlers provide a high recall but with less precision. Directories are more precise but with a smaller recall and are dependent of human insight. [5]

A solution which tries to combine the strengths of both approaches is at a first level a site description (through metadata), and then a full text crawler within the site.

So, one of the functions of web metadata is typically the same as in many other sectors of the information profession, including libraries and archives: discovery and retrieval of (data) objects. In the case of the www these objects are in the first place web-sites and web pages. Metadata are - one might say - the elements of the catalogue of the web.

Dublin Core

Like librarians before them, web-designers discovered the advantages of standardisation of descriptive metadata to enable an effective, global retrieval. (Until a few years ago, the archival community did not see much wider than their own repository, or at most their own country. The motto was: my archives are unique.). In the awareness of the advantages of standardisation for web-metadata lies the origins of the Dublin Core standards - which by nature are quite different from library catalogues and archival description. Furthermore, because of the large variety of data-objects (web-sites) the standards are extremely generic. Compared with ISAD(G) and ISAAR(CPF) the Dublin Core is less meaningful and has less power of expressing the contents of the site. Being limited to 15 data elements has certainly advantages for acceptance and implementation. For archives essential elements, particularly those referring to the context, are completely missing or have a different meaning. The same is true for the structure of the materials described. Both context and structure are essential for archival description.

Metadata Standards and EUAN

Should, then, EUAN neglect Dublin Core and solely focus on archival standards? I don't think so. No doubt, Dublin Core and similar standards do have a great value for any project, which aims to make information available through the Internet. The question, however, is to what level. Dublin Core metadata refer to web-sites, therefore they should be used for indexing the web-sites that will be the component parts of the EUAN network. The contents of the web-sites, however, should be described according to Archival Standards, which after all, are superior to Dublin Core.

The fact that the EVA project may adopt Dublin Core for description does not mean that EUAN should do the same. Although EVA is for the greater part a co-operation between archival institutions, the data objects they will make available through the Internet are not treated as archival materials, but rather as isolated digitised objects. EUAN aims to create a European wide facility to find archival resources, but one that is firmly based in archival theory.

Peter Horsman
Version 1.1


1. Tony Gill, 'Metadata and the World Wide Web', in: Murtha Baca (ed), Introduction to Metadata: Pathways to Digital Information. S.l.: Getty Information Institute, 1998 - Back
2. Anne Gilliland-Swetland, 'Defining Metadata', o.c. - Back
3. Ibid. - Back
4. The proceedings are published by Saur: Towards International Descriptive Standards. München: Saur, 1993. - Back
5. Tony Gill. - Back

Back to Top

Last updated 3 July 2000.

This page is maintained for EUAN by the International Institute of Social History, Amsterdam.
We highly appreciate comments and suggestions.