Third Meeting, Part Two, Amsterdam, 1-2 December 1999
Attendance: George MacKenzie, Rob Mildren, NAS; Jaap Kloosterman, IISG, Goran Kristiansson, Lena Wilhelmsson, RA, Peter Horsman, outside expert.
1. System Profiles
The aim of the audit of IT systems in this workpackage is to identify those aspects that will permit effective exchange of data between systems and with outside users, which is the ultimate aim of the project. It was agreed that one way of expressing this would be to compile a set of criteria for participation in the EUAN project in future A list was drafted for discussion, which is in Annex 1. This will be developed into part of the specification for the prototype.
2. Computing Approaches
2.1 The meeting identified three possible approaches, divided into two basic types, the centralised and the de-centralised.
2.2 The de-centralised approach would be based on the Z39.50 protocol, and involve sending a query around a variety of databases. For this to work, each database must be set up to receive and process the query.
2.3 The centralised approach would, by contrast, bring together a subset of the data from the different databases into a text index, using software, such as Fulcrum or Optosof, already used in the library world. This index would be used for searching, rather than the databases themselves. The text index would be based on agreed data elements, conforming to the 13 core ISAD elements identified in the archival workpackages. Updating the text index could be achieved in one of two ways.
2.3.1 In model 1, the individual system databases would export a set of data at regular intervals in an agreed format to the text index. This is illustrated in Annex 2.
2.3.2 In model 2, a web crawler would regularly search the individual systems, extracting new data and copying it to a series of searchable pages. This is illustrated in Annex 3.
2.4 Both the centralised and de-centralised approaches have advantages and disadvantages.
2.5 Intelligent agent software can be used to enhance either the text index or web crawler models. In the centralised approaches, a suitable exchange format needs to be agreed. It was agreed that for a data standard, MARC-AMC or EAD would be suitable and could be implemented by IISG and RA. It was also agreed that for character sets, Unicode could be used. Dates should be exported in ISO 8601 format, though display would be set either by the user's web browser, or by the user him or herself.
3.1 It was agreed that these represent a potential problem. In each partner institution, there are de facto authorities for place names and personal and organizational names, which will follow institutional or national rules. When the EUAN project level is reached, there is a potential for differences between the authorities. This will be made greater the more partners there are in EUAN and the more detailed the descriptions that are included. This is really a question for the archival group, but it could have an impact on the technical implementation. It was concluded that, with the current five partners and top level descriptions only, the problem is manageable.
3.2 Two means of dealing with it were identified:
3.3 It was also agreed to consult the archival group further on this question. In the longer term, there may be a role for the European Commission in co-ordinating work on name authorities at a trans-national level.
3.4 The best vehicle for exchanging name authority data is likely to be a new Document Type Definition (DTD) based on ISAAR(CPF), which has been drafted by Daniel Pitti. P-G Ottosson in RA is currently testing the first draft.
Last updated 1 March 2000.