Addsai logo
An overview of ADDSIA 
Home 
Partners
Outline
Presentation
Deliverables
Links
Public


Objective

The objective of this project is to use distributed database techniques and World Wide Web (WWW) technology to facilitate more effective access to statistical data by the European research and policy community; assisting the research and policy analyst to make cross-dataset comparisons, and providing European National Statistical Institutes (NSIs) with a productivity tool.

Approach to the work

The starting point of our approach is that the large scale producers of statistics have already put in place statistical information systems which collect and store data and the associated metadata. However, these systems have been developed to meet the needs of different institutes and are not compatible. Moreover, these systems are needed for the day to day running of these institutes and in the short term cannot readily be replaced by more up to date technology. However, there is a demand for better access to data from the wider statistical community.

The project seeks to address these problems making only three basic requirements of an NSI; that it stores its data in a relational database or other well known format; that it has access to the Internet; and that it can supply a minimum amount of data definition to make the data available. We aim to demonstrate how the World Wide Web could be used to provide a service that is run under the control of the publishing body. This means that we must address, at some level, the problems posed by such a service, without concentrating on any one problem in depth. Our aim is more to provide a framework into which the results of deeper studies can be fed.

To investigate these problems, we take a modular approach, dividing the work into six major segments, each of which has a number of smaller modules. Each segment has both a theoretical and practical goal. Three of the segments are central to the system, and three concentrate on identified application areas. We give a brief description of these segments, along with their theoretical and practical objectives.

Communications Segment:

This central segment handles communication between local sites which access the NSI data and a global site which is managed by a Domain manager.

Theoretical objective: to advance the methodology of communications for distributed statistical databases, particularly in a Web based environment.

Practical objective: to build the demonstrator communications modules and their interfaces to other modules.

Secure Interface Segment (Central):

This central segment handles the interface between existing micro data and the outside environment.

Theoretical objective: to contribute to the advancement of the methodology of security both from the point of view of unauthorised access and from the point of view of preventing disclosure, and to provide a framework in which advanced methodological algorithms can be expressed.

Practical objective: to build a demonstrator secure interface which handles all transactions between existing data and the external world.

Metadata Segment:

This central segment handles all aspects of metadata in the system, both local and global.

Theoretical objective: (a) To identify the minimum subset of metadata necessary for the running of the system. (b) To develop mechanisms for handling all metadata supplied by the NSIs. (c) To investigate the use of free-text indexing to impose a structure on existing documentation.

Practical objectives: (a) To construct the demonstrator metadata modules for both local and global environments. (b) To develop an internal interface language for interrogating the metadata modules.

Data Access Segment:

This application segment handles the interface with the administrative user, providing both data and user registration functions.

Theoretical objectives: To determine the needs of the administrations, with regard to (i) managing a global site for a domain (ii) registering data and (iii) registering users.

Practical objectives: To construct demonstrator data access modules both at a local (NSI) and global (Domain manager) level, and to provide appropriate user interfaces.

Publication Segment:

This application segment extracts tables in printed and electronic format for publication.

Theoretical objectives: To compare the metadata needed for dataset description and output tables description, and obtain a synthesis.

Practical objectives: To construct the demonstrator publishing module and an interface for the internal (privileged) user.

Analysis Segment:

This application segment enables analysis between datasets without recourse to the original micro datasets.

Theoretical objectives: (a) To develop a language for describing the formal manipulation of statistical macro (summary) objects. (b) To build on existing theoretical work and use this language to define operations and functions to the system.

Practical objectives: To construct the demonstrator analysis module and an interface for the external (Web based) user. 


Results

The results of the project will be:

(a) a software tool which comprises the six modules and the interfaces, which will be available as a demonstration tool from a Web server

(b) reports on theoretical advances, proof of concepts and utilisation of these concepts within the national statistical offices of the partnership

(c) formal languages for describing (i) existing metadata systems (ii) the algebra of manipulating macro statistics objects (iii) security algorithms and rules of disclosure. 


Impact and Exploitation

The main deliverable will be a suite of programs which will be able to be downloaded from the Web, and which will facilitate the registration and management of data, the publishing of data, and remote analysis of data without recourse to the individual micro data. In the course of producing this prototype, we will investigate, and report on, issues of security, efficiency and the harmonisation metadata.

Exploitation will take place in two forms:


Published by CES at the University of Edinburgh
Last updated 4th Septermber 2000
Comment on These Pages to j.m.lamb@ed.ac.uk