There are many ways to conduct a data integration project.
The following list provides an overview of the general stages of a project once all of the agreements have been signed and approved (see project application and approval).
- Extract and transfer data - the data custodians provide data to the integrating authority, as was specified in the project agreements.
- Prepare data for linking (link to NSS website) - prior to linking, the data needs to be cleaned and standardised. This may be conducted by the data custodians and/or integrating authority.
- Linking and merging of data - the source data is combined to create a new integrated dataset. This stage is managed by the integrating authority, however, some components may be outsourced or conducted in partnership.
- Access to integrated data - the integrating authority is responsible for providing the data users with secure access to the integrated dataset, de-identified and confidentialised according to the requirements of data custodians.
- Analysis of integrated dataset - data users conduct analysis of the integrated dataset and release outputs.
- Evaluation and project completion - the integrating authority conducts the evaluation and completion of the project, this includes managing the storage or destruction of the integrated dataset.
Below is a very broad diagram of the key stages for a data integration project once it has been approved. There are different protocols which can be used to conduct the project, see applying the separation principle.
For more information on linking methodologies (including the creation of a linkage key), see the Data Linking Information Series.
A broad overview of data integration following approval
Below is an example showing how the roles and responsibilities of a data custodian, integrating authority and data user apply to the key stages of a data integration project once it has been approved. This diagram does not show detailed processes or describe the different protocols which can be used when linking and merging the data, for more information on these approaches see the separation principle.