Ideally, Commonwealth data custodians should provide metadata (information about the data) for each dataset. Providing this information will help integrating authorities and data users to understand the available data and its limitations. The metadata should be made available to prospective data users and to the integrating authority so that they are able to make informed decisions about whether the data will meet the research need or can be successfully linked with other source data.
General principles for data custodians to consider when creating and distributing the metadata:
- where possible, use agreed authoritative standards (such as the Australian Statistical Geography Standard). This facilitates comparisons and linkage with other data which use the same standards.
- keep historical versions of the metadata. Data users need information about the data at the time it was collected, not just the latest version of the metadata.
- whenever you provide data to someone, provide the metadata too.
- provide information about your metadata, including the date the metadata was last updated and the status of the metadata (draft, final, or superseded).
Available metadata should:
- contain a broad overview of the dataset including the data source (name of the agency), the purposes for which the data was collected and information on file structure (e.g. flat file, relational database);
- specify the scope of the dataset (for example, the scope for hospital data might be every episode of care for every admitted patient in selected hospitals);
- outline coverage of the dataset (for example, the data might be collected for certain states or territories, or might relate only to particular types of organisations, such as public hospitals);
- specify the time periods available for the data source (start and finish dates);
- provide a list of the data items available and explanations of any codes and derivations used;
- outline the collection history – have there been changes in the collection of data over time that will affect the comparisons that can be made;
- contain any relevant definitions;
- provide, where appropriate, a copy of the collection instrument; and
- provide information on the data quality (this may take the form of a data quality statement, covered in the data quality section).
The Australian Institute of Health and Welfare publishes information on the formulation of good metadata standards which outline what is needed in order to produce good metadata items. This information includes, for example, types of metadata and the standardisation process.
Examples of metadata
Two examples of metadata are linked below:
- Example 1: Integrated South Australian Activity Collection (ISAAC) Reference Manual
- Example 2: Australian Early Development Census Data Dictionary
The Australian Institute of Health and Welfare has developed a repository for national metadata standards for health, housing and community services statistics and information, called the Metadata Online Registry (METeOR).
Keeping Your Data in Good Shape, available on the National Statistical Service website www.nss.gov.au, provides some broad guidelines for managing metadata.
For more information about data management see: