Local Needs Data Standard - draft v0.1
17/11/2023
A pragmatic data standard for sharing information about need in local areas. This standard is designed to make it easy for charities, public bodies and others to share anonymised information about local need. For example, a food bank could share data about demand for its services in a particular area over time. Or a national debt advice charity could share data about how indebtedness varies across the country, based on calls to its helpline.
Principles
The data standard is based on the following principles:
- Pragmatic: this standard should act as a bridge between existing data standards and data that the charities already hold in some form. It should not require wholesale changes in how charities measure or collect their data, but instead provide ways that this data can be shared in ways that encourage others to build on and reuse their data, and facilitate combining data from different sources.
- Anonymous: the standard envisages the publication of aggregate information relating to local need. This means it should not be used to publish anything that could be used to identify individuals within the data. Publishers should take care to guard against accidental disclosure of personal data - either through publishing aggregate data that only relates to one organisation, or through mistakes such as including hidden excel sheets.
- Incremental: the standard defines two categories of data: Good and Best. Data published to the Best standard will be much easier to use and combine with other datasets, but could require more effort to publish in that format.
- Builds on existing standards: Rather than mandating new data standards, this standard will instead bring together existing standards in a way that makes them easy to implement.
Definitions
- CSV: Comma-separated Values - a common file format for publishing tabular data.
- CSVW: CSV on the Web - a collection of data standards for publishing metadata about CSV files published on the internet.
- Dataset: A table of data published by a data publisher. A dataset contains a number of columns containing values that the publisher think would be useful for other to know. For this data standard, a dataset is expected to include aggregate data (data that sums or counts a particular value) rather than individual data items. For example, it could contain the number of visits to a food bank in a month, rather than a record of each individual visit. Datasets must not contain any personal or confidential data.
- Derived dataset: ****Tools that apply this dataset may apply transformations to the data so that it matches the data standard. The dataset after these transformations is known as a derived dataset.
- Metadata: Data about the dataset - for example the publisher, the date published, the data format, or information about the columns included.
- Property: A piece of information about a value that puts it in context. The main properties described by the data standard are the date it relates to (
observationDate
) or the area it relates to (observationAbout
).
- Property column: A column in the dataset that either describes the area each row relates to or the date.
- Publisher: The organisation that has published a dataset. Expected to be a charity, non-profit organisation or government, but may also include private companies and any other organisations.
- User: The person consuming the data. Expected to not have specialist data skills.
- Value: A piece of information contained within a dataset. Also known as an
observation
. Usually (but not always) a number. To understand a value, we also need to know the column heading, the date it relates to (observationDate
) and the area it relates to (observationAbout
)