Quick start
In this guide, you’ll learn about how the explore education statistics (EES) API is structured and how to use it to perform a basic data query.
What you’ll need
To run request examples in this guide, it is a good idea to come prepared with a HTTP or API client tool. Good recommendations for beginners are Postman or Insomnia.
Some prior knowledge of working with your chosen HTTP client will be necessary to work with the examples.
How the API is organised
The API endpoints are organised in a way that reflects how data is organised in EES.
Data is published in publications. Each publication covers a specific topic such as schools or higher education.
A publication will contain data sets that are relevant to the particular topic. Data sets are composed of data that has been collected over a period of time at varying geographic levels e.g. for local authorities. Updates to data sets are typically published at regular intervals e.g. yearly, monthly, etc.
Given the above, the API exposes endpoints that mirror this:
- publication endpoints:
/publications
- data set endpoints:
/data-sets
Workflow for querying data
To query the data available on the API, this will require the following steps:
- Find the publication you are interested in
- Find the data set you interested in (from the publication)
- Get the data set’s metadata
- Create and run a query against the data set
In the following sections, this guide will walk you through how to perform the above steps.
Step 1: Find a publication
To find a publication that you may be interested in, you’ll need to make a GET
request to the
List publications endpoint:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications
This endpoint will respond with something like the following (parts have been omitted for brevity):
{
"paging" : {
"page" : 1,
"pageSize" : 20,
"totalPages" : 3,
"totalResults" : 50
},
"results" : [
{
"id" : "cbbd299f-8297-44bc-92ac-558bcf51f8ad",
"slug" : "Pupil-absence-in-schools-in-England",
"title" : "Pupil absence in schools in England",
"summary": "Pupil absence, including overall, authorised and unauthorised absence...",
"lastPublished": "2023-11-10T09:15:00+00:00"
}
]
}
This endpoint does not return all publications in a single request. Instead, it is paginated and returns the publications in pages (or batches), with each page containing a maximum number of publications.
You can request additional pages of publications by appending a page
query parameter to the
endpoint URL. For example:
# Fetch page 2
GET https://dev.statistics.api.education.gov.uk/api/v1/publications?page=2
# Fetch page 3
GET https://dev.statistics.api.education.gov.uk/api/v1/publications?page=3
The possible values of page
will be dictated by the total number of results (across all pages)
and the pageSize
query parameter. For example, the following request would show 30 results per
page instead of the default:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications?page=1&pageSize=30
Each page of results contains a paging
property which describes the current page and the total
numbers of pages and results. This information can be used to set the query parameters for the next
page of results.
To make it easier to find a specific publication, you can append a search
query parameter to the
URL as well. The following example would search for publications matching the term ‘pupil absence’:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications?search=pupil+absence
Like a typical URL, you can combine query parameters together with &
. For example, you’d use
the following URL to get page 2 of publications matching the term ‘pupil absence’:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications?search=pupil+absence&page=2
Once you find a publication you are interested in, proceed to the next step.
Step 2: Find a data set
Now that you have a publication that you are interested, you can use this to find data sets related to it. This can be done using the List a publication’s data sets endpoint:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications/{publicationId}/data-sets
For this endpoint URL, you’d substitute the {publicationId}
parameter with the id
of the
publication you are interested in.
For example, given the following publication (parts omitted for brevity):
{
"id" : "cbbd299f-8297-44bc-92ac-558bcf51f8ad",
"slug" : "Pupil-absence-in-schools-in-England",
"title" : "Pupil absence in schools in England",
"summary": "Pupil absence, including overall, authorised and unauthorised absence...",
"lastPublished": "2023-11-10T09:15:00+00:00"
}
You’d make the following GET
request:
GET https://dev.statistics.api.education.gov.uk/api/v1/publications/cbbd299f-8297-44bc-92ac-558bcf51f8ad/data-sets
The endpoint responds with a paginated list of the publication’s data sets which will look like the following:
{
"paging": {
"page": 1,
"pageSize": 10,
"totalResults": 1,
"totalPages": 1
},
"results": [
{
"id": "63cfc86e-c334-4e58-2912-08da0807d53c",
"title": "Absence rates",
"summary": "Absence information for full academic year 2020/21 for pupils aged 5-15.",
"status": "Published",
"latestVersion": {
"version": "1.0",
"published": "2022-12-01T12:00:00Z",
"totalResults": 201625,
"file": {
"id": "84ee3cc9-21bf-44d8-89fd-98c6d7fc74f3"
},
"timePeriods": {
"start": "2020/21",
"end": "2020/21"
},
"geographicLevels": [
"National",
"Local authority"
],
"filters": [
"Phase type",
"Characteristic"
],
"indicators": [
"Number of authorised absence sessions",
"Number of unauthorised absence sessions"
]
}
}
]
}
Each data set result provides high-level information about its contents and metadata. You can use this information to help identify a data set that you’d be interested in looking at further.
Once you have chosen a data set, proceed to the next step.
Step 3: Get the data set’s metadata
Now that you have a chosen a data set, you’ll want to query it for some data. To create a query, you’ll need to use the Get a data set’s metadata endpoint. This provides information about all the filterable facets and indicators available to a data set.
Facets are specific features / characteristics of the data. These are used in a data set query to filter down the data that is returned.
Some examples of facets include:
- time periods e.g. 2022/23 (academic year), 2023 (calendar year), January (month), Week 1 (week)
- locations e.g. England (country), Yorkshire (region), Sheffield (local authority)
- school type e.g. state-funded primary, state-funded secondary
- pupil characteristics like ethnicity and gender
Indicators are types of data points that were collected, for example:
- numbers of pupils, sessions, etc
- rates of change
- proportions / percentages
Both facets and indicators need to be part of a query for data to be returned.
Facets and indicators are collectively referenced as a data set’s metadata. To fetch this for
your chosen data set, make the following GET
request:
GET https://dev.statistics.api.education.gov.uk/api/v1/data-sets/{dataSetId}/meta
To use this URL, substitute in the {dataSetId}
parameter with the id
of your chosen data set.
The endpoint will return something like the following:
{
"filters": [
{
"id": "gIyO9",
"column": "school_type",
"label": "School type",
"hint": "Filter by school type",
"options": [
{
"id": "1oX7c",
"label": "Total",
"isAggregate": true
},
{
"id": "uKR2K",
"label": "State-funded primary"
}
]
}
],
"indicators": [
{
"id": "04nTr",
"column": "sess_authorised",
"label": "Number of authorised absence sessions",
"unit": "",
"decimalPlaces": 0
}
],
"geographicLevels": [
{
"code": "NAT",
"label": "National"
},
{
"code": "REG",
"label": "Regional"
}
],
"locations": [
{
"level": {
"code": "NAT",
"label": "National"
},
"options": [
{
"id": "2FmYX",
"code": "E92000001",
"name": "England"
}
]
},
{
"level": {
"code": "REG",
"label": "Regional"
},
"options": [
{
"id": "e0768",
"code": "E12000001",
"name": "North East"
},
{
"id": "GQbUn",
"code": "E12000002",
"name": "North West"
}
]
}
],
"timePeriods": [
{
"code": "AY",
"label": "2021/22",
"period": "2021/2022"
},
{
"code": "AY",
"label": "2022/23",
"period": "2022/2023"
}
]
}
The core facets are found under the timePeriods
, geographicLevels
and locations
properties.
The locations
property contains the data set’s location options grouped by the geographic level
they reside in. Each location option has an id
and may contain additional code fields (e.g. ONS
codes) to identify them.
In the above example, there are location options for:
- ‘England’ (
2FmYX
) at ‘National’ geographic level - ‘North East’ (
e0768
) and ‘North West’ (GQbUn
) at ‘Regional’ geographic level
The geographicLevels
property contains the different geographic levels that the data was collected
at. Each geographic level is identified by its code
and a full list of these can be found in the
GeographicLevelCode schema.
In the above example, there are geographic level options for ‘National’ (NAT
) and ‘Regional’ (REG
)
geographic levels.
The timePeriods
property contains the time periods the data was collected at. The time period options
are represented by a code
that describes the time period’s type and a period
that describes the
date range. A full list of time period codes can be found in the TimePeriodCode schema.
In the above example there is a single time period option for ‘academic year 2022/23’.
Any additional facets are found under the filters
property, which groups them by their filter.
Each filter is identified by an id
. The example above has a ‘School type’ filter with an ID of gIyO9
.
Each filter will have a set of filter options. The example above has ‘State-funded primary’ and
‘Total’ options for ‘School type’. Filter options also have their own id
properties that can be used
to identify them.
Some filter options are aggregates of the entire filter and are marked by an isAggregate
property
set to true. In the above example, the ‘Total’ school type is an aggregate of all school types
(e.g. primary, secondary, special schools, etc).
Finally, the indicators
property contains the data set’s indicators. Each indicator contains an
id
to identify it and may also contain:
- a
unit
property specifying the mathematical unit that was used in the measurement - a
decimalPlaces
property specifying the recommended number of decimal places to use when displaying the indicator’s value
Spend some time getting familiarised with the metadata response and proceed to the next step when ready.
Step 4: Create and run your data set query
In this final step, you’ll need to use the metadata from the previous step to create and run your query against the Query a data set endpoint.
To use this endpoint, a POST
request needs to be sent to the endpoint URL with an appropriate
request body. The most basic request would look like the following:
POST https://dev.statistics.api.education.gov.uk/api/v1/data-sets/{dataSetId}/query
{
"indicators": []
}
As seen previously, you need to substitute the {dataSetId}
parameter with the id
of your chosen
data set.
The request body must contain an indicators
property with a list of indicator IDs, which can can
found in the data set’s metadata (under each indicator option’s id
property). The data values in
the response will correspond to the indicators that you specify.
To refine your query to a subset of the data, you will also need to provide some filtering criteria
by adding a criteria
property to your query request:
POST https://dev.statistics.api.education.gov.uk/api/v1/data-sets/{dataSetId}/query
{
"indicators": [],
"criteria": {}
}
The criteria
property at its simplest must be an object that describes which facets the query
should filter on. The facet object has properties that align with the different facet types seen
in the metadata step previously.
The table below describes the facet properties you can use and how each facet option should be represented in the query:
Property | Description | Facet option examples |
---|---|---|
filters |
Filter by filter option ID |
1oX7c , uKR2K
|
geographicLevels |
Filter by geographic level code |
LA , REG , NAT
|
locations |
Filter by location option ID or code |
{ "level: "REG", "code": "E12000001" } ,{ "level: "NAT", "id": "2FmYX" }
|
timePeriods |
Filter by time period | { "period": "2022/2023", "code": "AY" } |
Note that all the facet properties are optional so you only need to use the ones relevant to your query.
Each facet property must contain an object that describe how the facet options should be compared to the results when filtering. Some examples of comparators that can be used:
Comparator | Description | Multiple values? | Example |
---|---|---|---|
eq |
Equal to | No | "eq": "1oX7c" |
notEq |
Not equal to | No | "notEq": "1oX7c" |
in |
In a set | Yes | "in": ["1oX7c", "uKR2K"] |
notIn |
Not in a set | Yes | "notIn": ["1oX7c", "uKR2K"] |
lte |
Less than or equal to | No | "lte": { "period": "2022/2023", "code": "AY" } |
lt |
Less than | No | "lt": { "period": "2022/2023", "code": "AY" } |
gte |
Greater than or equal to | No | "gte": { "period": "2022/2023", "code": "AY" } |
gt |
Greater than | No | "gt": { "period": "2022/2023", "code": "AY" } |
Using the above information and the metadata example from the previous step, the following query could be constructed:
{
"criteria": {
"filters": {
"eq": ["uKR2K", "1oX7c"]
}
},
"indicators": ["04nTr"]
}
This query would filter so that only results in the ‘State-funded primary’ (filter option uKR2K
)
school type would be returned. Each result would then contain the ‘Number of authorised absence
sessions’ (indicator 04nTr
) in its data values.
You can add multiple clauses to the facet criteria object to refine your query further. For a fuller example, you could construct a query like the following:
{
"criteria": {
"filters": {
"in": ["uKR2K", "1oX7c"]
},
"timePeriods": {
"lte": { "period": "2021/2022", "code": "AY" },
"gte": { "period": "2022/2023", "code": "AY" }
},
"locations": {
"notIn": [
{ "level": "REG", "id": "e0768" }
]
},
"geographicLevels": {
"eq": "LA"
}
},
"indicators": ["04nTr"]
}
The above example would query for the ‘Number of authorised absence sessions’ (indicator 04nTr
)
matching the following criteria:
- is for ‘State-funded primary’ (filter option
uKR2K
) or ‘Total’ (filter option1oX7c
) school types - is during or after the 2021/22 academic year (time period
2021/2022
and codeAY
) - is during or before the 2022/23 academic year (time period
2022/2023
and codeAY
) - is not in the ‘North East’ (location
e0768
) - is collected at ‘Local authority’ level (geographic level
LA
)
Different facet values and comparators can be provided to modify the query in different ways. It’s advisable to spend a little time getting more familiar with the query API.
The Creating advanced data set queries guide explores this topic in greater depth and is recommended for further reading.
The data set query response
Once you have created your query, make the POST
request to the endpoint. You should receive a
paginated response that looks like:
{
"paging": {
"page": 1,
"pageSize": 100,
"totalResults": 150,
"totalPages": 2
},
"results": [
{
"timePeriod": {
"code": "AY",
"period": "2022/2023"
},
"geographicLevel": "REG",
"locations": {
"NAT": "2FmYX",
"REG": "e0768"
},
"filters": {
"gIyO9": "uKR2K"
},
"values": {
"04nTr": "1708016"
}
}
]
}
The results
part of this response contains a list of results containing a combination of facets
and the data matching it.
The timePeriod
property describes the time period that the result was collected in. In the above
example, this is academic year 2022/2023.
The geographicLevel
property describes the geographic level that the data was collected in. In
the above example, the result’s data was collected at ‘Regional’ (REG
) level.
The locations
property describe the set of locations that correspond to the result. This is a
dictionary where the keys correspond to geographic level codes and the values are an option ID within
the corresponding geographic level.
In the above example, the result’s locations were ‘England’ (2FmYX
at ‘National’ level) and
‘North East’ (e0768
at ‘Regional’ level).
The filters
property describes the additional facets corresponding to the result. This is a
dictionary where the keys are the filter ID and the values are an option ID within the corresponding
filter.
In the above example, the result’s ‘School type’ (gIyO9
) filter was ‘State-funded primary’ (uKR2K
).
The values
property of each result is a dictionary where the keys are the indicator IDs and
the values are the respective indicator data values.
In the above example, the ‘Number of authorised absence sessions’ (04nTr
) indicator has a value
of 1708016
.
Note that reported values may not be numeric. In some instances, it may not be possible to report the data (e.g. due to suppression for anonymity) and a placeholder value may be used instead.
Spend some time getting familiar with the structure of the results and try to find some results you are interested in.
Note on paginated data
Like some endpoints seen previously, the data set query’s response is paginated meaning that the
data is returned in multiple pages / batches. The paging
property is returned as part of each
response and describes the current page of data matching the query.
You can set page
and pageSize
parameters in the query string to request different pages of
results. For example, the following request would fetch page 5, with each page containing
a maximum of 200 results:
POST https://dev.statistics.api.education.gov.uk/api/v1/data-sets/{dataSetId}/query?page=5&pageSize=200
Conclusions
This quick start guide has now run you through a basic workflow for retrieving some data from the EES API. The core workflow is the same for all data sets. The majority of use-cases will simply require you to adjust the parameters used.
You should now have the basic tools to get started with the API, but you are encouraged to explore the documentation further. It is recommended that you read the Overview section to get a better understanding of the core API features.
To learn more about data set queries and how to create more complex ones, it is recommended that you read the guide to Creating advanced data set queries.