Skip to main content

Debugging data set queries

In this guide, you’ll learn various techniques to help debug potential issues with data set queries when using the explore education statistics (EES) API.

What you’ll need

You should already be familiar with the EES API. If not, you should read the Quick start first as this guide will presume some prior knowledge.

You should already be familiar with the basic usages of the following endpoints to proceed:

Diagnosing error responses

In most cases, if the API is unable to process a request successfully, it will return an error response. The success or failure of a request is primarily indicated by its HTTP status code, where:

  • 2xx status codes (e.g. 200, 204) indicate a success
  • 4xx status codes (e.g. 400, 404) indicate an error with the request itself, such as a validation issue
  • 5xx status codes (e.g. 500, 503) indicate an error that occurred within the API whilst processing

When there is an error response, the body will typically look like:

{
  "title": "There was a problem processing the request.",
  "type": "Internal Server Error",
  "status": 500
}

The response body is modelled by the ProblemDetailsViewModel schema, which attempts to detail the reason(s) why the request failed. The following fields are always included:

Property Type Description
title string The title of the error. Typically summarises the error.
type string The error type. Usually corresponds with the HTTP status code.
status number The HTTP status code.

If there are validation issues with the request, the response body will also contain an errors field which looks like:

{
  "title": "There are validation errors with the request.",
  "type": "Bad Request",
  "status": 400,
  "errors": [
    {
      "message": "Error message",
      "code": "error.code",
      "path": "theField"
    }
  ]
}

The errors property contains a list of errors. Each error corresponds to a specific problem and will at least contain a message (describing the issue) and a code (for further debugging and parsing).

If the error relates to a specific part of the request, the path property is used to describe the path to request property that caused the error. If this is omitted or empty, it means the error is ‘global’ and relates to the entire request.

Where possible, errors may also contain a detail field that provides more detailed information about the problem.

Validation errors and error responses in general are covered in much more detail in the section on Error handling.

Validation errors for data set queries

The Query a data set endpoint will usually try to process a query as much as possible before a validation error response is sent (instead of failing early). Consequently, the response typically aggregates as many validation errors as possible.

Common validation errors will be discussed in more detail in the following sections.

Incompatible comparator values

Validation errors are commonly caused by some data set query criteria containing comparators that use the wrong data type for their values. For example, a query with the following criteria:

{
  "criteria": {
    "filters": {
      "eq": ["filter-1"]
    }
  }
}

Will result in an error response like:

{
  "errors": [
    {
      "message": "Must be a valid value. Check that the type and format are correct.",
      "path": "criteria.filters.eq",
      "code": "InvalidValue"
    }
  ]
}

The above query is using a eq comparator with an array. Arrays are typically only used with comparators that can accept multiple values like in and notIn.

Solution

To correct the error, you can simply change eq to in:

{
  "criteria": {
    "filters": {
      "in": ["filter-1"]
    }
  }
}

Alternatively, the array can be replaced with a single filter item ID string:

{
  "criteria": {
    "filters": {
      "eq": "filter-1"
    }
  }
}

It is recommended that you read the guide on Creating advanced data set queries as it contains far more detail about each comparator, and how they are used with different facet types.

Incorrect use of condition clauses

Validation errors can commonly occur when writing more complex queries using condition clauses such as and, or and not. For example:

{
  "criteria": {
    "not": [
      {
        "filters": {
          "eq": "..."
        }
      }
    ]
  }
}

In the above example, an array is used as the not clause value. Unfortunately, arrays are incompatible and will result in an error response like:

{
  "errors": [
    {
      "message": "Must be a valid value. Check that the type and format are correct.",
      "path": "criteria.not",
      "code": "InvalidValue"
    }
  ]
}

Solution

Check that your query correctly follows the DataSetQueryRequest schema. Pay close attention to any usages of condition clauses.

The and / or clauses accept multiple criteria or condition clauses in an array:

{
  "criteria": {
    "and": [
      {
        "filters": { "eq": "..." }
      },
      {
        "locations": { "eq": "..." }
      }
    ]
  }
}

The not clause only accepts a single condition clause:

{
  "criteria": {
    "not": {
      "filters": { "eq": "..." }
    }
  }
}

For a better understanding of condition clauses, the guide on Creating advanced data set queries goes into much greater detail on this topic.

Warnings in successful data set query responses

In certain cases, a successful data set query may include warnings that indicate something is potentially wrong with the request. Whilst these are not critical errors, it is advisable that you double-check that your query to ensure that is functioning as expected.

Warnings in the response typically look like the following:

{
  "paging": {
    "page": 1,
    "pageSize": 100,
    "totalResults": 150,
    "totalPages": 2
  },
  "warnings": [
    {
      "message": "The query did not match any results. You may need to refine your criteria.",
      "code": "QueryNoResults"
    }
  ],
  "results": [...]
}

The format of a warning is the same as an error (see section on Error handling for more details) and will include at least a message and code.

No query results

If a query does not return any results, you’ll receive a warning like:

{
  "warnings": [
    {
      "message": "The query did not match any results. You may need to refine your criteria.",
      "code": "QueryNoResults"
    }
  ]
}

Depending on the use-case, this may be correct behaviour, however, it may also indicate that there is an issue with the query.

Solution

In most cases, the query is likely using criteria that is too specific and the matching data does not exist within the data set. Some tweaking of your query may be required to make the criteria less specific

If your query previously worked but begins to return no results, this may be due to the data set itself changing in a backwards incompatible way e.g. the removal of data, or a major change in the data set’s facets.

The EES API makes every effort to avoid publishing backwards incompatible data that may disrupt existing queries, however, these types of changes may still occur from time to time (deliberately or otherwise).

The versioning policy is outlined in more detail in the Versioning overview.

Missing facets

Before a data set query is executed by the API, the facets in the query are pre-validated to ensure that they exist in the data set. If some facets are missing, this may cause queries to have zero results (as they cannot be matched).

When there are missing facets, a response will look like:

{
  "warnings": [
    {
      "message": "One or more filters could not be found.",
      "path": "criteria.filters.in",
      "code": "FiltersNotFound",
      "details": {
        "items": ["invalid-filter-1", "invalid-filter-2"]
      }
    }
  ]
}

In these types of responses, the notFound warning informs you about the specific items that are missing (e.g. invalid-filter-1) in the details property.

Solution

Ensure that all facets in the data set query exist in the corresponding data set metadata. You should check this by cross-referencing the missing facets with the Get a data set’s metadata endpoint.

Facets are not usually removed from existing data sets, so there may be a typo (or similar) in the missing facets.

Documenting queries with comments

Data set queries can be documented with comments in the JSON body using forward slashes (//). You can add comments to any part of the query as long as it doesn’t result in a malformed JSON structure.

For example, the following is a valid query with comments:

{
  "criteria": {
    "filters": {
      "in": [
        "n0WqP", // State-funded secondary
        "hUfBQ" // Gender male
      ]
    }
  }
}

The following is an invalid query with comments:

{
  "criteria": {
    "filters": {
      "in": [
        "n0WqP" // State-funded secondary,
        "hUfBQ" // Gender male
      ]
    }
  }
}

In the invalid query, the comma needed after "n0WqP" has been accidentally placed within the adjacent comment causing the JSON to be syntactically invalid.

Debug mode

To assist in debugging unexpected results for a data set query, the Query a data set endpoint also accepts a debug query parameter that enables debug mode. This can be set in the request’s query string like so:

https://dev.statistics.api.education.gov.uk/api/v1/data-sets/{dataSetId}/query?debug=true

The response will then be modified to return results that look like the following:

{
  "filters": {
    "Z3PMP :: ethnicity": "bqJZ4 :: Asian - Chinese",
    "eW168 :: language": "LVRpO :: Total",
    "b0yZ4 :: phase_type_grouping": "RXIeh :: State-funded secondary"
  },
  "timePeriod": {
    "code": "AY",
    "period": "2021/2022"
  },
  "geographicLevel": "LA",
  "locations": {
    "NAT": "dv84z :: England :: E92000001",
    "REG": "T4Y1o :: East Midlands :: E12000004",
    "LA": "vNVmD :: Derby :: E06000015"
  },
  "values": {
    "dv84z :: headcount": "79",
    "vNVmD :: percent_of_pupils": "0.429161234"
  }
}

The keys and values of filters, locations and values are changed to display human-readable labels and facet IDs in the format {facet ID} :: {label}.

Enabling debug mode is useful to avoid having to cross-reference the facets of each result with the data set’s metadata (using the Get a data set’s metadata endpoint).

However, it is important to note that debug mode should not be used outside of development / debugging purposes. When your query’s issues have been resolved, you should disable debug mode before pushing your query to production.

Using debug mode in production comes with significant issues such as:

  • much larger (~2-3x) responses that consume more bandwidth
  • slower responses due to extra server-side processing needed
  • being subject to lower rate limits
  • needing additional client-side parsing of the human-readable labels and facet IDs
This page was last reviewed on 16 September 2024.