RAP for statistics

Guidance for how to implement the principles of Reproducible Analytical Pipelines (RAP) into statistics processes


This page sets out the RAP for statistics principles. For other areas of analysis, we have the RAP for general analysis guidance which covers how RAP can be applied to analysis work outside of Official Statistics production. The same 15 core principles apply across both RAP for statistics and RAP for general analysis, with four additional principles specific to statistics production.

If you’re unfamiliar with RAP, or want a refresher on what RAP is and why it matters, take a look at our getting started with RAP page. It covers the background and benefits of RAP, as well as an introduction to the RAP principles.


RAP for statistics scope


We want to focus on the parts of the production process that we have ownership and control over – so for statistics production we are focusing on the process from data sources to publishable data files. This is the part of the process where RAP can currently add the most value - automating the production and quality assurance of our outputs currently takes up huge amount of analytical resource, which could be better spent providing insight and other value adding activity.

In Official Statistics production we are using RAP as a framework for best practice when producing our published data files, as these are the foundations of our publications moving forward. Following this framework will help us to improve and standardise our current production processes and provide a clear ‘pipeline’ for analysts to follow. To get started with RAP, we first need to be able to understand what it actually means in practice, and be able to assess our own work against the principles of RAP. From there, we can work out what training is needed, if any, and where additional support can help teams to meet the baseline.

Implementing RAP may involve combining the use of Databricks, R, and clear, consistent version control to increase efficiency and accuracy in our work. For more information on what these tools are, why we are using them, and resources to help upskill in those areas, see our learning resources page.

The collection of, and routine checking of data as it is coming into the department is also an area that RAP can be applied to. We have kept this out of scope at the moment as the levels of control in this area vary wildly from team to team. If you would like advice and help to automate any particular processes, feel free to contact the Statistics Development Team.


Core principles

RAP has three core principles:

Preparing data: Data sources for a publication are stored in the same database

Writing code: Underlying data files are produced using code, with no manual steps

Version control: Files and scripts should be appropriately version controlled

Within each of these principles are separate elements of RAP, which are shown in the diagram below. You can find further information on each of the elements by clicking on the links in the diagram.


RAP for statistics principles

The diagram below highlights what RAP means for us, and the varying levels in which it can be applied in all types of analysis. You can click on each of the hexagons in the diagram to learn more about each of the RAP principles and how to use them in practice.

The expectation is that all statistics publications will meet the department’s baseline implementation of RAP, using the self-assessment tool to monitor their progress. It’s worth acknowledging that some teams are already working around great and best practice levels, and that we appreciate every team’s situation is unique, our guidance is designed to be applicable across all official statistics publications by DfE. Once teams achieve baseline status, their RAP process will be audited.


What is expected

Warning

It is expected that all teams’ processes meet all elements of good and great practice as a baseline.

Teams are expected to review their own processes using the publication self-assessment tool and use the guidance on this site to start making improvements towards meeting the core principles if they aren’t already. If you would like additional help to review your processes, please contact the Statistics Development Team.

Teams will start from different places and implement changes at different rates, and in different ways. We do not expect that every team will follow the same path, or even end at the same point. Don’t worry if this seems overwhelming at first, use the guidance here to identify areas for improvement and then tackle them with confidence.

While working to reach our baseline expectation of good and great practice, you can track your progress in the publication self-assessment tool and contact the Statistics Development Team for help and support.


How to assess your publication

The checklist provided in the publication self-assessment tool, shown below, is designed to make reviewing our processes against our RAP levels easier, giving a straightforward list of questions to check your work against. This will flag potential areas of improvement, and you can then use the links on the right hand side to go to the specific section of this page to find more detail and guidance on how to develop your current processes in line with best practice.

Some teams will already be looking at best practice, while others will still have work to do to achieve the department’s baseline of good and great practice. We know that all teams are starting this from different points, and are here to support all teams from their respective starting positions.

For guidance for how to implement the principles of Reproducible Analytical Pipelines (RAP) into statistics production processes and where to start, see the getting started with RAP page.


Apply the principles

Now that you’ve prioritised the area of your workflow to focus on, the next step is to apply the relevant RAP principles to that part of your process.

The next section is split by the three core principles, preparing data, writing code and version control.

Within each of these principles are separate elements of RAP. Each of these is discussed in detail so that you know what is expected of you as an analyst.


Back to top