# R For Sport - Calculating Rolling Averages in R

Hi everyone, in this weeks R For Sport post we are looking at how to calculate rolling averages using different packages that have been developed specifically for R. These packages make it super simple to create your own Rolling Averages in a matter of minutes!

So let’s get straight in to it.

$\\[0.1in]$

First of all, let’s read in the data we want to use and apply a few small steps. The first thing I will do is introduce you to MUTATE, a function in the Tidyverse that allows you to add new columns to a df. All you need to do is supply a new column name and an expression that will calculate the column for you. So below let’s add a High-speed distance column to the data I have read in.

data_select <- data %>%
select(c(1,2,4,8,9,10,11)) %>%
mutate("High-Speed-Distance" =
Band 5 Distance + Band 6 Distance + Band 7 Distance) %>%
mutate(Date = dmy(Date)) %>%
select(-c(5,6,7))

kbl() %>%
kable_material(full_width = F, c("striped", "hover"))
As you can see in the table above, we have added the new High-Speed-Distance column along with only selecting the data we want to keep. This has minimised our DF to be much smaller and easier to work with.

## Using RcppRoll

The first package we are going to use is called RcppRoll. This package has a few functions within it that allow you to create rolling averages or sums. The formula is quite simple and you can see it in the below code chunk.

data_rcpp <- data_select %>%
group_by(Athlete ID) %>%
arrange(Athlete ID, Date) %>%
mutate(rolling_avg_rcpp_7 =
roll_mean(Total Distance, n=7, fill=0, align="right"),
rolling_avg_rcpp_28 =
roll_mean(Total Distance, n=28, fill=0, align="right")) %>%
mutate(rcpp_acwr = rolling_avg_rcpp_7 / rolling_avg_rcpp_28)

data_rcpp %>%
filter(Athlete ID == 1) %>%
kbl() %>%
kable_material(full_width = F, c("striped", "hover")) %>%
scroll_box(height="400px")
There are a few things to notice in the function inputs. Firstly, you need to align your rolling values, with the default being center, this means that the data is equally distributed to each side of your value. I chose right, meaning that the value for our 7 or 28 day period will only start to appear once 7/28 days have passed. Because of this, our ACWR value takes 28 days to be calculated, resulting in NaN and inf values, which are blanks and infinite values. Both of these mean that the ACWR is unable to be calculated.

But now that we have some values, let’s plot them and see what they look like.

As we can see, our 7 day value appears quite early but it takes a little longer for the 28 day value to start appearing. This isn’t necessarily a bad thing, but it would be nicer to have your values from day 1. So let’s see what we have in another package!

## Using the Zoo Package

The next package we are going to use is called Zoo. In much the same way as RcppRoll, Zoo provides a couple of functions for the calculation of rolling averages and sums. So let’s give it a go below.

data_zoo <- data_rcpp %>%
group_by(Athlete ID) %>%
arrange(Athlete ID, Date) %>%
mutate(rolling_avg_zoo_7 =
zoo::rollmean(Total Distance, k = 7, fill=0,align="right"),
rolling_avg_zoo_28 =
zoo::rollmean(Total Distance, k = 28, fill=0,align="right")) %>%
mutate(zoo_acwr = rolling_avg_zoo_7 / rolling_avg_zoo_28)

data_zoo %>%
filter(Athlete ID == 1) %>%
select(c(1:5,9:11)) %>%
kbl() %>%
kable_material(full_width = F, c("striped", "hover")) %>%
scroll_box(height="400px")
The functions provided by Zoo are very similar to RcppRoll, requiring the alignment and also the delay in seeing values. So we can assume that our plot will look the same, but just to be sure, let’s plot the values below.

As we assumed, the figure looks identical to what we had previously. So let’s try our last package and see what we get.

## Using the Pracma Package

The last package on the list as pracma. This package provide the basic rolling average functionality we have seen previously, with the ability to also calculate the exponentially weighted values as well. First, let’s try the simple average calc.

data_pracma <- data_zoo %>%
group_by(Athlete ID) %>%
arrange(Athlete ID, Date) %>%
mutate(rolling_avg_pracma_7 =
movavg(Total Distance , n = 7, type="s" ),
rolling_avg_pracma_28 =
movavg(Total Distance , n =28, type="s" )) %>%
mutate(pracma_acwr = rolling_avg_pracma_7 / rolling_avg_pracma_28)

data_pracma %>%
filter(Athlete ID == 1) %>%
select(c(1:5,12:14)) %>%
kbl() %>%
kable_material(full_width = F, c("striped", "hover")) %>%
scroll_box(height="400px")
The first difference we can see with the pracma calculation is the difference in our values. In the first 7 days, we are getting the average of the days preceeding that day, the then average of the full 7 days from day 7 onwards. This is the same for our 28 day value which makes this possibly a better way of calculating our values. Let’s see the plot!

As we can see, with our data starting straightaway, we get the gradual build up of our values until they then start to differ from day 7 onwards. Let’s try and calculate the EWMA version now!

data_pracma <- data_pracma %>%
group_by(Athlete ID) %>%
arrange(Athlete ID, Date) %>%
mutate(rolling_avg_pracma_e_7 =
movavg(Total Distance , n = 7, type="e"),
rolling_avg_pracma_e_28 =
movavg(Total Distance , n = 28, type="e")) %>%
mutate(pracma_e_acwr = rolling_avg_pracma_e_7 / rolling_avg_pracma_e_28)

data_pracma %>%
filter(Athlete ID == 1) %>%
select(c(1:5,15:17)) %>%
kbl() %>%
kable_material(full_width = F, c("striped", "hover")) %>%
scroll_box(height="400px")
The easiest of change, with type being changed to e, the function now knows to use a different moving average type. We can see that these values differ from day 2 already, which is really interesting. Let’s plot these as well!

From day 2, we can see the difference very quickly which is interesting. There is a lot more variation between these values. The more interesting thing will be the output of the ACWR values.

So let’s plot those.

Our EWMA ACWR version is different and far more varied compared to our normal rolling average version. My data isn’t perfect, so the look of this in reality might be much different, so give it a go with your data and see how it appears!

$\\[0.1in]$

## Video

You can find the full video showing the information discussed above, in a little more detail, below.

As always, hit like on the video and subscribe here for more videos to help you Power Performance Through Data.

Until next time,

Josh