Extracting Freeze Frames
In my first post, I provided a tutorial on importing StatsBomb data in to R using the StatsBombR package. We took that imported data and created a few summary tables for the FA Women’s Super League. Today we are going to take this a step further and extract the shot freeze frame data provided.
You will have noticed last time, we removed the shot.freeze_frame column from our dataset so we could write the CSV summary tables. This was an important step as the shot freeze frame is provided as a nested dataframe. This means that there is a dataframe nested within that cell of the column.
Here is an example of the data nested within the shot.freeze frame_column for a single shot.
|115.9, 42.7||FALSE||15709||Megan Walsh||1||Goalkeeper|
|103.4, 58.2||TRUE||15547||Melissa Lawley||17||Right Wing|
|98.8, 44.0||TRUE||15613||Rinsola Babajide||23||Center Forward|
|91.8, 57.7||FALSE||16392||Felicity Gibbons||6||Left Back|
|97.8, 51.1||FALSE||22337||Maya Le Tissier||5||Left Center Back|
|95.0, 43.1||FALSE||16383||Danique Kerkdijk||3||Right Center Back|
|98.7, 32.2||FALSE||19414||Kirsty Barton||2||Right Back|
|88.3, 47.3||FALSE||20034||Danielle Buet||13||Right Center Midfield|
|88.5, 40.9||FALSE||31529||Léa Le Garrec||15||Left Center Midfield|
|87.2, 42.9||FALSE||23289||Emily Simpkins||10||Center Defensive Midfield|
|89.1, 51.3||FALSE||16399||Kate Natkiel||16||Left Midfield|
|a Table 1. A summary of data extracted from the freeze frame column|
As we can see, the freeze frame provides some valuable information on player locations at the time of the shot. From this we could see how many players are in front or behind of the ball, does the player have a clear shot and so on. This was a simple extraction using tidyr::unnest. However, we can run in to problems if there is a null value within the column, where no freeze frame positional data is provided. We will need to filter this row out before we can unnest the data. We would do that as follows:
### I have read in all data previously using StatsBombFreeEvents FreezeFrameData <- Data %>% filter(type.name == "Shot") %>% select(minute, second, shot.outcome.name, shot.freeze_frame) FreezeFrame <- FreezeFrameData %>% filter(!map_lgl(shot.freeze_frame, is.null)) %>% unnest()
## Warning: `cols` is now required when using unnest(). ## Please use `cols = c(shot.freeze_frame)`
Using purrr and “map_lgl” I can filter out the null values from the shot freeze frame column. From there I can then unnest all the data in to separate rows. Using this filterd and unnested data you can now plot or calculate player density at the time of the shot.
I hope this helps you examine the free StatsBomb data in more detail.