NFL Quarterback Performance Analysis
Statistical analysis of nullified interceptions in the NFL

Distribution of nullified interceptions across NFL quarterbacks (2018-2023)
The Hidden Side of NFL Statistics
As football fans, we're all familiar with the standard NFL statistics that dominate discussions every Monday morning - passing yards, touchdowns, and interceptions. But what about the plays that don't make it into the official record? What about those moments when a quarterback throws an interception, only to have it erased by a penalty flag?
These "nullified interceptions" exist in a statistical shadow realm - they happened on the field but disappear from the record books. In this data analysis project, I decided to shine a light on this overlooked aspect of quarterback performance.
The Mahomes Mystery
The inspiration for this project came from watching Kansas City Chiefs games over the past few seasons. As a data scientist and football fan, I couldn't help but notice what seemed like an unusual pattern: Patrick Mahomes appeared to benefit from an uncommonly high number of nullified interceptions.
Was this just confirmation bias on my part, or was there something statistically significant happening? To answer this question, I needed data - and lots of it.
Technologies & Tools
This project leverages several powerful data science technologies:
- R with nflfastR and tidyverse: The core of the data collection process, allowing me to access comprehensive play-by-play NFL data without manual review
- Python: Used for the statistical analysis and visualization components
- pandas: For efficient data manipulation and analysis
- scipy: For conducting statistical tests like the Mann-Whitney U Test
- matplotlib: For creating visualizations of the findings
- numpy: For numerical calculations and bootstrap resampling
Data Collection Process
The backbone of this project is the nflfastR
package in R, which provides comprehensive play-by-play data for all NFL games. My data collection workflow included:
- Extracting play-by-play data for NFL seasons 2018-2023
- Programmatically identifying nullified interceptions by searching for play descriptions containing "INTERCEPTED" alongside penalty flags
- Algorithmically matching interceptions with starting quarterbacks when the passer information was incomplete
- Merging pass attempt data to normalize interception rates
- Outputting a cleaned dataset as
nullified_interceptions_with_attempts_2018_2023.csv
Here's a glimpse at the R code I used to collect the data:
Analysis Techniques
After collecting the data, I analyzed it using Python to identify statistical outliers and differences in nullified interception rates. The analysis included:
- Mann-Whitney U Test: A non-parametric test to check if Mahomes' nullified interception rate was statistically different from other quarterbacks
- Bootstrapping: A resampling technique to estimate confidence intervals for nullified interception rates
- Z-Score Analysis: To identify statistical outliers in the dataset
- Data Visualization: To compare Mahomes' nullified interception rate against the distribution for other quarterbacks
The Python analysis that led to these conclusions included:
Key Findings
The analysis revealed some fascinating insights. Most importantly, Patrick Mahomes stood out significantly from his peers. The statistical tests confirmed what my eyes had suspected:
- Mahomes had a Z-score of 4.80 for nullified interceptions
- This is well beyond the typical threshold of 3.0 for identifying statistical outliers
- The Mann-Whitney U Test confirmed the statistical significance (p-value < 0.001)
- Bootstrap resampling demonstrated the robustness of these findings
What This Means for How We Evaluate Quarterbacks
These findings have significant implications for how we evaluate quarterback performance in the NFL. Standard interception statistics only tell part of the story. When a quarterback consistently benefits from nullified interceptions:
- Their interception statistics appear better than their actual on-field decision-making
- The team's offensive performance might be artificially boosted
- Risk-taking behavior is effectively rewarded without the statistical penalty
This doesn't necessarily mean there's anything nefarious happening - it could be a result of offensive line play, penalty tendencies, or even coaching strategies. What it does mean is that the traditional box score doesn't capture the full picture.
Implementation Details
The project structure consists of two main components:
- get_nullified_interceptions.R: An R script that collects and processes the NFL play-by-play data using nflfastR
- analyze_nullified_interceptions.py: A Python script that performs the statistical analysis
Here's an example of how I calculated Z-scores to identify outliers:
Challenges and Solutions
During this project, I encountered several challenges:
- Data Integration: Combining the play-by-play data with quarterback information required careful matching algorithms
- Sample Size Concerns: To address potential sample size issues, I implemented bootstrap resampling techniques
- Controlling for Variables: I needed to account for factors like total passing attempts and offensive style when analyzing the data
Future Work
This project could be extended in several ways:
- Expanding analysis to include the impact of nullified interceptions on game outcomes
- Comparing nullified interception rates across different eras of the NFL
- Refining methods to distinguish intentional penalties versus incidental fouls leading to nullified interceptions
- Investigating the types of penalties that lead to nullified interceptions
- Analyzing game situations (score, down, distance) when nullified interceptions occur
- Developing a predictive model for nullified interceptions based on quarterback style and team factors
How to Use This Project
If you're interested in reproducing or building on this analysis, the process is straightforward:
- Run
get_nullified_interceptions.R
in R to generate the dataset (requires nflfastR and tidyverse packages) - Run
analyze_nullified_interceptions.py
in Python to analyze the data and visualize results (requires pandas, numpy, scipy, and matplotlib)
Conclusion
This analysis provides statistical evidence that Patrick Mahomes benefits from an unusually high number of nullified interceptions. The findings suggest that standard interception statistics may not fully capture quarterback performance and risk-taking behavior.
The next time you watch an NFL game and see an interception wiped away by a penalty flag, remember - that play might not count in the official statistics, but it still tells us something important about quarterback performance. And in Patrick Mahomes' case, it tells us quite a lot.