Statistically Speaking: July 2007

Saturday, July 28, 2007

First Annual Predictomatic Contest

Amazingly, this blog has just recently celebrated it's two-year anniversary. In the two years I've been blogging, this space has evolved from a general sports blog, to a college football blog. As I strive to make this blog more informative and analytical, I decided I also need to make it a little more fun. Thus, get eggcited for the introduction of what I hope to be an annual contest...The Predictomatic.

The rules are simple. For each of the 6 BCS conferences (ACC, Big East, Big 10, Big 12, Pac 10, and SEC) select the team you think will win that league. But the fun doesn't stop there. Next select the teams you think will finish in the basement of each league. For leagues with two divisions (ACC, Big 12, and SEC), select the teams to that will win each division, finish in the cellar in each division, and the team that will win the corporately sponsored championship game. Next, select a team from outside the BCS that you think will play in a BCS bowl game this season. If you don't think one will qualify this season, make 'no team' your selection. Finally, choose your national champion and the total points scored in the BCS Championship Game.

Each correct selection of a conference champion, last place conference team, non-BCS BCS bowl game participant, and national champion will be worth 10 points. Each division champion, last place division team, and a correct 'no team' selection in the non-BCS category will be worth 5 points. To summarize, you ballot should have 23 selections broken down in this manner:

ACC (5)
2 Division winners (5 points each)
2 Last place teams (5 points each)
1 Champion (10 points)

Big East (2)
1 Champion (10 points)
1 Last place team (10 points)

Big 12 (5)
2 Division winners (5 points each)
2 Last place teams (5 points each)
1 Champion (10 points)

Big 10 (2)
1 Champion (10 points)
1 Last place team (10 points)

Pac 10 (2)
1 Champion (10 points)
1 Last place team (10 points)

SEC (5)
2 Division winners (5 points each)
2 Last place teams (5 points each)
1 Champion (10 points)

Non-BCS BCS Bowl Game Participant
1 Team (10 points) or No Team (5 points)

National Champion (10 points)

Tiebreaker
Total points in BCS Championship Game

The ballot with the most points (170 total possible) will win. If the national title is spilt (2003), then both selections will be valid, but only the total points in the BCS Championship Game will count in the tiebreaker. Please send your completed ballots to predictomatic@yahoo.com by August 24, 2007 with your name and/or alias in the body of the email.

What will the victor receive? This is where you come in. You may have noticed at the top of this blog there are advertisements. I am paid from these ads on a per click basis. Starting 9/1/07 and continuing through 12/31/07, every cent that is earned from clicks will be given to the winner. Please click the ads occasionally, but do not click them an inordinate number of times as they could be revoked leaving the winner with no cash prize. A conservative estimate for the winning haul would be somewhere between $25-35. The winner will be paid via check (if I know you personally) or money order around mid-February 2008. Consider it a Valentines day present from me. In October, November, and December I'll be posting updates, not only of the conference races, but also of the 'jackpot' that is being accumulated by the eventual winner. Please enjoy what I hope is to become an annual tradition.

Saturday, July 21, 2007

The Best Quarterback in College Football

Who was the best college quarterback in 2006? Troy Smith won the Hesiman, Walter Camp, and Davey O'Brien awards. Brady Quinn won the Maxwell Award. Colt Brennan was the highest rated passer. Chris Leak's team won the national title. Jared Zabransky's team finished undefeated. Who was the best?

College football has many different types of quarterbacks. Some 'manage' the game well, others are fortunate to play in run n' shoot offenses that boost their passing numbers, and others are athletic and run the option. Determining which quarterback is the 'best' in any given year is nigh impossible. But that doesn't mean its not worth a shot. The method that I am going to employ to rank quarterbacks is quarterback rating adjusted for opponent. Quarterback rating is not nearly a perfect measure of how well a quarterback performed (it does not include rushing yards and touchdowns or fumbles and in my opinion puts too high a premium on completion percentage), but it is a pretty good indicator. The methodology for adjusting for opponents is explained at the end of this post. Enjoy.

Let's start by ranking the top 25 quarterbacks from 2006 based on their unadjusted quarterback rating.

Colt Brennan 186.0
John Beck 169.1
JaMarcus Russell 167.0
Tyler Palko 163.2
Kevin Kolb 162.7
Jared Zabransky 162.6
Troy Smith 161.9
Colt McCoy 161.8
Brian Brohm 159.1
Justin Willis 158.4
Adam Trafralis 155.1
Chase Holbrook 155.1
Andre Woodson 154.5
Erik Ainge 151.9
Jordan Palmer 149.6
Bobby Reid 148.1
Nate Davis 147.3
Brady Quinn 146.7
Dan LeFevour 146.2
Zac Taylor 146.1
Graham Harrell 145.8
Chase Daniel 145.1
Chris Leak 144.9
John David Booty 144.0
John Stocco 143.9

Now here are the top 25 quarterbacks based on the adjusted quarterback rating. The second number indicates how much the actual rating improved or declined while the third number indicates how much the overall ranking changed. For quarterbacks who climb or fall a significant amount, an explanation is included.

John Beck 175.3 +6.2 +1
TCU (#7 in pass efficiency defense) and Wyoming (#12) in conference play as well as Boston College (#19) and Oregon (#28) outside Mountain West play help Beck take the top spot.

Colt Brennan 175.0 -11 -1
5 WAC teams finished with pass efficiency defenses rated 101st or higher (Idaho, Fresno State, New Mexico State, Louisiana Tech-2nd to last, and Utah State-last). Warriors also played UNLV (#113) in the non-conference schedule.

JaMarcus Russell 167.7 +0.7 E

Brian Brohm 166.3 +7.2 +5
Several good pass defenses on the schedule: Rutgers (#8), South Florida (#11), Cincinnati (#20), Miami (#24), and Wake Forest (#26).

Andre Woodson 165.3 +10.8 +8
Faced three pass defenses in the top 10 (LSU, Florida, and Georgia).

Troy Smith 162.1 +0.2 -1

Tyler Palko 160.5 -2.7 -3

Colt McCoy 160.0 -1.8 E

Kevin Kolb 157.9 -4.8 -4
3 Conference USA opponents finished over 100 nationally in pass efficiency defense (Tulane, Rice, and Memphis) and one was knocking at the door--UCF (#98).

Chris Leak 156.2 +11.3 +13
See Andre Woodson comment and add Ohio State (#10) and Arkansas (#18), Florida State (#29), and Auburn (#33).

Jared Zabransky 155.3 -7.3 -5
See Colt Brennan comment.

Erik Ainge 153.3 +1.4 +2

Riley Skinner 151.7 +12.1 +20
The first quarterback from outside the top 25 to jump onto the list, Skinner was a big part of Wake's dream season. Faced Virginia Tech (#2), Georgia Tech (#9), Clemson (#17), Boston College (#19), Louisville (#23), and Florida State (#29).

Brady Quinn 151.7 +5.1 +4

Bobby Reid 151.6 +3.5 +1

John David Booty 151.3 +7.3 +8
No truly great pass defenses on the schedule (Arkansas-#18 and Michigan-#25 the best), but besides Notre Dame (#90) no awful pass defenses either.

Chad Henne 151.3 +7.9 +9
Henne faced the top-ranked pass efficiency defense (Wisconsin) and the number ten rated defense (Ohio State) to go along with Penn State (#14) and Southern Cal (#22).

Adam Trafralis 150.4 -4.7 -7
See Colt Brennan and Jared Zabransky comment on the weakness of the WAC.

Bryan Cupito 148.8 +8.0 +11
Faced Wisconsin (#1), Kent State (#5 and a bit overrated), Ohio State (#10), Penn State (#14), and Michigan (#25). Also poor game against North Dakota State (67.5 QB rating) gets thrown out because they are not Division IA. Only rating i really don't agree with.

John Stocco 146.0 +2.1 +5

Dan LeFevour 145.6 -0.6 -2

Justin Willis 145.4 -13 -12
See Kevin Kolb comment, but replace Memphis (#116) and UCF (#98) with UAB (#103) and Marshall (#93). Also had an outstanding game against Sam Houston State (291.2 QB rating) that is thrown out.

Brandon Cox 145.3 +6.6 +12
See assorted SEC comments.

Nate Davis 145.3 -2.0 -7

Matt Moore 144.3 +4.6 +7

There you have it. It's a scientific fact that John Beck was the best quarterback in 2006. Maybe not quite that ex cathedra a proclamation, but he certainly is in the discussion. Five quarterbacks: Chase Holbrook (once again weak pass defenses in the WAC), Jordan Palmer (once again weak pass defenses in Conference USA) and three Big 12 quarterbacks (Zac Taylor, Graham Harrell, and Chase Daniel) fall out of the top 25.

Methodology:

1) Exclude all games against non-Division IA competition
2) For each game take the player's QB rating and divide by the QB rating defense (pass efficiency defense) of the opponent faced
3) Multiply this ratio by the number of pass attempts for the game
4) Add the these numbers up for each game
5) Divide by total pass attempts on the season
6) Multiply this number by 127.53 (the cumulative arithmetic mean quarterback rating for all passes thrown in 2006)
7) Voila

Saturday, July 14, 2007

The Teams You Beat

The impetus for this blog comes from some astute posting over at Kermit the Blog in regards to the relative paucity in quality opposition in Big East scheduling. The question I want to address here is whether a weak schedule can help us predict what will happen the following season. Is a team that beat up on patsies more likely to decline the following season than a team that played and won against a tough schedule? To answer this question, I performed two sets of regression analyses.

First I looked at the wins of all 65 BCS teams (plus Notre Dame) in 2005 and took the record against BCS teams (plus Notre Dame) of teams they had beaten (wins against non-Division IA teams excluded). Confused? Here's an example. In 2005, the Pittsburgh Panthers finished 5-6 under first year coach Dave Wannstedt. Their five wins were against Youngstown State, Cincinnati, South Florida, Syracuse, and Connecticut. The Youngstown State game gets thrown out since we are only concerned with games against Division IA teams. The first team they beat, Cincinnati, was 2-6 against other BCS teams (beating only Connecticut and Syracuse), South Florida was 4-5, Syracuse was 0-10, and Connecticut was 2-6. Combined, the teams Pitt beat finished 8-27 against BCS teams for a rather low winning percentage of .229. This number is our dependent variable and we will see how well it predicts the next season's (2006) winning percentage and conference winning percentage for each BCS team.

r squared value for predicting next season's winning percentage: .0946

r squared value for predicting next season's conference winning percentage: .0901

Both relationships are positive indicating that as defeated opponent's record against BCS teams goes up, so does both conference and overall winning percentage. Defeated opponent's combined record against BCS teams appears to be a consistent, albeit poor predictor of team's finish the following season. The r squared value is practically identical for both record and conference record the next season, but only a little more than 9% of the variation is explained.

While compiling each team's defeated opponent's record against BCS teams I noticed that winning percentage against BCS teams can have sample size issues. Therefore, I also decided to use total wins against BCS teams by defeated opponents as an independent variable. An example from 2005 between two Pac 10 schools can illustrate the dramatic effect sample size can have. In 2005, the Oregon Ducks had a fantastic regular season finishing 10-1. Their 10 wins were against Houston (1-1 versus BCS schools), Montana (non-Division IA), Fresno State (0-2), Stanford (4-5), Arizona State (5-5), Washington (1-8), Arizona (2-7), Cal (5-4), Washington State (1-7), and Oregon State (3-6). Combined, their defeated opponents finished 22-45 against BCS schools for a winning percentage of .328. That same season the Arizona Wildcats finished a disappointing 3-8. Their wins were against Northern Arizona (non-Division IA), Oregon State (3-6), and UCLA (7-2). Combined, their defeated opponents finished 10-8 for a winning percentage of .556. That's nearly 23 percentage points higher than Oregon's defeated opponents. However, their defeated opponents have less than half as many wins as Oregon's. Here are the r squared values for total wins.

r squared value for predicting next season's winning percentage: .2456

r squared value for predicting next season's conference winning percentage: .2183

Both relationships are positive indicating that as defeated opponent's total wins against BCS teams goes up, so does both conference and overall winning percentage. Again both relationships appear to be relatively consistent in regards to both conference and overall winning percentage. However, the predictive ability for total wins is over twice as high as winning percentage.

What does this all mean? While it certainly does matter to some extent how 'good' a team's wins were when predicting how they will fare the following season, it is certainly not the best indicator for future success. Instead of using it to predict how good a team will be, its best used to quantify how good a team was.

Thursday, July 05, 2007

Building a Better Mousetrap: Adjusted Pythagorean Winning Percentage

Time and again I've espoused on this blog the virtues of using a team's Pythagorean winning percentage to project more accurately how they will perform in the upcoming season. For the uninitiated the formula for the Pythagorean winning percentage is as follows:

Points Scored^2.37)/(Points Scored^2.37 + Points Allowed ^2.37)

The resulting number is a team's Pythagorean winning percentage. Multiplying that number by the number of games played will give a reasonable estimation of how many games a team should have won. However, the the formula is not without its flaws. For starters, blowouts, especially extreme blowouts can artificially inflate or deflate a team's Pythagorean record depending on whether or not they received or doled out the beating. The solution? Compute the Pythagorean winning percentage on a game by game basis, add up the totals, and divide by games played. This way each game is counted the same and the effect of blowouts is lessened. Here is a hypothetical example of the adjusted theorem in action for Eponymous State University.

ESU Results

17-10
31-28
21-35
70-3
34-20
21-24
28-14
21-17
34-14
21-27
14-6

ESU went 8-3 while scoring 312 points and allowing 198. Their Pythagorean winning percentage is .746. Their expected record is then 8.21-2.89. Their actual record is aligned pretty well with their Pythagorean record. However, one game sticks out like a sore thumb and is unjustly influencing the ratings. In the fourth game ESU dropped 70 on their opponent. Perhaps they were a Division III school, maybe they were a Division I school with a slew of injuries, maybe they turned the ball over nine times, maybe ESU ran up the score. Either way, we need to find a way to lessen that game's impact. Say for example, ESU stopped scoring after 30 points. They still win the game rather easily, but their seasonal Pythagorean winning percentage drops to .680 (7.48-3.52). That's almost 3/4 of a decrease in expected wins. If we determine the Pythagorean winning percentage of each game, add them up, and divide by 11 we get an adjusted Pythagorean winning percentage of .669 (7.36-3.64).

When we compute the Pythagorean winning percentage on a per game basis the difference between beating a team 30-3 and 70-3 is only about 3/100 of a point in winning percentage.

(30^2.37)/(30^2.37+3^2.37)=.996

(70^2.37)/(70^2.37+3^2.37)=.999

70 points is just piling on. Each extra score above a certain point negligibly increases the odds of winning the ball game. This in effect puts the proverbial 'cap' on margin of victory.

Now the important part. Is the adjusted Pythagorean winning percentage a decent predictor of a team's fortunes. Here are the r squared values for how three 2005 statistics predicted a teams 2006 winning percentage (BCS schools and Notre dame only--sample size 66).

2005 winning percentage: .352
2005 Pythagorean winning percentage: .3653
2005 adjusted winning percentage: .39

All three measures were reasonable predictors with adjusted Pythagorean winning percentage being the best predictor. It should be noted that in a post last off season, click here to view it, we found a team's (again BCS schools and Notre dame only) 2004 Pythagorean winning percentage to be a much better predictor than its 2004 winning percentage of its 2005 winning percentage. Part of the reason for the lowered predictive powers for both measures could be the added 12th game in 2006. The majority of the time the 12th game features a BCS school taking on a low-level Division IA or non-Division IA school for a guaranteed victory, thus boosting a team's winning percentage.