Statistical Analysis of Recruiting in C-USA

AustinFromUNT · January 30, 2017

*Prepares for copious amounts of dislikes* I have an ungodly amount of free time this late at night so I decided to look at the recruiting classes for each C-USA school from 2013-2016 and compared it with their record to find a correlation between average class rank and average record in our conference. Here is what I found (ACR = Average Class Ranking, AR = Average Record):

Marshall: ACR: 1, AR: 9-4

MTSU: ACR: 4, AR: 7-6

Southern Miss: ACR: 4, AR: 5-8

LA Tech: ACR: 4, AR: 8-5

FAU: ACR: 5, AR: 4-8

WKU: ACR: 8, AR: 10-3

UAB: ACR: 8, AR: 4-8 (minus 2 seasons)

UNT: ACR: 9, AR: 5-8

ODU: ACR: 9, AR: 7-5

FIU: ACR: 9, AR: 4-8

Rice: ACR: 10, AR: 7-6

UTKFC: ACR: 11, AR: 5-7

Charlotte: ACR: 12, AR: 3-9 (minus one se)

UTEP: ACR: 13, AR: 5-8

So what is the correlation here? First thing to note is that the standard of deviation is 3.5, which is pretty large given a sample size of 14. Next thing to note is that the coefficient of determination of this data is .2048, in a research setting this would be a very slight correlation, it would need further testing to test for statistical significance. So what does all of this mean? It means the sky isn't falling yet. It means we should wait and see how this class plays before we start calling for SL's job like some are already doing here.

Edited January 30, 2017 by AustinFromUNT

Harry · January 30, 2017

8 hours ago, AustinFromUNT said:

*Prepares for copious amounts of dislikes* I have an ungodly amount of free time this late at night so I decided to look at the recruiting classes for each C-USA school from 2013-2016 and compared it with their record to find a correlation between average class rank and average record in our conference. Here is what I found (ACR = Average Class Ranking, AR = Average Record):

Marshall: ACR: 1, AR: 9-4

MTSU: ACR: 4, AR: 7-6

Southern Miss: ACR: 4, AR: 5-8

LA Tech: ACR: 4, AR: 8-5

FAU: ACR: 5, AR: 4-8

WKU: ACR: 8, AR: 10-3

UAB: ACR: 8, AR: 4-8 (minus 2 seasons)

UNT: ACR: 9, AR: 5-8

ODU: ACR: 9, AR: 7-5

FIU: ACR: 9, AR: 4-8

Rice: ACR: 10, AR: 7-6

UTKFC: ACR: 11, AR: 5-7

Charlotte: ACR: 12, AR: 3-9 (minus one se)

UTEP: ACR: 13, AR: 5-8

So what is the correlation here? First thing to note is that the standard of deviation is 3.5, which is pretty large given a sample size of 14. Next thing to note is that the coefficient of determination of this data is .2048, in a research setting this would be a very slight correlation, it would need further testing to test for statistical significance. So what does all of this mean? It means the sky isn't falling yet. It means we should wait and see how this class plays before we start calling for SL's job like some are already doing here.

Interesting stuff. Thanks for sharing this.

Aldo · January 30, 2017

9 hours ago, AustinFromUNT said:

Hm, going off a @BillySee58 observation, what about average number of All-Conference players?

Cerebus · January 30, 2017

First, you're looking at 2 years of recruits really, most of the guys from those '15 and '16 classes will not have played major roles yet. Not sure if that is enough data to be statistically significant.

Second, @BillySee58's entire thesis is that at the G5 level offer sheets are better indicators of talent than star rankings. So if you do want to get something statistically significant a better approach would be to compare # of FBS offers for recruits over 10 years or so.

ChristopherRyanWilkes · January 30, 2017

It's been a while since I took a stats class, but something seems off about your numbers and findings. First off, how did you get an average record of 5-8 for UNT? Second, low correlation or not it looks like #1-4 in recruiting are consistently the top teams in our conference while the bottom 4 are consistently the worse. So you may have inadvertent proved our point about recruiting rankings even more.

When you say standard deviation of 3.5, what are you applying that to? Like 3.5 wins?

AustinFromUNT · January 30, 2017

1 hour ago, ChristopherRyanWilkes said:

It's been a while since I took a stats class, but something seems off about your numbers and findings. First off, how did you get an average record of 5-8 for UNT? Second, low correlation or not it looks like #1-4 in recruiting are consistently the top teams in our conference while the bottom 4 are consistently the worse. So you may have inadvertent proved our point about recruiting rankings even more.

When you say standard deviation of 3.5, what are you applying that to? Like 3.5 wins?

You get a 5-8 record by looking at the listed year's record. Over those 4 years we had 19 wins/4 years = 4.75 wins rounded to 5. Over those 4 years we had 31 losses /4 years = 7.75 losses rounded to 8. Hence the 5-8 average record. The "top 4 in recruiting" are Marshall, MTSU, Southern Miss, & LA Tech because they had the 4 highest average class rankings. So. Miss averaged 5-8 over that 4 year span, which I wouldn't label great and MTSU only average 7-6, which is a winning record, but not something necessarily to brag about. LA Tech has done well with 8-5 and obviously since Marshall placed 1st in recruiting all 4 of those years they did fairly well at 9-4, however this takes into account some drastic poles such as going from a 1 loss season to last season going 3-9. WKU had been the most successful program though averaging 10-3 with an average 8th ranked recruiting class. Plus let's not forget FAU has averaged 5th in recruiting but they only average a 4-8 record. The SD for this data is is comparing the ACR to AR (wins). So basically if you look at expected wins for a school that finishes 5th in recruiting then you will see that 95% of the time they'll fall within SD of 3.5. Which is pretty large given that there are a maximum of 13 games to be played.

Edited January 30, 2017 by AustinFromUNT
Put MTSU twice instead of LA Tech

ChristopherRyanWilkes · January 30, 2017

19 minutes ago, AustinFromUNT said:

You get a 5-8 record by looking at the listed year's record. Over those 4 years we had 19 wins/4 years = 4.75 wins rounded to 5. Over those 4 years we had 31 losses /4 years = 7.75 losses rounded to 8. Hence the 5-8 average record. The "top 4 in recruiting" are Marshall, MTSU, Southern Miss, & MTSU because they had the 4 highest average class rankings. So. Miss averaged 5-8 over that 4 year span, which I wouldn't label great and MTSU only average 7-6, which is a winning record, but not something necessarily to brag about. LA Tech has done well with 8-5 and obviously since Marshall placed 1st in recruiting all 4 of those years they did fairly well at 9-4, however this takes into account some drastic poles such as going from a 1 loss season to last season going 3-9. WKU had been the most successful program though averaging 10-3 with an average 8th ranked recruiting class. Plus let's not forget FAU has averaged 5th in recruiting but they only average a 4-8 record. The SD for this data is is comparing the ACR to AR (wins). So basically if you look at expected wins for a school that finishes 5th in recruiting then you will see that 95% of the time they'll fall within SD of 3.5. Which is pretty large given that there are a maximum of 13 games to be played.

OK I gotcha now. Only thing I would add is that recruiting classes generally take 3-4 years to show their record. So maybe going back even further can elaborate why those teams have done so well. Also, I would like to see how taking out the middle of the pack effects the coefficient. Seems like in the middle it isn't as significant as it is on the bottom and top end of the rankings. Sorry, not trying to poke holes in your research and appreciate what you've done, just trying to get further elaboration.

Graddean · January 30, 2017

Taking out the middle will increase the correlation, but the result is meaningless. As far as the SD is concerned, you have to go two deviations to get 95%. This also assumes a normal distribution which would not be appropriate with ranked data such as recruiting rankings.

ChristopherRyanWilkes · January 30, 2017

11 minutes ago, Graddean said:

Taking out the middle will increase the correlation, but the result is meaningless. As far as the SD is concerned, you have to go two deviations to get 95%. This also assumes a normal distribution which would not be appropriate with ranked data such as recruiting rankings.

OK this is what seemed off to me, but I couldn't put it in statistics terms. Like I said, been several years since my last stats class, or having to use stats at all.

Aldo · January 30, 2017

I think Austin's on the right track, but needs a little more fine tuning and more specific tests.

Found this paper. It's pretty simple and straightforward, but it's kind of what we're trying to do here.

Also interesting is that recruiting classes this year will not have an effect on wins later on because recruit quality is practically the same for each team year after year (ugh).

http://www.econ.ohio-state.edu/trevon/pdf/Bergmen_Logan.pdf

Quote

The difference between the OLS and fixed effect estimates is not uniform, however. The coefficient on three star recruits in the fixed effects regression (.0555) is larger than the OLS coefficient for three star recruits (.046). This implies that within schools three star recruits have a larger impact on the number of wins than between schools. This result is most likely due to the fact that three star recruits are recruited by a variety of football programs and their marginal impact on an individual school’s success could be quite large for schools without strong tradition.

Quote

Also lower rated recruits have more heterogeneity since they aren’t evaluated and scrutinized at the same level of higher rated recruits. Three star and two star recruits aren’t subject to rechecking of their evaluation as much as higher rated recruits. Even with that as a potential explanation the actual difference is small and is not statistically significant (t=0.349).

And in regards to seeing recruiting class effects over time seems negligible:

Quote

The addition of lags in recruit quality do not have a substantial impact on model fit once team fixed effects are included. In fact, the inclusion of lags yields misspecified results due to the high degree of colinearity between years.

Quote

While teams can (and do) win with talent of lower ratings, success at the highest levels of college football is much more likely to happen when a team possesses highly rated recruits.

Edited January 30, 2017 by Aldo
It's in the Journal for Sports Economics

AustinFromUNT · January 30, 2017

I thought about going 2 more recruiting classes back, but at that time different schools were in different conferences and all other kinds of confounding variables.