It should also be obvious that dividing two numbers that are on different scales from one another has very little chance of being meaningful. Having two statistics or metrics on the same scale is far from unusual or impossible. Assist Rate and Turnover Rate are on the same scale: percentage of a team's possessions. Thus, there's a much more compelling case for multiplying or dividing those two numbers. Generous of you, but it wasn't anything so large as an assumption, just a wondering. The fact that three wildly dissimilar players all happened to come out to within 1.0-1.2 was interesting, but I didn't research it much further. A 1:1 ratio was unlikely (since they're on different scales) but there's still a pretty good chance that there's some kind of mean that players don't vary too widely from. What constitutes "too widely" would be based on the standard deviations from the league (or starter) mean. Sure, I do too. That's what we do all the time when we reference Usage, Rebound Rate, Assist Rate, pace, etc. These are all components of PER. I'm not sure that there's an inherent value in simply dividing the result of a formula by one of its constituent parts. I'm open to it, but I'd like to hear a logical case for why it's meaningful. Not just "Here are a few players' Usage/PER...maybe this means something." And I said/agreed with that in my first response to you. PER and Usage are definitely correlated to some extent; that higher Usage will usually yield a higher PER, because PER does reward gross production, not just efficiency. So using more possessions will result in more shot attempts, points and assists. Of course, if the player is not very good, this will only have limited effect because more shots put up and more possessions used (by an inferior player) will result in worse scoring efficiency and more turnovers, both of which hurt PER.
Sabermetrics didn't begin with someone randomly combining two statistics and "asking" if it was meaningful and then raging when people were dubious. Sabermetrics, whether you go with Bill James or Pete Palmer, began with people making logical guesses/contentions about baseball and then crunching vast amounts of data to see whether there was statistical evidence for them. Neither of them said "What if we divided RBI by batting average? Does that give us something? Help me out here, guys."
This is not true. Usage is an estimate that takes a player's shots, turnovers, etc., and divides them by the team's shots, turnovers, etc. here's the official definition: Using Usage as a piece to a formula that attempts to measure Net Positive Impact isn't wrong. I'm making this argument separate from PapaG, because I want to make it clear that while I am not a professional statistical analyst (that's my wife) by job description, I am a BI engineer, so I wasn't just putting legos together out of stats and wondering what I was coming up with. Impact tells a story (though I admit the stat isn't yet normalized because I'm not going to go through the effort of tracking every player with it). Anyway, I'm washing my hands of this thread.
Thanks. I assumed it was based on real data that is logged, but this makes it even more interesting - since this is much more of a team-based statistic - and not a real touches statistics - which makes it's use even more ambiguous to determine a player's specific contribution. Still, even if the statistic specifically matches the real world - the meaning behind it is the same - and the inconclusive nature of it (was it a beneficial or a harmful touch) makes using it with PER rather inconclusive (other than as an exercise for one's Excel skills). No, it is not. It just makes little sense to use it with PER in a rigid formula (multiplication) that have had lots of fine tuning by a proper analyst using large data-sets. I believe that my initial post on this was the one that touched on the way I would look at the combination - by looking at outliers for players that have a low Usage% but rather acceptable PER - as players that might not be used properly by their coaches. As a direct multiplication - no real value, imho.
You're essentially trying to do a guess-and-check multi-variable regression with these two stats. The problem is that doing a multi-variable regression with variables that are strongly correlated is usually not all that useful.
I would like to publicly point out that what I said about this subject is not trying to make fun of anyone - I just do not think the two statistics go together in any way that makes sense with a direct formula. I apologize if anyone thought otherwise.
I've been told that the variables are not correlated. It's actually more of a binary-variable equation, though, since I'm using the end result of two different equations, and not the variable that comprise each numer. At this point, you're assuming that the two are not strongly correlated. It's my belief that more work could be done to see how strong a correlating relationship between the two results may be. We've already had one glaring incorrect assumption in this thread, and the two people who falsely used that data are still trying to spin that error.
Fine. Then why did you waste so much time arguing that, instead of actually addressing some of the results? BC's post yielded some interesting results, IMO?
You didn't even know how what usage rate measured? Good God. I hope you learned a few things in this thread.
This is exactly what I said all-along, I guess I just got carried away in the "Fuck this thread, no-one answers me" and started to be more confrontational than needed. I still maintain that if you read my original posts in this thread, this is exactly what they say.
AS I UNDERSTAND IT (and I'm sure I'll be corrected if I'm the least bit wrong) Andalusian is right; because the constituent pieces of Usage and PER are pretty much the same, the two will correlate strongly. And, because they correlate strongly, they either amplify or cancel each other out, depending on your actions (multiply, divide, etc.) If two stats like that do correlate strongly, they basically should reduce out. It's not PER and Usage we should be looking at. There is this quest to find a super-advanced stat like QB Rating that gives you One Stat To Rule Them All, a single stat that encompasses both efficiency and real-world usefulness. PER alone isn't that stat. If such a stat existed, it would be made available from people who know what they're doing.
They don't, though. We've had a Usg/PER range of 0.94(Batum) to 1.57 (Brandon Jennings) in the 6 or so players mentioned. I use those two as an example because they were within 0.4 of each other in PER (17.3 to 16.9). That's a big difference, and suggest that there isn't a consistent correlation. The question then becomes, what does this large difference mean?
Anyhow, thanks for the input from those trying to be sincere. I think there is something here. If not, no biggie.
It doesn't mean anything. It means that some players produce because they score or create scoring opportunities for others, while other players get rebounds and block shots. Why is this news? Ed O.
Why don't you just pull the PER and USG stats for every player and provide the correlation results? Wouldn't that be the easiest way to answer this? Or are you looking for somebody to write you a script to scrape the data and do it for you?
A large statsitcal variation doesn't "mean anything"? Maybe not to you; I have a curious mind and wonder if Jenning's PER is inflated by a high usage rate, or if Nic is being used right by the coach, if Jennings is extremely overrated, and only gets his stats in an inefficient manner because of volume, if PER isn't the great comparative tool I thought it was prior to this exercise... all kinds of questions pop into my head. If you aren't interested, that's cool.
I provided links to somebody who did that earlier in this thread for last year. If you didn't click on the data, that's not my fault. It shows how ridiculous the entire 'correlation' argument is. It's rare to find a player with a 1/1 Usg/PER ratio.