Evaluating the Player Evaluators, Part IV (Showdown: Price Guide vs. Point Shares)

10 Comments
January 15th, 2009 by Mays
Categories: Price Guide

I hadn’t planned on running another scenario, but Rudy left a comment on the Price Guide vs. RotoTimes vs. Razzball post:

Interesting test. Simulated draft tests are difficult to do w/o introducing bias. [...] I would argue that creating 4 800 IP teams is an artificial construct. You’re never going to see a league where 4 teams punt pitcher counting stats. I’d be interested to see how the test would go if you credited each team a realistic 5 starters and aim for a 1200 IP avg per team.

Rudy’s point is a good one, and one that I acknowledged in both Part II and Part III. Since both of us are interested in what would happen in a more typical scenario, let’s do one more trial.

This test will be a showdown between 5 Price Guide teams and 5 Point Shares teams. It will be a Retro-Draft like last time, but I’m going to change it up to give Point Shares every possible advantage.

Start One Catcher Per Team
The first time around, I drafted for a league that starts two catchers. Since Rudy mentioned that Point Shares assume one-catcher per team, let’s go with that this time. In fact, let’s use the complete league defaults from an ESPN league: 10-teams, 1 catcher, 5 OF, no CI, no MI. This is the Point Shares’s home turf.

Update: Rudy mentions that, despite what was shown on the Razzball site, the Point Shares are built based on a league with 1 CI and 1 MI. You can check the comments to see the results when I add CI/MI (and a few other modifications).

Draft More Starters
To get the Price Guide teams to draft more starters this time, I requested dollar values pretending that the league required 6 SP and 3 RP (instead of 9 P like last time). Basically, I’m forcing the Price Guide to ignore what it has already demonstrated to be the optimal strategy and instead take the typical approach (i.e. more starter focused). With this constraint in place, the Price Guide builds rankings so that each team should average 1309 IP.

So how did things turn out?

LPP C 63.5
LPP E 62
LPP B 58.5
RB C 58
LPP A 57.5
LPP D 55.5
RB B 53
RB A 52.5
RB D 48
RB E 41.5

Razzball manages to sneak one team into the upper half of the standings, but otherwise the Price Guide ends up with the clear edge.

This time, the LPP teams drafted SP like they were supposed to, averaging 1,505 IP per team.

On the other hand, it was the Razzball teams who were loading up on RP! They averaged only 3 SP per team and 1,065 IP. (Unfortunately, the draft-lots-of-relievers strategy didn’t work as well for them as it had for the Price Guide earlier.)

If anyone is interested in the details, I’ve got the full draft results. I’d also be interested if anyone has any critiques for the method I’m using to compare the systems.

Related posts:

10 Responses to “Evaluating the Player Evaluators, Part IV (Showdown: Price Guide vs. Point Shares)”

  1. Nick says:

    @Rudy: I seem to remember that earlier this year when you released the Marcel Point Shares, while you included MI & CI in the calculations, you accidentally left them off of the header. Is it possible that you did the same thing with the 2008 Point Shares?

    @Mays: I’m wondering if including multiple teams that use the same rating system is having an impact on the results, possibly making each system less “unique” and masking various strengths and weaknesses. Maybe a league made up of only one each of the different raters, run several times and then presented with average order of finish, would better highlight the differences in methodology. Creating a 10 or 12 team league might be a challenge, but we could use final ranks from some other sites. I know Yahoo, for instance, doesn’t have a player rater, but does have a ranking system. I’d be happy to create a ranked list according to Yahoo to save you a bit of time. And if we need more teams, we could use some other list like preseason ADP or preseason rank by various sites. Obviously, the preseason lists don’t enjoy the hindsight that the other raters do, and would be expected to finish poorly, but if we’re just looking for other teams to fill out the league, I think it would be okay.

  2. Eric says:

    I’d be happy to volunteer the output from my system as one of the entrants as well, if you’re trying to fill up the league with unique systems.

  3. Mays says:

    @Nick: You are right that having multiple teams using the same rankings affects the results. In a real league, if a team punts a category they will almost always finish with 1 pt in that category. In these tests, if multiple teams (using the same source) “punt” a category, they will get an average of 2.5 pts, and one of them will get 4 pts.

    So that’s not exactly realistic, however, I think it should benefit all of the systems equally.

    I agree that one team per system would be the best way to do things, but you would need to repeat the draft multiple times to prevent any fluke results. Even a good system can end up too heavy in a certain category on occasion, due to nothing more than poor timing.

  4. Rudy Gamble says:

    Nick -
    You’re right. I forgot to put MI/CI on the header. I caught the same thing after downloading the draft results. I updated the headers on the 2008 Player Raters (no change to rankings though)

    Mays -
    Apologies for that mixup but I suppose it does satisfy any curiosity you may have for why my teams drafted MIs for 4 of 5 UTIL slots :)

    I hate to criticize since you’ve done all this work and it could seem like sour grapes but…

    1) My rankings are based on their most valuable position – which is listed in my player rater. The following players are ‘drafted’ in positions where they would be valued lower:
    a) Jorge Cantu (valued at 3B, placed at 1B)
    b) Miguel Cabrera (valued at 3B, placed at 1B)
    c) Alexei Ramirez (valued at 2B, placed at OF)
    d) Mark DeRosa (valued at 2B, placed at OF)
    e) Garrett Atkins (valued at 3B, placed at 1B)

    (The only LPP player out of most valuable position was Dunn at 1B vs. OF)

    2) When you look at the draft results, they are so far from realistic that the test loses credibility. I value each player as if he’s being added to an average team (with the average player at each position). I’ve got a team that drafted Rivera, Papelbon, Soria, and Lidge in draft picks 2 through 5 (oddly enough, it finished last!). Rivera is legitimate value (though Santana at #1 reduced value of Rivera’s ERA/WHIP). Each reliever after that would have reduced value. I propose certain throttles are set to avoid extreem results:

    - No more than 2 Pitchers in the first 3 rounds.

    - No more than 3 Pitchers in the 1st 5 rounds (and it can’t be 3 of the same so 2 SP/1 RP or 1 SP/2 RP)

    - UTIL can only be 1B/3B/OF/DH (ridiculous having a catcher or MI)

    3) Given the 5.5 SP/3.5 RP split on the Razzball Point Shares, I suggest you value players at 5 SP, 3 RP, 1 P.

    - No MI/CI/UTIL assignment except for DH until 4th round (avoids unrealistic double-ups like Pujols/Howard and Hanley/Reyes)

    - Players only placed at most valuable position (at least for Razzball teams).

    Regards,
    Rudy

  5. Rudy Gamble says:

    I agree with Mays that there isn’t any inherent disadvantage at there being 1 or 5 teams from a specific player rater as long as it is impartial. If you’re talking about a draft board that might be skewed toward a specific strategy, that would be different.

  6. Nick says:

    I think we need to revisit the exact question we’re trying to answer here. Is it in fact “which system most accurately values players?” or “which system is best used when drafting?”

    I think they’re two different questions.

  7. Mays says:

    @Rudy – I really appreciate you taking the time to respond here, and I apologize for leaving out CI/MI the first time around.

    I did things again with the parameters that you suggest: Here are the standings that I get:

    RB C 58.5
    RB A 58.5
    LPP B 58
    LPP D 57
    LPP A 56
    RB E 55.5
    LPP E 54
    RB D 53.5
    LPP C 50
    RB B 49

    So maybe a slight edge for Razzball and a lot closer overall. (I think adding CI/MI made a bigger difference than the extra parameters.) The points actually split right down the middle with 275 for each system.

    But while things end up closer, the league still isn’t any more realistic than last time: The LPP teams finished 6-10 in almost every pitching category, and the Razzball teams finished last in nearly every hitting category.

  8. Mays says:

    Also, I’ve uploaded the full draft results.

  9. Rudy Gamble says:

    Hey Mays -
    Yeah, that sounds more realistic. Our rankings look pretty similar aside from Point Shares values starters more. Obviously a test like this takes it to an extreme even with parameters (like 5 SPs in first 8 rounds).

    I think you’d need a more scientific test to test whether one of our rankings is slightly better.

    Thanks for taking the time to redo the test. Any way you can edit the post to account for the revised results?

    Regards,
    Rudy

  10. Confused says:

    This is the conclusion I’d come to; for leagues that are exactly the same as Rudy’s (like your site) and use real people to draft (no stupid decisions) both systems are equal. However I give a major advantage to Last Player Picked for the fact that theirs is completely customizable.

Leave a Reply