Measuring luck in Scrabble

Discuss general strategy

Measuring luck in Scrabble

Postby arenasnow » Tue Oct 29, 2013 5:56 pm

I have developed quite a knack for undertaking ridiculously overwrought analyses that do basically nothing with regard to actually improving my ability as a player, and well, here's another one.

For quite some time, I have been interested in trying to quantitatively separate the luck and the skill factor in Scrabble to attempt to determine which games I may have played less poorly than my opponent but lost, and which games I may have played better than my opponent and won. I addressed this briefly on the post where I suggested equity percentage as a replacement for equity loss a couple years ago - nowadays I don't see many people really talking about equity loss either. In the long run, I think that's correct since the object of the game is to win and measuring differences in winning percentages between the top play and your play throughout the game would probably be much more accurate at determining how the game went. In that post, I suggested possible equity as a proxy for luck - simply adding up how many points you could have scored equity-wise playing your top equity play on each turn, but I realized that was likely misguided as well since for instance if you missed a bingo and saved bingo-prone tiles you'd be very likely to have high-equity opportunities on your future turns. Possible equity can be as much a measure of poor skill (missing your big plays) as it can be a measure of luck. And yeah, I know "luck is where preparation meets opportunity" and all that crap. A lot of so-called "luck" can be exploiting opponents' weaknesses to give yourself good opportunities later in the game. However, I've also been very annoyed for quite a while how most people seem to simply measure luck by counting power tiles when I knew it wasn't that simple. If you draw three s's on one turn, you weren't particularly lucky. If you draw a blank in the endgame on an extremely closed board, there's not much you can do with it, and again, you weren't particularly lucky. Or at least not as much as you would be on a wide-open board early in the game. I am at least advanced to know that tiles aren't worth the same amount at every part of the game. A blank in the beginning might be worth 30 or 35 points instead of 25; an S in the beginning might be worth 12 instead of 8. On a closed board, those values would probably be less than 25 and 8. What REALLY matters of course is the synergy of your rack. ADEINOR is obviously better than any rack with three s's. Likewise, even something as lousy as DIMRUUV can probably be better on some boards than AEINORT if you actually have two spots for DUUMVIR but don't have any spots for the 8s with AEINORT (okay, I'm not sure I could be able to come up with a situation that would support this, so maybe the example is too extreme.)

Essentially, what I wanted to do was to attempt to measure luck by taking the difference between the synergy of a player's leave and the resulting valuation of the player's subsequent rack. I ran 500 2-ply simulations in Quackle on each rack I drew to determine how many points the entire rack was worth, which I determined by simulating a passed turn for each rack, then took the difference between that and the value of the previous leave. For example, in my first game against John Fox in the September Syracuse tournament, I kept NRST from a rack of GNRRRST with a vowel-heavy pool remaining. The leave NRST is worth 13.1 points according to Superleaves. I drew AEQ on the next turn. The equity value of a pass after 500 simulations with that rack was 25.9. I dropped the Q playing QI leaving AENRST with a leave value of 38, then drew a U and bingoed. Despite having a bingo on the turn, the equity value of the pass was worth 41, barely more than the 38 AENRST was worth, indicating that most of my luck was drawing the A and E on the previous turn to accompany my NRST, not drawing the U on the subsequent turn to actually allow me to bingo, which I believe is correct, and with proper rack management, luck can build over several turns, which I also think is correct. Although rudimentary, I think this is a better measure of luck than just HAVING an S. I followed through on this logic for the entire game and did so for each of my last 30 games summing my cumulative luck over the entire game, and here were the results (along with a FEW of my opponents' luck values if they actually recorded their racks, which most people around here don't...)

Kevin Gauthier (L 351-373) -118.0
Heather Drumm (W 477-348) +89.6
Hani Khouri (W 522-248) -52.2
Jason Broersma (L 274-422) -106.3 to -9.9
Roberta Borenstein (W 462-286) +119.3
Greg Fox (W 536-377) +151.5
William Pizer (W 381-354) -104.2
Jason Broersma (L 389-421) +47.0 to +4.2

Daniel Citron (L 355-407) -11.2
Joan Tondra (L 360-369) +45.5
Mark Goodman (L 380-397) +93.8
Barbara Epstein (W 478-293) +26.1
Matthew O'Connor (W 431-347) +67.6 to +29.2
Daniel Citron (L 329-380) -71.7
Daniel Blake (W 373-352) +3.0

Karl Higby (L 399-447) +66.9
Morris Greenberg (L 410-436) +57.5
Sue Tremblay (L 361-383) -86.9
Matthew O'Connor (L 362-425) +8.3
Joseph Bowman (W 448-391) +21.3
Denise Dixon (W 464-275) +15.4
Kevin Gauthier (W 505-299) +52.1
Kevin Gauthier (W 413-343) +40.4

Karl Higby (W 375-319) +26.9
Matthew O'Connor (L 391-471) -8.7 to +70.7
Ted Rosen (L 377-486) -91.2
Shubha Kamath (W 531-349) +40.5
John A Fox (W 396-306) -38.3
Ted Rosen (W 448-394) +20.0
John A Fox (W 357-337) -113.7

Based on these results, I quickly came to the conclusion that what was being measured here probably is correlated with luck but hardly seems to be any sort of perfect measure of it. In my 30 games, I went 13-6 when I had positive luck and 4-7 when I had negative luck. That alone might indicate that on the surface there is something here, but several of the results seem bizarrely anomalous, especially my luck value of -52.2 in my 522-248 win over Hani Khouri. While I would be expected to win against three-digit-rated players even with atrocious luck, it seems unfathomable that anyone could possibly consider any 500 game as 'unlucky'. In 3 of the 4 results where I had both players' racks, the luckier player did win, and in the 4th game, Jason Broersma won a close game with slightly less luck which presumably indicates he just played better, which is understandable since he's a better player.

What REALLY makes these results more or less worthless is something I should have realized before I started doing this, and I WAS certainly aware of this and stated it above. Leave values CHANGE. Several of the games I won with very low luck values (especially the William Pizer and John Fox games) were ugly, defensive slugfests. There were few opportunities for good plays on either side for sizable portions of both of those games, hence even racks that would normally be above average weren't worth NEARLY as much as they would be on an open board. There are few sets of letters so synergistic that they would have successfully done much to open either of those boards, so you could probably pass with AEINRST every time and almost get a negative evaluation, and CERTAINLY a below average evaluation compared to the expected value of the previous leave. The condition of the board clearly has to be figured in when evaluating luck, so these values don't necessarily mean anything out of context. Maybe an open board is ALWAYS going to tend towards positive luck values since there are more possibilities for plays than on an average board, and a closed board is ALWAYS going to tend towards negative luck values. Hence the only way this might mean anything is if you compare both players' luck values, but unfortunately most players don't record their racks, so... Another thing I did that may decrease the accuracy is that I did not count endgame leave values. I know Quackle calculates endgame evaluations differently from evaluations in the early game and pre-endgame so I didn't think this crude analysis would be meaningful in an endgame, especially since I'm evaluating leaves based on passes and a pass is going to be worth MUCH, MUCH LESS in the endgame than before it.

Some of the anomalies are more easily explained. In my Mark Goodman game, I had subzero luck most of the way then drew the bingo STOGEYS which I didn't know (lol, yes, I now know it's common...this just shows how out-of-it culturally I am...) with very few tiles remaining, hence that one play by itself had an evaluation of 80 points or something. One interesting thing I noticed is how often passing with a bingo rack had a higher evaluation than playing almost anything besides the bingo. Perhaps passing might actually be a decent strategy when you know you have a bingo rack but need time to think of what it is and have very little time left?

I think my fundamentals on this are largely correct, but it's pretty worthless to even attempt doing this unless you have both players' entire racks for the whole game and it can take so long to do this that there definitely gets to be a diminishing marginal utility (which is why I didn't do this for my first three tournaments). I don't think it's worth even attempting to do this in Quackle unless you're good enough to evaluate how many points a particular leave gains/loses on an open/closed board, which I am definitely not good enough to do yet. Regardless, this may be something that is worth more research in the future if people REALLY want to try to break down luck and skill. Josh Sokol was a bit of a skeptic about ME doing this considering how much stuff I miss, saying that measuring luck when you miss lots of things is fruitless, but I don't agree with him. If you miss a bingo, chances are good you still have a good leave. You can draw either good or bad tiles that will either be synergistic with your leave or not. I think doing it this way truly does measure luck, not poor play, as something like possible equity would reflect, but the main problem is that many players even at the STEE level don't record racks and that leave values are dynamic, not static. Elise probably does a better job of reflecting that, but I wasted way too many hours on this analysis even in Quackle, and for this project, I chose finishing it more quickly over accuracy, since trying to EXACTLY quantify luck is a ridiculous pursuit, but I thought in my case it was probably necessary to attempt to do this since I credit all my wins to luck and all my losses to inferior skill, which I KNOW is an unhealthy attitude towards Scrabble. If I could just evaluate who ACTUALLY played better canceling out luck I might have a healthier attitude toward games. That was the only reason I attempted to do this, but I'm now convinced there really ISN'T a perfect way to do it. I'm even more convinced that just listing S's and blanks is too crude though.

My other main question about this analysis is whether it actually is measuring luck or skill. In most cases, I had positive luck against lower-rated opponents (of course, since none of them recorded their racks, I can't say that I had MORE luck), but a lot of that could be that I save superior leaves that will be synergistic with letters remaining in the pool. Like when I exchanged against John Fox (who was higher-rated than me before, but not after, that tournament) I would not have saved so many consonants (NRST) if the pool wasn't so vowel-heavy. Is it then really LUCKY to draw a couple of vowels, or is it simply a skillful decision? In a lot of cases, I'm sure this did measure dumb luck but some of luck in Scrabble is certainly setting up your racks so that they will be synergistic on future turns. This leads me to believe it's actually impossible to separate luck and skill and maybe it's not even worth trying.

Did I gain anything from this strategically? Yeah, probably. It probably gave me more insight on how leave values change during a game, and I tried to predict how much each leave was worth before I looked it up. In a lot of cases I was dead on; in some, I was very much not. Some actually shocked me (AETT has 0 leave value? I'm SHOCKED that's not like +2 or something...duplication must be much worse than I think it is). I still think in the grand scheme of things it was a colossal waste of time but once I spent as many hours on this as I did, I figured I HAD to share the results.
Posts: 71
Joined: Wed Dec 31, 2008 12:42 pm
Location: Syracuse, NY

Re: Measuring luck in Scrabble

Postby kev10293 » Wed Oct 30, 2013 6:09 pm

Interesting read. Let me just say, I don't really know...

I have a question though... what is meant by a "STEE" player?

I experimented with measuring luck a while ago. Basically, I just took the difference in my play and the sim and then compared it with the difference in total scores. So if the sim estimated -20 and I lost by 30 I would assign that loss to bad luck. On the other hand, if the sim estimated -20 and I won by 30 I would assign that win to skill... but really how skillful is it to make 20 points worth of mistakes?? not very skillful. My reasoning was that "I would have won anyway.", or "I would have lost anyway" Not very meaningful, but at least gave me a bar to measure myself with.

In reality, measuring luck against quackle makes very little sense. It is the better player, therefore every time I win it is due to luck (and actually I think this is a very healthy attitude to have, even if it is a bit perfectionistic). Even if I make 0 mistakes, it is only because I drew tiles that I knew how to use.

I don't presume to be better than Quackle or anyone else. I simply try to make the best move. If I make 0 mistakes then I guess I deserved the win - otherwise, I guess I didn't deserve it. No one really deserves anything, particularly not in a game that is so heavily influenced by luck.

Just my opinion. I don't think measuring luck has any meaningful value. You are certainly right though that tracking power tiles is not very meaningful. I get annoyed when people do this as well. On average, it's a good measure, but not from game to game.
Posts: 133
Joined: Tue Jul 31, 2012 7:25 pm

Re: Measuring luck in Scrabble

Postby arenasnow » Thu Oct 31, 2013 2:43 pm

STEE - Sixteen-hundred To Eighteen-hundred Expert. Kind of an inside joke, coined by Cesar del Solar.

I would say having an equity loss of -20 is very skillful compared to most others. Perhaps not compared to top 25 players or to theoretical perfection... I think it's very conceivable you could have an equity loss that large and still be making the right plays if you're good enough to know when the sim is wrong. I'm not sure you can simply add equity loss to your score differential to determine whether you SHOULD have won or lost, especially without considering the tiles your opponent had. Essentially, what you were trying to do was very similar to my initial idea: measuring possible equity. Let's say you lose a game by 15 points but you had an equity loss of -20 but your opponent had an equity loss of -100. According to your model, you would count that as losing due to skill, but your opponent was playing considerably worse than you! I'd call that luck! (And I still think equity percentage should completely replace equity loss in the Scrabble literature). You deserve the win if you make fewer mistakes than your opponent. I don't think you need to play a theoretically perfect game to deserve a win.

Okay, if I'm reading you correctly what you did is measure luck in games against Quackle, not against a human opponent, assuming 0 equity loss for Quackle. That doesn't particularly interest me much because I simply do not believe luck doesn't play a role between imperfect opponents, and my attempt here was to measure luck between imperfect opponents (the two best players I have played - Ben Schoenbrun and Matthew O'Connor - will both admit they're nowhere near perfect). Further, Quackle is a better player than you and much better than me, but I do believe there are a handful (maybe a handful of handfuls) of players who are better than it is. Not sure how many players are better than Elise.

In my opinion, because I am a pretty emotionally weak player, going into a game assuming I deserve to lose because I play imperfectly even though there is the potential that I will play less imperfectly than most of my opponents is the wrong attitude to bring, and it definitely detracts from winning. It may be accurate, and it may help a few players to motivate themselves, but as self-loathing as I am, bringing my natural attitude that I deserve to lose at everything in life into Scrabble cannot possibly help me, especially 'cause I'm still an intermediate and I need to focus on the game itself and cut any emotional crap out of it (something I am very skillful at doing in competitive typing and need to find a way to bring to this). The first step is to make fewer mistakes than my opponents. Aiming for theoretical perfection comes much, much later if I ever choose to get that obsessed at all. I'd probably be content to be a STEE anyway.

I've also grown to believe that measuring luck has little value since I now don't think it's very quantifiable, but I STILL want to figure this out so I can quantify who played less poorly. Sure, if it's two of Nigel, Adam, and Dave playing, the luckier player will win. But if it's two imperfect players - the 1600 should be able to beat the 1300 player most of the time, even if the 1300 player has better luck. But it's also possible that the 1300 player may have easier decisions in a game or may make fewer mistakes (just unlikely). Measuring yourself against theoretical perfection is one thing - I'm nowhere near there yet but you might be. I just want to measure myself against my opponents. I definitely think it's possible if both players record their racks, but the vast majority of imperfect players don't do that, so meh, screw it.
Posts: 71
Joined: Wed Dec 31, 2008 12:42 pm
Location: Syracuse, NY

Re: Measuring luck in Scrabble

Postby kev10293 » Fri Nov 01, 2013 3:28 am

Yeah, your point is valid. If I make less mistakes than my opponent yet still lose, then I was technically unlucky to lose. However, I feel this is somewhat of a defeatist attitude. I try to go in aiming to play perfectly. This way less than perfect is still pretty high. If a game was winnable, even if I did make less mistakes than my opinion, I am going to consider this a "badly played game." Probably pretty terrible for my self-esteem but I think most of the top players in any sport or endeavor have to think this way. Top performers in any sport/hobby are invariably hard on themselves. Trust me, I'm pretty easy on myself actually. But I'm also a realist. I recognize that I am lucky to beat anyone rated higher than I am.

All your points were good. I'm just not sure of the analysis. I feel that the ranking is accurate enough that this quantitative measure of luck doesn't even need to be performed. If I win against a higher rated player, I can assume I was lucky. If I lose to a lower rated player, I can assume I was unlucky (but I don't like to entertain this though much since it can lead to anger and bitterness).

If you know nothing about the opponent's rating, then these sorts of analyses might be benefical. The math is too difficult for me though. Not sure how to compute. It's more detailed than anything I have ever done.
Posts: 133
Joined: Tue Jul 31, 2012 7:25 pm

Re: Measuring luck in Scrabble

Postby kev10293 » Fri Nov 01, 2013 3:35 am

Also, I was a STEE for a long time. I probably still am at heart since the rating system has changed.

I play far from perfect scrabble. I don't know how many mistakes I make a game but I can tell you it is more than 20 on average. Some games I make upwards of 100 points of mistakes per game. It's really pretty terrible actually. And yet, I guess I can be happy with the fact that I am better than most players out there. Scrabble is not an easy game.

I think there are a lot of lower players who are underrated. The new system seems to favor those who already have a high rating. I think it is more difficult to get your rating up than to maintain it.
Posts: 133
Joined: Tue Jul 31, 2012 7:25 pm

Re: Measuring luck in Scrabble

Postby snowplow48 » Fri Nov 01, 2013 12:35 pm

You could just compare the best play using your actual draws against a large sample of random draws. That would account for position, leaves, and even fishing (i.e., if you already have a very strong leave, most draws may result in bingo possibilities and thus be close). Would be tedious by hand but easily coded for open-source programs.
User avatar
Posts: 32
Joined: Wed Jul 27, 2011 7:15 pm

Re: Measuring luck in Scrabble

Postby codehappy » Fri Nov 01, 2013 3:02 pm

I've tried quantifying luck in Scrabble several different ways -- I think this is an interesting question.

Consider this game, for example. Who was luckier during the game, Elise or Quackle?

One way you could try and determine that is simply to make some estimate of the value of the tiles (or full racks) that each player held, and compare them. If you approach it that way, you'll decide that Quackle was considerably luckier than Elise -- although Elise had a bingo sitting on its starting rack, Quackle got both blanks, and Elise had two racks late in the game (BBCDFIN and BCFLMMN) that are so bad that "exchange all" was either its best or second-best move.

Another way is to consider each player's rack leave from the previous move, and determine how many possible rack draws scored better than the actual rack draw. This effectively counts, not the average value of the tiles, but actual scoring opportunities on the board. If you do that, you would conclude that Quackle and Elise were nearly even on "draw luck" -- they each made seven draws from the bag better than the median draw, and neither player had substantially luckier draws. Even though Quackle drew both blanks, it was not actually very lucky, because for several turns it was unable to bingo with the blanks it drew.

A third way to approach it is simply to consider that Quackle had both blanks and outbingoed Elise, substantial advantages, but Elise won the game anyway. Even accounting for a skill difference between the players, anytime you beat a very strong player without blanks, there must have been a fair bit of luck involved, and Elise must have gotten more important "breaks" during the game than Quackle did.

These three approaches are all sensible in different ways, but they also come to three completely different conclusions: Quackle was luckier, Elise was luckier, they were about equally as lucky. Not one of them really capture everything that we might mean by 'lucky' in Scrabble play. A better definition of what we're looking for is probably in order...

The first approach is logical but doesn't take into account the board at all -- certainly a skillful player will try to make play on the board better for his own tiles, or worse for his opponent's tiles, than it would be on average.

The second approach is also logical, and considers the actual board position, but it still may fold some skill to its luck calculation: if a player has tiles that should score well but don't on the current board, is that all bad luck, or because he didn't open the board for his nice rack when he should have, or because his opponent made good defensive plays? This approach does make some attempt to separate those things from 'luck', because it looks at scoring opportunities available on the current board, so it is taking into account the current open- or closed-ness of the board already, but if opponent closed the board on his last turn -- who's to say that wasn't due to skillful anticipation of opponent's play?

The third approach is also logical, but it doesn't actually attempt to measure luck based on play. It's basically skill-agnostic, and assumes that any large divergences in results from expectations are significantly due to luck. There is probably some truth in that, but it makes no attempt to quantify it in any way.
Posts: 46
Joined: Tue Jun 11, 2013 1:56 am
Location: Pacific Northwest, USA

Re: Measuring luck in Scrabble

Postby cesar » Sun Nov 03, 2013 7:19 pm

top 20 players make about 30-40 points of equity loss per game (i think it starts decreasing sharply around top 5 )
Posts: 101
Joined: Wed Dec 31, 2008 5:04 pm
Location: Los Angeles

Re: Measuring luck in Scrabble

Postby raima55 » Fri Oct 24, 2014 1:35 am

Also, I would stray away from squamae brown's advice to just trust Quackle's valuation on opening racks. First of all, squamae brown, what do you mean by valuation? Do you mean a sim or speedy player? If you mean speedy player, that definitely doesn't make sense, because a lot of the time you can make a good setup on the opening rack. Another good example is the opening rack of FOOOTTT. Here, exchanging seven is right because playing keeps you with a poor leave and gives your opponent a board with many easy resources. Yet speedy player and sim both prefer TOFT or FOOT.
Our complete set of 700-505 exam questions - test questions and ISEB study guides you in exact way so you will pass your real Columbia University exam & wikipedia with flying colors of Baylor College of Medicine training.
Posts: 1
Joined: Fri Oct 24, 2014 1:34 am

Return to Strategy

Who is online

Users browsing this forum: No registered users