Playing a toy poker game with Reinforcement Learning
Reinforcement learning (RL) has had some high-profile successes lately, e.g. AlphaGo, but the basic ideas are fairly straightforward. Let’s try RL on our favorite toy problem: the heads-up no limit shove/fold game. No deposit bonus casino list nz. This is a pedagogical post rather than a research write-up, so we’ll develop all of the ideas (and code!) more or less from scratch. Follow along in a Python3 Jupyter notebook!(more)Games, Strategies, and GTO Strategies
This is Part 1 of 6 of an adaptation of my chapter “Game Theory Optimal Strategies: What Are They Good For?” from Excelling at No-Limit Hold’em edited by Jonathan Little.Much of the reason I wrote Expert Heads Up NLHE was to explain the ideas of game theory, poorly understood in the community at the time, to the average poker player. Heads up no limit (HUNL) is my game of choice personally, so it made sense to use it as the primary example. However, HUNL is something of a simple case, and there’s a bit more to be said about how game theory applies to other games. In this chapter, I’ll give a quick introduction to game theory as it applies to a variety of common poker formats. We’ll see when it’s useful, and more importantly, when it’s not – when it’s appropriate to use game theory-inspired strategies, and when it just can’t really guide our play. I promise to cover a practical skill or two as well.(more)Solving the Shove/fold Game with TensorFlow
Google recently open-sourced TensorFlow (website, whitepaper), a software package primarily meant for training neural networks. However, neural nets come in all shapes and sizes, so TF is fairly general. Essentially, you can write down some expression in terms of vectors, matrices, and other tensors, and then tell TF to minimize it.I ran through a couple of their very well written tutorials and then decided to try it out on one of my standard toy problems: the HUNL shove/fold game.(more)Running it up, Part 3
We doubled up twice – time for round 3!(more)Running it up, Part 2
We doubled the roll once, can we do it again?(more)Running it up, Part 1
Tonight, Carbon was down, but Black Chip Poker gave me a few dollars to play with, so I’m going to try to run it up.(more)Value categories in C++11
How to win money at blackjack casino. One of the most important additions to C++ in the C++11 standard was the introduction of movable types. This feature has consequences for many common programming tasks such as assigning variables and passing arguments to or returning objects from a function. Move semantics are a bit subtle, and when reading documentation, it helps to understand some vocabulary: value categories.(more)EDVis v1.1
Changes from v1.0 to v1.1:- Control fractions of individual hand combos- View and set fractions of hands of a particular suit- Account for card removal effects when drawing the distributions(more)Debugging
Pretty much any nontrivial piece of software will have bugs during development. Fixing bugs is thus an unavoidable part of programming, and it’s important that all programmers have some skill at the task. I recently made a video series about developing some poker-related software. The focus was on the problem domain, but much of the audience was new to programming, and I didn’t talk too much about what to do when things don’t go perfectly, i.e. when there are bugs. So, this post is a quick intro to debugging methodology in general, but I have my poker audience in mind.(more)Arbitrarily-deep nested loops
I finished a first pass at my lattice regression library over the weekend. The idea with that is pretty straightforward. Essentially, there’s some function we want to model, and it’s unknown, but we have a bunch of observations of inputs and corresponding outputs. So, we throw down a lattice (i.e. a regularly-spaced grid) of points over the space of inputs, and we use the data to “learn” some values of the function at the lattice points. Then, we discard the training data but can predict new values of the function by interpolating between the values at the lattice points. For more details, see, e.g. this paper.Code-wise, one challenge of the project was in representing and dealing with the lattice. For example, suppose the function we want to model has 4 inputs. Then, our learned values on a grid over the space of inputs might naturally be stored in something like a 4-D array,(more)Eclipse: Computing Git status for repository LatticeRegression
I’ve gotten a lot of value out of Eclipse CDT over the years, but I wish it was less buggy. And the UI could be better. Anyway, today I open my laptop (on an airplane), start a new C++ project, and soon notice (thanks to a battery indicator reading under 2 hours time left) that Eclipse is using 350% of my CPU. I check the Progress tab and see that Eclipse is “Computing Git status for repository LatticeRegression”.(more)Job fairs and SWE interviews
I have a different perspective on the job search thing now that I’ve successfully done it once and seen things from the other side. I manned a booth at my alma mater’s job fair recently and didn’t think most students asked the right questions. Ideally, almost all of a job fair conversation should consist of the student telling me what he’s good at and passionate about in as straightforward a way as possible (there’s no need to be modest or subtle). Reading resumes is mind-numbing work, if I have to use a lot of imagination to see you as a successful candidate, you’re likely to be disappointed. If you do ask questions, and I do the talking, you might as well take the opportunity to try and get as valuable of information as possible. Lopesan costa meloneras resort corallium spa & casino reviews.(more)
subscribe via RSS
Apr 27, 2014 Though Will Tipton may not be the most well-known name in online poker, he’s certainly achieved a great deal of success at the virtual felt. Poker I wrote Expert Heads Up No Limit Hold’em, a two volume work motivated by the desire to make the fundamentals of game theory accessible and useful to poker players. The focus is on heads up (i.e. Two-player) play, because that’s my specialty, and it’s where a lot of the theory is most useful, but a lot of the ideas covered apply to. Will Tipton: Sure, so I started out playing small home games with friends in college. At first I had no idea what was going on, but I borrowed a copy of Harrington on Hold’em and must have finished it in a day or two. Will Tipton Poker, successful roulette systems, casino barriere bordeaux poker, remote gambling association portugal. Gamble Responsibly BeGambleAware.org. No Deposit Bonus Frequently Asked. First publicly known attempt to do that was described by Will Tipton, here: CLICK for 2p2 discussion about subset of flops. Tipton's method was based on first creating conditions a good subset must satisfy and then finding the minimum size subset which satisfies all them.
Will Tipton Poker Rules
There are 22100 possible flops in Holdem out of which 1755 are strategically different. This is quite a big number which makes attempts to approximate preflop EV for chosen spots as well as upcoming preflop solving quite a difficult task. One needs a few terabytes of RAM to fit even very simple games (assuming full postflop play) and using a disc storage slows things down very significantly.
It's no surprise then that players and programmers are attracted to the idea of simplifying the game a bit. One of the natural ideas is to reduce the number of flops from 1755 to something more manageable hoping that the preflop results stays approximately the same. First publicly known attempt to do that was described by Will Tipton, here:
CLICK for 2p2 discussion about subset of flops
CLICK for 2p2 discussion about subset of flops
Tipton's method was based on first creating conditions a good subset must satisfy and then finding the minimum size subset which satisfies all them. Example conditions are a frequency of every card appearing, a frequency of any given pair being a top pair etc.
Will Tipton Poker Show
While this method makes sense and was improved upon by others since the original publications we have chosen a bit different road. Our method is based on defining some metrics which a good subset must satisfy and then running a solver of sorts to find the best subsets of N elements which scores the best on the metric. The metrics used are equtity (against full range, against 50% of the range, against AA etc.) as well as EVs from all 1755 sets which we got access to thanks to several of our users who run high volume analysis before (see credits at the bottom of this post).
The algorithm starts from a random set and 'evolves' at every iteration. We have used a random walk approach - at every step the subset is mutated in some ways and if the improvement is found that new subset becomes a new current one - rinse and repeat. The real EV results we got were divided into a training set and a testing set to avoid a situation where the same data is used for both training and testing.
We tried many metrics trying to determine the best one. Interestingly it seems a mix of EV and EQ performs better than other even if we grade the set using EV only.
The results we got are quite promising. To measure how good a subset of flops is we have used least square measure, that is a sum of squares of EV differences for every possible hand. That method punishes big deviations which is what we want. We got big improvements over Tipton's method (which contains 103 flops). Tipton's subset is performing on par with our 25 element subsets and signficantly worse than 50+ element subsets.
Without further ado let's go to the benchmarks. You can find comparison of our subsets to the real results (ones calculated on all 1755 flops) below. We are presenting 5 subsets we've developed: 25 element one, 49 element one, 75 element one, 95 element one as well as 184 element one. Additionally original Tipton's subset is added to the comparison.
It seems that 184 element subset performs really well but the smaller one should offer very good accuracy when the goal is get preflop EV, adjust, repeat.
We've uploaded more benchmarks, HERE.
We've uploaded more benchmarks, HERE.
As to the subsets themselves: (you can just copy-paste them into the script generation window)
Will Tipton Poker
We hope making those subset available will make estimating preflop EVs faster and more productive process. We hope those subsets can also be used to obtain preflop solutions once the preflop solver is available. Preliminary tests look very promising.
Have fun!
Will Tipton Poker Rules
*The testing data was provided by our very helpful users, among others:
-Selcouth
-Ilya 'SM0LK0' Smolko
-Selcouth
-Ilya 'SM0LK0' Smolko