Annoying Questions I'd Like Answered...

Mundane & Pointless Stuff I Must Share: The Off Topic Forum

Moderator: Moderators

DSMatticus
King
Posts: 5271
Joined: Thu Apr 14, 2011 5:32 am

Post by DSMatticus »

The goal of the iterated prisoner's dilemma is to have the highest score you possibly can. The goal of RPS is to have a higher score than your opponent. Which makes this hypothetical RPS game simultaneously more complicated and less interesting than the iterated prisoner's dilemma.

In the iterated prisoner's dilemma, the best strategies are all variations of tit-for-tat; cooperate until your opponent defects, then punish them with a single defection of your own. There are fiddly bits you can add to break vengeance loops or whatever, but the basic principle is always the same. That strategy frequently results in a tie with your opponent, but at the same time it will consistently outscore other strategies.

If the goal of this hypothetical RPS game was simply to score very high (as it is in the iterated prisoner's dilemma), I believe the optimal strategy would be "play scissors until your opponent plays rock; play rock until your opponent plays scissors; repeat." Assuming your opponent reciprocates, that generates three points for each player every two turns. Assuming your opponent takes advantage of you by playing paper on your rock ad infinitum, that generates two points for him every two turns. Three is more than two, so that would be stupid of your opponent to do. You'd need to add fiddly random bits to break out of bad loops; for example, the "you go first; no you go first" loop this strategy runs into if it plays itself. You'd probably want to add some logic to detect and score points off the idiots who think throwing paper on your rock is winning them big bucks instead of ruining the game for both of you.

But if the goal of this hypothetical RPS game is to beat your opponent's score, then there's no easy pattern. You need to predict your opponent's underlying strategy, then adjust your strategy to exploit his while partially randomizing your moves. You could probably do very well with a bag of marbles strategy: start with a bag containing 1 rock, 1 scissor, and 1 paper. Decide your next play by drawing a marble from the bag. Everytime your opponent plays, put another marble in the bag corresponding to the play that would have beaten his. Then you can do weird stuff on top of that like adding weights to particular moves, flushing old marbles so you don't get bogged down thinking about your opponent's old strategies, or even keeping track of separate bags given what move your opponent played last.
User avatar
Prak
Serious Badass
Posts: 17359
Joined: Fri Mar 07, 2008 7:54 pm

Post by Prak »

DSM, is your username supposed to be read DSM atticus or DS Matticus?
Cuz apparently I gotta break this down for you dense motherfuckers- I'm trans feminine nonbinary. My pronouns are they/them.
Winnah wrote:No, No. 'Prak' is actually a Thri Kreen impersonating a human and roleplaying himself as a D&D character. All hail our hidden insect overlords.
FrankTrollman wrote:In Soviet Russia, cosmic horror is the default state.

You should gain sanity for finding out that the problems of a region are because there are fucking monsters there.
DSMatticus
King
Posts: 5271
Joined: Thu Apr 14, 2011 5:32 am

Post by DSMatticus »

DS Matticus. DS is the abbreviation of the internet handle I started using when I was 8-10, playing Diablo I on the original incarnation of battle.net. No, I won't tell you what it stands for. That shit is embarrassing. Matticus is just a stupider version of my actual name. I use DSMatticus instead of Matticus because the latter is always taken and the former never is.
User avatar
Prak
Serious Badass
Posts: 17359
Joined: Fri Mar 07, 2008 7:54 pm

Post by Prak »

Huh, ok. Just curious.
Cuz apparently I gotta break this down for you dense motherfuckers- I'm trans feminine nonbinary. My pronouns are they/them.
Winnah wrote:No, No. 'Prak' is actually a Thri Kreen impersonating a human and roleplaying himself as a D&D character. All hail our hidden insect overlords.
FrankTrollman wrote:In Soviet Russia, cosmic horror is the default state.

You should gain sanity for finding out that the problems of a region are because there are fucking monsters there.
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

If I'm not getting this wrong, the only fully mixed Nash equilbrium is paper half the time, then rock and scissors a quarter each.

Say player 1 plays rock with frequency x, scissors with frequency y, and paper with frequency z. Then the payoffs for player 2's pure strategies are:

rock: -y + 3z
paper: 2x - z
scissors: -x + y

In a fully-mixed NE, all three of the pure strategies will have the same payoffs. So we can say stuff like (2x - z) - (-x + y) = 3x - y - z = 0, and Wolfram tells me that the solution to those subject to the constraint x + y + z = 1 is x = z = 1/4, y = 1/2.

But this is just the Nash equilibrium, the situation of what you do if you aren't fully randomizing your strategies or you're cooperating in some fashion with your opponent to maximize everyone's score will look substantially different. Hence DSM's ramblings.
-JM
Username17
Serious Badass
Posts: 29894
Joined: Fri Mar 07, 2008 7:54 pm

Post by Username17 »

John Magnum wrote:If I'm not getting this wrong, the only fully mixed Nash equilbrium is paper half the time, then rock and scissors a quarter each.

But this is just the Nash equilibrium, the situation of what you do if you aren't fully randomizing your strategies or you're cooperating in some fashion with your opponent to maximize everyone's score will look substantially different. Hence DSM's ramblings.
Uh, I'm pretty sure not. If someone responded to that with DSM's marble strategy, then they'd end up throwing Scissors half the time and rock and paper a quarter each. In sixteen throws, you'd win with paper twice (2 points) and lose with paper 4 times (-8 points), win with rock twice (6 points) and lose once (-1 point), win with scissors once (2 points) and lose with scissors once (-3 points). Every 16 rounds against a marble pusher, you're down 6 points against your opponent.

But it gets worse. Imagine you're playing someone who has no strategy and just acts randomly with no weighting. In 12 throws you'd win with paper twice (2 points) and lose twice (-4 points), you'd win with rock once (3 points) and lose once (-1 point), and you'd win with scissors once (2 points), and lose once (-3 points). So in 12 throws you're down 1 point on Mr. crazy face who has no strategy at all.

Anything that loses to a marble pusher is not the Nash equilibrium. Anything that loses to "no strategy" is obviously not the Nash equilibrium. The optimum strategy must start Rock heavy and adapt from there, because a rock heavy strategy is the only possible way to gain points against an opponent who simply rolls an unweighted die every turn.

-Username17
DSMatticus
King
Posts: 5271
Joined: Thu Apr 14, 2011 5:32 am

Post by DSMatticus »

Consider this; if you are running a fixed frequency strategy, I can just transpose your frequencies onto the plays that beat your plays and be guaranteed more total wins (unless your fixed frequencies are equal, in which case nothing changes). Simply by giving yourself unequal fixed frequencies, you have conceded that you will lose more games than you win. And that means if you still want to win overall, you need more points per win, which is accomplished in this case by favoring rock, not scissors.

As a simple illustration, let's try the strategy R = .5; P = .25; S = .25, which is matched by R = .25; P = .5, S= .25 (16 rounds is needed for a full sample):
R vs R x2; ---
R vs P x4; -4
R vs S x2; +6
P vs R x1; +1
P vs P x2; ---
P vs S x1; -2
S vs R x1; -3
S vs P x2; +4
S vs S x1; ---

Now even though you still lost 6 out of the 11 scoring rounds, you're actually up 2 points; because favoring rock is better than favoring anything else.
fectin
Prince
Posts: 3760
Joined: Mon Feb 01, 2010 1:54 am

Post by fectin »

FrankTrollman wrote:
John Magnum wrote:If I'm not getting this wrong, the only fully mixed Nash equilbrium is paper half the time, then rock and scissors a quarter each.

But this is just the Nash equilibrium, the situation of what you do if you aren't fully randomizing your strategies or you're cooperating in some fashion with your opponent to maximize everyone's score will look substantially different. Hence DSM's ramblings.
Uh, I'm pretty sure not. If someone responded to that with DSM's marble strategy, then they'd end up throwing Scissors half the time and rock and paper a quarter each. In sixteen throws, you'd win with paper twice (2 points) and lose with paper 4 times (-8 points), win with rock twice (6 points) and lose once (-1 point), win with scissors once (2 points) and lose with scissors once (-3 points). Every 16 rounds against a marble pusher, you're down 6 points against your opponent.

But it gets worse. Imagine you're playing someone who has no strategy and just acts randomly with no weighting. In 12 throws you'd win with paper twice (2 points) and lose twice (-4 points), you'd win with rock once (3 points) and lose once (-1 point), and you'd win with scissors once (2 points), and lose once (-3 points). So in 12 throws you're down 1 point on Mr. crazy face who has no strategy at all.

Anything that loses to a marble pusher is not the Nash equilibrium. Anything that loses to "no strategy" is obviously not the Nash equilibrium. The optimum strategy must start Rock heavy and adapt from there, because a rock heavy strategy is the only possible way to gain points against an opponent who simply rolls an unweighted die every turn.

-Username17
Your premise is off. If your starting point is that no-one you face will go random, there's no need for your strategy to address random. For caricatured example: if you're playing against a Batman villain who ALWAYS chooses paper, your winning strategy is to always choose scissors.
Vebyast wrote:Here's a fun target for Major Creation: hydrazine. One casting every six seconds at CL9 gives you a bit more than 40 liters per second, which is comparable to the flow rates of some small, but serious, rocket engines. Six items running at full blast through a well-engineered engine will put you, and something like 50 tons of cargo, into space. Alternatively, if you thrust sideways, you will briefly be a fireball screaming across the sky at mach 14 before you melt from atmospheric friction.
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

Well, hang on. If your opponent is playing R = 1/4, P = 1/2, S = 1/4, let's look at the expected payoffs of your pure strategies. The payoff for playing rock 100% of the time is (-1/2) + 3 * (1/4) = 1/4, for playing paper 100% of the time is 2(1/4) + -1(1/4) = 1/4, and for playing scissors 100% of the time is -1(1/4) + 1(1/2) = 1/4. So the point is that you can't improve your payoffs by switching to another strategy, no matter what you do your expected value is 1/4. And if you play that same strategy, your opponent gets payoff 1/4 no matter what they do.

DSM managed to score 2 points in 16 rounds, but this strategy would have scored 4 points in 16 rounds, and I claim you can't do better than that if your player is playing (1/4, 1/2, 1/4) and they can't do better than it if you're playing (1/4, 1/2, 1/4).
-JM
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

Oh, I see what I've been doing differently. I didn't realize this was still zero-sum, and thought you only lost one point for losing no matter what you lost to. If it's zero-sum...

Say your strategy is (R, P, S) = (1/6, 1/2, 1/3). Then an opponent's payoff for pure rock is (-2)(1/2) + (3)(1/3) = 0, pure paper is (2)(1/6) + (-1)(1/3) = 0, pure scissors is (-3)(1/6) + (1)(1/2) = 0. So no matter what mixed strategy they choose, they can't get a better payoff than zero. And similarly, if you run (1/6, 1/2, 1/3), your opponent can't get a better payoff than zero. So (1/6, 1/2, 1/3)² is the fully-mixed NE.
Last edited by John Magnum on Wed Nov 19, 2014 1:29 pm, edited 1 time in total.
-JM
DSMatticus
King
Posts: 5271
Joined: Thu Apr 14, 2011 5:32 am

Post by DSMatticus »

@Fectin: I am confused. All of the examples Frank is using are random. Sixteen is just the magic number of rounds you need to measure wins and losses in integers instead of fractions, because in that specific instance the probabilities were measured in 1/4ths and 4*4=16. It's exactly like enumerating the sixteen possibilities of four coinflips and asking stats questions about that set instead of attaching probabilities to events and going from there. There's just a constant in front of everything and it doesn't really make any difference.

@John: I am also very confused by you. Look, we're talking about point advantage. (1/2, 1/4, 1/4) vs (1/4, 1/2, 1/4) gives a 2 point advantage to the former (the player who favors rock). Individual point totals don't really matter, because we are talking about competitive rock-paper-scissors and the goal is to beat your opponent by as much as possible, exactly like in basketball or football or virtually all other games with a scoreboard.

If you play rock and win, you get 3 points and the opponent gets 0 points (net diff is 3). If you play paper and win, you get 1 point and the opponent gets 0 points (net diff is 1). If you play scissors and win, you get 2 points and the opponent gets 0 points (net diff is 2). When you lose, from your perspective those numbers get negatives in front of them. When you tie, it's a great big zero.

So now, let's consider your newest strategy: (1/6, 1/2, 1/3). We're going to put it up against that exact same frequency but rotated: (1/3, 1/6, 1/2). It will take 36 total rounds to get everything in integers, but whatever that's fine:
R vs R x2; ---
R vs P x1; -1
R vs S x3; +9

P vs R x6; +6
P vs P x3; ---
P vs S x9; -18

S vs R x4; -12
S vs P x2; +4
S vs S x6; ---

Your strategy is behind 12 points over 36 rounds. I don't care what the actual point totals are, but your's is down 12 points. Now, is this a Nash equilibrium? If it were, neither of us would have anything to gain by changing our strategies. Which means it's not a Nash equilibrium, because it's trivially obvious that you can close the gap to zero just by copying my frequencies. Something about the way you're trying to model this is off.
Last edited by DSMatticus on Wed Nov 19, 2014 2:25 pm, edited 2 times in total.
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

Oof. That's on me, I had the setup wrong. I gave rock 3 points, paper 2 points, and scissors 1 point. Sorry to keep flubbing it.

With 3, 1, 2 the NE is (R, P, S) = (1/3, 1/2, 1/6). Let's check the pure payoffs: R = (-1)(1/2) + (3)(1/6) = 0; P = (1)(1/3) + (-2)(1/6) = 0; S = (-3)(1/3) + (2)(1/2) = 0.

Of slight interest is that rock still isn't the most-used strategy, or even tied for it.
-JM
TiaC
Knight-Baron
Posts: 968
Joined: Thu Jun 20, 2013 7:09 am

Post by TiaC »

This is really impressive nerd-sniping.
virgil wrote:Lovecraft didn't later add a love triangle between Dagon, Chtulhu, & the Colour-Out-of-Space; only to have it broken up through cyber-bullying by the King in Yellow.
FrankTrollman wrote:If your enemy is fucking Gravity, are you helping or hindering it by putting things on high shelves? I don't fucking know! That's not even a thing. Your enemy can't be Gravity, because that's stupid.
Username17
Serious Badass
Posts: 29894
Joined: Fri Mar 07, 2008 7:54 pm

Post by Username17 »

John Magnum wrote:Oof. That's on me, I had the setup wrong. I gave rock 3 points, paper 2 points, and scissors 1 point. Sorry to keep flubbing it.

With 3, 1, 2 the NE is (R, P, S) = (1/3, 1/2, 1/6). Let's check the pure payoffs: R = (-1)(1/2) + (3)(1/6) = 0; P = (1)(1/3) + (-2)(1/6) = 0; S = (-3)(1/3) + (2)(1/2) = 0.

Of slight interest is that rock still isn't the most-used strategy, or even tied for it.
Stop being wrong. You keep putting up strategies that don't beat random man. If the guy with unweighted throws doesn't lose to you, you don't have a Nash. Because it is trivially easy to change your strategy to something that will beat unweighted throws. In eighteen throws, Random man wins and loses twice against rock (+4 points for you), wins and loses once against Scissors (-1 points for you), and wins and loses three times against your paper (-3 points for you).

Seriously, all you have to do is throw a fucking rock more than one time in 3, and you beat random man. I don't know what you're doing, but it's wrong.

-Username17
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

So you get zero points and he gets zero points. How is that supposed to prove it's not an NE? It's a zero-sum symmetric game, you'd expect the equilibrium to give both players EVs of zero.

The fact that there are mixed strategies that have positive expected value when played against (1/3, 1/3, 1/3) doesn't tell you anything about what the Nash equilibrium actually looks like, because (1/3, 1/3, 1/3) won't be a maximal response against those strategies and so they won't form an equilibrium.
-JM
Username17
Serious Badass
Posts: 29894
Joined: Fri Mar 07, 2008 7:54 pm

Post by Username17 »

You found a low point that gives up and gains so few points that on average it never wins. Since the goal is to win, why would anyone switch to your strategy?

-Username17
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

Because other strategies will have best-responses that are actually positive, which means that you will lose points when your opponent switches to them.

I'll run through my process.

For this two-player three-strategy game, a mixed strategy is a tuple (x, y, z) such that x, y, and z are real numbers between 0 and 1 whose sum is 1. A mixed strategy profile is a pair of mixed strategies, one for each player. A Nash equilibrium is a mixed strategy profile such that neither player can get a strictly better expected value by unilaterally switching to another mixed strategy.

This is equivalent to saying that, fixing player 1's mixed strategy as (x1, y1, z1), none of player 2's pure strategies have strictly higher expected value than any of player 2's other strategies, and vice versa for player 2's mixed strategy (x2, y2, z2). Otherwise, they'd drop the underperforming strategies from their mix.

So, we write down the expected values of player 2's strategies as functions of player 1's probabilities. For instance, if player 2 plays rock 100% of the time, their expected value is (0)x1 + (-1)y1 + (3)z1. And similarly for paper and scissors. If (x1, y1, z1) is part of a Nash equilibrium, these are all equal. That, plus x1 + y1 + z1 = 1, lets us write down and solve a dead simple set of linear equations.

(1/3, 1/2, 1/6) is a strategy that, no matter what your opponent does, gives both players an expected value of zero. Therefore if you play (1/3, 1/2, 1/6) your opponent cannot do strictly better than (1/3, 1/2, 1/6), and reciprocally you can't do strictly better if your opponent plays it. That makes it a Nash equilibrium.

There isn't going to be a Nash equilibrium where one player actually wins on average, because then the other player would be losing on average and would be able to strictly improve their outcome just by mirroring the winner.
Last edited by John Magnum on Wed Nov 19, 2014 7:08 pm, edited 2 times in total.
-JM
Username17
Serious Badass
Posts: 29894
Joined: Fri Mar 07, 2008 7:54 pm

Post by Username17 »

(1/3, 1/2, 1/6) is a strategy that, no matter what your opponent does, gives both players an expected value of zero. Therefore if you play (1/3, 1/2, 1/6) your opponent cannot do strictly better than (1/3, 1/2, 1/6), and reciprocally you can't do strictly better if your opponent plays it. That makes it a Nash equilibrium.
Nope. You can in fact easily beat it because it has an expected value of zero against all opponents. But it won't be all of the participants, because no one has an incentive to change to it.

Let's say there are three players: Bart's Always Rock, Shaggy's Unweighted, and John's Mix. John ties with Shaggy and Bart, but Bart beats Shaggy. Therefore Jogn comes in second, and everyone's incentive is to either play Bart's strategy or devise a strategy that beats Bart's strategy (perhap's Lisa's "Always Paper" or something more complicated).

To be a Nash Equilibrium, it has to be more than just a stable point, it has to be something that people have incentive to change to and no incentive to change out of. John's heavy paper light scissors mix is something that there is never any incentive to change into because it never wins any games and never comes any better than 2nd place.

-Username17
User avatar
Pixels
Knight
Posts: 430
Joined: Mon Jun 14, 2010 9:06 pm

Post by Pixels »

A Nash Equilibrium only considers advantage if a single player unilaterally deviates their strategy. If all players use John Magnum's strategy, then one player changing gives that player no benefit. Therefore it is a Nash Equilibrium.
Last edited by Pixels on Wed Nov 19, 2014 10:10 pm, edited 1 time in total.
Username17
Serious Badass
Posts: 29894
Joined: Fri Mar 07, 2008 7:54 pm

Post by Username17 »

Pixels wrote:A Nash Equilibrium only considers advantage if a single player unilaterally deviates their strategy. If all players use John Magnum's strategy, then one player changing gives that player no benefit. Therefore it is a Nash Equilibrium.
But since all players won't use John's strategy because there is no advantage to shifting to it in the first place, it is not a Nash Equilibrium. It does not matter if it is a Nash Equilibrium if X so long as X is counterfactual. Which in this case it is.

If even one other player is not using John's strategy, then advantage is to be had in switching to whatever beats that. And then none of the players are using John's strategy. There is simply no reason or method for it to converge at John's point, because it's only stable if there are no other strategies in play. John's equilibrium cannot evolve from any other play environment.

-Username17
John Magnum
Knight-Baron
Posts: 826
Joined: Tue Feb 14, 2012 12:49 am

Post by John Magnum »

Where are you getting this requirement that there has to be a positive incentive to switch to the Nash equilibrium? Or that NE have to be the convergence points of iterated best-response play?
-JM
User avatar
Pixels
Knight
Posts: 430
Joined: Mon Jun 14, 2010 9:06 pm

Post by Pixels »

The definition of a Nash Equilibrium does not care whether the players would arrive there by iterating strategies. It only cares that they remain there.

If you have done differential equations, you can think of a Nash Equilibrium as a fixed point. In this case (and in all zero-sum games, actually), the Nash Equilibrium will be a saddle point, with strategies at that point staying put. Strategies even infinitesimally off of the point will move away, but it regardless it is still a fixed point.
Last edited by Pixels on Wed Nov 19, 2014 10:40 pm, edited 2 times in total.
DSMatticus
King
Posts: 5271
Joined: Thu Apr 14, 2011 5:32 am

Post by DSMatticus »

Okay, look: a Nash equilibrium is the state in which no player can benefit by changing their strategy and only their own strategy. Every player has to stop and think, "if my opponents keep using their strategies, there is nothing I can do to change mine and improve my position."

So let's begin from the simple observation that any strategy can tie itself. Anyone who is down any number of points at all can copy their opponent's strategy and no longer be down any points at all. Any Nash equilibrium for this game must be symmetric, so let's consider only symmetric situations going forward.

Now, if you name a strategy for both players to use and I name a strategy that beats your strategy, then by definition we are not in a Nash equilibrium. Because if player A keeps using your strategy and player B switches to my strategy then they win. If you lose to random chance, you are not at a Nash equilibirum. If you lose to marblepusher, you are not at a Nash equilibrium.

You must name a single strategy that both players look at and immediately think "if my opponent keeps using that strategy, the best I can do is to copy it."
Last edited by DSMatticus on Wed Nov 19, 2014 11:17 pm, edited 1 time in total.
hyzmarca
Prince
Posts: 3909
Joined: Mon Mar 14, 2011 10:07 pm

Post by hyzmarca »

So I'm shopping for cookwear on Amazon and thinking that I'm looking at some good deals and...

Why the fuck do people list lids as separate pieces? No, if you have four pots with four lids that a fucking four piece set. I mean, fuck, what kind of gullible morons do they take us for?
Koumei
Serious Badass
Posts: 13970
Joined: Fri Mar 07, 2008 7:54 pm
Location: South Ausfailia

Post by Koumei »

I think the Nash Equilibrium involves a jackknife powerbomb.
Image
hyzmarca wrote:I mean, fuck, what kind of gullible morons do they take us for?
They do it because often enough it actually works. Or rather, some people do it because it works, other people do it because if you list your four pots + lids as "four pieces", some people are jaded enough to assume you mean two pots + lids. If enough people lie in advertising, then anyone who tells the truth in advertising is assumed to also be lying, and just not offering a good deal.
Count Arioch the 28th wrote:There is NOTHING better than lesbians. Lesbians make everything better.
Post Reply