Game Theory Fundamentals

By Rohith-Raj Dhinakaran

Game Theory, as described by Steve Levitt, delves into “strategic interactions between a small number of adversaries (2 to 3 competitors)” . It's a fascinating concept ranging from every situation like “holding the door open for someone” all the way to significant global problems such as “nuclear weapon conflicts between the USA and the Soviet Union in the late 1940’s”. I will discuss the rudiments associated with this discombobulating idea and perhaps help you understand how pertinent this is to the real world.

Case 1: Static Game

Static Game: This is a game that happens only once and all decisions are made simultaneously.

The most common/simple game which we relate to game theory is the “Prisoner’s Dilemma”.

Explanation of the Game:

The “Prisoner's Dilemma” is a scenario which involves two prisoners who each have two options: stay silent or betray the other person. If both prisoners decide to stay silent, they will each serve one month in prison. If they both decide to betray, they will each spend three months in prison. However, if one prisoner betrays while the other stays silent, the betrayer gets out without any punishment but the silent prisoner serves 12 months.

We can illustrate this in a payoff matrix diagram. To understand this diagram, there are 2 players: row player and column player. The row player’s value comes first followed by the column player’s value in each box.

Row Player (prisoner 1)

Column Player (prisoner 2)

Strategy 1: Dominant Strategy

A strategy that produces a higher payoff than any other possible strategy and is the best possible response.

The best possible strategy in this scenario would be to “Betray” for both players.

If the row player stays silent, the column player could either have -1 (silent) or 0 (betray). 0 is better than -1 so Betray is a suitable option for the column player.

If the row player chooses to betray, the column player could have -12(silent) or -3(betray). -3 is better than -12, so Betray is a suitable option for the column player.

This can be applied vice versa in a reversal where we look at the options for the row player. This means that betrayal is the rational choice for both prisoners. Therefore, the dominant strategy equilibrium for this scenario would be (-3,-3) where both prisoners end up serving 3 months in prison.

Strategy 2: Dominated Strategy

A dominated strategy is the strategy that players will never play because it is sub optimal and rewards with lower payoffs compared to other decisions.

Iterated Deletion of these dominated Strategies will allow us to come to a smaller game, helping us find an equilibrium.

Consider the same example:

If we know the column player is going to be silent, the best option for the row player is to betray. We also know that if the column player is going to betray, the best option for the row player is to also betray. Therefore, we 100% know that a dominated strategy for the row player is to be silent. Hence, we can eliminate it.

Now, knowing that the row player is going to betray, the greatest payoff for the column player is to betray. Therefore, the best equilibrium again is the (-3,-3) where both players/prisoners decide to betray.

Nash equilibrium: Nash equilibrium is when players are responding in the best way possible and pursuing their best possible strategy. It basically defines to “no regrets” where both players believe that they made the best choice for themselves although it may not be the best overall choice (as proven below). It essentially means that if any player deviates from betraying, the other player is better off which is true. Staying silent makes the other player go out for free.

Pareto equilibrium: there is no other outcome where someone can be made better off without making someone else worse off.

Using this example, we can state that the Nash equilibrium is indeed (-3,-3) because any decisions where either player decides to go silent will make the opponent receive a better pay off (go out for free) compared to them (12 months).

However, looking at the pareto equilibrium, this would be (-1,-1), this is because any other outcome makes someone else better with someone else being left worse off, or both being left worse off in general.

You might argue that, would it not have been better if both prisoners had just cooperated and stayed silent as both would have served 1 month in prison which is definitely much better. However, since both prisoners acted in the maximisation of their self interest (rational human behaviour), it led to both of them being worse off. Therefore, the pareto equilibrium is the most suitable equilibrium position for optimal results to be achieved.

Case 2- Dynamic Strategy:

The games that happen repeatedly leading to strategic interactions changing over time.

Most of the real situations that occur in the world are explained through dynamic strategies. This is because the prisoner's dilemma is not played just once but played over and over again because the same situation just keeps going on and on.

Example:

Consider an example where 2 people need to climb this wall every day and they need each other’s cooperation. One person is needed to boost the person up the wall, the person who is boosted up then lifts the other person by pulling them upwards so both people can get over this wall.

You have 2 options, you can cooperate and help each other, or just not help each other and take a different route which is longer and does not involve climbing the wall.

If this was one interaction, as discussed earlier, the optimal static strategy would be for both players to not help as they can save time and energy (resources). However, since these 2 people need to climb this wall everyday, this means that this is just not a single interaction but a repeated one continuously. Therefore, one person’s decision to not help may impact future decisions that the other person makes.

General example with Axelrod’s Ideas (Based on Veritassium’s Video):

Robert Axelrod was intrigued by this and so therefore introduced a tournament where people could make program submissions to identify the best strategy.

Both Cooperation - 3 points

Both Defect - 1 point

One Cooperates, One Defects - Defect (5 points), Cooperate (0 points)

200 Rounds for each Matchup. Each round is a decision made to cooperate or defect.

Submissions go against each other, total score is added altogether after every matchup and repeated 5 times to ensure accurate results.

The simplest program called “Tit For Tat” won against the 16 other programs, it would start off by cooperating and then if the opponent defects, it’ll defect and copy the opponents previous move. However, if the opponent then switches back to cooperation, Tit for Tat would forgive and cooperate again.

In this tournament, Tit for Tat always lost or drew in every matchup but it still ended up winning overall (points tally wise). This tells us about reality, we may lose out in one to one interactions but a tally of all these one to one interactions we have with people will put ourselves at the highest possible position compared to other people if we share the same qualities as “Tit for Tat”.

Axelrod therefore identified 3 principal qualities for Tit for Tat that is ideal:

Nice (Not defecting first)

Forgiving (Retaliate but never hold grudges - so if the opponent cooperates, you can go back to cooperating).

Retaliatory (Don’t be a pushover, always retaliate immediately if the opponent defects)

Environmental Dependency

The optimal strategy is actually dependent on the environment itself, if you were “Tit for Tat” in an environment full of nasty people taking advantage of you, you would not be getting the best interaction out of them.

However, the beautiful thing is, if there is a cluster of “Tit for Tats” incorporated inside a nasty population, this cluster could be geographically sequestered and rack up loads of points and hence produce offspring as cooperation emerges and then Tit for Tats could take over the entire population. This means that the world’s population can change from selfish organisms towards a cooperative society.

Real world

Sometimes, there may be scenarios where you may try to cooperate but it comes across as a defection. For example: You say something that is misinterpreted as an insult when it is just constructive criticism.

In a noisy environment, we have 2 players (both Tit for Tat).

Initially, player 1 and player 2 both cooperate, but a cooperation from player 1 is mistaken as a defect, this creates an alternating pattern due to Tit for Tat’s natural behaviour. Eventually, a cooperation from player 2 is also taken as a defect, this generates a constant sequence of both player 1 and player 2 both defecting. This means that noise could create serious issues in the world.

Solution: We could use a generous tit for tat (GTFT), that would be more forgiving and does not retaliate after every defection but 90% of the time. This will allow tit for tat to be able to break out from the constant defection.

In Conclusion, for optimal progress in this world (long term), it is a wise choice to characterise yourself with the attributes that “Tit for Tat” possesses. This will dictate the environment and an environment full of “Tit for Tats” can ensure the world grows successfully to where it should ideally be. Therefore, when making life choices with other people, your choices are potentially responsible for theirs. With human behaviour, it is very tricky to match programmable trends in cooperation and defection to it but through competitions, it is proven that we should adopt the traits symbolic of “Tit for Tat”.

Resources:

Podcast:

Dubner, S. J. (2014) 'Jane Austen, Game Theorist', Freakonomics. Available at: https://freakonomics.com/podcast/jane-austen-game-theorist/ (Accessed: 4 July 2024).

Lecture PDF:

University of Edinburgh (n.d.) 'Lecture 8: Game Theory'. Available at: https://www.ed.ac.uk/files/atoms/files/lecture_8_game_theory.pdf (Accessed: 4 July 2024).

Website Article:

Tutor2u (n.d.) 'Nash Equilibrium'. Available at: https://www.tutor2u.net/economics/topics/nash-equilibrium (Accessed: 4 July 2024).

YouTube Video:

TED (2016) 'The Prisoner's Dilemma', YouTube video, added by TED-Ed. Available at: https://www.youtube.com/watch?v=mScpHTIi-kM (Accessed: 4 July 2024).

Research Article:

ResearchGate (n.d.) 'Prisoner's Dilemma in economics: illustration of Nash and cooperative Pareto Equilibrium'. Available at: https://www.researchgate.net/figure/Prisoners-Dilemma-in-economics-illustration-of-Nash-and-cooperative-Pareto-Equilibrium_fig4_241476821 (Accessed: 4 July 2024).

The Chomsky Hierarchy and Automata in Computer Science

This article placed third in the inaugural Fuller Research Prize competition 2021 HAMISH STARLING Even the least technical among us are familiar with programming languages in a loose sense: purposefully invented syntaxes constructed from keywords, symbols and identifiers used to tell a computer what to do. These confections power our modern world. From the operating system on which you are reading this article to the aeroplane which just passed overhead, most things are now controlled by code. So to fully comprehend the scope, characteristics and limitations of computers, it was realised in the 1950s that understanding the computational structures behind language was critical. In this piece I’ll discuss the Chomsky Hierarchy, a mathematical classification of languages into 4 types - regular, context-free, context-sensitive and recursively enumerable - explaining what each means. We’ll also discuss why this concept is relevant in the real world and how it links to “Automata”. Lang...

The Looking Glass

Search This Blog

A CALL TO CREATIVITY

Game Theory Fundamentals

By Rohith-Raj Dhinakaran

Case 1: Static Game

Strategy 1: Dominant Strategy

Strategy 2: Dominated Strategy

Case 2- Dynamic Strategy:

Labels

Comments

Post a Comment

Popular posts from this blog

A CALL TO CREATIVITY

Complexity for complexity’s sake? The Ars subtilior repertory

The Chomsky Hierarchy and Automata in Computer Science