Matrix games: examples of problem solving. Payment matrix

PRACTICAL WORK No. 3

Game theory models

The concept of gaming models

Game theory deals with the development of various kinds of recommendations for making decisions in a conflict situation. Forming conflict situations mathematically, they can be represented as a game of two, three or more players, each of whom pursues the goal of maximizing their winnings at the expense of the other player. The mathematical model of a conflict situation is called game, the parties involved in the conflict – players, and the outcome of the conflict is win. For each formalized game, rules, i.e. system of conditions defining:

1. options for players’ actions;

2. the amount of information each player has about the behavior of their partners;

3. the gain that each set of actions leads to.

As a rule, the winnings can be specified quantitatively (for example, a loss is 0, a win is 1, a draw is ½). The game is called steam room, if it involves two players, and multiple, if the number of players is more than two. The game is called zero sum game, if the gain of one of the players is equal to the loss of the other. The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random. Personal move– a conscious choice by the player of one of the possible actions (a move in chess game), random move– a randomly selected action (choosing a card from a shuffled deck).

Player strategy is a set of rules that determine the choice of his action for each personal move, depending on the current situation. The game is called ultimate, if the player has a finite number of strategies, and endless- otherwise.

To solve a game, or find game solution, you should choose a strategy for each player that satisfies the optimality condition, i.e. one of the players must receive maximum win when the second one sticks to his strategy. At the same time, the second player must have minimum loss, if the first one sticks to his strategy. Such strategies are called optimal. Purpose game theory is the definition optimal strategy for each player. When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests.

Payment matrix. Lower and top price games

Let's consider the steam room the end game. Let the player A has m personal strategies, which we denote A 1, A 2,…, A m. Let the player B available n personal strategies, let's designate them B 1, B 2,…, B n. They say the game has dimensions m´n. As a result of the players choosing any pair of strategies A i And Bj The outcome of the game is clearly determined, i.e. winnings a ij player A(positive or negative) and loss (- a ij) player IN. Matrix Р=(a ij), the elements of which are the winnings corresponding to the strategies A i And Bj, called payment matrix or matrix of the game.

Bj A i B 1 B 2 Bn
A 1 a 11 a 12 a 1n
A 2 a 21 a 22 a 2n
Am a m1 am 2 a mn

Example - the game "Search"

Player A can hide in shelter 1 - let's denote this strategy as A 1 or in shelter 2 - strategy A 2. Player IN can look for the first player in hideout 1 – strategy IN 1, or in shelter 2 - strategy AT 2. If the player A is located in shelter 1 and is discovered there by the player IN, i.e. a couple of strategies are being implemented (A 1,B 1), then the player A pays a fine, i.e. a 11=–1. Similarly we get a 22=–1. It is obvious that the strategies (A 1,B 2) And (A 2,B 1) give to the player A payoff is 1, so a 12=a 21=1. Thus, we obtain the payment matrix

Consider the game m´n with matrix Р=(a ij) and determine the best among the player’s strategies A. Choosing a strategy A i, player A must expect that the player IN will answer it using one of the strategies In j, for which the payoff for the player A minimal (player IN seeks to “harm” the player A).

Let us denote by a i player's smallest winnings A when choosing a strategy A i for all possible player strategies IN(smallest number in i th row of the payment matrix), i.e. .

Among all the numbers a i Let's choose the largest: . Let's call a the lowest price of the game , or maximum winnings (maximin ). This guaranteed win player A for any strategy of player B. Hence, .

The strategy corresponding to maximin is called maximin strategy. Player IN interested in reducing the player's winnings A; choosing a strategy Bj, it takes into account the maximum possible gain for A. Let us denote .

Among all the numbers, we choose the smallest one and call it b top price of the game , or minimax win (minimax ). This guaranteed loss of player B for any strategy of player A. Hence, .

The strategy corresponding to minimax is called minimax strategy. The principle that dictates that players choose the most cautious minimax and maximin strategies is called minimax principle.

Statistical games

In many tasks that lead to games, uncertainty is caused by the lack of information about the conditions under which the action is carried out. These conditions depend not on the conscious actions of the other player, but on objective reality, which is commonly called “nature.” Such games are called games with nature (statistical games).

Task

After several years of operation, industrial equipment finds itself in one of the following states: B 1 – the equipment can be used in next year after preventive maintenance; B 2 – for trouble-free operation of the equipment in the future, individual parts and assemblies should be replaced; 3 – equipment requires major repairs or replacement.

Depending on the current situation B 1, B 2, B 3, the management of the enterprise can make the following decisions: A 1 - repair the equipment by factory specialists, which requires corresponding costs a 1 = 6, and 2 = 10, and 3 = 15 monetary units ; A 2 - call a special team of repairmen, the costs in this case will be b 1 = 15, b 2 = 9, b 3 = 18 monetary units; A 3 – replace the equipment with new one, selling the obsolete equipment at its residual value. The total costs associated with the results of this event will be equal to, respectively, c 1 =13, c 2 =24, c 3 =12 monetary units.

Exercise

1. Having given the described situation game scheme, identify its participants, indicate possible pure strategies of the parties.

2. Create a payment matrix, explaining the meaning of the elements a ij of the matrix (why are they negative?).

3. Find out what decision on the operation of equipment in the coming year is advisable to recommend to the management of the enterprise in order to minimize losses under the following assumptions: a) the experience accumulated at the enterprise in operating similar equipment shows that the probabilities of the indicated equipment states are equal, respectively, q 1 = 0.15; q 2 =0.55; q 3 =0.3 (apply Bayes criterion); b) existing experience indicates that all three possible states of the equipment are equally probable (apply the Laplace criterion); c) nothing definite can be said about the probability of equipment (apply the Wald, Savage, Hurwitz criteria). The value of the parameter g=0.8 in the Hurwitz criterion is specified.

Solution

1) The described situation is a statistical game.

The statistician is the management of the enterprise, which can make one of the following decisions: repair the equipment on its own (strategy A 1), call repairmen (strategy A 2); replace the equipment with new ones (strategy A 3).

The second playing side, nature, will be considered a set of factors influencing the condition of the equipment: the equipment can be used after preventative repairs (condition B 1); individual components and parts of equipment need to be replaced (state B 2): major repairs or replacement of equipment will be required (state B 3).

2) Let’s create the payment matrix of the game:

Element of the payment matrix a ij shows the costs of the enterprise management if, with the chosen strategy A i, the equipment ends up in state B j. The elements of the payment matrix are negative, since with any chosen strategy, the management of the enterprise will have to bear costs.

a) the experience accumulated at the enterprise in operating similar equipment shows that the probabilities of equipment states are equal to q 1 = 0.15; q 2 =0.55; q 3 =0.3.

Let's present the payment matrix as:

Strategies statistics, A i States of nature B j
B 1 B 2 B 3
A 1 -6 -10 -15 -10,9
A 2 -15 -9 -18 -12,6
A 3 -13 -24 -12 -18,75
q j 0,15 0,55 0,3

where , (i=1.3)

According to the Bayes criterion, the optimal pure strategy A i is taken to be the one that maximizes the average gain of the statistician, i.e. provided =max .

The optimal strategy according to Bayes is strategy A 1 .

b) existing experience indicates that all three possible states of the equipment are equally probable, i.e. = 1/3.

Average winnings are:

1/3*(-6-10-15) = -31/3 "-10.33;

1/3*(-15-9-18) = -42/3 = -14;

1/3*(-13-24-12) = -49/3 » -16.33.

The optimal Laplace strategy is strategy A 1 .

c) nothing definite can be said about the probabilities of equipment.

According to the Wald criterion, a pure strategy is taken as optimal, which under the worst conditions guarantees the maximum gain, i.e.

.

= max (-15, -18, -24) = -15.

Thus, strategy A 1 is optimal.

Let's build a risk matrix, where .

The game is called zero sum game, or antagonistic, if the gain of one of the players is equal to the loss of the other, i.e. For complete task game, it is enough to indicate the size of one of them. If we designate a- winnings of one of the players, b- the other's winnings, then for a zero-sum game b = - a, therefore it is enough to consider, for example, a.

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random.

Personal move- this is a conscious choice by the player of one of the possible actions (for example, a move in a chess game).

Random move is a randomly selected action (for example, choosing a card from a shuffled deck). In my work I will consider only the personal moves of the players.

Strategy A player is a set of rules that determine the choice of his action at each personal move, depending on the current situation. Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that all decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program. (This way you can play the game using a computer.) The game is called ultimate, if each player has a finite number of strategies, and endless- otherwise.

In order to solve the game, or find a solution to the game, you should choose a strategy for each player that satisfies the condition optimality, i.e. one of the players must receive maximum win when the second one sticks to his strategy. At the same time, the second player must have minimum loss, if the first one sticks to his strategy. Such strategies are called optimal. Optimal strategies must also satisfy stability condition, i.e. It must be disadvantageous for either player to abandon their strategy in this game.

Purpose of Game Theory: Determining the optimal strategy for each player. When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests.

Antagonistic games in which each player has a finite set of strategies are called matrix games . This name is explained by the following possibility of describing games of this kind. We draw up a rectangular table in which the rows correspond to the strategies of the first player, the columns to the strategies of the second, and the table cells located at the intersection of the rows and columns correspond to the situations of the game. If we put the winnings of the first player in the corresponding situation in each cell, we obtain a description of the game in the form of a certain matrix. This matrix is ​​called matrix of the game or payoff matrix.

The same finite zero-sum game can be described by different matrices, differing from each other only in the order of rows and columns.

Consider the game m x n with matrix P = (a ij), i = 1,2, ... , m;j = 1,2, ... , n and determine the best among strategies A 1, A 2, …, A m. Choosing a strategy A i player A must expect that the player IN will answer it using one of the strategies Bj, for which the payoff for the player A minimal (player IN seeks to "harm" the player A). Let us denote by a i, the player's smallest winnings A when choosing a strategy A i for all possible player strategies IN(smallest number in i-th line of the payment matrix), i.e.

a i = a ij , j = 1,..., n.

Among all the numbers a i (i = 1,2, ... , m ) choose the largest. Let's call a the lowest price of the game or maximum winnings (maximin). This is a guaranteed win for the player A for any player strategy IN. Hence, , i = 1,... , m; j = 1,..., n

The strategy corresponding to maximin is called maximum strategy. Player IN interested in reducing the player's winnings A; choosing a strategy Bj, it takes into account the maximum possible gain for A.

Let's denote: β i = a ij , i = 1,... , m

Among all the numbers Bj choose the smallest one and call it β top price of the game or minimax win (minimax). This is a guaranteed loss for the player IN.

Hence, i = 1,... , m; j = 1,..., n.

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

http://emm. *****/lect/lect5.html#vopros2

(looks better on the Internet than this copy)

Elements of Game Theory

1. Basic concepts and definitions. Subject of game theory.
2. Zero-sum games. The solution is in pure strategies.
3. Solving games in mixed strategies.
4. Geometric interpretation of games.
5. Reducing the pair game to a linear programming problem.
6. General scheme solutions to zero-sum paired games.
7. Use of alternative criteria for determining optimal strategies.

1. Basic concepts and definitions. Subject of game theory

Quite often in my practical activities a person has to face problems in which it is necessary to make a decision in conditions where two or more parties pursue different goals, and the results of any action of each party depend on the activities of the partner. Such situations, which arise, for example, when playing chess, checkers or dominoes, are classified as conflictual: The result of each player's move depends on the opponent's counter move. What this response will be is not known in advance, so they say that the decision has to be made under conditions of uncertainty. The goal of the game is for one of the participants to win.

In economics, conflict situations occur frequently and are diverse in nature. These include, for example, relationships between supplier and consumer, buyer and seller, bank and client. In these examples, the conflict situation is generated by the difference in the interests of the partners and the desire of each of them to make decisions that realize their goals to the greatest extent. At the same time, everyone has to take into account not only their own goals, but also the goals of their partner, and take into account the decisions unknown in advance that these partners will make.

The situation is called conflict, if it involves parties whose interests are completely or partially opposite.

There are scientifically based methods for rationally solving problems with conflict situations. Such methods have been developed mathematical theory conflict situations, which is called game theory.

A game- this is an actual or formal conflict in which there are at least two participants (players), each of whom strives to achieve their own goals.

The permissible actions of each player aimed at achieving a certain goal are called rules of the game.

The game is called steam room, if it involves two players, and multiple, if the number of players is more than two. In what follows, we will consider only doubles games. This game involves two players - A and B, whose interests are opposite. The game (game process) will be understood as a series of actions on the part of A and B.

Quantifying the results of a game is called payment.

The doubles game is called zero sum game, or antagonistic, if the amount of payments is zero, that is, the gain of one player is equal to the loss of the other. In this case, to complete the task of the game, it is enough to indicate one of the quantities. If, for example, a– winnings of one of the players, b- the other's winnings, then for a zero-sum game b = -a, therefore it is enough to consider, for example, a.

Within this course We will consider zero-sum paired games.

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random.

Personal move is a conscious choice by a player of one of the possible actions (for example, a move in a chess game).

Random move is a randomly selected action (for example, choosing a card from a shuffled deck).

In the future, we will consider only the personal moves of the players.

Player strategy is a set of rules that determine the choice of his action at each personal move, depending on the current situation.

Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program.

The game is called ultimate, if each player has a finite number of strategies, and endless- otherwise.

The player's strategy is called optimal, if it provides the player with the maximum win (or, what is the same, the minimum loss), provided that the second player adheres to his strategy.

If the game is repeated many times, then players may be interested not in winning and losing in each specific game, but in average win (loss) in all batches.

In order to solve the game, or find a solution to the game, it is necessary for each player to choose the optimal strategy.

Thus, subject of game theory constitute methods for finding optimal player strategies.

When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests. The most important limitation of game theory is the uniqueness of payoff as an indicator of efficiency, while in most real economic problems there is more than one indicator of efficiency. In addition, in economics, as a rule, there are tasks in which the interests of partners are not necessarily antagonistic. However, solving games in the presence of many participants with consistent interests is a much more difficult task.

We will limit ourselves to considering zero-sum paired games.

2. Zero-sum games. Solution in pure strategies

Consider a paired finite game.

Let player A have m personal strategies: A1, A2, …, Am. Let player B have n personal strategies. Let us denote them B1, B2, …, Bn. In this case, the game has a dimension mxn..gif" width="39" height="17 src=">) the outcome of the game is uniquely determined, i.e. the winning aij of player A (positive or negative) and the loss (- aij) of player IN.

Let us assume that the values ​​of aij are known for any pair of strategies (Ai, Bj).

Matrix A = (aij), 230" style="width:172.5pt">

a11 a12 ... a1n
a21 a22 ... a2n

The payment matrix is ​​also often presented in table form (see Table 5.1).

Table 5.1 - General form payment matrix

The rows of matrix A correspond to the strategies of the first player, and the columns correspond to the strategies of the second.

These strategies are called clean.

Example 5.1. Create a payoff matrix for the next game (Search game).

Player A can hide in one of two shelters (I or II); player B looks for player A, and if he finds it, he receives a fine of 1 monetary unit from A, otherwise he pays player A 1 monetary unit.

Solution.

In order to create a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - let's denote this strategy by A1, or in shelter II - strategy A2.

Player B can look for the first player in shelter I - strategy B1, or in shelter II - strategy B2. If player A is in shelter I and player B discovers him there, i.e., a pair of strategies (A1, B1) is implemented, then player A pays a penalty, i.e. a11 = -1. Similarly a22 = -1.

Obviously, combinations of strategies (A1, B2) and (A2, B1) give player A a payoff equal to one, so a12 = a21 = 1.

Thus, for the game "Search" size 2x2 we get the following payment matrix:

A (hiding) =

Let's consider a game of size mxn with matrix A = (aij), https://pandia.ru/text/78/456/images/image002_132.gif" width="39" height="17 src=">and determine the best among strategies A1, A2, …, Am.

When choosing a strategy Ai, player A must expect that player B will respond to it with one of the strategies Bj for which the payoff of player A is minimal (player B seeks to “harm” player A).

Let's denote MsoNormalTable">

Among the numbers https://pandia.ru/text/78/456/images/image010_40.gif" width="44" height="16 src=">) we will choose the largest. Let's call the lowest price of the game or maximum winnings (maximin). This guaranteed payoff of player A for any strategy of player B.

The final formula can be written as follows:

The strategy corresponding to maximin is called maximin strategy.

Similar reasoning can be carried out in relation to player B.

Player B is interested in reducing Player A's payoff.

When choosing strategy Bj, he takes into account that player A will strive for the maximum win.

Let's denote https://pandia.ru/text/78/456/images/image015_30.gif" width="10" height="17 src=">.gif" width="58 height=23" height="23" >and let's call top price of the game or minimax. This minimum guaranteed loss for player B.

Thus:

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” maximin and minimax strategies is called minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

The player chooses his actions, assuming that the enemy will act in an unfavorable manner, that is, he will try to “harm.”

Let's go back to example 5.1 and determine the lower and upper price of the game in the “Search” problem.

Consider the payment matrix:

When choosing strategy A1 (the first row of the matrix), the minimum payoff is https://pandia.ru/text/78/456/images/image012_33.gif" width="10" height="8 src=">2 = min (- 1; 1) = -1, it is achieved when player B uses strategy B2.

Guaranteeing yourself the maximum win for any strategy of player B, i.e..gif" width="10" height="8 src=">.gif" width="7" height="14">1 = max (-1 ; 1) = 1.

Similarly, the maximum loss of player B when he chooses strategy B2 (second column) is 2 = max (1; -1) = 1.

Thus, for any strategy of player A, the guaranteed minimum loss of player B is = min (1, 2) = min (1, 1) = 1 - the upper price of the game.

Any strategy of player B is minimax.

We summarize the results of our reasoning in table 5.2, which is a payment matrix with an additional row j and column i. At their intersection we will write down the upper and lower prices of the game.

Table 5.2 - Payment matrix of the game "Search" with additional row and column

Thus, in the problem under consideration, the lower and upper prices of the game are different: https://pandia.ru/text/78/456/images/image017_28.gif" height="14 src=">.

If the upper and lower prices of the game coincide, then general meaning upper and lower prices v = https://pandia.ru/text/78/456/images/image017_28.gif" height="14 src=">is called net price games, or simply at the cost of the game. The maximin and minimax strategies corresponding to the price of the game are optimal strategies, and their totality – optimal solution, or simply game solution.

In this case, player A receives the maximum guaranteed (independent of the behavior of player B) winnings v, and player B achieves the minimum guaranteed (independent of the behavior of player A) loss v. They say that the solution to the game has stability, i.e., if one of the players adheres to its optimal strategy, then it cannot be profitable for the other to deviate from its optimal strategy.

A pair of pure strategies Ai and Bj gives an optimal solution to the game if and only if its corresponding element aij is both the largest in its column and the smallest in its row.

This situation, if it exists, is called saddle point(similar to the surface of a saddle, which curves up in one direction and down in the other).

Thus, for a game with a saddle point, finding a solution involves choosing a maximin and minimax strategy, which are optimal.

Example 5.2. Determine the lower and upper price of the game, which is given by the following payoff matrix:

0,5 0,6 0,8
0,9 0,7 0,8
0,7 0,6 0,6

Solution.

Let's find out whether the game has a saddle point. It is convenient to carry out the solution in a table. Table 5.3 includes the game's payoff matrix as well as an additional row and column that illustrate the process of finding optimal strategies.

Table 5.3 - Payment matrix of example 5.2 with additional row and column

Let's give some explanations.

Column https://pandia.ru/text/78/456/images/image012_33.gif" width="10" height="8 src=">1 = 0.5; 2 = 0.7; 3 = 0, 6 - minimum numbers in lines.

Similarly, https://pandia.ru/text/78/456/images/image017_28.gif" width="7 height=14" height="14">2 = 0.7; 3 = 0.8 - maximum numbers in columns.

3. Solving games in mixed strategies

So, for a game with a saddle point, finding a solution consists of choosing the maximin and minimax strategies, which are optimal.

If the game does not have a saddle point, then the use of pure strategies does not provide an optimal solution to the game. For example, in the game "Search" ( example 5.1) there is no saddle point.

In this case, it is possible to obtain an optimal solution by alternating pure strategies.

A mixed strategy for player A is the use of pure strategies A1, A2, …, Am with probabilities u1, u2, …, um.

Usually the mixed strategy of the first player is denoted as a vector: U = (u1, u2, ..., um), and the strategy of the second player as a vector: Z = (z1, z2, ..., zm).

It's obvious that:

ui ≥ 0, ,
zj ≥ 0, ,
ui = 1, zj = 1.

Optimal game solution(or simply - game solution) is a pair of optimal strategies U*, Z*, generally mixed, having the following property: if one of the players adheres to his optimal strategy, then it cannot be profitable for the other to deviate from his. The payoff corresponding to the optimal solution is called at the cost of the game v. The price of the game satisfies the inequality:

The following fundamental theorem of game theory is true.

Neumann's theorem. Every finite zero-sum game has a mixed strategy solution..

Let U* = (, https://pandia.ru/text/78/456/images/image030_23.gif" width="15" height="17 src=">) and Z* = (, https:// pandia.ru/text/78/456/images/image033_22.gif" width="13" height="17 src=">) - a pair of optimal strategies. If a pure strategy is included in an optimal mixed strategy with probability different from zero, then it is called active.

Active Strategies Theorem. If one of the players sticks to his optimal mixed strategy, then the payoff remains unchanged and equal to the game price v, if the second player does not go beyond the limits of his active strategies..

This theorem is of great practical importance - it provides specific models for finding optimal strategies in the absence of a saddle point.

Consider the game of size 2 x2 .

Such a game is the simplest case of a finite game. If such a game has a saddle point, then the optimal solution is a pair of pure strategies corresponding to this point.

For a game in which there is no saddle point in accordance with Neumann’s theorem, an optimal solution exists and is determined by the pair mixed strategies U* = (https://pandia.ru/text/78/456/images/image029_24.gif" width="13" height="17 src=">) and Z* = (, https://pandia. ru/text/78/456/images/image029_24.gif" width="13" height="17"> = v. Considering that + = 1, we obtain a system of equations:

We'll find the best strategy player A, for which we analyze all his strategies sequentially. Choosing a strategy A i, we must count on the player B will answer it with such a strategy Bj, for which the gain A will be minimal. Therefore, among the numbers in the first line, we select the minimum one, denote it, and write it in the additional column. Similarly for each strategy A i choose, i.e. αi– minimum winnings when applying the strategy A i.
In example 1:
α 1= min (0, –1, –2) = –2;
α 2= min (1, 0, –1) = –1;
α 3= min (0, –1, –2) = 0.
We will write these numbers in the additional column. What strategy should the player choose? A? Of course, the strategy for which αi maximum. Let's denote . This is a guaranteed win that a player can secure for himself A, i.e. ; this winning is called the lowest price of the game or maximin . Strategy A i, ensuring the receipt of the lower price of the game is called maximin(reinsurance). If the player A will adhere to this strategy, then he is guaranteed to win ≥ α for any behavior of the player B.
In example 1. This means that if A will write “3”, then at least he won’t lose. Player B interested in reducing winnings A. Choosing a strategy B 1, for reasons of caution, he takes into account the maximum possible gain A. Let's denote . Similarly when choosing a strategy Bj maximum possible gain A–; Let's write these numbers in an additional line. To reduce your winnings A, it is necessary to choose the smallest from the numbers β j. The number is called top price of the game or minimax . This is a guaranteed loss for the player B(i.e. he will lose no more than β). Player strategy B, providing a payoff ≥ - β is called its minimax strategy.
In example 1:
;
;
;
.
This means that the optimal strategy B– write “3”, then at least he won’t lose.

BA B 1 B 2 B 3
A 1 0 – 1 –2 –2
A 2 1 0 –1 –1
A 3 2 1 0 0
2 1 0 0 0
The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called minimax principle . This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.
It can be proven that , i.e. .
In example 1 α = β. If those. minimax coincides with maximin, then this game is called saddle point game. Saddle point is a pair of optimal strategies ( Ai,Bj). In example 1, the game has a saddle point ( A 3, B 3). In this case, the number α = β is called (pure) at the cost of the game(the lower and upper price of the game are the same). This means that the matrix contains an element that is the minimum in its row and at the same time the maximum in its column. In example 1, this is element 0. The price of the game is 0.
Optimal strategies in any game have an important property, namely: stability . This means that each player is not interested in deviating from his optimal strategy, since this is unprofitable for him. Deviation from the player's optimal strategy A leads to a decrease in his winnings, and the player’s one-sided deviation IN- to increase losses. They say that the saddle point gives equilibrium position .
Example 2. First side (player A) selects one of three types of weapons – A 1, A 2, A 3, and the opponent (player IN) - one of three types of aircraft: IN 1, AT 2, AT 3. Target IN– breakthrough of the defense front, goal A- aircraft destruction. Probability of aircraft being hit IN 1 weapons A 1 equal to 0.5, aircraft AT 2 weapons A 1 equal to 0.6, aircraft AT 3 weapons A 1 equals 0.8, etc., i.e. element a ij payment matrix – probability of aircraft being hit INj weapons Ai. The payoff matrix has the form: Solve the game, i.e. find the lower and upper price of the game and optimal strategies.
Solution. In each line we find the minimum element and write it in the additional column. In each column we find the maximum element and write it in the additional line.
IN A IN 1 AT 2 AT 3 αi
A 1 0,5 0,6 0,8 0,5
A 2 0,9 0,7 0,8 0,7
A 3 0,7 0,5 0,6 0,5
β j 0,9 0,7 0,8 0,7 0,7

In the additional column we find the maximum element = 0.7, in the additional line we find the minimum element = 0.7.
Answer: = 0.7. Optimal strategies – A 2 And AT 2.

Example 3. A game of toss. Each player can choose one of two strategies during his move: heads or tails. If the selected strategies coincide A receives a win +1, if there is a mismatch B receives payoff 1 (i.e. A receives a payoff –1). Payment matrix:
Find the lower and upper price of the game. Does the game have a saddle point?

Solution.

IN A IN 1 AT 2
A 1 1 -1 -1
A 2 -1 1 1
1 1 -1 1

α = -1, β = 1, i.e. A will lose no more than 1, and B will lose no more than 1. Since α ≠ β, the game has no saddle point. There is no equilibrium position in this game, and an optimal solution cannot be found in pure strategies.

Example. Find the lower price of the game, the upper price of the game, determine the saddle points, optimal pure strategies and the price of the game (if they exist).

Consider a paired finite game. Let the player A has m personal strategies, which we denote A 1 , A 2 , …,A m . Let the player IN available n personal strategies, let's designate them IN 1 ,IN 2 , …, IN n. They say the game of names is dimension mn. As a result of the players choosing any pair of strategies

A i And B i (I = 1, 2, …, m; j = 1, 2, …, n)

the outcome of the game is clearly determined, i.e., winning a ij player A(positive or negative) and loss (- a ij) player IN. Let's assume that the values a ij known for any pair of strategies ( Ai,Bj). Matrix R = (A ij), i = 1, 2, …, m; j = 1, 2, …, n, the elements of which are the winnings corresponding to the strategies A i And B j, called payment matrix or matrix of the game. The general appearance of such a matrix is ​​presented in Table. 1. The rows of this table correspond to the player’s strategies A, and the columns are the player’s strategies IN.

Let's create a payout matrix for the next game.

Table 1

A j B i

a 1n

a 2n

a m 1

a mn

1. Search game.

Player A can hide in one of the shelters (I and II); player IN looking for a player A, and if found, he receives a fine of 1 den. units from A, otherwise pays the player A 1 day units It is necessary to build a payment matrix for the game.

Solution. To compile a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - we denote this strategy by A 1 or to the shelter II - strategy A 2 .

Player IN can look for the first player in the shelter I - strategy IN 1, or in the shelter II - strategy IN 2. If the player A is located in Vault I and is discovered by the player there IN, i.e. a couple of strategies are implemented ( A 1 , IN 1), then the player A pays a fine, i.e. A 11 = -1. similarly we get A 22 = -1 (A 2 , IN 2). It is obvious that the strategies ( A 1 , IN 1) and ( A 2 , IN 2) give to the player A payoff is 1, so A 12 = A 21 = 1.

Start of the problem condition; - completion of the problem solution.

Thus, for a “search” game of size 2 2 we obtain the payoff matrix

Consider the game m n with matrix P = ( A ij), i = 1, 2, …, m; j = 1, 2, …, n and determine the best among strategies A 1 , A 2 , …, A m. Choosing a strategy A i, player A must expect that the player IN will answer it using one of the strategies B j , for which the payoff for the player A minimal (player IN seeks to “harm” the player A).

Let us denote by A i player's smallest winnings A when choosing a strategy A i for all possible player strategies IN(smallest number in i-th line of the payment matrix), i.e.

A ij = i . (1.1)

among all the numbers? i (i = 1, 2, …, m) choose the largest: ? = ? i. Shall we call it? the lowest price of the game, or maximum winnings (maximin). This is the guaranteed win of player A for any player strategy IN. Hence,

? = a ij . (1.2)

The strategy corresponding to maximin is called maximin strategy. Player IN interested in reducing the player's winnings A; choosing a strategy B j, it takes into account the maximum possible gain for A. Let's denote

? j = a ij (1.3)

Among all the numbers? j Shall we choose the smallest? = ? j and let's call it? top price of the game or minimax win (minimax). This is a guaranteed loss for the player IN. Hence,

? = a ij . (1.4)

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent. Let us determine the lower and upper prices of the game and the corresponding strategies in the problem 1. Consider the payment matrix

from the problem 1. When choosing a strategy A 1 (first row of the matrix) the minimum winning is? 1 = min(-1;1) = -1 and corresponds to the strategy? 1 player IN. When choosing a strategy A 2 (second row of the matrix) the minimum winning is? 2 = min(1;-1) = -1, it is achieved with the strategy IN 2 .

Guaranteeing yourself the maximum win for any player strategy IN, i.e. lower price for the game? = max(? 1 , ? 2) = max(-1,-1) = -1, player A can choose any strategy: A 1 or A 2, i.e. any of his strategies is maximin.

Choosing a strategy IN 1 (column 1), player IN understands that the player A will respond with a strategy A 2 to maximize your winnings (loss IN). Therefore, the player's maximum loss is IN when choosing a strategy IN Is 1 equal? 1 = max(-1;1) = 1.

Similarly, the maximum loss of a player IN(winning A) when choosing a strategy IN 2 (column 2) equals? 2 = max(1;-1) = 1.

Thus, for any player strategy A guaranteed minimum player loss IN equal? = min (? 1; ? 2) = min(1;1) = 1 - the top price of the game.

Any player strategy IN is minimax. Having added the table 1 line? j and column? i, we get table. 2. At the intersection of the additional rows and columns, we will write down the upper and lower prices of the games.

Table 2.

A j B i

? i

? j

In the problem 1 , discussed above, are the upper and lower prices of the game different? ? ?.

If the upper and lower prices of the game are the same, then the total value of the upper and lower prices of the game? = ? = v is called clean at the cost of the game, or at the cost of the game. Minimax strategies corresponding to the price of the game are optimal strategies, and their totality is optimal decision, or decision games. In this case the player A receives the maximum guaranteed (independent of the player’s behavior) IN) payoff v, and the player IN achieves the minimum guaranteed (regardless of the player’s behavior A) loss v. They say that the solution of the game is stable, i.e. If one player adheres to his optimal strategy, then it cannot be profitable for the other to deviate from his optimal strategy.

A couple of pure strategies A j And B i gives an optimal solution to the game if and only if the corresponding element a ij is both the largest in its column and the smallest in its row. This situation, if oan exists, is called saddle point(similar to the surface of a saddle, which curves up in one direction and down in the other).

Let's denote A * And B * - a pair of pure strategies that achieve a solution to the game in the problem with a saddle point. Let us introduce the payoff function of the first player for each pair of strategies: R(A i , B j) = a ij. Then, from the optimality condition at the saddle point, the double inequality holds: R(A i , B *) ? R(A * , B * ) ? R(A * , B j), which is true for everyone i = 1, …, m; j = 1, …, n. indeed, the choice of strategy A * the first player with the optimal strategy B * the second player maximizes the minimum possible payoff: R(A * , B * ) ? R(A * , B).

2. determine the lower and upper price of the game given by the payment matrix

P = 0.9 0.7 0.8

Table 3.

A i B j

Does the game have a saddle point?

Solution. It is convenient to carry out all calculations in a table, to which, in addition to the matrix P, a column is entered? i and line? j(Table 3). Analyzing the rows of the matrix (player strategies A), fill out the column? i: ? 1 = 0.5, ? 2 = 0.7, ? 3 = 0.6 - minimum numbers in lines 1, 2, 3. Similarly? 1 = 0.9, ? 2 = 0.7, ? 3 = 0.8 - maximum numbers in columns 1, 2, 3 respectively.

Lowest game price? = ? i= max (0.5; 0.7; 0.6) = 0.7 ( greatest number in the column ? i) and the top price of the game? = ? j= min(0.9, 0.7, 0.8) = 0.7 (smallest number in the line? j). These values ​​are equal, i.e. ? = ?, and are achieved using the same pair of strategies ( A 2 , IN 2). Therefore, the game has a saddle point ( A 2 , IN 2) and game price = 0.7.