The maximum guaranteed win of player a is called. Game theory

http://emm. *****/lect/lect5.html#vopros2

(looks better on the Internet than this copy)

Elements of Game Theory

1. Basic concepts and definitions. Subject of game theory.
2. Zero-sum games. The solution is in pure strategies.
3. Solving games in mixed strategies.
4. Geometric interpretation of games.
5. Reducing the pair game to a linear programming problem.
6. General scheme solutions to zero-sum paired games.
7. Use of alternative criteria for determining optimal strategies.

1. Basic concepts and definitions. Subject of game theory

Quite often in my practical activities a person has to face problems in which it is necessary to make a decision in conditions where two or more parties pursue different goals, and the results of any action of each party depend on the activities of the partner. Such situations, which arise, for example, when playing chess, checkers or dominoes, are classified as conflictual: The result of each player's move depends on the opponent's counter move. What this response will be is not known in advance, so they say that the decision has to be made under conditions of uncertainty. The goal of the game is for one of the participants to win.

In economics, conflict situations occur frequently and are diverse in nature. These include, for example, relationships between supplier and consumer, buyer and seller, bank and client. In these examples, the conflict situation is generated by the difference in the interests of the partners and the desire of each of them to make decisions that realize their goals to the greatest extent. At the same time, everyone has to take into account not only their own goals, but also the goals of their partner, and take into account the decisions unknown in advance that these partners will make.

The situation is called conflict, if it involves parties whose interests are completely or partially opposite.

There are scientifically based methods for rationally solving problems with conflict situations. Such methods have been developed mathematical theory conflict situations, which is called game theory.

A game- this is an actual or formal conflict in which there are at least two participants (players), each of whom strives to achieve their own goals.

The permissible actions of each player aimed at achieving a certain goal are called rules of the game.

The game is called steam room, if it involves two players, and multiple, if the number of players is more than two. In what follows, we will consider only doubles games. This game involves two players - A and B, whose interests are opposite. The game (game process) will be understood as a series of actions on the part of A and B.

Quantifying the results of a game is called payment.

The doubles game is called zero sum game, or antagonistic, if the amount of payments is zero, that is, the gain of one player is equal to the loss of the other. In this case for complete task game, it is enough to indicate one of the quantities. If, for example, a– winnings of one of the players, b- the other's winnings, then for a zero-sum game b = -a, therefore it is enough to consider, for example, a.

Within this course We will consider zero-sum paired games.

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random.

Personal move- this is a conscious choice by the player of one of the possible actions (for example, a move to chess game).

Random move is a randomly chosen action (for example, choosing a card from a shuffled deck).

In the future, we will consider only the personal moves of the players.

Player strategy is a set of rules that determine the choice of his action at each personal move, depending on the current situation.

Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program.

The game is called ultimate, if each player has a finite number of strategies, and endless- otherwise.

The player's strategy is called optimal, if it provides the player maximum win(or, what is the same thing, a minimum loss), provided that the second player sticks to his strategy.

If the game is repeated many times, then players may be interested not in winning and losing in each specific game, but in average win (loss) in all batches.

In order to solve the game, or find a solution to the game, it is necessary for each player to choose the optimal strategy.

Thus, subject of game theory constitute methods for finding optimal player strategies.

When choosing optimal strategy it is natural to assume that both players behave rationally from the point of view of their interests. The most important limitation of game theory is the uniqueness of payoff as an indicator of efficiency, while in most real economic problems there is more than one indicator of efficiency. In addition, in economics, as a rule, there are tasks in which the interests of partners are not necessarily antagonistic. However, solving games in the presence of many participants with consistent interests is a much more difficult task.

We will limit ourselves to considering zero-sum paired games.

2. Zero-sum games. Solution in pure strategies

Let's consider the steam room the end game.

Let player A have m personal strategies: A1, A2, …, Am. Let player B have n personal strategies. Let us denote them B1, B2, …, Bn. In this case, the game has a dimension mxn..gif" width="39" height="17 src=">) the outcome of the game is uniquely determined, i.e. the winning aij of player A (positive or negative) and the loss (- aij) of player IN.

Let us assume that the values ​​of aij are known for any pair of strategies (Ai, Bj).

Matrix A = (aij), 230" style="width:172.5pt">

a11 a12 ... a1n
a21 a22 ... a2n

The payment matrix is ​​also often presented in table form (see Table 5.1).

Table 5.1 - General form payment matrix

The rows of matrix A correspond to the strategies of the first player, and the columns correspond to the strategies of the second.

These strategies are called clean.

Example 5.1. Create a payoff matrix for the next game (Search game).

Player A can hide in one of two shelters (I or II); player B looks for player A, and if he finds it, he receives a fine of 1 monetary unit from A, otherwise he pays player A 1 monetary unit.

Solution.

In order to create a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - let's denote this strategy by A1, or in shelter II - strategy A2.

Player B can look for the first player in shelter I - strategy B1, or in shelter II - strategy B2. If player A is in shelter I and player B discovers him there, i.e., a pair of strategies (A1, B1) is implemented, then player A pays a penalty, i.e. a11 = -1. Similarly a22 = -1.

Obviously, combinations of strategies (A1, B2) and (A2, B1) give player A a payoff equal to one, so a12 = a21 = 1.

Thus, for the game "Search" size 2x2 we get the following payment matrix:

A (hiding) =

Let's consider a game of size mxn with matrix A = (aij), https://pandia.ru/text/78/456/images/image002_132.gif" width="39" height="17 src=">and determine the best among strategies A1, A2, …, Am.

When choosing a strategy Ai, player A must expect that player B will respond to it with one of the strategies Bj for which the payoff of player A is minimal (player B seeks to “harm” player A).

Let's denote MsoNormalTable">

Among the numbers https://pandia.ru/text/78/456/images/image010_40.gif" width="44" height="16 src=">) we will choose the largest. Let's call the lowest price of the game or maximum winnings (maximin). This guaranteed payoff of player A for any strategy of player B.

The final formula can be written as follows:

The strategy corresponding to maximin is called maximin strategy.

Similar reasoning can be carried out in relation to player B.

Player B is interested in reducing Player A's payoff.

When choosing strategy Bj, he takes into account that player A will strive for the maximum win.

Let's denote https://pandia.ru/text/78/456/images/image015_30.gif" width="10" height="17 src=">.gif" width="58 height=23" height="23" >and let's call top price of the game or minimax. This minimum guaranteed loss for player B.

Thus:

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” maximin and minimax strategies is called minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

The player chooses his actions, assuming that the enemy will act in an unfavorable manner, that is, he will try to “harm.”

Let's go back to example 5.1 and determine the lower and upper price of the game in the “Search” problem.

Consider the payment matrix:

When choosing strategy A1 (the first row of the matrix), the minimum payoff is https://pandia.ru/text/78/456/images/image012_33.gif" width="10" height="8 src=">2 = min (- 1; 1) = -1, it is achieved when player B uses strategy B2.

Guaranteeing yourself the maximum win for any strategy of player B, i.e..gif" width="10" height="8 src=">.gif" width="7" height="14">1 = max (-1 ; 1) = 1.

Similarly, the maximum loss of player B when he chooses strategy B2 (second column) is 2 = max (1; -1) = 1.

Thus, for any strategy of player A, the guaranteed minimum loss of player B is = min (1, 2) = min (1, 1) = 1 - the upper price of the game.

Any strategy of player B is minimax.

We summarize the results of our reasoning in table 5.2, which is a payment matrix with an additional row j and column i. At their intersection we will write down the upper and lower prices of the game.

Table 5.2 - Payment matrix Search game with extra row and column

Thus, in the problem under consideration, the lower and upper prices of the game are different: https://pandia.ru/text/78/456/images/image017_28.gif" height="14 src=">.

If the upper and lower prices of the game coincide, then general meaning upper and lower prices v = https://pandia.ru/text/78/456/images/image017_28.gif" height="14 src=">is called at the pure price of the game, or simply at the cost of the game. The maximin and minimax strategies corresponding to the price of the game are optimal strategies, and their totality – optimal solution, or simply game solution.

In this case, player A receives the maximum guaranteed (independent of the behavior of player B) winnings v, and player B achieves the minimum guaranteed (independent of the behavior of player A) loss v. They say that the solution to the game has stability, i.e., if one of the players adheres to its optimal strategy, then it cannot be profitable for the other to deviate from its optimal strategy.

Pair pure strategies Ai and Bj gives an optimal solution to the game if and only if its corresponding element aij is both the largest in its column and the smallest in its row.

This situation, if it exists, is called saddle point(similar to the surface of a saddle, which curves up in one direction and down in the other).

Thus, for a game with a saddle point, finding a solution involves choosing a maximin and minimax strategy, which are optimal.

Example 5.2. Determine the lower and upper price of the game, which is given by the following payoff matrix:

0,5 0,6 0,8
0,9 0,7 0,8
0,7 0,6 0,6

Solution.

Let's find out whether the game has a saddle point. It is convenient to carry out the solution in a table. Table 5.3 includes the game's payoff matrix as well as an additional row and column that illustrate the process of finding optimal strategies.

Table 5.3 - Payment matrix of example 5.2 with additional row and column

Let's give some explanations.

Column https://pandia.ru/text/78/456/images/image012_33.gif" width="10" height="8 src=">1 = 0.5; 2 = 0.7; 3 = 0, 6 - minimum numbers in lines.

Similarly, https://pandia.ru/text/78/456/images/image017_28.gif" width="7 height=14" height="14">2 = 0.7; 3 = 0.8 - maximum numbers in columns.

3. Solving games in mixed strategies

So, for a game with a saddle point, finding a solution consists of choosing the maximin and minimax strategies, which are optimal.

If the game does not have a saddle point, then the use of pure strategies does not provide an optimal solution to the game. For example, in the game "Search" ( example 5.1) there is no saddle point.

In this case, it is possible to obtain an optimal solution by alternating pure strategies.

A mixed strategy for player A is the use of pure strategies A1, A2, …, Am with probabilities u1, u2, …, um.

Usually the mixed strategy of the first player is denoted as a vector: U = (u1, u2, ..., um), and the strategy of the second player as a vector: Z = (z1, z2, ..., zm).

It's obvious that:

ui ≥ 0, ,
zj ≥ 0, ,
ui = 1, zj = 1.

Optimal game solution(or simply - game solution) is a pair of optimal strategies U*, Z*, generally mixed, having the following property: if one of the players adheres to his optimal strategy, then it cannot be profitable for the other to deviate from his. The payoff corresponding to the optimal solution is called at the cost of the game v. The price of the game satisfies the inequality:

The following fundamental theorem of game theory is true.

Neumann's theorem. Every finite zero-sum game has a mixed strategy solution..

Let U* = (, https://pandia.ru/text/78/456/images/image030_23.gif" width="15" height="17 src=">) and Z* = (, https:// pandia.ru/text/78/456/images/image033_22.gif" width="13" height="17 src=">) - a pair of optimal strategies. If a pure strategy is included in an optimal mixed strategy with probability different from zero, then it is called active.

Active Strategies Theorem. If one of the players sticks to his optimal mixed strategy, then the payoff remains unchanged and equal to the game price v, if the second player does not go beyond the limits of his active strategies..

This theorem is of great practical importance - it provides specific models for finding optimal strategies in the absence of a saddle point.

Consider the game of size 2 x2 .

Such a game is the simplest case of a finite game. If such a game has a saddle point, then the optimal solution is a pair of pure strategies corresponding to this point.

For a game in which there is no saddle point in accordance with Neumann’s theorem, an optimal solution exists and is determined by a pair of mixed strategies U* = (https://pandia.ru/text/78/456/images/image029_24.gif" width="13 " height="17 src=">) and Z* = (, https://pandia.ru/text/78/456/images/image029_24.gif" width="13" height="17"> = v. Considering that + = 1, we obtain a system of equations:

Lecture 9. The concept of game models. Payment matrix.

§ 6 ELEMENTS OF GAME THEORY

6.1 The concept of game models.

The mathematical model of a conflict situation is called game , parties involved in the conflict - players, and the outcome of the conflict is win .

For each formalized game, rules , those. a system of conditions that determines: 1) options for players’ actions; 2) the amount of information each player has about the behavior of their partners; 3) the gain that each set of actions leads to. Typically, winning (or losing) can be quantified; for example, you can value a loss as zero, a win as one, and a draw as 1/2. Quantifying the results of a game is called payment .

The game is called steam room , if it involves two players, and multiple , if the number of players is more than two. We will only consider doubles games. They involve two players A And IN, whose interests are opposite, and by game we mean a series of actions on the part of A And IN.

The game is called zero sum game or antagonistic sky , if the gain of one of the players is equal to the loss of the other, i.e. the sum of the winnings of both sides is zero. To complete the game task, it is enough to indicate the value of one of them . If we designate A– winnings of one of the players, b the other's winnings, then for a zero-sum game b =A, therefore it is enough to consider, for example A.

The choice and implementation of one of the actions provided for by the rules is called progress player. The moves may be personal And random . Personal move this is a conscious choice by the player of one of the possible actions (for example, a move in a chess game). The set of possible options for each personal move is regulated by the rules of the game and depends on the totality of previous moves on both sides.

Random move it is a randomly chosen action (for example, choosing a card from a shuffled deck). For a game to be mathematically defined, the rules of the game must indicate for each random move probability distribution possible outcomes.

Some games may consist only of random moves (so-called pure gambling) or only of personal moves (chess, checkers). Most card games belong to mixed type games, that is, they contain both random and personal moves. In the future, we will consider only the personal moves of the players.

Games are classified not only by the nature of moves (personal, random), but also by the nature and amount of information available to each player regarding the actions of the other. A special class of games consists of the so-called “games with complete information». A game with complete information is a game in which each player, with each personal move, knows the results of all previous moves, both personal and random. Examples of games with complete information include chess, checkers, and famous game"crosses and toes". Most games of practical importance do not belong to the class of games with complete information, since uncertainty about the actions of the enemy is usually an essential element of conflict situations.

One of the main concepts of game theory is the concept strategies .

Strategy A player is a set of rules that determine the choice of his action at each personal move, depending on the current situation. Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, it is in principle possible that all decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program. (This way you can play the game using a computer.) The game is called ultimate , if each player has a finite number of strategies, and endless .– otherwise.

In order to decide game , or find game solution , for each player we should choose a strategy that satisfies the condition optimality , those. one of the players must receive maximum win, when the second one sticks to his strategy, At the same time the second player must have minimum loss , if the first one sticks to his strategy. Such strategies are called optimal . Optimal strategies must also satisfy the condition sustainability , those. It must be disadvantageous for either player to abandon their strategy in this game.

If the game is repeated quite a few times, then players may not be interested in winning and losing in each specific game, but Aaverage win (loss) in all batches.

The goal of game theory is to determine the optimal strategy for each player.

6.2. Payment matrix. Lower and top price games

The ultimate game in which the player A It has T strategies, and the player V – p strategies is called a game.

Consider the game
two players A And IN(“we” and “enemy”).

Let the player A has T personal strategies, which we denote
. Let the player IN available n personal strategies, let's designate them
.

Let each side choose a specific strategy; for us it will be , for the enemy . As a result of the players choosing any pair of strategies And (
) the outcome of the game is uniquely determined, i.e. winnings player A(positive or negative) and loss
player IN.

Let's assume that the values known for any pair of strategies ( ,). Matrix
,
, the elements of which are winnings corresponding to the strategies And , called payment matrix or matrix of the game. The rows of this matrix correspond to the player's strategies A, and the columns – the player’s strategies B. These strategies are called pure.

Game Matrix
has the form:

Consider the game
with matrix

and determine the best among strategies
. Choosing a strategy , player A must expect that the player IN will answer it using one of the strategies , for which the payoff for the player A minimal (player IN seeks to "harm" the player A).

Let us denote by player's smallest winnings A when choosing a strategy for all possible player strategies IN(smallest number in i th row of the payment matrix), i.e.

(1)

Among all the numbers (
) choose the largest:
.

Let's call
lowest price ngra, or maximum winnings (maxmin). This is a guaranteed win for player A for any strategy of player B. Hence,

. (2)

The strategy corresponding to maximin is called maximin strategy . Player IN interested in reducing the player's winnings A, choosing a strategy , it takes into account the maximum possible gain for A. Let's denote

. (3)

Among all the numbers let's choose the smallest

and let's call top price of the game or minimax win (minimax). Ego guaranteed loss of player B . Therefore,

. (4)

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called minimax principle . This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

Theorem.The lower price of the game always does not exceed the upper price of the game
.

If the upper and lower prices of the game are the same, then the total value of the upper and lower prices of the game
called the pure price of the game, or at the cost of the game. Minimax strategies corresponding to the price of the game are optimal strategies , and their totality - optimal solution or solution of the game. In this case the player A receives the maximum guaranteed (independent of the player’s behavior) IN) winnings v, and the player IN achieves the minimum guaranteed (regardless of the player’s behavior A) losing v. They say that the solution to the game has stability , those. If one player sticks to his optimal strategy, then it cannot be profitable for the other to deviate from his optimal strategy.

If one of the players (for example A) sticks to his optimal strategy, and the other player (IN) will deviate from its optimal strategy in any way, then for the player who made the deviation, it can never be profitable; such player deviation IN can at best leave the winnings unchanged. and in worst case– increase it.

On the contrary, if IN adheres to its optimal strategy, and A deviates from its own, then this can in no way be beneficial for A.

A couple of pure strategies And gives an optimal solution to the game if and only if the corresponding element is both the largest in its column and the smallest in its row. This situation, if it exists, is called saddle point. In geometry, a point on a surface that has the property of having a simultaneous minimum in one coordinate and a maximum in another is called saddle point, by analogy this term is used in game theory.

The game for which
,
called game with a saddle point. Element , which has this property, is a saddle point of the matrix.

So, for every game with a saddle point, there is a solution that determines a pair of optimal strategies for both sides, differing in the following properties.

1) If both sides stick to their optimal strategies, then the average payoff is equal to the net cost of the game v, which is simultaneously its lower and upper price.

2) If one of the parties adheres to its optimal strategy, and the other deviates from its own, then the deviating party can only lose and in no case can increase its winnings.

The class of games that have a saddle point is of great interest from both theoretical and practical points of view.

In game theory, it is proven that, in particular, every game with complete information has a saddle point, and, therefore, every such game has a solution, i.e., there is a pair of optimal strategies of both sides, giving an average payoff equal to the cost of the game. If a game with complete information consists only of personal moves, then when each side applies its optimal strategy, it should always end in a well-defined outcome, namely, a win exactly equal to the cost of the game.

Consider a game with a matrix

The letter i will denote the number of our strategy, and the letter the number of the enemy’s strategy.

Let's discard the question of mixed strategies and consider only pure ones for now. Let's set the task: to determine the best among our strategies. Let's analyze each of them sequentially, starting with and ending with. When choosing, we must expect that the enemy will respond to it with the strategy for which our winnings are minimal. Let's find the smallest number in the line and denote it

(the sign indicates the minimum value of this parameter for all possible

Let's write down the numbers (row minimums) next to the matrix on the right in the form of an additional column:

When choosing a strategy, we must count on the fact that as a result of the enemy’s reasonable actions we will only win. Naturally, acting most carefully (i.e., avoiding any risk), we must prefer to others the strategy for which the number is maximum. Let's denote this maximum value

or. taking into account formula (4.1),

The value a is called the lower price of the game, otherwise the maximin payoff or maximin. The strategy of player A that corresponds to maximin a is called maximin strategy.

Obviously, if we adhere to the maximin strategy, then regardless of the enemy’s behavior we are guaranteed a win, at least not less than a. Therefore, the value a is called the “lower price of the game.” This is the guaranteed minimum that we can provide for ourselves by adhering to our most cautious (“reinsurance”) strategy.

Obviously, a similar reasoning can be carried out for opponent B. He is interested in reducing our winnings to a minimum; this means that he must look through all his strategies, highlighting the maximum winning value for each of them. Let us write down the matrix (4.2) maximum values ​​by columns:

and find their minimum:

(4.4)

The value is called the upper price of the game, otherwise the minimax win or minimax. The opponent's winning strategy is called his minimax strategy. By sticking to his most cautious minimax strategy, the opponent is guaranteed that in any case he will lose no more than p.

The principle of caution, which dictates that players choose appropriate strategies (maximin and minimax), is fundamental in game theory and is called the minimax principle. It follows from the assumption that each player is reasonable, striving to achieve a goal opposite to the opponent’s goal. The most “cautious” maximin and minimax strategies are often denoted general term"minimax strategies".

Let us determine the lower and upper prices of the game, as well as minimax strategies, for the three examples discussed in the previous paragraph.

Example 1. (Search game). Determining the row minima and column maxima we get

Since the values ​​, are constant and equal to -1, respectively, and the lower and upper prices of the game are also equal to -1 and

Any strategy of player A is his maximin strategy, and any strategy of player B is his minimax strategy. The conclusion is trivial: by adhering to any of his strategies, player A can guarantee that he will lose no more than 1 ruble; Player B can guarantee the same for any of his strategies.

Example 2. (Three fingers game). By writing out the minimums of the rows and maximums of the columns, we will find the lower price of the game and the upper one (highlighted in bold in the table). Our maximin strategy (by applying it systematically, we guarantee that we will win no less than -3, i.e. we will lose no more than 3).

The enemy’s minimax strategy is any of the strategies, using them systematically, he can guarantee that he will not give up more than 4. If we deviate from our maximin strategy (for example, choose A 2), then the enemy can “punish” us for this by applying and reducing our winning and the opponent's retreat from his minimax strategy can be “punished” by increasing his loss to 6.

Let us pay attention to the fact that minimax strategies in in this case not stable. Indeed, let, for example, the opponent choose one of his minimax strategies and stick to it. Having learned this, we will move on to strategy and win 4. The enemy will respond with strategy and win 5; to this we, in turn, will respond with a strategy and win 4, etc. Thus, the situation in which both players use their minimax strategies is unstable and can be violated by the received information about the strategy used by the other side. However, such instability is not always observed; We will see this in the following example.

Example 3. (Game “weapons and aircraft”). Determine row minimums and column maximums:

In this case, the lower price of the game is equal to the upper:

Minimax strategies are stable: if one of the players adheres to his minimax (maximin) strategy, then the other player cannot improve his position by deviating from his.

Thus, we see that there are games for which the lower price is equal to the upper:

These games occupy a special place in game theory and are called saddle point games. In the matrix of such a game there is an element that is both minimal in its row and maximal in its column; such an element is called a saddle point” (by analogy with a saddle point on a surface, where a minimum is achieved along one coordinate and a maximum along another).

The total value of the lower and upper price of the game

called the net price of the game.

The saddle point corresponds to a pair of minimax strategies; these strategies are called optimal, and their combination is called a solution to the game. The solution of the game has the following property: if one of the players adheres to his optimal strategy, then it cannot be profitable for the other to deviate from his optimal one (such a deviation will either leave the situation unchanged or worsen it).

Indeed, in a game with a saddle point, let player A stick to his optimal strategy, and player B stick to his. As long as this is so, the payoff remains constant and equal to the game price v. Now suppose that B deviated from his optimal strategy. Since v is the minimum element in its row, such a deviation cannot be beneficial to B; Likewise, for A, if B adheres to his optimal strategy, deviation from his cannot be beneficial.

We see that for a game with a saddle point, minimax strategies are stable. A pair of optimal strategies in a game with a saddle point is like an equilibrium position: a deviation from the optimal strategy causes a change in the payoff that is disadvantageous for the deviating player and forces him to return to his optimal strategy.

The net price of play v in a saddle point game is the value of the payoff that, in a game against a reasonable opponent, player A cannot increase and player B cannot decrease.

Note that there may be more than one saddle point in the payment matrix, but several.

For example, there are six saddle points in the matrix, with a common payoff value and corresponding pairs of optimal strategies: It is not difficult to prove (we will not do this) that if there are several saddle points in the game matrix, then they all give the same payoff value.

Example. Side A - air defense systems - defends a section of territory from an air raid, having two guns No. 1 and No. 2, the coverage areas of which do not overlap (Fig. 9.1). Each weapon can only fire at an aircraft passing through its coverage area, but to do this it must track it in advance (before the target enters the area) and generate targeting data. If the target is fired at, it is hit with a probability. Side B has two aircraft, each of which can be directed to any zone At the moment when side A carries out target distribution (assigns which weapon to shoot at which target), the movement of target aircraft No. 1 is directed into the range of action of gun No. 1, and target No. 2 is directed into the range of action of gun No. 2 However, after making a decision on target distribution, each target can maneuver using a “deception maneuver” (see dotted arrows in Figure 9.1).

The task of side A is to maximize, and side B’s task is to minimize the number of targets hit. Find a solution to the game (optimal strategies of the parties)

Solution. Side A (air defense weapons) has four possible strategies - each gun monitors the target heading into its zone,

The guns track targets “crosswise” (each one tracking a target heading towards its neighbor),

Both guns are tracking target No. 1,

Both guns are tracking target No. 2. Side B (target aircraft) also has four strategies:

Both intact do not change direction,

Both targets use deception.

The first target uses a deception maneuver, but the second does not,

The second target uses a deception maneuver, but the first does not.

The result is a 4X4 game, the matrix of which is given in the table:

By finding the minima of the rows and the maxima of the columns, we are convinced that the lower price of the game is equal to the upper price of the game: this means that the game has a saddle point and a solution in pure strategies, leading to the net price of the game. In this case, there is not one, but four saddle points. Each of them corresponds to a pair of optimal strategies that give a solution to the game. The price of the game means that with optimal behavior of the parties, the planes will inevitably lose one plane, and no tricks will help them lose less, and the means Air defense - shoot down more than one aircraft This state of equilibrium is achieved when both sides use their optimal strategies: both guns track the same aircraft (any), and the aircraft are sent after target distribution to the same zone (any)

The class of games that have a saddle point is very interesting from both theoretical and practical points of view. In particular, all so-called “games with complete information” belong to it.

A game with complete information is a game in which each player, with each personal move, knows the results of all previous moves - both personal and random. Examples of games with complete information include: checkers, chess, the well-known game of tic-tac-toe, etc.

In game theory it is proven that every game with complete information has a saddle point and, therefore, a solution in pure strategies. In other words, in every game with complete information, there is a pair of optimal strategies on both sides that give a stable payoff equal to the net cost of the game. If a game with complete information consists only of personal moves, then when each side applies its optimal strategy, the game should always end with a well-defined outcome equal to the cost of the game

As an example, consider the following game with complete information. Two players alternately place identical coins on round table, choosing the position of the coin arbitrarily (mutual overlap of coins is not allowed). The winner is the one who puts in the last coin (when there is no room left for others). It is not difficult to see that the outcome of this game is predetermined, and there is a certain strategy that ensures a certain win for the player who puts in the coin first. Namely, he must place the coin in the center of the table for the first time, and then respond to each opponent’s move with a symmetrical move. Obviously, no matter how the enemy behaves, he cannot avoid losing. Therefore, the game makes sense only for people who do not know its solution. The situation is exactly the same with chess and other games with complete information; any of these games has a saddle point and, therefore, a solution that indicates to each player his optimal strategy, so the game makes sense only as long as the solution is unknown. A solution to the chess game has not been found (and is unlikely to be found in the foreseeable future) only because the number of strategies (combinations of moves) in chess is too large to be able to construct a payoff matrix and find a saddle point in it.

Lecture 4

Topic: “ELEMENTS OF GAME THEORY”

The concept of game models

In practice, we often encounter problems in which it is necessary to make decisions under conditions of uncertainty, i.e. Situations arise in which two (or more) parties pursue different goals, and the results of any action of each party depend on the activities of the partner. Such situations that arise when playing chess, checkers, dominoes, etc., are considered conflict situations: the result of each player’s move depends on the opponent’s response move, the goal of the game is to win one of the partners. In economics, conflict situations occur very often and are of a diverse nature. These include, for example, relationships between supplier and consumer, buyer and seller, bank and client. In all these examples, the conflict situation is generated by the difference in the interests of the partners and the desire of each of them to make optimal decisions that realize their goals to the greatest extent. At the same time, everyone has to take into account not only their own goals, but also the goals of their partner, and take into account the decisions unknown in advance that these partners will make.

To competently solve problems with conflict situations, scientifically based methods are needed. Such methods are developed by the mathematical theory of conflict situations, which is called game theory .

Let's get acquainted with the basic concepts of game theory. The mathematical model of a conflict situation is called game , parties involved in the conflict - players And, and the outcome of the conflict is win . For each formalized game, rules A, those. system of conditions defining:

1) options for players’ actions;

2) the amount of information each player has about the behavior of their partners;

3) the gain that each set of actions leads to. Typically, winning (or losing) can be quantified; for example, you can value a loss as zero, a win as one, and a draw as 1/2.



The game is called steam room , if it involves two players, and multiple , if the number of players is more than two. We will only consider doubles games. They involve two players A And IN, whose interests are opposite, and by game we mean a series of actions on the part of A And IN.

The game is called zero sum game , or antagonistic th, if the gain of one of the players is equal to the loss of the other, i.e. To complete the game task, it is enough to indicate the value of one of them. If we designate A - winnings of one of the players, b - the other's winnings, then for a zero-sum game b = -A, therefore it is enough to consider, for example A .

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random. Personal move - this is a conscious choice by the player of one of the possible actions (for example, a move in a chess game). Random move - it is a randomly chosen action (for example, choosing a card from a shuffled deck). In the future, we will consider only the personal moves of the players.

Strategy A player is a set of rules that determine the choice of his action at each personal move, depending on the current situation. Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that all decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program. (This way you can play the game using a computer.) The game is called Certainly th, if each player has a finite number of strategies, and endless - otherwise.

In order to decide game, or find a solution to the game, you should choose a strategy for each player that satisfies the condition optimality And, those. one of the players must receive maximum win , when the second one sticks to his strategy. At the same time, the second player must have minimum loss , if the first one sticks to his strategy. Such strategists And are called optimal. Optimal strategies must also satisfy the condition sustainability , those. It must be disadvantageous for either player to abandon their strategy in this game.

If the game is repeated quite a few times, then players may be interested not in winning and losing in each specific game, but in average win (loss) in all batches.

The goal of game theory is to determine the optimal strategy for each player. When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests. The most important limitation of game theory is the uniqueness of payoff as an indicator of efficiency, while in most real economic problems there is more than one indicator of efficiency. In addition, in economics, as a rule, problems arise in which the interests of partners are not necessarily antagonistic. The development of game theory apparatus for solving problems with many participants having consistent interests is beyond the scope of the lecture.

Payment matrix.

Lower and upper price of the game

Consider a paired finite game. Let the player A has T personal strategies, which we denote by . Let the player IN available P personal strategies, let's denote them . They say the game has dimensions t x p . As a result of the players’ choice of any pair of strategies, the outcome of the game is uniquely determined, i.e. player's winnings A (positive or negative) and player loss IN . Assume that the values ​​are known for any pair of strategies . Matrix Р=(), i = 1,2, ..., t; j = 1, 2, ..., n , elements of which are winnings corresponding to strategies and , called payment matrix or matrix of the game . The general appearance of such a matrix is ​​presented in the table:

The rows of this table correspond to the player's strategies A , and the columns - the player's strategies IN .

Example.Create a payoff matrix for the next game. Search game.

Player A can hide in one of two shelters (I and II); player IN looking for a player A , and if he finds it, he receives a fine of 1 den. units from A , otherwise pays the player A 1 day units It is necessary to build the payment matrix of the game.

Solution. To compile a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - let's denote this strategy by or in shelter II - strategy.

Player IN can look for the first player in the shelter I - strategy , or to the shelter II - strategy. If the player A is located in Vault I and is discovered by the player there IN , those. a pair of strategies is implemented then the player A pays a fine, i.e. . Similarly we get . Obviously, strategies give the player A win 1, therefore . So for the game" search"of size 2x2 we get a payment matrix

Consider the game t p with matrix, i =1, 2, ..., T ; j =1, 2, ..., P and determine the best among strategies . Player choosing strategy A must expect that the player IN will answer it using one of the strategies , for which the payoff for the player A minimal (player IN seeks to "harm" the player A ).

Let us denote by , the player’s smallest payoff A when choosing a strategy , for all possible player strategies IN (smallest number in i -and the line of the payment matrix), i.e.

, j =1,…n . (1)

Among all the numbers Let's choose the largest: . Let's call the lowest price of the game, or maximum winnings (maximin). This guaranteed winnings for the player A for any player strategy IN. Hence,

. (2)

The strategy corresponding to maximin is called maximin strategy . Player IN interested in reducing the player's winnings A ; choosing a strategy , it takes into account the maximum possible gain for A . Let's denote

Among all the numbers, choose the smallest one and call it top price of the game or minimax win (minimax). This guaranteed player loss IN. Therefore,

(4)

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

Example. Determine the lower and upper prices of the game and the corresponding strategies in the game " Search" Consider the payment matrix:

When choosing a strategy (first row of the matrix), the minimum payoff is and corresponds to the player's strategy IN . When choosing a strategy (second row of the matrix), the minimum payoff is equal to , it is achieved with the strategy.

Guaranteeing yourself the maximum win for any player strategy IN, those. lower price of the game, player A can choose any strategy: or , those. any of his strategies is maximin.

When choosing a strategy (column 1), the player IN understands that the player A will respond with a strategy to maximize his gain (loss IN ). Therefore, the player's maximum loss is IN when choosing a strategy it is equal to max (-l; 1) = 1.

Similarly, the maximum loss of a player IN (winning A ) when he chooses a strategy (column 2) is equal to .

Thus, for any player strategy A guaranteed minimum player loss IN equal to the top price of the game.

Any player strategy IN is minimax. By adding a row and a column to the table, we get a new table:

-1 -1
-1 -1
1

At the intersection of the additional rows and columns we will write down the upper and lower prices of the games.

Consider a paired finite game. Let the player A has m personal strategies, which we denote A 1 , A 2 , …,A m . Let the player IN available n personal strategies, let's designate them IN 1 ,IN 2 , …, IN n. They say the game of names is dimension mn. As a result of the players choosing any pair of strategies

A i And B i (I = 1, 2, …, m; j = 1, 2, …, n)

the outcome of the game is clearly determined, i.e., winning a ij player A(positive or negative) and loss (- a ij) player IN. Let's assume that the values a ij known for any pair of strategies ( Ai,Bj). Matrix R = (A ij), i = 1, 2, …, m; j = 1, 2, …, n, the elements of which are the winnings corresponding to the strategies A i And B j, called payment matrix or matrix of the game. The general appearance of such a matrix is ​​presented in Table. 1. The rows of this table correspond to the player’s strategies A, and the columns are the player’s strategies IN.

Let's create a payout matrix for the next game.

Table 1

A j B i

a 1n

a 2n

a m 1

a mn

1. Search game.

Player A can hide in one of the shelters (I and II); player IN looking for a player A, and if found, he receives a fine of 1 den. units from A, otherwise pays the player A 1 day units It is necessary to build a payment matrix for the game.

Solution. To compile a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - we denote this strategy by A 1 or to the shelter II - strategy A 2 .

Player IN can look for the first player in the shelter I - strategy IN 1, or in the shelter II - strategy IN 2. If the player A is located in Vault I and is discovered by the player there IN, i.e. a couple of strategies are implemented ( A 1 , IN 1), then the player A pays a fine, i.e. A 11 = -1. similarly we get A 22 = -1 (A 2 , IN 2). It is obvious that the strategies ( A 1 , IN 1) and ( A 2 , IN 2) give to the player A payoff is 1, so A 12 = A 21 = 1.

Start of the problem condition; - completion of the problem solution.

Thus, for a “search” game of size 2 2 we obtain the payoff matrix

Consider the game m n with matrix P = ( A ij), i = 1, 2, …, m; j = 1, 2, …, n and determine the best among strategies A 1 , A 2 , …, A m. Choosing a strategy A i, player A must expect that the player IN will answer it using one of the strategies B j , for which the payoff for the player A minimal (player IN seeks to “harm” the player A).

Let us denote by A i player's smallest winnings A when choosing a strategy A i for all possible player strategies IN(smallest number in i-th line of the payment matrix), i.e.

A ij = i . (1.1)

among all the numbers? i (i = 1, 2, …, m) choose the largest: ? = ? i. Shall we call it? the lowest price of the game, or maximum winnings (maximin). This is the guaranteed win of player A for any player strategy IN. Hence,

? = a ij . (1.2)

The strategy corresponding to maximin is called maximin strategy. Player IN interested in reducing the player's winnings A; choosing a strategy B j, it takes into account the maximum possible gain for A. Let's denote

? j = a ij (1.3)

Among all the numbers? j Shall we choose the smallest? = ? j and let's call it? top price of the game or minimax win (minimax). This is a guaranteed loss for the player IN. Hence,

? = a ij . (1.4)

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent. Let us determine the lower and upper prices of the game and the corresponding strategies in the problem 1. Consider the payment matrix

from the problem 1. When choosing a strategy A 1 (first row of the matrix) the minimum winning is? 1 = min(-1;1) = -1 and corresponds to the strategy? 1 player IN. When choosing a strategy A 2 (second row of the matrix) the minimum winning is? 2 = min(1;-1) = -1, it is achieved with the strategy IN 2 .

Guaranteeing yourself the maximum win for any player strategy IN, i.e. lower price for the game? = max(? 1 , ? 2) = max(-1,-1) = -1, player A can choose any strategy: A 1 or A 2, i.e. any of his strategies is maximin.

Choosing a strategy IN 1 (column 1), player IN understands that the player A will respond with a strategy A 2 to maximize your winnings (loss IN). Therefore, the player's maximum loss is IN when choosing a strategy IN Is 1 equal? 1 = max(-1;1) = 1.

Similarly, the maximum loss of a player IN(winning A) when choosing a strategy IN 2 (column 2) equals? 2 = max(1;-1) = 1.

Thus, for any player strategy A guaranteed minimum player loss IN equal? = min (? 1; ? 2) = min(1;1) = 1 - the top price of the game.

Any player strategy IN is minimax. Having added the table 1 line? j and column? i, we get table. 2. At the intersection of the additional rows and columns, we will write down the upper and lower prices of the games.

Table 2.

A j B i

? i

? j

In the problem 1 , discussed above, are the upper and lower prices of the game different? ? ?.

If the upper and lower prices of the game are the same, then the total value of the upper and lower prices of the game? = ? = v is called clean at the cost of the game, or at the cost of the game. Minimax strategies corresponding to the price of the game are optimal strategies, and their totality is optimal decision, or decision games. In this case the player A receives the maximum guaranteed (independent of the player’s behavior) IN) payoff v, and the player IN achieves the minimum guaranteed (regardless of the player’s behavior A) loss v. They say that the solution of the game is stable, i.e. If one player adheres to his optimal strategy, then it cannot be profitable for the other to deviate from his optimal strategy.

A couple of pure strategies A j And B i gives an optimal solution to the game if and only if the corresponding element a ij is both the largest in its column and the smallest in its row. This situation, if oan exists, is called saddle point(similar to the surface of a saddle, which curves up in one direction and down in the other).

Let's denote A * And B * - a pair of pure strategies that achieve a solution to the game in the problem with a saddle point. Let us introduce the payoff function of the first player for each pair of strategies: R(A i , B j) = a ij. Then, from the optimality condition at the saddle point, the double inequality holds: R(A i , B *) ? R(A * , B * ) ? R(A * , B j), which is true for everyone i = 1, …, m; j = 1, …, n. indeed, the choice of strategy A * the first player with the optimal strategy B * the second player maximizes the minimum possible gain: R(A * , B * ) ? R(A * , B).

2. determine the lower and upper price of the game given by the payment matrix

P = 0.9 0.7 0.8

Table 3.

A i B j

Does the game have a saddle point?

Solution. It is convenient to carry out all calculations in a table, to which, in addition to the matrix P, a column is entered? i and line? j(Table 3). Analyzing the rows of the matrix (player strategies A), fill out the column? i: ? 1 = 0.5, ? 2 = 0.7, ? 3 = 0.6 - minimum numbers in lines 1, 2, 3. Similarly? 1 = 0.9, ? 2 = 0.7, ? 3 = 0.8 - maximum numbers in columns 1, 2, 3 respectively.

Lowest game price? = ? i= max (0.5; 0.7; 0.6) = 0.7 ( greatest number in column ? i) and the top price of the game? = ? j= min(0.9, 0.7, 0.8) = 0.7 (smallest number in the line? j). These values ​​are equal, i.e. ? = ?, and are achieved using the same pair of strategies ( A 2 , IN 2). Therefore, the game has a saddle point ( A 2 , IN 2) and game price = 0.7.