Saddle point in game theory. Well-defined games (games with a saddle point)

Matrix games and the concept of a saddle point. Let's take a closer look at antagonistic games and their main properties. In a convenient way The task of a two-player zero-sum game is payment matrix. This, by the way, is where another name comes from - matrix games. Each element payment matrix and ij contains a numeric value player's winnings I (player II loses) if the first one applies the strategy i, and the second - strategy j. Terms winnings And loss should be understood in in a broad sense, because they can accept negative values and from an everyday point of view mean the opposite. The non-trivial nature of the problem lies primarily in the fact that each player makes his own choice without knowing about the choice of the other, which significantly complicates the process of optimizing the chosen strategy.

Classic example An antagonistic game is a game with two participants guessing numbers independently of each other. It is assumed that if their sum turns out to be even, then the winnings equal to 1 go to the first player, and if odd, then to the second. Assuming that for both players guessing an odd number is the first strategy, and an even number is the second, we can write the payoff matrix of this game:

The rows of matrix (6.1) correspond to the strategies of player I, the columns - to the strategies of player II, and its elements - to the results of the first player. It also follows from the definition of the game that the elements of this matrix, taken with the opposite sign, correspond to the winnings of the second player.

A more complex and meaningful payment matrix can be obtained if the proposed game is slightly modified. Let's assume that both participants have the right to guess the numbers from 1 to 4, which constitute their respective strategies. If the result of adding the intended numbers is even, then the second player pays the resulting amount to the first, and if it is odd, then the first player pays the second. Let's write down the payment matrix for such a game:

Some conventionality and artificiality in the formulation of the problem should not be in this case confuse us, since a model describing, for example, competition between two firms for a newly opened market for products, etc., can be reduced to a similar form.

As already noted, the most important question in game theory is the optimality of the solution (choice of strategy) for each player. From this point of view, let us analyze a certain matrix game for which the payment matrix is ​​given A=║a ijm x n. When player I chooses strategy i his guaranteed income regardless of the actions of player II will be min a i,j. Because he can choose i independently, then it is advisable to make this choice in such a way that, for any enemy strategy, it maximizes the amount of guaranteed income, i.e., ensures the receipt of max (min a i,j). This principle of choosing a strategy is called “ maximin principle" On the other hand, similar reasoning can be carried out regarding the actions of the second player. His biggest loser when choosing a strategy j will be max a i,j, and, therefore, he should choose a strategy in such a way as to minimize the amount of loss for any actions of the opponent, i.e., ensure min (max a i,j). that's the point minimax principle.

The validity of the following relation can be proven:

However, of obvious interest is the situation in which the value of the payoff (payment) received by player I when he chooses the maximin strategy is equal to the payment (loss) of player II with the minimax strategy

In this case the game is said to have saddle point. The coincidence of the values ​​of the guaranteed payoffs of the players under the maximin and minimax strategies means the possibility of achieving some optimal (stable, equilibrium) state in the game, from which it is unprofitable for any of the participants to deviate. The concept of “optimality” here means that no reasonable (cautious) player seeks to change his strategy, since his opponent, in principle, will be able to choose a strategy that will give the worst result for the first one. Strategies i* And j* forming a saddle point are called optimal, and the value v = a i*j* called at the cost of the game. Troika ( i*, j*, v) counts decision matrix game with a saddle point.

It is easy to see that not every game has a saddle point. In particular, both game (6.1) and game (6.2) do not have a saddle point. An example of a game that has a saddle point is the game with the payoff matrix (6.5).

In this matrix, the minimum (guaranteed) winnings of the first player in rows are 1, 5 and (-3). Consequently, his maximin choice will correspond to strategy 2, which guarantees a win of 5. For the second player, the maximum losses in the columns of the matrix will be 8, 10, 5, 17, so it makes sense to choose strategy 3, under which he will lose only 5. Thus, the second the strategy of the first player and the third strategy of the second form a saddle point with value 5, i.e., for the game with matrix (6.5) it has a solution (2; 3; 5).

Games can be classified according to the number of players, the number of strategies, the nature of interaction between players, the nature of winning, the number of moves, the state of information, etc.

Depending on the number of players, games are divided into two and n players. The first of them are the most studied. Games of three or more players have been less studied due to the fundamental difficulties encountered and the technical possibilities of obtaining a solution. The more players there are, the more problems there are.

Based on the number of strategies, games are divided into finite and infinite. If all players in a game have a finite number of possible strategies, then it is called ultimate. If at least one of the players has an infinite number of possible strategies, the game is called endless.

Based on the nature of interaction, games are divided into:

    non-coalition: players do not have the right to enter into agreements or form coalitions;

    coalition(cooperative) – can join coalitions.

In cooperative games, coalitions are predetermined.

According to the nature of winnings, games are divided into: games with zero sum(the total capital of all players does not change, but is redistributed between players; the sum of winnings of all players is zero) and games with non-zero sum.

Based on the type of winning functions, games are divided into: matrix, bimatrix, continuous, convex, separable, duel type, etc.

Definition . If in a game with a matrix A = (the lower net price is equal to the upper net price), then this game is said to have saddle point in pure strategies and net price games==.

Saddle point is a pair of pure strategies (i O , j O ) respectively, players 1 and 2, at which equality is achieved =. If one of the players adheres to the strategy corresponding to the saddle point, then the other player cannot do better than to adhere to the strategy corresponding to the saddle point. Mathematically, this can be written another way: , Where i, j– any pure strategies of players 1 and 2, respectively; (i O , j O ) –strategies that form a saddle point.

Thus, the saddle element is the minimum in the i o -th row and the maximum in the j o -th column in matrix A. Finding the saddle point of matrix A occurs as follows: in matrix A sequentially in each line find the minimum element and check whether this element is the maximum in its column. If yes, then it is a saddle element, and the pair of strategies corresponding to it forms a saddle point. A couple of pure strategies (i O , j O ) players 1 and 2, forming a saddle point and a saddle element is called game solution . Wherein i O And j O are called optimal clean strategies respectively players 1 and 2.

Properties of saddle points:

1. Equivalence. If there are several saddle points in the game, then the values ​​of the payoff function in them are the same.

2. Interchangeability optimal strategies. Players can replace their optimal strategies with other optimal strategies, while the equilibrium will not be disrupted, and the gain (loss) will remain unchanged.

13.Definition of a mixed strategy. Solution of the 2*2 game in mixed strategies.

A mixed strategy is when, instead of applying any one specific strategy, the participants in the game can randomly alternate (mix) their strategies in accordance with a specially designed scheme that provides the desired frequency, or probability, of the implementation of each of the strategies.

For each player you can set the following components:

Pia – probability of using the i-th strategy by A.

If you choose such a set Pia , which provides the greatest gain regardless of the actions of the second party, then this set of probabilities (p 1 a, p 2 a, ..., p ma) = S A and will be called a mixed strategy.

S * A = (p * 1 a, p * 2 a, …, p * ma) – optimal mixed strategy.

( S A ) – set mixed strategies from side A, from which the optimal one must be selected.

Game 2*2 in mixed strategies.

If at least one party has only 2 actions, we apply the graph-analytical solution method. Let's define the game in the form of the following matrix:

For this game, the following can be considered: the player plays against some pure strategy of the other side each time. In this case, he can choose a probability ratio that will give him guaranteed win, the size of the price of the game.

Let us consider from these positions a game with the following payoff matrix:

Let's consider the reasoning that guides the first player. If he makes move i=1, then the worst situation for him will be when the second player makes move j=3, since in this case he will receive 0. If the first player makes move i=2, then worst case(when the second player moves j=1) he will also receive 0. Similarly, with i=3 he will receive 4 in the worst case (with j=2), with i=4 - 2 (with j=3) and, finally, with i=5 he will get 0 in the worst case (for j=3).

In an effort to make his guaranteed payoff as large as possible, the first player must choose move i=3, since in this case he guarantees himself a payoff equal to 4 (though his maximum win small - only 5).

Now let's try to look at the same matrix from the point of view of the second player. For him, this is the matrix of his loss.

If he chooses move j=1, then his maximum loss will be 18 (if the first player makes move i=1). Similarly, with j=2 his maximum loss will be 4, with j=3- 8, and finally, with j=4 his maximum loss will be 25. In an effort to make his maximum loss as small as possible, the second player must choose a move j=2, since in this case his maximum loss, equal to 4, is the smallest.

So, we came to the conclusion that the first player should move i=3, and the second j=2. Suppose now that the second player, as they say, “reveals the cards” and declares to the first player: “I will make the move j=2.” Does the first player need to change his move? No, because in this case his best move is still i=3.

Similarly, if the first player tells the second that he will move i=3, then the second player also does not make sense to change his move, since the best answer will still be j=2. The pair i=3, j=2 is said to be a balanced pair, since the “opening of cards” by the players does not give the opponent any reason to change his strategy. As they say, there is a pair i=3, j=2 game solution, and the amount of winnings of the first player (and at the same time the amount of loss of the second) is 4 - this is game price.

Let's formulate all this mathematically. So let the first player choose the move i. In the worst situation for him, he will win:

Trying to make his minimum winnings maximum, he chooses his move from the condition:

.

This strategy is called maximin , and the value α called the lowest price of the game , or maximin .

Likewise, the second player, choosing a move j, in the worst situation for himself loses:

In an effort to make his maximum loss minimal, he must choose his move from the condition:

This strategy is called minimax , and the value β called top price games , or minimax .



Game price v always lies between the lower price of the game α and the top price of the game β .

There are games for which lowest price game is equal to the top:

These games occupy a special position in game theory and are called saddle point games. General designation of the lower and upper price of the game:

called at the pure price of the game.

Matrix saddle point– an element that is minimal in its row, but maximal in its column. This makes it easy to find the saddle points of the matrix.

Dot i=3, j=2, is a saddle point. Payment matrix element a 32=4 was characterized precisely by the property that it was maximum in its column and minimum in its row.

Some questions regarding saddle points.

1. A matrix has several saddle points, for example the matrix:

two saddle points ( i=1, j=1) and ( i=1, j=3).

2. Not all matrices have a saddle point; for example, a matrix does not have saddle points.

Each player should not generally strive for a situation in which the value payoff functions maximum or minimum, and above all to the situation that may arise during the game. For a situation to be feasible, it must simultaneously achieve acceptable results for both Player I and Player II. Situations with this property are called equilibrium situations. They are the ones that can develop as a result of the players’ wise choice of their strategies. When identifying equilibrium situations it is necessary first of all to analyze sequentially each strategy of player I from the point of view of the most unfavorable outcome for him when player II chooses one of his strategies. To do this, the minimum winning value is found in the - row of the matrix. Let's denote it , where the sign (minimum by) denotes the operation of finding the minimum value payoff functions at all possible .

The numbers that are written next to the matrix in the form of an additional column characterize the minimum winnings of player I, taking into account the reasonable actions of player II. Therefore, player I must choose his strategy in such a way as to maximize his minimum payoff, that is, he must choose the strategy for which the number is maximum. Let us denote the maximum value by , that is


The quantity is called the bottom value of the game or maximin, and the corresponding strategy of player I is maximin strategy .

maximin strategy player I ensures himself (regardless of the opponent’s behavior) a guaranteed win of at least .

Next, we will analyze each strategy of player II from the point of view of the most unfavorable outcome for him when player I chooses one of his strategies. As a result of this, we will find the maximum loss values, which we denote

,

where the sign (maximum by) indicates the maximum value payoff functions at all possible .

The numbers , which are written under the matrix in the form of an additional line, characterize the maximum losses of player II, taking into account the reasonable actions of player I. Therefore, player II must choose his strategy in such a way as to minimize his maximum loss. To do this, he must choose the strategy in which the number will be minimal. Let us denote the minimum value by , that is

The quantity is called upper value of the game or minimax , and the corresponding strategy of player II is minimax strategy .

Obviously, when choosing the most careful minimax strategy Player II does not, under any circumstances, allow Player I to win more than .

Therefore, if both players behave reasonably, then player I's payoff must be no less than maximin and no more than minimax, that is:

A necessary and sufficient condition for the fulfillment of equality 7.2. is the existence of a saddle point. matrices. The term "saddle point" is borrowed from geometry. However, the presence of a saddle point in geometry is considered locally, and in game theory – globally. That is, the existence of a pair of integers is declared, for which it turns out to be both the minimum of its row and the maximum of its column. Therefore, player I, using the maximin strategy, guarantees himself a win, and player II, using the minimax strategy, does not allow him to win more than .

Therefore, for player I it is best to choose the strategy, and for player II it is best to choose . According to this, the strategies are called optimal, and the guaranteed payoff of player I is the value of the game, which is denoted by .

The set of optimal strategies is called game solution .

The principle of optimality underlying the players' choice of strategies is called minimax principle. According to this principle (or minimax criterion for reasonable behavior) each method of action is evaluated according to the worst outcome for it, and the optimal method is the one that leads to the best of the worst results.

Let's consider the game matrix.


So, the matrix has a saddle point , since the number 7 is the minimum of the second row and the maximum of the first column. Therefore, the optimal strategy of player I is maximin, and that of player II is minimax. The meaning of the game.

Having chosen his optimal strategy, player I can be sure that he will get at least 7, and player II, having chosen his optimal strategy, will not allow player I to get more than 7. These strategies constitute the solution to the saddle point game.

The solution to a saddle point game has the following property: if the players stick to their optimal strategies, then the payoff is equal to the value of the game. If one of the players adheres to his optimal strategy, and the other deviates from it, then he only loses in the game and in no case can increase his winnings. Moreover, the presence of information from any player that the other has chosen his optimal strategy does not serve as a basis for choosing any strategy other than the optimal (minimax or maximin) strategy. A pair of optimal strategies in a saddle point game creates an equilibrium situation, and any deviation from the optimal strategy results in disadvantageous consequences for the player using the suboptimal strategy. Thus, for the game under consideration, the presence of information from the player that player I has chosen the optimal strategy does not affect his choice of his optimal strategy. Otherwise, player II will give player I the opportunity to win 9 instead of 7.

Let's consider general principles solutions to two-person zero-sum games. Based on the principle of reasonableness, it is recommended to choose as best strategy the one that provides the largest guaranteed payoff (that is, a payoff that does not depend on the opponent’s actions, a payoff that the opponent can in no way reduce). Let the game be defined by the matrix: . Player A has m pure strategies A i, and the player Bn pure strategies Bj, . A couple of strategies

(A i, B j) corresponds to payment C ij, paid by the player B to the player A at the end of the game, that is, the player's winnings A.

If the player A uses strategy A i, then he will receive a payoff at least equal to , where the minimum is taken over all the player’s strategies B. And since the player A is free to choose his strategy, then it is natural for him to strive to do as much as possible; that is, it tends to choose such a strategy A i 0 , to get a win no less than , where the maximum is taken over all player strategies A.

Strategy A i 0 is called the player's maximin strategy A. This is his most cautious strategy, the application of which, for any behavior of player B, guarantees the player A winnings C ij no less α . Magnitude α called the lowest price of the game or maximin.

Player B, reasoning in the same way, chooses a strategy Bj 0, at which the player A will receive no more than . Strategy Bj 0 is called the player's minimax strategy B. This is his most cautious strategy, the use of which guarantees the player B is that the player A for any of his behavior, he receives a gain of no more than . The quantity is called top price of the game or minimax.

Thus, with the most careful play, the player A must apply maximin, and the player B minimax strategy.

The principle of caution that dictates the choice of such strategies for players is called the minimax principle, and both strategies are generally minimax.

What is the relationship between the upper and lower prices of games? It can be shown that for these quantities the inequality is always true

That is, the lower price of the game is always no more than the upper price α ≤ β.

If the lower price of the game is equal to the upper one, that is, if α = β, then those values i 0, j 0 at which this equality is achieved indicate the optimal strategies of the players A i 0 and Bj 0 . In this case the player A by adhering to his maximin strategy he receives no less than v, and the player B sticking to his minimax strategy will hinder the player A get more than v.


Any deviation from optimal strategies is disadvantageous for both players, since for any strategies A i And Bj the following inequalities are valid:

C i , j 0 ≤ C i 0, j 0 ≤ C i 0, j.

Element C i 0, j 0 is called the saddle point of the matrix C. This name corresponds to the fact that the element C i 0, j 0 matrices C is both the minimum in its row and the maximum in its column.

This item C i 0, j 0 = v is called the price of the game, and the game itself is called a saddle point game.

Example. Consider a game with a payoff matrix.