Pure strategy games. Pure and mixed strategies

Although I graduated from the Faculty of Physics and Technology, I was not taught game theory at the university. But, since in my student years I played a lot, first in preference, and then in bridge, game theory interested me, and I mastered a small textbook. And recently, site reader Mikhail solved a game theory problem. Realizing that the task was not easy for me, I decided to brush up on my knowledge of game theory. I offer you a small book - a popular presentation of the elements of game theory and some methods for solving matrix games. It contains almost no evidence and illustrates the main provisions of the theory with examples. The book was written by mathematician and popularizer of science Elena Sergeevna Ventzel. Several generations of Soviet engineers studied from her textbook “Theory of Probability”. Elena Sergeevna also wrote several literary works under the pseudonym I. Grekova.

Elena Ventzel. Elements of game theory. – M.: Fizmatgiz, 1961. – 68 p.

Download short summary in format or

§ 1. Subject of game theory. Basic Concepts

When solving a number of practical problems (in the field of economics, military affairs, etc.), it is necessary to analyze situations where there are two (or more) warring parties pursuing opposing goals, and the result of each action of one of the parties depends on what course of action the enemy will choose. We will call such situations “conflict situations.”

Numerous examples of conflict situations from various areas of practice can be given. Any situation that arises during military operations belongs to conflict situations: each of the fighting parties takes all measures available to it in order to prevent the enemy from achieving success. Conflict situations also include situations that arise when choosing a weapon system, methods of its combat use, and in general when planning military operations: each of the decisions in this area must be made taking into account the actions of the enemy that are least beneficial for us. A number of situations in the field of economics (especially in the presence of free competition) belong to conflict situations; trading firms, industrial enterprises, etc. act as contending parties.

The need to analyze such situations gave rise to a special mathematical apparatus. Game theory is essentially nothing more than a mathematical theory of conflict situations. The purpose of the theory is to develop recommendations on the rational course of action for each of the opponents during conflict situation. Each conflict situation directly taken from practice is very complex, and its analysis is complicated by the presence of numerous contributing factors. To make mathematical analysis of the situation possible, it is necessary to abstract from secondary, incidental factors and build a simplified, formalized model of the situation. We will call this model a “game.”

The game differs from a real conflict situation in that it is played according to very specific rules. Humanity has long been using such formalized models of conflict situations that are games in the literal sense of the word. Examples include chess, checkers, card games, etc. All these games are in the nature of a competition, proceeding according to known rules and ending with the “victory” (winning) of one or another player.

Such formally regulated, artificially organized games represent the most suitable material to illustrate and master the basic concepts of game theory. Terminology borrowed from the practice of such games is also used in the analysis of other conflict situations: the parties involved in them are conventionally called “players”, and the result of the collision is the “win” of one of the parties.

The interests of two or more opponents may collide in the game; in the first case the game is called “paired”, in the second - “multiple”. Participants in a multiple game can form coalitions during its course - permanent or temporary. In the presence of two permanent coalitions, the multiple game turns into a pair game. Pairs games are of greatest practical importance; Here we will limit ourselves to considering only such games.

Let us begin our presentation of elementary game theory by formulating some basic concepts. We will consider a pairs game in which two players A and B with opposing interests participate. By “game” we mean an event consisting of a series of actions by parties A and B. In order for a game to be subjected to mathematical analysis, the rules of the game must be precisely formulated. By “rules of the game” we mean a system of conditions that regulates the possible options for actions of both sides, the amount of information each side has about the behavior of the other, the sequence of alternating “moves” ( individual decisions adopted during the game), as well as the result or outcome of the game to which this set of moves leads. This result (win or loss) does not always have a quantitative expression, but it is usually possible, by establishing some scale of measurement, to express it with a certain number. For example, in a chess game, a win can be conditionally assigned a value of +1, a loss –1, a draw 0.

A game is called a zero-sum game if one player wins what the other loses, i.e. the sum of the winnings of both sides is zero. In a zero-sum game, the interests of the players are directly opposed. Here we will consider only such games.

Since in a zero-sum game, the payoff of one player is equal to the payoff of the other with opposite sign, then, obviously, when analyzing such a game, we can consider the winnings of only one of the players. Let it be, for example, player A. In the future, for convenience, we will conventionally call side A “we”, and side B - “the enemy”.

In this case, side A (“we”) will always be considered as the “winner”, and side B (“the enemy”) as the “loser”. This formal condition obviously does not mean any real advantage for the first player; it is easy to see that it is replaced by its opposite if the sign of the winning is reversed.

We will imagine the development of the game over time as consisting of a number of successive stages or “moves”. In game theory, a move is the choice of one of the options provided for by the rules of the game. Moves are divided into personal and random. A personal move is a conscious choice by one of the players of one of the possible moves in a given situation and its implementation. An example of a personal move is any of the moves in a chess game. When performing the next move, the player makes a conscious choice of one of the options possible with a given arrangement of pieces on the board. Kit possible options with each personal move, it is regulated by the rules of the game and depends on the totality of previous moves of both sides.

A random move is a choice from a number of possibilities, carried out not by the player’s decision, but by some random selection mechanism (coin toss, dice, shuffling and dealing cards, etc.). For example, dealing the first card to one of the preference players is a random move with 32 equally possible options. For a game to be mathematically defined, the rules of the game must specify the probability distribution of possible outcomes for each random move.

Some games may consist only of random moves (so-called pure gambling) or only of personal moves (chess, checkers). Majority card games belongs to games mixed type, i.e. contains both random and personal moves.

Games are classified not only by the nature of moves (personal, random), but also by the nature and amount of information available to each player regarding the actions of the other. A special class of games consists of the so-called “games with complete information" A game with complete information is a game in which each player, with each personal move, knows the results of all previous moves, both personal and random. Examples of games with complete information include chess, checkers, and the well-known game “tic-tac-toe.”

Most games of practical importance do not belong to the class of games with complete information, since uncertainty about the enemy's actions is usually an essential element of conflict situations.

One of the basic concepts of game theory is the concept of “strategy”. A player’s strategy is a set of rules that uniquely determine the choice for each personal move of a given player, depending on the situation that arises during the game. Typically, the decision (choice) for each personal move is made by the player during the game itself, depending on the specific situation. However, theoretically, the matter will not change if we imagine that all these decisions are made by the player in advance. To do this, the player would have to compile in advance a list of all possible situations during the game and provide his own solution for each of them. In principle (if not practically) this is possible for any game. If such a decision system is accepted, it will mean that the player has chosen a certain strategy.

The player who has chosen the strategy can now not participate in the game personally, but replace his participation with a list of rules that some disinterested person (the judge) will apply for him. The strategy can also be specified to the automatic machine in the form of a specific program. This is how computer chess is currently played. For the concept of “strategy” to make sense, there must be personal moves in the game; In games consisting of only random moves, there are no strategies.

Depending on the number of possible strategies, games are divided into “finite” and “infinite”. A game in which each player has only a finite number of strategies is called a finite game. A finite game in which player A has m strategies, and player B - n strategies is called the mxn game.

Consider the game mxn of two players A and B (“us” and “opponent”). We will denote our strategies A 1 , A 2 , …, A m and the opponent’s strategies B 1 , B 2 , …, B n . Let each side choose a specific strategy; for us it will be A i, for the enemy B j. If the game consists only of personal moves, then the choice of strategies A i, B j uniquely determines the outcome of the game - our winnings. Let's denote it a ij. If the game contains, in addition to personal ones, random moves, then the payoff for a pair of strategies A i, B j is a random value, depending on the outcomes of all random moves. In this case, a natural estimate of the expected payoff is its average value ( expected value). We will use the same sign to denote both the winning itself (in a game without random moves) and its average value (in a game with random moves).

Let us know the values ​​a ij of the payoff (or average payoff) for each pair of strategies. The values ​​can be written in the form of a rectangular table (matrix), the rows of which correspond to our strategies (A i), and the columns correspond to the opponent’s strategies (B j). This table is called the payoff matrix or simply the game matrix. The matrix of the game mxn is shown in Fig. 1.

Rice. 1. Matrix mxn

In short, we will denote the matrix of the game ‖a ij‖. Let's look at a few basic examples of games.

Example 1. Two players A and B, without looking at each other, place a coin on the table, heads or tails up, at their discretion. If the players chose the same sides (both have a coat of arms or both have heads), then player A takes both coins; otherwise they are taken by player B. It is required to analyze the game and create its matrix. Solution. The game consists of only two moves: our move and the opponent's move, both personal. The game does not belong to games with complete information, since at the moment of the move the player making it does not know what the other has done. Since each player has only one personal move, the player's strategy is a choice with this single personal move.

We have two strategies: A 1 - choose a coat of arms and A 2 - choose tails; The enemy has the same two strategies: B 1 - coat of arms and B 2 - tails. Thus, this game is a 2x2 game. We will count winning a coin as +1. Game Matrix:

Using the example of this game, no matter how elementary it is, one can understand some essential ideas of game theory. Let's first assume that this game is played only once. Then, obviously, there is no point in talking about any “strategies” of players that are more intelligent than others. Each of the players with the same reason can make any decision. However, when the game is repeated, the situation changes.

Indeed, let’s assume that we (player A) have chosen some strategy for ourselves (say, A 1) and stick to it. Then, based on the results of the first few moves, the enemy will guess about our strategy and will respond to it in the least advantageous way for us, i.e. choose tails. It is clearly not beneficial for us to always use one strategy; In order not to be a loser, we must sometimes choose a coat of arms, sometimes a tails. However, if we alternate coats of arms and tails in a certain sequence (for example, through one), the enemy can also guess about this and respond to this strategy in the worst way for us. Obviously, a reliable way to guarantee that the enemy will not know our strategy is to organize the choice at each move when we ourselves do not know it in advance (this can be ensured, for example, by tossing a coin). Thus, through intuitive reasoning, we approach one of the essential concepts of game theory - the concept of “mixed strategy”, i.e. such when “pure” strategies - in in this case A 1 and A 2 – alternate randomly with certain frequencies. In this example, for reasons of symmetry, it is clear in advance that strategies A 1 and A 2 should alternate with the same frequency; in more complex games the solution may be far from trivial.

Example 2. Players A and B simultaneously and independently of each other write down one of three numbers: 1, 2 or 3. If the sum of the written numbers is even, then B pays A this amount in rubles; if it is odd, then, on the contrary, A pays B this amount. It is required to analyze the game and create its matrix.

Solution. The game consists of two turns; both are personal. We (A) have three strategies: A 1 - write 1; A 2 - write 2; A 3 - write 3. The enemy (B) has the same three strategies. The game is a 3x3 game:

Obviously, as in the previous case, the enemy can respond to any strategy we choose in a way that is worst for us. Indeed, if we choose, for example, strategy A 1, the enemy will always respond to it with strategy B 2; to strategy A 2 - strategy B 3; to strategy A 3 - strategy B 2; thus, any choice of a certain strategy will inevitably lead us to loss (however, we must not forget that the enemy is in the same distress). The solution to this game (i.e., the set of the most profitable strategies of both players) will be given in § 5.

Example 3. We have three types of weapons at our disposal: A 1, A 2, A 3; The enemy has three types of aircraft: B 1, B 2, B 3. Our task is to hit the plane; The enemy's task is to keep him undefeated. When using weapons A 1, aircraft B 1, B 2, B 3 are hit respectively with probabilities of 0.9, 0.4 and 0.2; with armament A 2 - with probabilities 0.3, 0.6 and 0.8; with armament A 3 - with probabilities 0.5, 0.7 and 0.2. It is required to formulate the situation in terms of game theory.

Solution. The situation can be considered as a 3x3 game with two personal moves and one random one. Our personal move is the choice of weapon type; The enemy's personal move is the choice of aircraft to participate in the battle. Random move - use of weapons; this move may or may not result in the aircraft being defeated. Our gain is equal to one if the plane is hit, and equal to zero otherwise. Our strategies are three weapon options; enemy strategies - three aircraft options. The average payoff for each given pair of strategies is nothing more than the probability of hitting a given aircraft with a given weapon. Game Matrix:

The purpose of game theory is to develop recommendations for the reasonable behavior of players in conflict situations, i.e. determining the “optimal strategy” for each of them. The optimal strategy of a player in game theory is a strategy that, when the game is repeated many times, provides the given player with the maximum possible average win (or the minimum possible average loss). When choosing this strategy, the basis of reasoning is the assumption that the enemy is at least as intelligent as ourselves and is doing everything to prevent us from achieving our goal.

In game theory, all recommendations are developed based on these principles; therefore, it does not take into account the elements of risk that are inevitably present in every real strategy, as well as the possible miscalculations and mistakes of each player. Game theory, like any mathematical model of a complex phenomenon, has its limitations. The most important of them is that the winnings are artificially reduced to one singular. In most practical conflict situations, when developing a reasonable strategy, it is necessary to take into account not one, but several numerical parameters - criteria for the success of the event. A strategy that is optimal according to one criterion will not necessarily be optimal according to others. However, being aware of these limitations and therefore not blindly adhering to the recommendations received game methods, you can still intelligently use the mathematical apparatus of game theory to develop, if not exactly an “optimal”, then at least an “acceptable” strategy.

§ 2. Lower and upper price of the game. Minimax principle

Consider the game mxn with a matrix as in Fig. 1. We will denote the number of our strategy with the letter i; the letter j is the number of the enemy's strategy. Let’s set ourselves a task: to determine our optimal strategy. Let's analyze each of our strategies sequentially, starting with A 1 .

When choosing a strategy A i, we must always count on the fact that the enemy will respond to it with one of the strategies B j for which our payoff a ij is minimal. Let us determine this winning value, i.e. the minimum of the numbers a ij in i th line. Let's denote it α i:

Here the sign min (minimum in j) denotes the minimum of the values ​​of this parameter for all possible j. Let's write down the numbers α i ; next to the matrix on the right as an additional column:

When choosing any strategy A i , we must count on the fact that as a result of the enemy’s reasonable actions we will not win more than α i . Naturally, acting most cautiously and counting on the most reasonable opponent (i.e., avoiding any risk), we must choose the strategy for which the number α i is the maximum. Let's denote this maximum value α:

or, taking into account formula (2.1),

The quantity α is called lower price games, otherwise - maximin winnings or simply maximin. The number α lies in a certain row of the matrix; the strategy of player A that corresponds to this line is called a maximin strategy. Obviously, if we adhere to the maximin strategy, then we are guaranteed a win, at least not less than α, regardless of the enemy’s behavior. Therefore, the value α is called the “lower price of the game.” This is the guaranteed minimum that we can provide ourselves by adhering to the most cautious (“reinsurance”) strategy.

Obviously, a similar reasoning can be carried out for adversary B. Since the adversary is interested in minimizing our gain, he must review each of his strategies from the point of view of the maximum gain for this strategy. Therefore, at the bottom of the matrix we will write down the maximum values ​​​​for each column:

and find the minimum of β j:

The value β is called the upper price of the game, otherwise known as the “minimax”. The opponent's strategy corresponding to the minimax payoff is called his “minimax strategy.” By adhering to his most cautious minimax strategy, the enemy guarantees himself the following: no matter what we do against him, he will in any case lose an amount no greater than β. The principle of caution, which dictates that players choose appropriate strategies (maximin and minimax), is often called the “minimax principle” in game theory and its applications. The most cautious maximin and minimax strategies of players are sometimes denoted general term"minimax strategies".

As examples, we define the lower and upper price of the game and minimax strategies for examples 1, 2 and 3 of § 1.

Example 1. In Example 1 § 1, a game is given with the following matrix:

Since the values ​​α i and β j are constant and equal to –1 and +1, respectively, the lower and upper prices of the game are also equal to –1 and +1: α = –1, β = +1. Any strategy of player A is his maximin strategy, and any strategy of player B is his minimax strategy. The conclusion is trivial: by sticking to any of his strategies, player A can guarantee that he will lose no more than 1; Player B can guarantee the same.

Example 2. In Example 2 § 1 a game with a matrix is ​​given:

Low game price α = –3; the upper price of the game is β = 4. Our maximin strategy is A 1 ; By applying it systematically, we can firmly expect to win at least -3 (lose no more than 3). The enemy's minimax strategy is any of the strategies B 1 and B 2; by applying them systematically, he, in any case, can guarantee that he will lose no more than 4. If we deviate from our maximin strategy (for example, choose strategy A 2), the enemy can “punish” us for this by applying strategy B 3 and reducing our winnings are -5; Likewise, the opponent’s retreat from his minimax strategy can increase his loss to 6.

Example 3. In Example 3 of § 1 a game with a matrix is ​​given:

Low game price α = 0.3; the upper price of the game is β = 0.7. Our most cautious (maximin) strategy is A 2 ; By using A2 weapons, we guarantee that we will hit the aircraft on average in no less than 0.3 of all cases. The enemy's most cautious (minimax) strategy is B 2 ; By using this aircraft, the enemy can be sure that it will be hit in no more than 0.7 of all cases.

Using the last example, it is convenient to demonstrate one important property of minimax strategies – their instability. Let us use our most cautious (maximin) strategy A 2 , and the enemy use his most cautious (minimax) strategy B 2 . As long as both opponents adhere to these strategies, the average payoff is 0.6; it is more than the lower price, but less than the upper price of the game. Now let's assume that the enemy knows that we are using strategy A 2 ; he will immediately respond with strategy B 1 and reduce the winnings to 0.3. In turn, we have a good answer to strategy B 1: strategy A 1, which gives us a payoff of 0.9, etc.

Thus, the situation in which both players use their minimax strategies is unstable and can be violated by received information about the strategy of the other side. However, there are some games for which minimax strategies are stable. These are the games for which the lower price is equal to the upper: α = β. If the lower price of the game is equal to the upper price, then their total value is called the net price of the game (sometimes just the price of the game), we will denote it by the letter ν.

Let's look at an example. Let the 4x4 game be given by the matrix:

Let's find the lower price of the game: α = 0.6. Let's find the upper price of the game: β = 0.6. They turned out to be the same, therefore, the game has a net price equal to α = β = ν = 0.6. Element 0.6, highlighted in the payment matrix, is both the minimum in its row and the maximum in its column. In geometry, a point on a surface that has a similar property (simultaneous minimum in one coordinate and maximum in another) is called a saddle point; by analogy, this term is used in game theory. An element of a matrix that has this property is called a saddle point of the matrix, and a game is said to have a saddle point.

The saddle point corresponds to a pair of minimax strategies (in this example, A 3 and B 2). These strategies are called optimal, and their combination is called a solution to the game. The solution to the game has the following remarkable property. If one of the players (for example, A) adheres to his optimal strategy, and the other player (B) deviates from his optimal strategy in any way, then for the player who made the deviation, this can never be profitable; such a deviation of player B may best case scenario leave the winnings unchanged, and worst case- increase it. On the contrary, if B adheres to his optimal strategy, and A deviates from his, then this can in no way be beneficial for A.

This statement can be easily verified using the example of the game under consideration with a saddle point. We see that in the case of a game with a saddle point, minimax strategies have a kind of “stability”: if one side adheres to its minimax strategy, then it can only be disadvantageous for the other to deviate from its own. Note that in this case, any player's knowledge that the opponent has chosen his optimal strategy cannot change the player's own behavior: if he does not want to act against his own interests, he must adhere to his optimal strategy. A pair of optimal strategies in a saddle point game is like an “equilibrium position”: any deviation from the optimal strategy leads to unfavorable consequences for the deviating player, forcing him to return to his original position.

So, for every game with a saddle point, there is a solution that determines a pair of optimal strategies for both sides, differing in the following properties.

1) If both parties adhere to their optimal strategies, then the average payoff is net price game ν, which is simultaneously its lower and upper price.

2) If one of the parties adheres to its optimal strategy, and the other deviates from its own, then the deviating party can only lose and in no case can increase its winnings.

The class of games that have a saddle point represents big interest both from a theoretical and practical point of view. In game theory it is proven that, in particular, every game with complete information has a saddle point, and, therefore, every such game has a solution, i.e. there is a pair of optimal strategies for both sides that give an average payoff equal to the cost of the game. If a game with complete information consists only of personal moves, then when each side applies its optimal strategy, it should always end in a well-defined outcome, namely, a win exactly equal to the cost of the game.

As an example of a game with complete information, we give famous game with coins placed on round table. Two players alternately place identical coins on the round table, each time choosing an arbitrary position for the center of the coin; mutual covering of coins is not allowed. The player who puts in the last coin wins (when there is no room left for others). Obviously, the outcome of this game is always predetermined, and there is a well-defined strategy that ensures a certain win for the player who puts in the coin first. Namely, he must place the coin in the center of the table for the first time, and then respond to each opponent’s move with a symmetrical move. In this case, the second player can behave as he pleases without changing the predetermined outcome of the game. Therefore, this game makes sense only for players who do not know the optimal strategy. The situation is similar with chess and other games with complete information; any of these games has a saddle point and a solution that indicates to each player his optimal strategy; solution chess game was not found only because the number of combinations of possible moves in chess is too large for it to be possible to construct a payoff matrix and find a saddle point in it.

§ 3. Pure and mixed strategies. Solving a mixed strategy game

Among finite games of practical importance, games with a saddle point are relatively rare; a more typical case is when the lower and upper price of the game are different. Analyzing the matrices of such games, we came to the conclusion that if each player is given the choice of a single strategy, then, counting on a reasonably acting opponent, this choice should be determined by the minimax principle. By adhering to our maximin strategy, we certainly guarantee ourselves a win equal to the lower price of the game α, regardless of the enemy’s behavior. A natural question arises: is it possible to guarantee an average payoff greater than α if you use not just one “pure” strategy, but randomly alternate several strategies? Such combined strategies, consisting of the use of several pure strategies, alternating according to a random law with a certain frequency ratio, are called mixed strategies in game theory.

Obviously, each pure strategy is a special case of a mixed one, in which all strategies except one are applied with zero frequencies, and this one with frequency 1. It turns out that, using not only pure, but also mixed strategies, it is possible to obtain for each finite game decision, i.e. a pair of (generally mixed) strategies such that when both players use them, the payoff will be equal to the cost of the game, and with any one-sided deviation from the optimal strategy, the payoff can only change in a direction unfavorable for the deviant.

The stated statement constitutes the content of the so-called fundamental theorem of game theory. This theorem was first proven by von Neumann in 1928. Known proofs of the theorem are relatively complex; Therefore, we will give only its formulation.

Each end game has at least one solution (possibly in the area of ​​mixed strategies).

The payoff resulting from the decision is called the cost of the game. From the main theorem it follows that every finite game has a price. Obviously, the price of the game ν always lies between the lower price of the game α and the upper price of the game β:

(3.1) α ≤ ν ≤ β

Indeed, α is the maximum guaranteed gain that we can ensure for ourselves using only our pure strategies. Since mixed strategies include, as a special case, all pure strategies, then by allowing, in addition to pure ones, also mixed strategies, we, in any case, do not worsen our capabilities; therefore, ν ≥ α. Similarly, considering the enemy’s capabilities, we will show that ν ≤ β, which implies the inequality (3.1) being proved.

Let us introduce a special notation for mixed strategies. If, for example, our mixed strategy consists of using strategies A 1, A 2, A 3 with frequencies p 1, p 2, p 3, and p 1 + p 2 + p 3 = 1, we will denote this strategy

Similarly, we will denote the enemy’s mixed strategy:

where q 1, q 2, q 3 are the frequencies at which strategies B 1, B 2, B 3 are mixed; q 1 + q 2 + q 3 = 1.

Let us assume that we have found a solution to the game consisting of two optimal mixed strategies S A *, S B *. In general, not all pure strategies available to a given player are included in his optimal mixed strategy, but only some. We will call the strategies included in a player’s optimal mixed strategy his “useful” strategies. It turns out that the solution to the game has another remarkable property: if one of the players sticks to his optimal mixed strategy S A * (S B *), then the payoff remains unchanged and equal to the cost of the game ν, regardless of what the other player does, unless he goes beyond its “useful” strategies. For example, he can use any of his “useful” strategies in pure form, and can also mix them in any proportions.

§ 4. Elementary methods for solving games. Games 2x2 and 2xn

If the game mxn does not have a saddle point, then finding a solution is generally a rather difficult task, especially for large m and n. Sometimes this task can be simplified if you first reduce the number of strategies by deleting some unnecessary ones. Excessive strategies are a) duplicative and b) obviously unprofitable. Consider, for example, a matrix game:

It is easy to verify that strategy A 3 exactly repeats (“duplicates”) strategy A 1, so any of these two strategies can be eliminated. Next, comparing strings A 1 and A 2, we see that each element of string A 2 is less (or equal) to the corresponding element of string A 1. Obviously, we should never use strategy A2; it is obviously unprofitable. Crossing out A 3 and A 2, we reduce the matrix to more simple view. Next, we note that strategy B 3 is obviously unprofitable for the enemy; By crossing it out, we bring the matrix to its final form:

Thus, the 4x4 game is reduced to a 2x3 game by eliminating duplicate and obviously unprofitable strategies.

The procedure for eliminating duplicate and obviously unprofitable strategies should always precede the solution of the game. The simplest cases of finite games, which can always be solved using elementary methods, are 2x2 and 2xn games.

Consider a 2x2 game with a matrix:

Two cases may occur here: 1) the game has a saddle point; 2) the game does not have a saddle point. In the first case, the solution is obvious: it is a pair of strategies intersecting at a saddle point. Let us note by the way that in a 2x2 game the presence of a saddle point always corresponds to the existence of obviously unprofitable strategies that must be eliminated during preliminary analysis.

Let there be no saddle point and, therefore, the lower price of the game is not equal to the upper: α ≠ β. We need to find the optimal mixed strategy for player A:

It is distinguished by the property that, whatever the opponent’s actions (unless he goes beyond the limits of his “useful” strategies), the payoff will be equal to the cost of the game ν. In a 2x2 game, both of the opponent's strategies are "useful" - otherwise the game would have a pure strategy solution (saddle point). This means that if we adhere to our optimal strategy (4.1), then the opponent can use any of his pure strategies B 1, B 2 without changing the average payoff ν. From here we have two equations:

from which, taking into account that p 1 + p 2 = 1, we obtain:

We find the price of the game ν by substituting the values ​​p 1, p 2 into any of the equations (4.2).

If the price of the game is known, then to determine the opponent’s optimal strategy

One equation is enough, for example:

whence, taking into account that q 1 + q 2 = 1, we have:

Example 1. Let us find the solution to the 2×2 game considered in Example 1 of § 1 with the matrix:

The game does not have a saddle point (α = –1; β = +1), and, therefore, the solution must lie in the area of ​​mixed strategies:

We need to find p 1, p 2, q 1 and q 2. For p 1 we have the equation

1*p 1 + (–1)(1 – p 1) = (–1)p 1 + 1(1 – p 1)

whence p 1 = 1/2, p 2 = 1/2.

We find similarly: q 1 = 1/2, q 2 = 1/2, ν = 0.

Therefore, the optimal strategy for each player is to randomly alternate between his two pure strategies, using each equally often; in this case the average gain will be zero.

The conclusion was clear enough in advance. In the next example we will look at more challenging game, the solution of which is not so obvious. The example is a rudimentary example of games known as "deception" or "deception" games. In practice, in conflict situations, they often use different ways misleading the enemy (disinformation, placement of false targets, etc.). The example, despite its simplicity, is quite instructive.

Example 2. The game is as follows. There are two cards: an ace and a two. Player A draws one at random; B doesn't see which card he took out. If A has taken out an ace, he declares: “I have an ace,” and demands 1 ruble from his opponent. If A took out a deuce, then he can either A 1) say “I have an ace” and demand 1 ruble from the opponent, or A 2) admit that he has a deuce and pay the opponent 1 ruble.

The enemy, if he is voluntarily paid 1 ruble, can only accept it. If he is asked for 1 ruble, then he can either B 1) believe player A that he has an ace and give him 1 ruble, or B 2) demand a check in order to make sure whether A’s statement is true. If the result is After checking, it turns out that A really has an ace, B must pay A 2 rubles. If it turns out that A is cheating and has a deuce, player A pays player B 2 rubles. It is required to analyze the game and find the optimal strategy for each player.

Solution. The game has a relatively complex structure; it consists of one mandatory random move - player A's choice of one of two cards - and two personal moves, which, however, are not necessarily carried out. Indeed, if A took out an ace, then he does not make any personal move: he is given only one opportunity - to demand 1 ruble, which he does. In this case, the personal move - believe or not believe (i.e., pay or not pay 1 ruble) - is given to player B. If A received a two as a result of the first random move, then he is given a personal move: pay 1 ruble or try to cheat enemy and demand 1 ruble (in short: “not to deceive” or “to deceive”). If A chooses the first, then B can only accept 1 ruble; if A chose the second, then player B is given a personal choice: to believe or not to believe A (i.e. pay A 1 ruble or demand verification).

Each player's strategies are rules that indicate what a player should do when given a personal turn. Obviously, A has only two strategies: A 1 - to deceive, A 2 - not to deceive. B also has two strategies: B 1 - believe, B 2 - do not believe. Let's build a game matrix. To do this, we calculate the average winnings for each combination of strategies.

1. A 1 B 1 (A deceives, B believes). If A received an ace (the probability of this is ½), then he is not given a personal move; he demands 1 ruble, and player B believes him; A’s payoff in rubles is 1. If A received a deuce (the probability of this is also ½), he cheats according to his strategy and demands 1 ruble; B believes and pays; winning A is also equal to 1. Average winning: a 11 = ½*1 + ½*1 = 1.

2. A 1 B 2 (A deceives, B does not believe). If A gets an ace, he has no personal move; he demands 1 ruble; B, according to his strategy, does not believe it and, as a result of the check, pays 2 rubles (A’s payoff is +2). If A received a bad mark, he, according to his strategy, demands 1 ruble; In, according to his own, he does not believe; as a result, A pays 2 rubles (A’s payoff is –2). The average winning is: a 12 = ½*(+2) + ½*(–2) = 0.

3. A 2 B 1 (A does not deceive, B believes). If A takes out an ace, he demands 1 ruble; B pays according to his strategy; A's payoff is +1. If A takes out a deuce, according to his strategy he pays 1 ruble; B can only accept (A's payoff is –1). The average winning is: a 21 = ½*(+1) + ½*(–1) = 0.

4. A 2 B 2 (A does not deceive, B does not believe). If A takes out an ace, he demands 1 ruble; B checks and as a result of the check pays 2 rubles (the winnings are +2). If A takes out a deuce, he pays 1 ruble; All that remains is to accept (the payoff is 1). The average winning is: a 22 = ½*(+2) + ½*(–1) = ½.

We build the game matrix:

The matrix does not have a saddle point. The lower price of the game is α = 0, the upper price of the game is β = ½. Let's find a solution to the game in the field of mixed strategies. Applying formula (4.3), we obtain:

those. Player A must use his first strategy (cheat) in one third of all cases, and his second strategy (not cheat) in two thirds of all cases. In this case, he will win on average the game price ν = 1/3.

The value ν = 1/3 indicates that under these conditions the game is profitable for A and disadvantageous for B. Using his optimal strategy, A can always ensure a positive average payoff. Note that if A were to use his most cautious (maximin) strategy (in this case, both strategies A 1 and A 2 are maximin), he would have an average payoff of zero. Thus, the use of a mixed strategy gives A the opportunity to realize its advantage over B, which arises under the given rules of the game.

Let's determine the optimal strategy B. We have: q 1 *1 + q 2 *0 = 1/3, q 1 = 1/3, q 2 = 2/3. Where

i.e. player B must believe A in one third of all cases and pay him 1 ruble without checking, and in two thirds of cases - check. Then he will lose on average 1/3 of each game. If he used his minimax pure strategy B 2 (don't believe), he would lose on average 1/2 for each game.

The solution to the 2x2 game can be given a simple geometric interpretation. Let there be a 2x2 game with the matrix

Let's take a section of the abscissa axis of length 1 (Fig. 4.1). The left end of the section (the point with the abscissa x = 0) will depict strategy A 1; the right end of the section (x = 1) - strategy A 2. Let us draw two perpendiculars to the abscissa axis through points A 1 and A 2: axis I–I and axis II–II. On axis I–I we will postpone winnings for strategy A 1; on the axis II–II-payoffs for strategy A 2. Consider the strategy of the enemy B 1; it gives two points on the axes I–I And II–II with ordinates a 11 and a 21, respectively. Let's draw a straight line B 1 B 1 through these points. Obviously, if we use a mixed strategy with enemy strategy B 1

then our average gain, equal in this case to a 11 p 1 + a 21 p 2, will be represented by point M on the straight line B 1 B 1; The abscissa of this point is equal to p 2. The straight line B 1 B 1, depicting the payoff for strategy B 1, will be conventionally called “strategy B 1”.

Obviously, strategy B 2 can be constructed in exactly the same way (Fig. 4.2).

We need to find the optimal strategy S A *, i.e., one for which the minimum gain (for any behavior B) would turn into a maximum. To do this, we will construct a lower bound for the winnings for strategies B 1, B 2, i.e. broken line B 1 NB 2 marked in Fig. 4.2 with a thick line. This lower bound will express the minimum payoff of player A for any of his mixed strategies; the point N at which this minimum payoff reaches a maximum determines the decision and price of the game. It is easy to verify that the ordinate of point N is the cost of the game ν, and its abscissa is equal to p 2 - the frequency of application of strategy A 2 in the optimal mixed strategy S A *.

In our case, the solution to the game was determined by the intersection point of the strategies. However, this will not always be the case; in Fig. 4.3 shows the case when, despite the presence of an intersection of strategies, the solution gives pure strategies for both players (A 2 and B 2), and the cost of the game is ν = a 22. In this case, the matrix has a saddle point, and strategy A 1 is obviously unprofitable, because for any pure strategy of the opponent, it gives a smaller payoff than A 2.

In the case when the enemy has a obviously unprofitable strategy, the geometric interpretation has the form shown in Fig. 4.4.

In this case, the lower limit of winning coincides with strategy B 1, strategy B 2 is obviously unprofitable for the opponent.

The geometric interpretation also makes it possible to visualize the lower and upper prices of the game (Fig. 4.5).

To illustrate, let us construct geometric interpretations of the 2×2 games discussed in examples 1 and 2 (Fig. 4.6 and 4.7).

We made sure that any 2x2 game can be solved with basic techniques. Any 2xn game can be solved in exactly the same way. where we have only two strategies, and the enemy has an arbitrary number.

Let us have two strategies: A 1, A 2, and the enemy has n strategies: B 1, B 2, ..., B n. The matrix ‖a ij ‖ is given; it consists of two rows and n columns. Similar to the case of two strategies, let us give the problem a geometric interpretation; n enemy strategies will be depicted as n straight lines (Fig. 4.8). We construct the lower boundary of the winnings (the broken line B 1 MNB 2) and find on it the point N with the maximum ordinate. This point gives the solution to the game (strategy ) the ordinate of point N is equal to the cost of the game ν, and the abscissa is equal to the frequency p 2 of strategy A 2 .

In this case, the enemy’s optimal strategy is obtained by using a mixture of two “useful” strategies: B 2 and B 4, intersecting at point N. Strategy B 3 is obviously unprofitable, and strategy B 1 is unprofitable with the optimal strategy S A *. If A sticks to his optimal strategy, then the payoff will not change, no matter which of his “useful” strategies B uses, however, it will change if B switches to strategies B 1 or B 3. In game theory, it is proven that any finite game mxn has a solution in which the number of “useful” strategies on either side does not exceed the smaller of the two numbers m and n. In particular, it follows from this that the game 2xm always has a solution in which no more than two “useful” strategies are involved on either side.

Using geometric interpretation, we can give a simple way to solve any 2xm game. Directly from the drawing we find a pair of “useful” enemy strategies B j and B k intersecting at point N (if more than two strategies intersect at point N, take any two of them). We know that if player A adheres to his optimal strategy, then the payoff does not depend on the proportion in which B uses his “useful” strategies, therefore,

From these equations and the condition p 2 = 1 – p 1, we find p1, p2 and the game price ν. Knowing the price of the game, you can immediately determine the optimal strategy player B. To do this, solve, for example, the equation: q j a 1 j + q k a 1 k = ν, where q j + q k = 1. In the case when we have m strategies, and the enemy has only two, obviously, the problem is solved in a completely similar way ; It is enough to note that by reversing the sign of the winnings, you can turn player A from a “winner” into a “loser.” You can solve the game without changing the winning sign; then the problem is solved directly for B, but not the lower, but the upper bound of the gain is constructed (Fig. 4.9). On the boundary, a point N with a minimum ordinate is sought, which is the cost of the game ν.

Let's consider and solve several examples of 2x2 and 2xm games, which are simplified examples of games that have practical significance.

Example 3. Side A sends two bombers to enemy B's location I And II; I flies in front II- behind. One of the bombers - it is not known in advance which one - must carry a bomb, the other serves as an escort. In the enemy area, the bombers are attacked by a fighter from side B. The bombers are armed with cannons of varying rates of fire. If a fighter attacks a rear bomber II, then only the guns of this bomber fire at it; if he attacks the front bomber, then the guns of both bombers fire at him. The probability of hitting a fighter in the first case is 0.3, in the second 0.7.

If a fighter is not shot down by defensive fire from bombers, it hits its chosen target with a probability of 0.6. The bombers' task is to deliver the bomb to the target; The fighter's task is to prevent this, i.e. shoot down a carrier bomber. Required to select optimal strategies sides:

a) for side A: which bomber should be used as a carrier?

b) for side B: which bomber to attack?

Solution. We have a simple case of a 2x2 game; winning - probability non-infection of the host. Our strategies: A 1 - carrier - bomber I; A 2 - carrier - bomber II. Enemy strategies: B 1 - attack the bomber I; B 2 - bomber attacks II. Let's create a game matrix, i.e. Let's find the average payoff for each combination of strategies.

1. A 1 B 1 (carrier I, attacked I). The carrier will not be hit if the bombers shoot down the fighter, or not, but it will not hit its target: a 11 = 0.7 + 0.3 * 0.4 = 0.82.

2. A 2 B 1 (carrier II, attacked I). a 21 = 1

3. A 1 B 2 (carrier I, attacked II). A 12 = 1

4. A 2 B 2 (carrier II, attacked II). A 22 = 0.3 + 0.7*0.4 = 0.58

The game matrix looks like:

Low game price 0.82; upper price 1. The matrix does not have a saddle point; We are looking for a solution in the field of mixed strategies. We have:

p 1 *0.82 + p 2 *1 = ν

p 1 *1 + p 2 *0.58 = ν

p 1 = 0.7; p 2 = 0.3

Our optimal strategy yes, i.e. you need to choose more often as a carrier I, how II. The price of the game is ν = 0.874. Knowing ν, we determine q 1 and q 2 - the frequencies of strategies B 1 and B 2 in the opponent’s optimal strategy S B *. We have: q 1 *0.82 + q 2 *1 = 0.874 and q 2 = 1 – q 1, whence q 1 = 0.7; q 2 = 0.3, i.e. the enemy’s optimal strategy is .

Example 4. Side A attacks the object, side B defends it. Side A has two planes; side B has three anti-aircraft guns. Each aircraft is a carrier of a powerful lethal weapon; In order for an object to be hit, it is enough for at least one aircraft to break through to it. Side A aircraft can choose any of three directions to approach the object: I, II, III(Fig. 4.10). The enemy (side B) can place any of his guns in any direction; in this case, each gun only shoots through an area of ​​space related to a given direction, and does not shoot through neighboring directions. Each gun can only fire one aircraft; a fired aircraft is hit with probability 1. Side A does not know where the guns are located; Side B doesn't know where the planes will come from. Side A's job is to hit the target; The task of side B is to prevent his defeat. Find the solution to the game.

Solution. The game is a 2x3 game. Winning is the probability of hitting the object. Our possible strategies: A 1 - send one plane to two different directions. A 2 - send both planes in the same direction. Enemy strategies: B 1 - place one gun in each direction; B 2 - place two guns in one direction and one in the other; At 3 - place all three guns in the same direction. Let's create a game matrix.

1. A 1 B 1 (planes fly along different directions; guns are placed one at a time). Obviously, in this case, not a single plane will break through to the object: a 11 = 0.

2. A 2 B 1 (airplanes fly together in one direction; guns are placed one at a time). Obviously, in this case one plane will pass to the object without being fired upon: a 21 = 1.

3. A 1 B 2 (planes fly one at a time; the enemy protects two directions and leaves the third unprotected). The probability that at least one plane will break through to the object is equal to the probability that one of them will choose an unprotected direction: a 12 = 2/3.

4. A 2 B 2 (airplanes fly together in one direction; the enemy protects one direction with two guns and one with one, i.e., actually protects one direction and leaves two unprotected). The probability that at least one plane will break through to the object is equal to the probability that a pair of planes will choose an actually unprotected direction: a 22 = 2/3.

5. A 1 B 3 (planes fly one at a time; the enemy defends only one direction with three guns): a 13 = 1.

6. A 2 B 3 (both planes fly together; the enemy defends only one direction with three guns). For an object to be hit, aircraft must choose an unprotected direction: a 23 = 2/3.

Game Matrix:

It is clear from the matrix that strategy B 3 is obviously unprofitable compared to B 2 (this could have been decided in advance). By eliminating strategy B 3, the game is reduced to a 2x2 game:

The matrix has a saddle point: the lower price of the game 2/3 coincides with the upper one. At the same time, we note that for us (A) strategy A 1 is obviously unprofitable. Conclusion: both sides A and B should always use their pure strategies A 2 and B 2, i.e. we must send planes in 2s, randomly choosing the direction in which the pair is sent; the enemy must place guns like this: two - in one direction, one in another, and the choice of these directions must also be made randomly (here, as we see, “pure strategies” already include an element of randomness). By applying these optimal strategies, we will always obtain a constant average payoff of 2/3 (i.e., the object will be hit with a probability of 2/3). Note that the found solution to the game is not unique; in addition to the decision in pure strategies, there is a whole section of mixed strategies of player A that are optimal, from p 1 = 0 to p 1 = 1/3 (Fig. 4.11).

It is easy, for example, to verify directly that the same average gain of 2/3 will be obtained if we apply our strategies A 1 and A 2 in the proportion of 1/3 and 2/3.

Example 5. The same conditions as in the previous example, but four directions of attack are possible for us, and the enemy has four guns.

Solution. We still have two possible strategies: A 1 - send planes one at a time, A 2 - send two planes together. The enemy has five possible strategies: B 1 - place one gun in each direction; In 2 - place two guns in two different directions; In 3 - place two guns in one direction and one each in the other two; B 4 - place three guns in one direction and one in another; At 5 - place all four guns in one direction. We will discard strategies B 4 and B 5 in advance as obviously unprofitable. Reasoning similarly to the previous example, we build the game matrix:

The lower price of the game is 1/2, the upper is 3/4. The matrix does not have a saddle point; the solution lies in the area of ​​mixed strategies. Using the geometric interpretation (Fig. 4.12), we highlight the enemy’s “useful” strategies: B 1 and B 2.

We determine the frequencies p 1 and p 2 from the equations: p 1 *0 + (1 – p 1)*1 = ν and p 1 *5/6 + (1 – p 1)*1/2 = ν; whence p 1 = 3/8; p 2 = 5/8; ν = 5/8, i.e. our optimal strategy is . By using it, we guarantee ourselves an average win of 5/8. Knowing the cost of the game ν = 5/8, we find the frequencies q 1 and q 2 of the opponent’s “useful” strategies: q 1 *0 + (1 – q 1)*5/6 = 5/8, q 1 = ¼, q 2 = ¾. The enemy's optimal strategy will be: .

Example 6. Side A has two strategies A 1 and A 2, side B has four strategies B 1, B 2, B 3 and B 4. The game matrix looks like:

Find the solution to the game.

Solution. Lowest price game 3; top 4. Geometric interpretation (Fig. 4.13) shows that useful strategies for player B are B 1 and B 2 or B 2 and B 4:

Player A has infinitely many optimal mixed strategies: in an optimal strategy, p 1 can vary from 1/5 to 4/5. The cost of the game is ν = 4. Player B has a pure optimal strategy B 2 .

§ 5. General methods for solving finite games

So far we have considered only the most elementary games of the 2xn type, which can be solved quite simply and allow for a convenient and visual geometric interpretation. In the general case, solving the game mxn is a rather difficult problem, and the complexity of the problem and the amount of calculations required to solve it increases sharply with increasing m and n. However, these difficulties are not of a fundamental nature and are associated only with a very large volume of calculations, which in some cases may turn out to be practically impossible. The fundamental aspect of the method for finding a solution remains the same for any m.

Let's illustrate this using the example of the game 3xn. Let's give it a geometric interpretation - already a spatial one. Our three strategies A 1 , A 2 and A 3 will be represented by three points on the plane xOy; the first lies at the origin of coordinates (Fig. 5.1), the second and third - on the axes Oh And OU at distances 1 from the beginning.

Axes are drawn through points A 1, A 2 and A 3 II, IIII And IIIIII, perpendicular to the plane xOy. On axis II winnings for strategy A 1 are postponed on the axes IIII And IIIIII- winnings for strategies A 2, A 3. Each enemy strategy B j will be represented by a plane cutting off the axes II, IIII And IIIIII segments equal to the winnings for the corresponding strategies A 1, A 2 and A 3 and strategy B j. Having constructed all the enemy’s strategies in this way, we get a family of planes over the triangle A 1, A 2 and A 3 (Fig. 5.2). For this family, you can also construct a lower bound for the payoff, as we did in the case of 2xn, and find on this boundary the point N with the maximum height above the plane xOy. This height will be the cost of the game ν.

The frequencies p 1 , p 2 , p 3 of strategies A 1 , A 2 and A 3 in the optimal strategy S A * will be determined by the coordinates (x, y) of point N, namely: p 2 = x, p 3 = y, p 1 = 1 – p 2 – p 3 . However, this geometric construction even for the 3xn case it is not easy to implement and requires a lot of time and effort of imagination. In the general case of the game, it is transferred to m-dimensional space and loses all clarity, although the use of geometric terminology in a number of cases may be useful. When solving mxn games in practice, it is more convenient to use not geometric analogies, but calculated analytical methods, especially since these methods are the only ones suitable for solving the problem on computers.

All of these methods essentially come down to solving a problem through successive trials, but ordering the sequence of trials allows you to build an algorithm that leads to a solution in the most economical way. Here we will briefly focus on one calculation method solving mxn games - using the so-called “method” linear programming" To do this, we first give a general formulation of the problem of finding a solution to the game mxn. Let a game mxn be given with m strategies A 1 , A 2 , …, A m of player A and n strategies B 1 , B 2 , …, B n of player B and the payment matrix ‖a i j ‖ is given. It is required to find a solution to the game, i.e. two optimal mixed strategies of players A and B

where p 1 + p 2 + … + p m = 1; q 1 + q 2 + … + q n = 1 (some of the numbers p i and q j may be zero).

Our optimal strategy S A * must provide us with a gain not less than ν for any behavior of the enemy, and a gain equal to ν for his optimal behavior (strategy S B *). Similarly, strategy S B * should provide the enemy with a loss no greater than ν for any of our behavior and equal to ν for our optimal behavior (strategy S A *).

The value of the game ν in this case is unknown to us; we will assume that it is equal to some positive number. Believing this way, we do not violate the generality of reasoning; In order for ν > 0, it is obviously sufficient that all elements of the matrix ‖a i j‖ are non-negative. This can always be achieved by adding a sufficiently large positive value to the elements ‖a i j ‖ L; the price of the game will increase by L, but the decision will not change.

Let us choose our optimal strategy S A *. Then our average payoff for the opponent’s strategy B j will be equal to: a j = p 1 a 1j + p 2 a 2j + … + p m a mj . Our optimal strategy S A * has the property that, for any enemy behavior, it provides a payoff no less than ν; therefore, any of the numbers a j cannot be less than ν. We get a number of conditions:

Let us divide inequalities (5.1) by the positive value ν and denote

Then conditions (5.1) will be written in the form

where ξ 1, ξ 2, …, ξ m are non-negative numbers. Since р 1 + p 2 + … + p m = 1, then the quantities ξ 1, ξ 2, …, ξ m satisfy the condition

(5.3) ξ 1 + ξ 2 + … + ξ m = 1/ν.

We want to make our guaranteed winnings as high as possible; Obviously, in this case the right-hand side of equality (5.3) takes on a minimum value. Thus, the problem of finding a solution to the game is reduced to the following mathematical problem: determine non-negative quantities ξ 1, ξ 2, ..., ξ m, satisfying conditions (5.2), so that their sum Φ = ξ 1 + ξ 2 + ... + ξ m was minimal.

Usually, when solving problems related to finding extreme values ​​(maxima and minima), the function is differentiated and the derivatives are set equal to zero. But such a technique is useless in this case, since the function Φ, which needs to be minimized, is linear, and its derivatives with respect to all arguments are equal to one, i.e. do not vanish anywhere. Consequently, the maximum of the function is achieved somewhere on the boundary of the range of changes in the arguments, which is determined by the requirement of non-negativity of the arguments and conditions (5.2). The technique of finding extreme values ​​using differentiation is also unsuitable in cases where the maximum of the lower (or minimum of the upper) limit of the winnings is determined to solve the game, as we, for example, did when solving 2xn games. Indeed, the lower boundary is made up of sections of straight lines, and the maximum is achieved not at the point where the derivative is equal to zero (there is no such point at all), but at the boundary of the interval or at the point of intersection of straight sections.

To solve such problems, which are quite often encountered in practice, a special linear programming apparatus has been developed in mathematics. The linear programming problem is formulated as follows. Given a system of linear equations:

It is required to find non-negative values ​​of the quantities ξ 1, ξ 2, ..., ξ m, satisfying conditions (5.4) and at the same time minimizing the given homogeneous linear function quantities ξ 1, ξ 2, …, ξ m (linear form): Φ = c 1 ξ 1 + c 2 ξ 2 + … + c m ξ m

It is easy to see that the game theory problem posed above is a special case of a linear programming problem with c 1 = c 2 = ... = c m = 1. At first glance it may seem that conditions (5.2) are not equivalent to conditions (5.4), since instead equal signs they contain inequality signs. However, it is easy to get rid of the inequality signs by introducing new fictitious non-negative variables z 1, z 2, ..., z n and writing conditions (5.2) in the form:

The form Φ that needs to be minimized is Φ = ξ 1 + ξ 2 + … + ξ m. The linear programming apparatus allows, by means of a relatively small number of successive samples, to select the values ​​ξ 1, ξ 2, ..., ξ m that satisfy the set requirements. For greater clarity, we will demonstrate here the use of this apparatus directly on the material of solving specific games.

Example 1. It is required to find a solution to the 3×3 game given in Example 2 of § 1, with the matrix:

To make all a ij non-negative, we add L = 5 to all elements of the matrix. We obtain the matrix:

In this case, the price of the game will increase by 5, but the solution will not change.

Let us determine the optimal strategy S A *. Conditions (5.2) have the form:

where ξ 1 = p 1 /ν, ξ 2 = p 2 /ν, ξ 3 = p 3 /ν. To get rid of inequality signs, we introduce dummy variables z 1, z 2, z 3; conditions (5.6) will be written as:

The linear form of Φ is: Φ = ξ 1 + ξ 2 + ξ 3 and should be made as small as possible. If all three strategies B are “useful”, then all three dummy variables z 1 , z 2 , z 3 will vanish (i.e., a payoff equal to the cost of the game ν will be achieved for each strategy B j). But we have no reason yet to say that all three strategies are “useful.” To check this, let's try to express the form Φ in terms of dummy variables z 1, z 2, z 3 and see if we achieve a minimum of the form by setting them equal to zero. To do this, let’s solve equations (5.7) with respect to the variables ξ 1, ξ 2, ξ 3 (i.e., express ξ 1, ξ 2, ξ 3 through fictitious variables z 1, z 2, z 3):

Adding ξ 1, ξ 2, ξ 3, we get: Φ = 1/5 + z 1/20 + z 2/10 + z 3/20. Here the coefficients for all z are positive; This means that any increase in z 1, z 2, z 3 above zero can only lead to an increase in the shape of Φ, and we want it to be minimal. Consequently, the values ​​of z 1, z 2, z 3, turning the form Φ to a minimum, are z 1 = z 2 = z 3 = 0. Therefore, the minimum value of the form Φ: 1/ν = 1/5, whence the price of the game ν = 5. Substituting zero values ​​z 1, z 2, z 3 into formulas (5.8), we find: ξ 1 = 1/20, ξ 2 = 1/10, ξ 3 = 1/20, or, multiplying them by ν, p 1 = 1/4, p 2 = 1/2, p 3 = 1/4. Thus, the optimal strategy A is found: , i.e. we should write the number 1 in one quarter of all cases, 2 in half of the cases and 3 in the remaining quarter of cases.

Knowing the cost of the game ν = 5, we can use already known methods to find the opponent’s optimal strategy . To do this, we will use our any two “useful” strategies (for example, A 2 and A 3) and write the equations:

9q 1 + 11 (1-q 2 -q 1) = 5,

whence q 1 = q3 = 1/4; q 2 = 1/2. The enemy's optimal strategy will be the same as ours: . Now let's go back to the original (unconverted) game. To do this, you only need to subtract the value L = 5 added to the elements of the matrix from the game price ν = 5. Let us obtain the price of the original game v 0 = 0. Consequently, the optimal strategies of both sides provide an average payoff equal to zero; the game is equally beneficial or disadvantageous for both parties.

Example 2. Sport Club A has three options for team composition: A 1, A 2 and A 3. Club B - also with three options B 1, B 2 and B 3. When applying to participate in the competition, neither club knows which squad the opponent will choose. The probabilities of club A winning for various team compositions, approximately known from the experience of past meetings, are given by the matrix:

Find the frequency with which clubs should field each of their squads in matches against each other in order to achieve the largest average number of wins.

Solution. Low game price 0.4; top 0.6; We are looking for a solution in the field of mixed strategies. To avoid dealing with fractions, let's multiply all elements of the matrix by 10; in this case, the price of the game will increase 10 times, but the decision will not change. We get the matrix:

Conditions (5.5) have the form:

and the minimum condition is Φ = ξ 1 + ξ 2 + ξ 3 = min.

We check whether all three enemy strategies are “useful”. As a hypothesis, we first assume that the dummy variables z 1, z 2, z 3 are equal to zero, and to check we solve equations (5.10) for ξ 1, ξ 2, ξ 3:

(5.12) 136Φ = 30 +13z 1 +18z 2 – 51z 3

Formula (5.12) shows that increasing the variables z 1 and z 2 from their assumed value of zero can only increase Φ, while increasing z 3 can decrease Φ. However, the increase in z 3 must be done carefully so that the values ​​ξ 1, ξ 2, ξ 3, depending on z 3, do not become negative. Therefore, let us set the values ​​z 1 and z 2 equal to zero on the right-hand sides of equalities (5.11), and increase the value z 3 to acceptable limits (until any of the values ​​ξ 1 , ξ 2 , ξ 3 goes to zero). From the second equality (5.11) it is clear that an increase in z 3 is “safe” for the value ξ 2 - it only increases from this. As for the values ​​ξ 1 and ξ 3, here an increase in z 3 is possible only to a certain limit. The value ξ 1 becomes zero at z 3 = 10/23; the value ξ 3 vanishes earlier, already at z 3 = 1/4. Therefore, giving z 3 its maximum permissible value z 3 = 1/4, we will turn the value ξ 3 to zero.

To check whether the form Φ turns to a minimum at z 1 = 0, z 2 = 0, ξ 3 = 0, we express the remaining (non-zero) variables in terms of supposedly zero z 1, z 2, ξ 3. Resolving equations (5.10) for ξ 1, ξ 2 and z 3, we obtain:

(5.13) 32Φ = 7 + Зz 1 + 4z 2 + ξ 3

From formula (5.13) it is clear that any increase in z 1, z 2, ξ 3 beyond their assumed zero values ​​can only increase the shape of Φ. Consequently, the solution to the game has been found; it is determined by the values ​​z 1 = z 2 = ξ 3 = 0, whence ξ 1 = 1/32, ξ 2 = 3/16, z 3 = 1/4. Substituting into formula (5.13), we find the price of the game ν: 32Φ = 7 = 32/ν; ν = 32/7. Our optimal strategy: . "Useful" strategies (compositions A 1 and A 2) should be applied at frequencies of 1/7 and 6/7; composition A 3 - never use.

To find the enemy’s optimal strategy, in the general case, you can do this: change the sign of the payoff to the opposite, add a constant value L to the matrix elements to make them non-negative, and solve the problem for the enemy in the same way as we solved it for ourselves. However, the fact that we already know the price of the game ν simplifies the problem somewhat. In addition, in this particular case, the problem is further simplified by the fact that only two “useful” enemy strategies B 1 and B 2 are involved in the solution, since the value of z 3 is not equal to zero, and, therefore, with strategy B 3 the cost of the game is not achieved . By choosing any “useful” strategy of player A, for example A 1, you can find the frequencies q 1 and q 2. To do this, we write the equation 8q 1 + 2(1 – q 1) = 32/7, from where q 1 = 3/7, q 2 = 4/7; The enemy's optimal strategy will be: , i.e. the enemy should not use composition B 3, and compositions B 1 and B 2 should be used with frequencies of 3/7 and 4/7.

Returning to the original matrix, we determine the true cost of the game ν 0 = 32/7:10 = 0.457. This means that when large number matches, the number of victories of club A will be 0.457 of all meetings.

§ 6. Approximate methods for solving games

Often in practical problems there is no need to find an exact solution to the game; It is enough to find an approximate solution that gives an average payoff close to the cost of the game. Approximate knowledge of the price of the game ν can be obtained by a simple analysis of the matrix and determination of the lower (α) and upper (β) prices of the game. If α and β are close, there is practically no need to search for an exact solution, and it will be enough to choose pure minimax strategies. In cases where α and β are not close, it is possible to obtain a practical solution using numerical methods for solving games, of which we will briefly highlight the iteration method.

The idea of ​​the iteration method comes down to the following. " thought experiment", in which opponents A and B use their strategies against each other. The experiment consists of a sequence of elementary games, each of which has the matrix of a given game. It starts with the fact that we (player A) randomly choose one of our strategies, for example A i. The enemy responds to this with his strategy B j , which is least beneficial for us, i.e. turns the payoff for strategy A i to a minimum. We respond to this move with our strategy A k, which gives the maximum average gain when the opponent uses strategy B j. Next, it’s the enemy’s turn again. He responds to our pair of moves A i and A k with his strategy B j, which gives us the smallest average payoff for these two strategies (A i, A k), and so on. At each step of the iterative process, each player responds to any move of another player with his own strategy, which is optimal relative to all his previous moves, considered as some mixed strategy in which pure strategies are presented in proportions corresponding to the frequency of their use.

This method is like a model of real practical “training” of players, when each of them experiences the enemy’s behavior and tries to respond to it in a way that is beneficial for themselves. If such an imitation of the learning process is continued long enough, then the average gain per one pair of moves (elementary game) will tend to the price of the game, and the frequencies p 1 ... p m ; q 1 ... q n , which the players' strategies encounter in this game, will approach the frequencies that determine the optimal strategies. Calculations show that the convergence of the method is very slow, but this is not an obstacle for high-speed calculating machines.

Let us illustrate the application of the iterative method using the example of the 3x3 game solved in example 2 of the previous paragraph. The game is given by the matrix:

Table 6.1 shows the first 18 steps of the iterative process. The first column gives the number of the elementary game (pair of moves) n; in the second - number i the chosen strategy of player A; in the next three - “accumulated winnings” for the first n games with the opponent's strategies B 1, B 2, B 3. The minimum of these values ​​is underlined. Next comes the number j strategy chosen by the opponent and, accordingly, the accumulated winnings for n games for strategies A 1 , A 2 , A 3 of these values ​​the maximum is underlined above. The underlined values ​​determine the other player's choice of response strategy. The following columns sequentially show: the minimum average winnings ν, equal to the minimum accumulated winnings divided by the number of games n; maximum average winnings, equal to the maximum accumulated winnings divided by n, and their arithmetic mean ν* = (ν + )/2. When increasing n all three values ​​ν and ν* will approach the game price ν, but the value ν* will naturally approach it relatively faster.

Table 6.1.

As can be seen from the example, the convergence of iterations is very slow, but still even such a small calculation makes it possible to find the approximate value of the game’s price and identify the predominance of “useful” strategies. When using calculating machines, the value of the method increases significantly. The advantage of the iterative method of solving games is that the volume and complexity of calculations increase relatively little as the number of strategies increases m And n.

§ 7. Methods for solving some endless games

An infinite game is a game in which at least one of the parties has an infinite number of strategies. General methods for solving such games are still poorly developed. However, some special cases that allow a relatively simple solution may be of interest for practice. Consider a game of two opponents A and B, each of whom has an infinite (uncountable) set of strategies; these strategies for player A correspond to different meanings continuously changing parameter X, and for B - parameter at. In this case, instead of the matrix ‖a ij ‖ the game is determined by a certain function of two continuously changing arguments a(x, y), which we will call the payoff function (note that the function itself a(x, y) does not have to be continuous). Win function a(x, y) can be represented geometrically by some surface a(x, y) above the argument change area (x, y)(Fig. 7.1)

Payoff Function Analysis a(x, y) is carried out similarly to the analysis of the payment matrix. First, the lower price of the game α is found; for this purpose is determined for each X minimum function a(x, y) on all at: , then the maximum of these values ​​is searched for over all X(maximin):

Top price games (minimax) is defined similarly:

Let's consider the case when α = β. Since the price of the game ν is always between α and β, their total value is ν. The equality α = β means that the surface a(x, y) has a saddle point, i.e., a point with coordinates x 0, y 0, at which a(x, y) is at the same time minimal in at and maximum X(Fig. 7.2).

Meaning a(x, y) at this point is the price of the game ν: ν = a(x 0, y 0). The presence of a saddle point means that a given infinite game has a pure strategy solution; x 0, y 0 represent optimal pure strategies A and B. In the general case, when α ≠ β, the game can only have a solution in the domain of mixed strategies (possibly not the only one). Mixed strategy for infinite games there is some probability distribution for the strategies X And at, considered as random variables. This distribution can be continuous and determined by densities f 1 (X) And f 2 (y); may be discrete, and then optimal strategies consist of a set of individual pure strategies chosen with some nonzero probabilities.

In the case where an infinite game does not have a saddle point, we can give a clear geometric interpretation of the lower and upper prices of the game. Consider an infinite game with a payoff function a(x, y) and strategies x, y, filling continuously the segments of the axes (x 1, x 2) And (y 1, y 2). To determine the lower price of the game α, you need to “look” at the surface a(x, y) from the axle side at, i.e. project it onto a plane xOa(Fig. 7.3). We obtain a certain figure bounded on the sides by the straight lines x = x 1 and x = x 2, and on the top and bottom by the curves K B and K H. The lower price of the game α, obviously, is nothing more than the maximum ordinate of the curve K H.

Similarly, to find the upper price of the game β, you need to “look” at the surface a(x, y) from the axle side X(project the surface onto a plane uOa) and find the minimum ordinate of the upper boundary of the K B projection (Fig. 7.4).

Let's look at two elementary examples of infinite games.

Example 1. Players A and B each have an uncountable set of possible strategies X And at, and 0 ≤ x ≤ 1; 0 ≤ y ≤ 1. The payoff function for a is given by the expression a (x, y) – (x – y) 2 . Find the solution to the game.

Solution: The surface a(x, y) is a parabolic cylinder (Fig. 7.5) and has no saddle point. Let's determine the lower price of the game; obviously for everyone X; hence = 0. Let us determine the upper price of the game. To do this, we find for a fixed at

In this case, the maximum is always achieved at the boundary of the interval (at x = 0 or x = 1), i.e. it is equal to that of the quantities y 2; (1 – y) 2, which is greater. Let us depict the graphs of these functions (Fig. 7.6), i.e. surface projection a(x, y) to the plane uOa. Bold line in Fig. 7.6 shows the function. Obviously, its minimum value is achieved at y = 1/2 and is equal to 1/4. Therefore, the upper price of the game is β = 1/4. In this case, the upper price of the game coincides with the price of the game ν. Indeed, player A can apply a mixed strategy S A = , in which the extreme values ​​x = 0 and x = 1 occur with the same frequencies; then, for any strategy of player B, the average payoff of player A will be equal to: ½у 2 + ½(1 – y) 2. It is easy to verify that this quantity for any values at between 0 and 1 has a value of at least ¼: ½у 2 + ½(1 – y) 2 ≥ ¼.

Thus, player A, using this mixed strategy, can guarantee himself a win equal to the upper price of the game; since the price of the game cannot be greater than the upper price, then this strategy S A is optimal: S A = S A *.

It remains to find the optimal strategy of player B. Obviously, if the price of the game ν is equal to the upper price of the game β, then the optimal strategy of player B will always be his pure minimax strategy, which guarantees him the upper price of the game. In this case, such a strategy is 0 = ½. Indeed, with this strategy, no matter what player A does, his payoff will not be more than ¼. This follows from the obvious inequality (x – ½) 2 = x(x –1) + ¼ ≤ ¼

Example 2. Side A (“we”) is firing at enemy aircraft B. In order to evade fire, the enemy can maneuver with some overload at, to which he, at his discretion, can assign values ​​from at= 0 (linear motion) to at = atmax(flight along a circle of maximum curvature). We assume atmax unit of measurement, i.e. let's put atmax= 1. In the fight against the enemy, we can use sighting devices based on one or another hypothesis about the movement of the target during the flight of the projectile. Overload X in this hypothetical maneuver can be assumed to be equal to any value from 0 to 1. Our task is to hit the enemy; The enemy's task is to remain undefeated. Probability of defeat for data X And at is approximately expressed by the formula: a(x, y) = , Where at- overload applied by the enemy; x - overload taken into account in the sight. It is necessary to determine the optimal strategies of both parties.

Solution. Obviously, the solution of the game will not change if we set p = 1. Payoff function a(x, y) represented by the surface shown in Fig. 7.7.

This is a cylindrical surface, the generators of which are parallel to the bisector of the coordinate angle xOy, and the section by a plane perpendicular to the generatrix is ​​a curve like a normal distribution curve. Using the geometric interpretation of the lower and upper prices of the game proposed above, we find β = 1 (Fig. 7.8) and (Fig. 7.9). The game has no saddle point; the solution must be sought in the field of mixed strategies. The task is somewhat similar to the problem in the previous example. Indeed, at small values k the function behaves roughly like a function –(x – y) 2, and the solution to the game will be obtained if in the solution of the previous example we swap the roles of players A and B; those. our optimal strategy will be the pure strategy x = 1/2, and the enemy’s optimal strategy S B = will be to use the extreme strategies y = 0 and y = 1 with equal frequencies. This means that we must use the aim in all cases designed for an overload of x = 1/2, and the enemy must in half of all cases not use any maneuver at all, and in half - the maximum possible maneuver.

Rice. 7.8 Fig. 7.9.

It is easy to prove that this solution will be valid for values ​​k ≤ 2. Indeed, the average payoff for the opponent’s strategy S B = and for our strategy X expressed by the function , which for values ​​k ≤ 2 has one maximum at x = 1/2, equal to the lower price of the game α. Consequently, the use of strategy S B guarantees the opponent a loss no greater than α, from which it is clear that α - the lower price of the game - is the price of the game ν.

For k > 2, the function a(x) has two maxima (Fig. 7.10), located symmetrically with respect to x = 1/2 at points x 0 and 1 – x 0, and the value of x 0 depends on k.

Obviously, when k= 2 x 0 = 1 – x 0 = ½; with increasing k points x 0 and 1 – x 0 move apart, coming closer to extreme points(0 and 1). Therefore, the solution of the game will depend on k. Let's set a specific value of k, for example k = 3, and find the solution to the game; To do this, we determine the abscissa x 0 of the maximum of the curve a(x). Equating the derivative of the function a(x) to zero, we write an equation to determine x 0:

This equation has three roots: x = 1/2 (where the minimum is reached) and x 0, 1 – x 0, where the maximums are reached. Solving the equation numerically, we find approximately x 0 ≈ 0.07; 1 – x 0 ≈ 0.93.

Let us prove that the solution to the game in this case will be the following pair of strategies:

With our strategy and the enemy's strategy at the average winning is

Let's find the minimum a 1 (y) at 0< у < 1. Функция a 1 (y) симметрична относительно y = 1/2 и может иметь только один или два максимума; ее минимум, во всяком случае, достигается либо в середине отрезка (0, 1), либо на его концах. Полагая у = 0 (или у = 1), найдем

Setting y = 1/2, we get

which is greater than a 1 (0); therefore, the price of the game is not less than a 1 (0):

Now let's say that the enemy uses strategy S B *, and we use strategy x. Then the average gain will be

But we chose x 0 precisely so that at x = x 0 the maximum of expression (7.2) is achieved; hence,

those. the opponent using the strategy S B * can prevent a loss greater than 0.530; therefore, ν = 0.530 is the price of the game, and the strategies S A * and S B * provide the solution. This means that we must use sights with x = 0.07 and x = 0.93 with the same frequency, and the enemy must not maneuver with the same frequency and maneuver with maximum overload.

Note that the payoff ν = 0.530 is noticeably greater than the lower price of the game , which we could provide ourselves by applying our maximin strategy x 0 = 1/2.

One of the practical ways to solve infinite games is to reduce them approximate to finite ones. In this case, the whole range of possible strategies for each player is conditionally combined into one strategy. In this way, of course, you can only obtain an approximate solution to the game, but in most cases an exact solution is not required.

However, it must be borne in mind that when applying this technique, solutions in the field of mixed strategies may appear even in cases where the solution of the original infinite game is possible in pure strategies, i.e. when an infinite game has a saddle point. If by reducing an infinite game to a finite one one obtains mixed solution, which includes only two adjacent “useful” strategies, then it makes sense to try to apply the intermediate pure strategy of the original infinite game.

In conclusion, we note that infinite games, unlike finite ones, may not have a solution. Let's give an example of an infinite game that has no solution. Two players each name any integer. Named larger number receives 1 ruble from another. If both call the same number, the game ends in a draw. The game obviously cannot have a solution. However, there are classes of infinite games for which a solution certainly exists.

If in a game each of the opponents uses the same strategy, then this game is said to be played in pure strategies, and the strategies of players A and B will be called pure strategies.In a zero-sum game, a pair of strategies is called equilibrium(stable) if it is unprofitable for any of the players to retreat from their strategies. It makes sense to use pure strategies if the players are aware of the opponent’s actions. If this is not the case, then the idea of ​​equilibrium is violated and the game can be played as it turns out. Strategies A1 B1 are stable with respect to information about the opponent’s behavior. A sign of the stability of a pair of strategies is the equality of the upper and lower prices of the game. And case A1 B1 will be

ν = α = β. ν > 0, then player A will win if ν< 0, то в выигрыше игрок В. Если ν = 0, в этом случае игра справедлива для обоих игроков. Не все матричные игры имеют седловые точки.

Theorem: every game with complete information has a saddle point and therefore solves in pure strategies, i.e. there is a pair of stable strategies that give a stable payoff equal to ν. If the matrix does not have a saddle point, then the cost of the game lies α<ν<β. Это означает, что первый игрок, используя максиминный принцип, обеспечит себе выигрыш не менее, чем α. А второй игрок придерживаясь минимаксного подхода обеспечит себе проигрыш не больше верхней цены игры. Игра будет оптимальна, если оба игрока будут применять смешанные стратегии.Случайная величина, значениями которой являются чистые стратегии, называется смешанной стратегией для этого игрока.

To specify a mixed strategy means to specify the probabilities with which pure strategies are used.

S A = || p 1 , p 2 …. p m || ,S B = || q1, q2…. q m || , A: ∑ pi = 1 , B: ∑ qi = 1

The game can be repeated several times, but in each game the player follows a mixed strategy, where pure strategies adhere to the probabilities p i and q j .

The mixed strategy model differs from the pure strategy model. In the case of mixed strategies, the players' tactics will be more flexible, because players know in advance what pure strategy they will use.

Let's assume that both player A and player B have a mixed strategy. It is necessary to determine A: ∑∑ a ij p i q j

For player B, the expected loss is equal to the expected gain of player A. The winnings of the first player and the average loss of the second player are equal to each other.

18.Methods for solving a finite two-person game of order m*n.

Let us assume that all elements of the payment matrix are 0≤aij. Then α≤ν≤β. According to the fundamental theorem of matrix games, any matrix game has 2 optimal mixed strategies.

S A = (p 1 , p 2 , … , p n)

S B = (p 1 , p 2 , … , p n)

We solve the game for player A, while assuming that player B uses only pure strategies. Then

a 11 p 1 + a 21 p 2 + … + a m1 p m ≥ ν: B 1

a 12 p 1 + a 22 p 2 + … + a m2 p m ≥ ν: B 2 (1)

a 1n p 1 + a 2n p 2 + … + a mn p m ≥ ν: B n

X 1 = P 1 /ν, X 2 = P 2 /ν … X m = P m /ν

a 11 X 1 … + a m1 p m ≥ 1

a 1n X 1 … + a m1 p m ≥ 1 (2)

p 1 +p 2 +…+p m =1

X 1 +X 2 +…+X m = 1/ν (3)

L(x) = X 1 +X 2 +…+X m -> min (4)

Let's define a linear programming problem.

ν = 1/(X 1 0 +X 2 0 …X m 0) (5)

P1 = X 1 0 *ν opt

p2 = X 2 0 *ν opt (6)

min L(x) = ∑x i

∑a ij: 1≤x i (7) (direct problem)

0≤x i (i=1,2..)

a 11 q 1 + a 21 q 2 + … + a m1 q m< ν: A 1

a 21 q 1 + a 22 q 2 + … + a m2 q m< ν: A 2 (8)

a m1 q 1 + a m2 q 2 + … + a mn q m< ν: A m

Y 1 = q 1 /ν, Y 2 = q 2 /ν ... Y m = q m /ν

q 1 +q 2 +…+q n =1

y 1 +y 2 +…+y n =1/ν

L(y)=∑y j -> max

∑a ij , y i ≤1 (i=1,2…) (9) (dual problem)

y 1 0 +y 2 0 …y m 0 = 1/ν opt

ν opt = 1/∑y m 0

Q1 = y 1 0 *ν opt

q2 = y 2 0 *ν opt

ν=1/∑x i = 1/∑y i = 1/min L(x) = 1/ max L(y) (11)

B 1 B 2 B 3 αi
A 1
A 2
A 3
β j

1) α = 1, β = 3

2) There are no simplifications.

L(x)=x 1 +x 2 +x 3 => min

x 1 +3x 2 +x 3 >= 1

2x 1 +x 2 +x 3 >=1

3x 1 +x 2 +x 3 >=1

x 1 =2/9, x 2 =2/9, x 3 =1/9

ν=1/(2/9+2/9+1/9)=9/5

p 1 =x 1 *ν=2/5

S A =(2/5, 2/5, 1/5)

dual problem

L(y) = y 1 +y 2 +y 3 => max

y 1 +2y 2 +3y 3 ≤ 1 y 1 =2/9

3y 1 +y 2 +y 3 ≤1 => y 2 =2/9 max L(y) = 5/9

y 1 +3y 2 +y 3 ≤1 y 3 =1/9

ν=1/(2/9+2/9+1/9)=9/5

q 1 =y 2 *ν=(2/9)*(9/5)=2/5

q 2 =(2/9)*(9/5)=2/5

q 3 =(1/9)*(9/5)=1/5

S B =(2/5, 2/5, 1/5)

The mxn problem reduces to a linear programming problem.

An approximate method for solving mxn matrix games (Brown-Robinson).

Player A and Player B take turns using pure strategies. Each player tries to increase his winnings using maximin or minimax approaches. It is not the average gain that is minimized (maximized), but the accumulated one. The theory shows that such a method will inevitably give us optimal winnings and optimal mixed strategies.



IN 1 AT 2 AT 3
A 1
A 2
A 3
3 * 8 * 9 * 36 *
3 * 4 * 12 * 13 *
7 *
1 *
3 *
4 *
6 *
9 *
10 *
12 *
34 *
Pure strategy player I is to choose one of the n rows of the payoff matrix A, and the pure strategy of player II is to choose one of the columns of the same matrix.

Optimal pure player strategies differ from mixed ones by the presence of a mandatory unit p i = 1, q i = 1. For example: P(1,0), Q(1,0). Here p 1 = 1, q 1 = 1.

Problem 1
Using the payment matrix, find optimal pure strategies using the principle of strict dominance. As an answer, write down the vectors P*, Q*.



R1

R2

R3

R4

S1

3

1

2

5

S2

2

0

0

3

S3

-3

-5

-5

-2

S4

0

-2

-2

1

Solution:

We solve all problems using a calculator Matrix game.

We assume that player I chooses his strategy in such a way as to maximize his payoff, and player II chooses his strategy in such a way as to minimize player I’s payoff.

PlayersB 1B 2B 3B 4a = min(A i)
A 13 1 2 5 1
A 22 0 0 3 0
A 3-3 -5 -5 -2 -5
A 40 -2 -2 1 -2
b = max(B i)3 1 2 5
We find the guaranteed payoff determined by the lower price of the game a = max(a i) = 1, which indicates the maximum pure strategy A 1 .
The upper price of the game b = min(b j) = 1.
The saddle point (1, 2) indicates a solution to a pair of alternatives (A1,B2). The price of the game is 1.
2. Check the payment matrix for dominant rows and dominant columns.
Sometimes, based on a simple examination of the game matrix, one can say that some pure strategies can only be included in the optimal mixed strategy with zero probability.
They say that i-th Player 1's strategy dominates him kth strategy if a ij ≥ a kj for all j E N and for at least one j a ij > a kj . In this case it is also said that i-th strategy (or line) – dominant, k-th– dominated.
They say that j-i 2nd player's strategy dominates him lth strategy if for everyone j E M a ij ≤ a il and for at least one i a ij< a il . В этом случае j-th the strategy (column) is called dominant, lth– dominated.
Strategy A 1 dominates strategy A 2 (all elements of row 1 are greater than or equal to the values ​​of the 2nd row), therefore we exclude the 2nd row of the matrix. Probability p 2 = 0.
Strategy A 1 dominates strategy A 3 (all elements of row 1 are greater than or equal to the values ​​of the 3rd row), therefore we exclude the 3rd row of the matrix. Probability p 3 = 0.
3 1 2 5
0 -2 -2 1

From the perspective of player B's losses, strategy B 1 dominates strategy B 2 (all elements of column 1 are greater than elements of column 2), therefore we exclude the 1st column of the matrix. Probability q 1 = 0.
From the perspective of player B's losses, strategy B 4 dominates strategy B 1 (all elements of column 4 are greater than elements of column 1), therefore we exclude the 4th column of the matrix. Probability q 4 = 0.
1 2
-2 -2

We've reduced the 4 x 4 game to a 2 x 2 game.



Game solution ( 2 x n


p 1 = 1
p 2 = 0
Game price, y = 1
Now we can find the minimax strategy of player B by writing the corresponding system of equations
q 1 = 1
q 1 + q 2 = 1
Solving this system, we find:
q 1 = 1.
Answer:
Game price: y = 1, player strategy vectors:
Q(1, 0), P(1, 0)

∑a ij q j ≤ v
∑a ij p i ≥ v
M(P 1 ;Q) = (1 1) + (2 0) = 1 = v
M(P 2 ;Q) = (-2 1) + (-2 0) = -2 ≤ v
M(P;Q 1) = (1 1) + (-2 0) = 1 = v
M(P;Q 2) = (2 1) + (-2 0) = 2 ≥ v

Since rows and columns were removed from the original matrix, the found probability vectors can be written as:
P(1,0,0,0)
Q(0,1,0,0)

Problem 2
Using the payment matrix, find the lower and upper price of the game. If there is a saddle point, write down the vectors of optimal pure strategies P*, Q*.



R1

R2

R3

S1

-6

-5

0

S2

-8

-3

-2

S3

-3

-2

3

Solution:
1. Check whether the payment matrix has a saddle point. If yes, then we write out the solution to the game in pure strategies.
PlayersB 1B 2B 3a = min(A i)
A 1-6 -5 0 -6
A 2-8 -3 -2 -8
A 3-3 -2 3 -3
b = max(B i)-3 -2 3

We find the guaranteed payoff determined by the lower price of the game a = max(a i) = -3, which indicates the maximum pure strategy A 3 .
Upper price of the game b = min(b j) = -3.
The saddle point (3, 1) indicates a solution to a pair of alternatives (A3,B1). The cost of the game is -3.
Answer: P(0,0,1), Q(1,0,0)

Problem 3
Using the payment matrix, find the vectors of optimal strategies P*, Q* and the game price. Which player wins?



R1

R2

R3

R4

S1

-6

-6

2

4

S2

2

-2

7

-1

Solution:
1. Check whether the payment matrix has a saddle point. If yes, then we write out the solution to the game in pure strategies.
We assume that player I chooses his strategy in such a way as to maximize his payoff, and player II chooses his strategy in such a way as to minimize player I’s payoff.
PlayersB 1B 2B 3B 4a = min(A i)
A 1-6 -6 2 4 -6
A 22 -2 7 -1 -2
b = max(B i)2 -2 7 4

We find the guaranteed payoff determined by the lower price of the game a = max(a i) = -2, which indicates the maximum pure strategy A 2 .
Upper price of the game b = min(b j) = -2.
The saddle point (2, 2) indicates a solution to a pair of alternatives (A2,B2). The cost of the game is -2.
3. Find a solution to the game in mixed strategies.
Let's solve the problem using the geometric method, which includes the following steps:
1. In the Cartesian coordinate system, a segment is plotted along the abscissa axis, the length of which is equal to 1. The left end of the segment (point x = 0) corresponds to strategy A 1, the right end to strategy A 2 (x = 1). Intermediate points x correspond to the probabilities of some mixed strategies S 1 = (p 1 ,p 2).
2. The payoffs of strategy A 1 are plotted on the left ordinate. On a line parallel to the ordinate, from point 1, the winnings of strategy A 2 are plotted.
Game solution ( 2 x n) we carry out from the position of player A, adhering to the maximin strategy. None of the players have dominant or duplicate strategies.

The maximal optimal strategy of player A corresponds to point N, for which we can write the following system of equations:
p 1 = 0
p 2 = 1
Game price, y = -2
Now we can find the minimax strategy of player B by writing the corresponding system of equations, excluding the strategy B 1,B 3,B 4, which clearly gives a greater loss to player B, and, therefore, q 1 = 0,q 3 = 0,q 4 = 0 .
-2q 2 = -2
q 2 = 1
Solving this system, we find:
q 2 = 1.
Answer:
Game price: y = -2, player strategy vectors:
Q(0, 1, 0, 0), P(0, 1)
4. Let's check the correctness of the game solution using the strategy optimality criterion.
∑a ij q j ≤ v
∑a ij p i ≥ v
M(P 1 ;Q) = (-6 0) + (-6 1) + (2 0) + (4 0) = -6 ≤ v
M(P 2 ;Q) = (2 0) + (-2 1) + (7 0) + (-1 0) = -2 = v
M(P;Q 1) = (-6 0) + (2 1) = 2 ≥ v
M(P;Q 2) = (-6 0) + (-2 1) = -2 = v
M(P;Q 3) = (2 0) + (7 1) = 7 ≥ v
M(P;Q 4) = (4 0) + (-1 1) = -1 ≥ v
All inequalities are satisfied as equalities or strict inequalities, therefore, the solution to the game is found correctly.

Problem 4
Give a detailed answer to the question

Description of the bimatrix game. All games that were reviewed belonged to the class zero sum games. However, a number of conflict situations that arise in the course of actions are characterized by the fact that the gain of one side is not exactly equal to the loss of the other. Game-theoretic models Such situations are non-cooperative non-zero-sum games. Such games are called bimatrix because the task of each such game is reduced to the task of two matrices of the same form: .

Process bimatrix game consists in the independent choice by player I of a number and by player II of a number, after which player I receives a win, and player II receives a win.

The row numbers of the matrices are called pure player strategies I, and the column numbers of these matrices are pure player strategies II. Then pairs of the form will be situations in pure strategies bimatrix game, and the numbers and are the payoffs of players I and II in the situation. Accordingly, the probability distribution of using pure strategies of player I is and player II - we'll call mixed strategies. Then pairs of the form represent situations bimatrix game V mixed strategies, and the numbers And are the mathematical expectations of winning for players I and II.

The equilibrium situation of a bimatrix game in mixed strategies we will call such a pair for which:

(8.2)
,

where is the mathematical expectation of winning player I;

Mathematical expectation of winning for player II;

Optimal mixed player strategy I;

Optimal mixed player strategy II.

Task

Construction and solution of a bimatrix game. Suppose that a country's anti-submarine submarine is searching for a country's missile submarine, which is maneuvering in a strictly defined part of the combat patrol area. The rest of the area is operated by an anti-submarine submarine, which conducts anti-submarine search operations. Let each anti-submarine boat use its own hydroacoustic station to detect the enemy either in an active mode, turning it on periodically, or only in a passive mode, performing a continuous search.

Both an anti-submarine submarine and a missile submarine with sonar detection can evade an enemy. However, the frequency of sonar activation makes detection possible, but unreliable.

In such a conflict situation, one of the players is an anti-submarine submarine, and the other is an anti-submarine submarine. Obviously, a missile submarine cannot be a player, since it only has one mode of action, which is to maneuver and perform evasive maneuvers while detecting sonar signals.

The characteristic feature here is that each of the players pursues different, but not opposing goals. Indeed, the purpose of an anti-submarine submarine is to detect a missile submarine, and the purpose of an anti-submarine submarine is to detect an anti-submarine submarine. Therefore, to assess the achievement of the goal by each player, depending on the chosen methods of action (strategies), it is necessary to have two efficiency criteria and, accordingly, two payoff functions. Then the model of such a conflict situation will be a finite game with a non-zero sum, described by two matrices of the same shape And , called bimatrix.

Let's take it as performance criterion anti-submarine submarine (player I) the probability of detecting a missile submarine, and for performance criterion anti-submarine submarine (player II) – probability of detecting an anti-submarine submarine. Then the bimatrix game will be given by a matrix (Figure 9.a) and a matrix (Figure 9.b).


Rice. 9.a.


Rice. 9.b.

Where - use of active mode;

Using passive mode.

The player's choice of one action or another is called progress. There are moves personal(the player consciously makes one decision or another) and random(the outcome of the game does not depend on the will of the player). The set of rules that determine which move a player must make is called strategy. There are strategies clean(non-random decisions of players) and mixed(strategy can be considered as a random variable).

Saddle point

IN game theory S. t. ( saddle element) is the largest element of the column matrix game, which is also the smallest element of the corresponding row (in two-person zero-sum game). At this point, therefore, the maximin of one player is equal to the minimax of the other; S. t. is a point equilibrium.

Minimax theorem

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” maximin and minimax strategies is called minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

The player chooses his actions, assuming that the opponent will act in an unfavorable manner, i.e. will try to "harm".

Loss function

Loss function– a function that in theory statistical solutions characterizes losses due to incorrect decision-making based on observed data. If the problem of estimating a signal parameter against a background of noise is being solved, then the loss function is a measure of the discrepancy between the true value of the estimated parameter and the parameter estimate

Optimal Mixed Player Strategy- this is a complete set of application of his pure strategies when the game is repeated many times under the same conditions with given probabilities.

A player's mixed strategy is a complete set of application of his pure strategies when the game is repeated many times under the same conditions with given probabilities.

1. If all elements of a row are not greater than the corresponding elements of another row, then the original row can be deleted from the payment matrix. Same for columns.

2. The price of the game is unique.

Document: Let's say there are 2 game prices v and , which are achieved on the pair and respectively, then

3. If the same number is added to all elements of the payoff matrix, then the optimal mixed strategies will not change, but the price of the game will increase by this number.

Document:
, Where

4. If all elements of the payment matrix are multiplied by the same number not equal to zero, the price of the game will be multiplied by this number, but the optimal strategies will not change.