Paired finite game with a payoff matrix. Matrix games: examples of problem solving

A player's strategy is a plan according to which he makes a choice in any possible situation and with any possible factual information. Naturally, the player makes decisions as the game progresses. However, theoretically, it can be assumed that all these decisions are made by the player in advance. Then the totality of these decisions constitutes his strategy. Depending on the number of possible strategies, games are divided into finite and infinite. The task of game theory is to develop recommendations for players, i.e., to determine for them optimal strategy. An optimal strategy is one that, when the game is repeated many times, provides a given player with the maximum possible average winnings.

The simplest type of strategic game is a two-person zero-sum game (the sum of the parties' payoffs is zero). The game consists of two moves: player A chooses one of his possible strategies Ai (i = 1, 2, m), and player B chooses strategy Bj (j = 1, 2, ., n), and each choice is made with complete ignorance another player's choice.

The goal of player A is to maximize the function φ (Ai, Bj), in turn, the goal of player B is to minimize the same function. Each player can choose one of the variables on which the value of the function depends. If player A chooses some of the strategies Ai, then this in itself cannot influence the value of the function φ (Ai, Bj).

The influence of Ai on the value of φ (Ai, Bj) is uncertain; certainty occurs only after the choice, based on the principle of minimizing φ (Ai, Bj), by another player of the variable Bj. In this case, Bj is determined by another player. Let φ (Ai, Bj)= aij. Let's create matrix A:

The rows of the matrix correspond to strategies Ai, the columns to strategies Bj. Matrix A is called the payoff or game matrix. Element aij of the matrix is ​​the payoff of player A if he chose strategy Ai, and player B chose strategy Bj.

Let player A choose some strategy Ai; then in worst case(for example, if the choice becomes known to player B), he will receive a payoff equal to min aij. Anticipating this possibility, player A must choose a strategy to maximize his minimum payoff a:

a = max min aij

The value a - the guaranteed win of player A - is called the lower price of the game. The strategy Ai0 that ensures obtaining a is called maximin.

Player B, when choosing a strategy, proceeds from the following principle: when choosing a certain strategy Bj, his loss will not exceed the maximum of the values ​​of the elements of the j-th column of the matrix, i.e. less than or equal to max aij

Considering the set max aij for different meanings j, player B, will naturally choose a value of j that minimizes his maximum loss β:

β = min miax aij

The value β is called the upper price of the game, and the strategy Bj0 corresponding to the payoff β is called minimax.

The actual winnings of player A, with reasonable actions of partners, are limited by the lower and upper prices of the game. If these expressions are equal, i.e.

Lecture 4

Topic: “ELEMENTS OF GAME THEORY”

The concept of gaming models

In practice, we often encounter problems in which it is necessary to make decisions under conditions of uncertainty, i.e. Situations arise in which two (or more) parties pursue different goals, and the results of any action of each party depend on the activities of the partner. Such situations that arise when playing chess, checkers, dominoes, etc., are considered conflict situations: the result of each player’s move depends on the opponent’s response move, the goal of the game is to win one of the partners. In economics, conflict situations occur very often and are of a diverse nature. These include, for example, relationships between supplier and consumer, buyer and seller, bank and client. In all these examples, the conflict situation is generated by the difference in the interests of the partners and the desire of each of them to make optimal decisions that realize their goals to the greatest extent. At the same time, everyone has to take into account not only their own goals, but also the goals of their partner, and take into account the decisions unknown in advance that these partners will make.

To competently solve problems with conflict situations, scientifically based methods are needed. Such methods have been developed mathematical theory conflict situations, which is called game theory .

Let's get acquainted with the basic concepts of game theory. The mathematical model of a conflict situation is called game , parties involved in the conflict - players And, and the outcome of the conflict is win . For each formalized game, rules A, those. system of conditions defining:

1) options for players’ actions;

2) the amount of information each player has about the behavior of their partners;

3) the gain that each set of actions leads to. Typically, winning (or losing) can be quantified; for example, you can value a loss as zero, a win as one, and a draw as 1/2.



The game is called steam room , if it involves two players, and multiple , if the number of players is more than two. We will only consider doubles games. They involve two players A And IN, whose interests are opposite, and by game we mean a series of actions on the part of A And IN.

The game is called zero sum game , or antagonistic th, if the gain of one of the players is equal to the loss of the other, i.e. For complete task game, it is enough to indicate the size of one of them. If we designate A - winnings of one of the players, b - the other's winnings, then for a zero-sum game b = -A, therefore it is enough to consider, for example A .

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random. Personal move - this is a conscious choice by the player of one of the possible actions (for example, a move in a chess game). Random move - it is a randomly chosen action (for example, choosing a card from a shuffled deck). In the future, we will consider only the personal moves of the players.

Strategy A player is a set of rules that determine the choice of his action at each personal move, depending on the current situation. Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that all decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program. (This way you can play the game using a computer.) The game is called Certainly th, if each player has a finite number of strategies, and endless - otherwise.

In order to decide game, or find a solution to the game, you should choose a strategy for each player that satisfies the condition optimality And, those. one of the players must receive maximum win , when the second one sticks to his strategy. At the same time, the second player must have minimum loss , if the first one sticks to his strategy. Such strategists And are called optimal. Optimal strategies must also satisfy the condition sustainability , those. It must be disadvantageous for either player to abandon their strategy in this game.

If the game is repeated quite a few times, then players may be interested not in winning and losing in each specific game, but in average win (loss) in all batches.

The goal of game theory is to determine the optimal strategy for each player. When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests. The most important limitation of game theory is the uniqueness of payoff as an indicator of efficiency, while in most real economic problems there is more than one indicator of efficiency. In addition, in economics, as a rule, problems arise in which the interests of partners are not necessarily antagonistic. The development of game theory apparatus for solving problems with many participants having consistent interests is beyond the scope of the lecture.

Payment matrix.

Lower and upper price of the game

Consider a paired finite game. Let the player A has T personal strategies, which we denote by . Let the player IN available P personal strategies, let's denote them . They say the game has dimensions t x p . As a result of the players’ choice of any pair of strategies, the outcome of the game is uniquely determined, i.e. player's winnings A (positive or negative) and player loss IN . Assume that the values ​​are known for any pair of strategies . Matrix Р=(), i = 1,2, ..., t; j = 1, 2, ..., n , elements of which are winnings corresponding to strategies and , called payment matrix or matrix of the game . General form such a matrix is ​​presented in the table:

The rows of this table correspond to the player's strategies A , and the columns - the player's strategies IN .

Example.Create a payoff matrix for the next game. Search game.

Player A can hide in one of two shelters (I and II); player IN looking for a player A , and if he finds it, he receives a fine of 1 den. units from A , otherwise pays the player A 1 day units It is necessary to build the payment matrix of the game.

Solution. To compile a payment matrix, you should analyze the behavior of each player. Player A can hide in shelter I - let's denote this strategy by or in shelter II - strategy.

Player IN can look for the first player in the shelter I - strategy , or to the shelter II - strategy. If the player A is located in Vault I and is discovered by the player there IN , those. a pair of strategies is implemented then the player A pays a fine, i.e. . Similarly we get . Obviously, strategies give the player A win 1, therefore . So for the game" search"of size 2x2 we get a payment matrix

Consider the game t p with matrix, i =1, 2, ..., T ; j =1, 2, ..., P and determine the best among strategies . Player choosing strategy A must expect that the player IN will answer it using one of the strategies , for which the payoff for the player A minimal (player IN seeks to "harm" the player A ).

Let us denote by , the player’s smallest payoff A when choosing a strategy , for all possible player strategies IN (smallest number in i -and the line of the payment matrix), i.e.

, j =1,…n . (1)

Among all the numbers Let's choose the largest: . Let's call the lowest price of the game, or maximum winnings(maximin). This guaranteed winnings for the player A for any player strategy IN. Hence,

. (2)

The strategy corresponding to maximin is called maximin strategy . Player IN interested in reducing the player's winnings A ; choosing a strategy , it takes into account the maximum possible gain for A . Let's denote

Among all the numbers, choose the smallest one and call it top price of the game or minimax win (minimax). This guaranteed player loss IN. Therefore,

(4)

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

Example. Determine the lower and upper prices of the game and the corresponding strategies in the game " Search" Consider the payment matrix:

When choosing a strategy (first row of the matrix), the minimum payoff is and corresponds to the player's strategy IN . When choosing a strategy (second row of the matrix), the minimum payoff is equal to , it is achieved with the strategy.

Guaranteeing yourself the maximum win for any player strategy IN, those. lower price of the game, player A can choose any strategy: or , those. any of his strategies is maximin.

When choosing a strategy (column 1), the player IN understands that the player A will respond with a strategy to maximize his gain (loss IN ). Therefore, the player's maximum loss is IN when choosing a strategy it is equal to max (-l; 1) = 1.

Similarly, the maximum loss of a player IN (winning A ) when he chooses a strategy (column 2) is equal to .

Thus, for any player strategy A guaranteed minimum player loss IN equal to the top price of the game.

Any player strategy IN is minimax. By adding a row and a column to the table, we get a new table:

-1 -1
-1 -1
1

At the intersection of the additional rows and columns we will write down the upper and lower prices of the games.

The game is called zero sum game, or antagonistic, if the gain of one of the players is equal to the loss of the other, i.e. To complete the game task, it is enough to indicate the value of one of them. If we designate a- winnings of one of the players, b- the other's winnings, then for a zero-sum game b = - a, therefore it is enough to consider, for example, a.

The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random.

Personal move- this is a conscious choice by the player of one of the possible actions (for example, a move to chess game).

Random move is a randomly selected action (for example, choosing a card from a shuffled deck). In my work I will consider only the personal moves of the players.

Strategy A player is a set of rules that determine the choice of his action at each personal move, depending on the current situation. Usually during the game, with each personal move, the player makes a choice depending on the specific situation. However, in principle, it is possible that all decisions are made by the player in advance (in response to any given situation). This means that the player has chosen a specific strategy, which can be specified as a list of rules or a program. (This way you can play the game using a computer.) The game is called ultimate, if each player has a finite number of strategies, and endless- otherwise.

In order to solve the game, or find a solution to the game, you should choose a strategy for each player that satisfies the condition optimality, i.e. one of the players must receive maximum win when the second one sticks to his strategy. At the same time, the second player must have minimum loss, if the first one sticks to his strategy. Such strategies are called optimal. Optimal strategies must also satisfy stability condition, i.e. It must be disadvantageous for either player to abandon their strategy in this game.

Purpose of Game Theory: Determining the optimal strategy for each player. When choosing an optimal strategy, it is natural to assume that both players behave reasonably in terms of their interests.

Antagonistic games in which each player has a finite set of strategies are called matrix games. This name is explained by the following possibility of describing games of this kind. We draw up a rectangular table in which the rows correspond to the strategies of the first player, the columns to the strategies of the second, and the table cells located at the intersection of the rows and columns correspond to the situations of the game. If we put the winnings of the first player in the corresponding situation in each cell, we obtain a description of the game in the form of a certain matrix. This matrix is ​​called matrix of the game or payoff matrix.

The same finite zero-sum game can be described by different matrices, differing from each other only in the order of rows and columns.

Consider the game m x n with matrix P = (a ij), i = 1,2, ... , m;j = 1,2, ... , n and determine the best among strategies A 1, A 2, …, A m. Choosing a strategy A i player A must expect that the player IN will answer it using one of the strategies Bj, for which the payoff for the player A minimal (player IN seeks to "harm" the player A). Let us denote by a i, the player's smallest winnings A when choosing a strategy A i for all possible player strategies IN(smallest number in i-th line of the payment matrix), i.e.

a i = a ij , j = 1,..., n.

Among all the numbers a i (i = 1,2, ... , m ) choose the largest. Let's call a lower price of the game or maximum winnings (maximin). This is a guaranteed win for the player A for any player strategy IN. Hence, , i = 1,... , m; j = 1,..., n

The strategy corresponding to maximin is called maximum strategy. Player IN interested in reducing the player's winnings A; choosing a strategy Bj, it takes into account the maximum possible gain for A.

Let's denote: β i = a ij , i = 1,... , m

Among all the numbers Bj choose the smallest one and call it β top price of the game or minimax win (minimax). This is a guaranteed loss for the player IN.

Hence, i = 1,... , m; j = 1,..., n.

The strategy corresponding to minimax is called minimax strategy.

The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called the minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.

We'll find the best strategy player A, for which we analyze all his strategies sequentially. Choosing a strategy A i, we must count on the player B will answer it with such a strategy Bj, for which the gain A will be minimal. Therefore, among the numbers in the first line, we select the minimum one, denote it, and write it in the additional column. Similarly for each strategy A i choose, i.e. α i– minimum winnings when applying the strategy A i.
In example 1:
α 1= min (0, –1, –2) = –2;
α 2= min (1, 0, –1) = –1;
α 3= min (0, –1, –2) = 0.
We will write these numbers in the additional column. What strategy should the player choose? A? Of course, the strategy for which α i maximum. Let's denote . This is a guaranteed win that a player can secure for himself A, i.e. ; this winning is called lower price of the game or maximin . Strategy A i, providing lower price games called maximin(reinsurance). If the player A will adhere to this strategy, then he is guaranteed to win ≥ α for any behavior of the player B.
In example 1. This means that if A will write “3”, then at least he won’t lose. Player B interested in reducing winnings A. Choosing a strategy B 1, for reasons of caution, he takes into account the maximum possible gain A. Let's denote . Similarly when choosing a strategy Bj maximum possible gain A–; Let's write these numbers in an additional line. To reduce your winnings A, it is necessary to choose the smallest from the numbers β j. The number is called top price of the game or minimax . This is a guaranteed loss for the player B(i.e. he will lose no more than β). Player strategy B, providing a payoff ≥ - β is called its minimax strategy.
In example 1:
;
;
;
.
This means that the optimal strategy B– write “3”, then at least he won’t lose.

BA B 1 B 2 B 3
A 1 0 – 1 –2 –2
A 2 1 0 –1 –1
A 3 2 1 0 0
2 1 0 0 0
The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called minimax principle . This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent.
It can be proven that , i.e. .
In example 1 α = β. If those. minimax coincides with maximin, then this game is called saddle point game. Saddle point is a pair of optimal strategies ( Ai,Bj). In example 1, the game has a saddle point ( A 3, B 3). In this case, the number α = β is called (pure) at the cost of the game(the lower and upper price of the game are the same). This means that the matrix contains an element that is the minimum in its row and at the same time the maximum in its column. In example 1, this is element 0. The price of the game is 0.
Optimal strategies in any game have an important property, namely: stability . This means that each player is not interested in deviating from his optimal strategy, since this is unprofitable for him. Deviation from the player's optimal strategy A leads to a decrease in his winnings, and the player’s one-sided deviation IN- to increase losses. They say that the saddle point gives equilibrium position .
Example 2. First side (player A) selects one of three types of weapons – A 1, A 2, A 3, and the opponent (player IN) - one of three types of aircraft: IN 1, AT 2, AT 3. Target IN– breakthrough of the defense front, goal A- aircraft destruction. Probability of aircraft being hit IN 1 weapons A 1 equal to 0.5, aircraft AT 2 weapons A 1 equal to 0.6, aircraft AT 3 weapons A 1 equals 0.8, etc., i.e. element a ij payment matrix – probability of aircraft being hit INj weapons Ai. The payoff matrix has the form: Solve the game, i.e. find the lower and upper price of the game and optimal strategies.
Solution. In each line we find the minimum element and write it in the additional column. In each column we find the maximum element and write it in the additional line.
IN A IN 1 AT 2 AT 3 α i
A 1 0,5 0,6 0,8 0,5
A 2 0,9 0,7 0,8 0,7
A 3 0,7 0,5 0,6 0,5
β j 0,9 0,7 0,8 0,7 0,7

In the additional column we find the maximum element = 0.7, in the additional line we find the minimum element = 0.7.
Answer: = 0.7. Optimal strategies – A 2 And AT 2.

Example 3. A game of toss. Each player can choose one of two strategies during his move: heads or tails. If the selected strategies coincide A receives a win +1, if there is a mismatch B receives payoff 1 (i.e. A receives a payoff –1). Payment matrix:
Find the lower and upper price of the game. Does the game have a saddle point?

Solution.

IN A IN 1 AT 2
A 1 1 -1 -1
A 2 -1 1 1
1 1 -1 1

α = -1, β = 1, i.e. A will lose no more than 1, and B will lose no more than 1. Since α ≠ β, the game has no saddle point. There is no equilibrium position in this game, and an optimal solution cannot be found in pure strategies.

Example. Find the lower price of the game, the upper price of the game, determine the saddle points, optimal pure strategies and the price of the game (if they exist).

Game theory is a mathematical discipline whose subject of study is decision-making methods in conflict situations.

The situation is called conflict, if the interests of several (usually two) persons pursuing opposing goals collide. Each party may undertake a series of activities to achieve its goals, with the success of one party meaning the failure of the other.

In economics, conflict situations occur very often (relationships between supplier and consumer, buyer and seller, banker and client). Conflict situations occur in many other areas.

A conflict situation is generated by the difference in interests of partners and the desire of each of them to make optimal decisions that realize their goals to the greatest extent. At the same time, everyone has to take into account not only their own goals, but also the goals of their partner, and take into account the decisions unknown in advance that the partners will make.

Typically, conflict situations are difficult to analyze directly due to the many secondary factors involved. In order to make mathematical analysis of a conflict situation possible, it is necessary to simplify it, taking into account only the main factors. A simplified formalized model of a conflict situation is called game, the parties involved in the conflict - players, and the outcome of the conflict is win. Typically, winning (or losing) can be quantified; for example, you can value a loss as zero, a win as one, and a draw as 1/2.

The game is a collection rules, describing the behavior of players. Each instance of playing a game in some specific way from beginning to end represents game game. The choice and implementation of one of the actions provided for by the rules is called progress player. Moves can be personal and random. Personal move- this is a conscious choice by the player of one of the possible actions (for example, a move in a chess game). Random move- this is also a choice of one of many options, but here the option is selected not by the player, but by some mechanism random selection(tossing coins, choosing a card from a shuffled deck).

Strategy A player is called a set of rules that determine the choice of his actions at each personal move, depending on the current situation.



If the game consists only of personal moves, then the outcome of the game is determined if each player chooses his own strategy. However, if the game has random moves, then the game will be probabilistic in nature and the choice of players’ strategies will not yet ultimately determine the outcome of the game.

In order to decide game, or find a solution to the game, you should choose a strategy for each player that satisfies the condition optimality, those. one of the players must receive maximum win, when the second one sticks to his strategy. At the same time, the second player must have minimum loss, if the first one sticks to his strategy. Such strategies are called optimal. Optimal strategies must satisfy the stability condition, i.e. It must be disadvantageous for either player to abandon their strategy in this game.

The goal of game theory is to determine the optimal strategy for each player.

Consider a paired finite game. Let the player A has m personal strategies, which we denote A 1 , A 2 , ..., Am . Let the player IN available n personal strategies, let's designate them B 1 , B 2 , ..., B m . They say the game has dimensions m×n . As a result of the players choosing any pair of strategies



A i and B j (i = 1, 2, ..., m; j = 1, 2, ..., n)

The outcome of the game is clearly determined, i.e. winnings a ij player A (positive or negative) and loss ( - a ij ) player IN . Let's assume that the values OU known for any pair of strategies (A i ,B j ). Matrix , the elements of which are the winnings corresponding to the strategies A i And Bj , called payment matrix or matrix of the game. The general appearance of such a matrix is ​​presented in Table 3.1.

Table 3.1

The rows of this table correspond to the player's strategies A , and the columns are the player’s strategies IN . Let's create a payment matrix for the next game.

Consider the game m×n with matrix P = (a ij), i = 1, 2, ..., m; j = 1, 2, ..., n and determine the best among strategies A 1 , A 2 , ..., Am . Choosing a strategy A i player A must expect that the player IN will answer it using one of the strategies Bj , for which the payoff for the player A minimal (player IN seeks to "harm" the player A ). Let us denote by α i , the player's smallest winnings A when choosing a strategy A i for all possible player strategies IN (smallest number in i th row of the payment matrix), i.e.

The strategy corresponding to maximin is called maximin strategy. Player IN interested in reducing the player's winnings A ; choosing a strategy Bj , it takes into account the maximum possible gain for A . Let's denote

A strategy corresponding to minimax is called a minimax strategy. The principle that dictates players to choose the most “cautious” minimax and maximin strategies is called minimax principle. This principle follows from the reasonable assumption that each player strives to achieve a goal opposite to that of his opponent. Let us determine the lower and upper prices of the game and the corresponding strategies in the problem.

If the upper and lower prices of the game coincide, then general meaning upper and lower game prices α = β = v called net price games , or at the cost of the game . Minimax strategies corresponding to the price of the game are optimal strategies, and their totality is optimal solution , or game solution. In this case the player A receives the maximum guaranteed (independent of the player’s behavior) IN ) winnings v , and the player IN achieves the minimum guaranteed (regardless of the player’s behavior A ) loss v . They say that the solution to the game has stability , i.e. If one player sticks to his optimal strategy, then it cannot be profitable for the other to deviate from his optimal strategy.

A couple of pure strategies A i And Bj gives an optimal solution to the game if and only if the corresponding element a ij , is both the largest in its column and the smallest in its row. This situation, if it exists, is called saddle point (similar to the surface of a saddle, which curves up in one direction and down in the other).

Basic concepts of the inventory management model.

In both business and manufacturing, it is common practice to maintain a reasonable inventory of material resources or components to ensure continuity of the production process. Traditionally, inventory is viewed as an unavoidable cost, with inventory levels that are too low leading to costly production shutdowns, and inventory levels that are too high leading to the “death” of capital. The goal of inventory management is to determine the level of inventory that balances the two extreme cases mentioned.

Let's consider the main characteristics of inventory management models.

Demand. The demand for the stocked product can be deterministic(in the simplest case - constant in time) or random. Demand randomness is described by either a random moment of demand or a random amount of demand at deterministic or random times.

Warehouse replenishment. Replenishment of the warehouse can be carried out either periodically at certain intervals, or as stocks are exhausted, i.e. reducing them to a certain level.

Order quantity. With periodic replenishment and occasional depletion of stocks, the order quantity may depend on the condition observed at the time the order is placed. An order is usually placed for the same amount when the stock reaches a given level - the so-called order points.

Delivery time. Idealized inventory management models assume that ordered replenishment is delivered to the warehouse instantly. Other models consider delays in deliveries for a fixed or random period of time.

Delivery cost. As a rule, it is assumed that the cost of each delivery consists of two components - one-time costs that do not depend on the volume of the ordered batch, and costs that depend (most often linearly) on the volume of the batch.

Storage costs. In most inventory management models, the warehouse volume is considered to be practically unlimited, and the volume of stored inventory serves as the controlling variable. It is assumed that a certain fee is charged for storing each unit of stock per unit of time.

Deficiency penalty. Any warehouse is created in order to prevent shortages of a certain type of product in the serviced system. Lack of stock at the right time leads to losses associated with equipment downtime, irregular production, etc. These losses are called penalty for deficit.

Stock nomenclature. In the simplest cases, it is assumed that the warehouse stores a stock of similar products or a homogeneous product. In more complex cases it is considered multi-item stock.

Structure of the warehouse system. The most fully developed mathematical models of a single warehouse. However, in practice there are also more complex structures: hierarchical warehouse systems with different replenishment periods and order delivery times, with the possibility of exchanging stocks between warehouses of the same hierarchy level, etc.

The criterion for the effectiveness of the adopted inventory management strategy is cost function (costs), representing the total costs of supplying the stocked product, its storage and the cost of fines.

Inventory management consists of finding a strategy for replenishment and consumption of inventories in which the cost function takes on a minimum value.

Let the functions , and express respectively:

Restocking,

Inventory consumption

Demand for the product being stocked

for a period of time.

Inventory management models usually use derivatives of these functions with respect to time, , , called, respectively,