Bayes formula in simple terms. Total probability formula, Bayes formula. "Physical meaning" and terminology

Let's start with an example. In the urn in front of you equally likely there can be (1) two white balls, (2) one white and one black, (3) two black. You drag the ball and it turns out to be white. How do you rate now? probability these three options (hypotheses)? Obviously, the probability of hypothesis (3) with two black balls = 0. But how to calculate the probabilities of the two remaining hypotheses!? This allows you to make the Bayes formula, which in our case has the form (the number of the formula corresponds to the number of the hypothesis being tested):

Download note in format or

X – random value(hypothesis) taking the values: x 1- two whites x 2- one white, one black; x 3- two black ones; at is a random variable (event) that takes the following values: 1- a white ball is drawn and at 2- a black ball is drawn; P(x 1) is the probability of the first hypothesis before the ball is drawn ( a priori likelihood or probability before experience) = 1/3; P(x 2)– probability of the second hypothesis before the ball is drawn = 1/3; P(x 3)– probability of the third hypothesis before pulling out the ball = 1/3; P(y 1|x 1)– conditional probability of drawing a white ball if the first hypothesis is true (the balls are white) = 1; P(y 1|x 2) – the probability of drawing a white ball, if the second hypothesis is true (one ball is white, the second is black) = ½; P(y 1|x 3) – the probability of drawing a white ball, if the third hypothesis is true (both black) = 0; P(y 1)– probability of drawing a white ball = ½; P(y 2)– probability of drawing a black ball = ½; and finally what we are looking for - P(x 1|at 1) – the probability that the first hypothesis is true (both balls are white), provided that we drew a white ball ( a posteriori likelihood or probability after experience); P(x 2|at 1) – the probability that the second hypothesis is true (one ball is white, the second is black), provided that we pulled out a white ball.

The probability that the first hypothesis (two white balls) is true, given that we drew a white ball:

The probability that the second hypothesis is true (one is white, the second is black), provided that we pulled out a white ball:

The probability that the third hypothesis (two black ones) is true, given that we drew a white ball:

What does the Bayes formula do? It makes it possible, on the basis of a priori probabilities of hypotheses - P(x 1), P(x 2), P(x 3)– and probabilities of occurrence of events – P(y 1), P(y 2)– calculate the posterior probabilities of the hypotheses, for example, the probability of the first hypothesis, provided that a white ball is drawn – P(x 1|at 1).

Let's go back to formula (1). The initial probability of the first hypothesis was P(x 1) = 1/3. With probability P(y 1) = 1/2 we could draw a white ball, and with probability P(y 2) = 1/2- black. We pulled out the white one. Probability of drawing white, provided that the first hypothesis is true P(y 1|x 1) = 1. Bayes' formula says that since white is drawn, the probability of the first hypothesis has increased to 2/3, the probability of the second hypothesis is still 1/3, and the probability of the third hypothesis has become zero.

It is easy to check that if we draw a black ball, the posterior probabilities would change symmetrically: P(x 1|y 2) = 0, P(x 2|y 2) = 1/3, P(x 3|y 2) = 2/3.

Here is what Pierre Simon Laplace wrote about the Bayes formula in a paper published in 1814:

This is the basic principle of the branch of chance analysis that deals with transitions from events to causes.

Why is Bayes' formula so hard to understand!? In my opinion, because our usual approach is reasoning from causes to effects. For example, if there are 36 balls in an urn, 6 of which are black and the rest are white. What is the probability of drawing a white ball? Bayes' formula allows you to go from events to causes (hypotheses). If we had three hypotheses, and an event occurred, then how exactly did this event (and not the alternative) affect the initial probabilities of the hypotheses? How have these probabilities changed?

I believe that Bayes' formula is not just about probabilities. It changes the paradigm of perception. What is the train of thought when using the deterministic paradigm? If an event occurs, what is its cause? If there was an accident, an emergency, a military conflict. Who or what was their fault? How does a Bayesian observer think? What is the structure of reality that led to given case to such and such a manifestation ... Bayesian understands that in otherwise the result could be different...

Let's place the symbols in formulas (1) and (2) a little differently:

Let's talk again about what we see. With an equal initial (a priori) probability, one of the three hypotheses could be true. With equal probability, we could draw a white or black ball. We pulled out the white one. In light of this new additional information, our evaluation of the hypotheses should be revised. Bayes' formula allows you to do this numerically. The a priori probability of the first hypothesis (formula 7) was P(x 1), a white ball is drawn, the posterior probability of the first hypothesis becomes P(x 1|at 1). These probabilities differ by a factor .

Event 1 called evidence that more or less confirms or refutes a hypothesis x 1. This ratio is sometimes referred to as the power of evidence. The more powerful the evidence (the more the coefficient differs from unity), the greater the fact of observation 1 changes the prior probability, the more the posterior probability differs from the prior. If the evidence is weak (coefficient ~ 1), the posterior is nearly equal to the prior.

Certificate 1 in = 2 times changed the prior probability of the hypothesis x 1(formula 4). At the same time, evidence 1 did not change the probability of the hypothesis x 2, since its power = 1 (formula 5).

In general, the Bayes formula has the following form:

X is a random variable (a set of mutually exclusive hypotheses) that takes the values: x 1, x 2, … , Xn. at is a random variable (a set of mutually exclusive events) that takes the following values: 1, at 2, … , atn. Bayes' formula allows you to find the posterior probability of a hypothesis Xi when an event occurs y j. The numerator is the product of the a priori probability of the hypothesis Xi – P(xi) the probability of an event occurring y j if the hypothesis is true Xi – R(y j|xi). In the denominator - the sum of the products of the same as in the numerator, but for all hypotheses. If we calculate the denominator, we get the total probability of the event occurring atj(if any of the hypotheses is true) – R(y j) (as in formulas 1–3).

Once again about the evidence. Event y j provides additional information that allows you to revise the prior probability of the hypothesis Xi. Power of evidence - - contains in the numerator the probability of the event occurring y j if the hypothesis is true Xi. The denominator is the total probability of the event occurring atj(or the probability of an event occurring atj averaged over all hypotheses). atj above for hypothesis xi than the average for all hypotheses, then the evidence plays into the hands of the hypothesis xi, increasing its posterior probability R(y j|xi). If the probability of an event occurring atj below for hypothesis xi than the average for all hypotheses, then the evidence lowers the posterior probability R(y j|xi) for hypotheses xi. If the probability of an event occurring atj for the hypothesis xi is the same as the average for all hypotheses, then the evidence does not change the posterior probability R(y j|xi) for hypotheses xi.

Here are a few examples that I hope will solidify your understanding of the Bayes formula.

Problem 2. Two shooters independently shoot at the same target, each firing one shot. The probability of hitting the target for the first shooter is 0.8, for the second - 0.4. After shooting, one hole was found in the target. Find the probability that this hole belongs to the first shooter. .

Task 3. The object being monitored can be in one of two states: H 1 = (functioning) and H 2 = (not functioning). A priori probabilities of these states Р(Н 1) = 0.7, Р(Н 2) = 0.3. There are two sources of information that provide conflicting information about the state of an object; the first source reports that the object is not functioning, the second - that it is functioning. It is known that the first source gives correct information with a probability of 0.9, and with a probability of 0.1 - erroneous. The second source is less reliable: it gives correct information with a probability of 0.7, and with a probability of 0.3 - erroneous. Find the posterior probabilities of the hypotheses. .

Tasks 1–3 are taken from the textbook by E.S. Ventzel, L.A. Ovcharov. Probability theory and its engineering applications, section 2.6 Hypothesis theorem (Bayes formula).

Problem 4 is taken from the book, section 4.3 Bayes' Theorem.

Who is Bayes? And what does it have to do with management? – may be followed by quite a fair question. For now, take my word for it: this is very important! .. and interesting (at least for me).

What paradigm do most managers operate in: if I observe something, what conclusions can I draw from it? What does Bayes teach: what must actually be in order for me to observe this something? This is how all sciences develop, and he writes about this (I quote from memory): a person who does not have a theory in his head will shy away from one idea to another under the influence of various events (observations). Not for nothing they say: there is nothing more practical than a good theory.

An example from practice. My subordinate makes a mistake, and my colleague (the head of another department) says that it would be necessary to exert managerial influence on the negligent employee (in other words, punish / scold). And I know that this employee makes 4-5 thousand of the same type of operations per month, and during this time he makes no more than 10 mistakes. Feel the difference in the paradigm? My colleague reacts to observation, and I have a priori knowledge that an employee makes a certain number of mistakes, so that one more did not affect this knowledge ... Now, if at the end of the month it turns out that there are, for example, 15 such errors! .. This will already become a reason to investigate the causes of non-compliance with standards.

Convinced of the importance of the Bayesian approach? Intrigued? Hope so". And now a fly in the ointment. Unfortunately, Bayesian ideas are rarely given on the first go. I was frankly unlucky, as I got acquainted with these ideas through popular literature, after reading which many questions remained. When planning to write a note, I collected everything that I had previously outlined according to Bayes, and also studied what they write on the Internet. I present to you my best guess on the subject. Introduction to Bayesian Probability.

Derivation of Bayes' theorem

Consider the following experiment: we name any number lying on the segment and fix when this number is, for example, between 0.1 and 0.4 (Fig. 1a). The probability of this event is equal to the ratio of the length of the segment to the total length of the segment, provided that the occurrence of numbers on the segment equiprobable. Mathematically, this can be written p(0,1 <= x <= 0,4) = 0,3, или кратко R(X) = 0.3, where R- probability, X is a random variable in the range , X is a random variable in the range . That is, the probability of hitting the segment is 30%.

Rice. 1. Graphical interpretation of probabilities

Now consider the square x (Fig. 1b). Let's say we have to name pairs of numbers ( x, y), each of which is greater than zero and less than one. The probability that x(first number) will be within the segment (blue area 1), equal to the ratio of the area of the blue area to the area of \u200b\u200bthe entire square, that is, (0.4 - 0.1) * (1 - 0) / (1 * 1) = 0, 3, that is, the same 30%. The probability that y is inside the segment (green area 2) is equal to the ratio of the area of the green area to the area of the entire square p(0,5 <= y <= 0,7) = 0,2, или кратко R(Y) = 0,2.

What can be learned about the values at the same time x and y. For example, what is the probability that both x and y are in the corresponding given segments? To do this, you need to calculate the ratio of the area of \u200b\u200bdomain 3 (the intersection of the green and blue stripes) to the area of the entire square: p(X, Y) = (0,4 – 0,1) * (0,7 – 0,5) / (1 * 1) = 0,06.

Now suppose we want to know what is the probability that y is in the interval if x is already in the range. That is, in fact, we have a filter and when we call pairs ( x, y), then we immediately discard those pairs that do not satisfy the condition for finding x in a given interval, and then from the filtered pairs we count those for which y satisfies our condition and consider the probability as the ratio of the number of pairs for which y lies in the above segment to the total number of filtered pairs (that is, for which x lies in the segment). We can write this probability as p(Y|X at X hit in the range." Obviously, this probability is equal to the ratio of the area of area 3 to the area of blue area 1. The area of area 3 is (0.4 - 0.1) * (0.7 - 0.5) = 0.06, and the area of blue area 1 ( 0.4 - 0.1) * (1 - 0) = 0.3, then their ratio is 0.06 / 0.3 = 0.2. In other words, the probability of finding y on the segment, provided that x belongs to the segment p(Y|X) = 0,2.

In the previous paragraph, we actually formulated the identity: p(Y|X) = p(X, Y) /p( X). It reads: "probability of hitting at in the range, provided that X hit in the range is equal to the ratio of the probability of simultaneous hit X in range and at in the range, to the probability of hitting X into the range."

By analogy, consider the probability p(X|Y). We call couples x, y) and filter those for which y lies between 0.5 and 0.7, then the probability that x is in the segment provided that y belongs to the segment is equal to the ratio of the area of area 3 to the area of green area 2: p(X|Y) = p(X, Y) / p(Y).

Note that the probabilities p(X, Y) and p(Y, X) are equal, and both are equal to the ratio of the area of zone 3 to the area of the entire square, but the probabilities p(Y|X) and p(X|Y) not equal; while the probability p(Y|X) is equal to the ratio of the area of area 3 to area 1, and p(X|Y) – domain 3 to domain 2. Note also that p(X, Y) is often denoted as p(X&Y).

So we have two definitions: p(Y|X) = p(X, Y) /p( X) and p(X|Y) = p(X, Y) / p(Y)

Let's rewrite these equalities as: p(X, Y) = p(Y|X)*p( X) and p(X, Y) = p(X|Y) * p(Y)

Since the left sides are equal, so are the right ones: p(Y|X)*p( X) = p(X|Y) * p(Y)

Or we can rewrite the last equality as:

This is Bayes' theorem!

Is it possible that such simple (almost tautological) transformations give rise to a great theorem!? Do not rush to conclusions. Let's talk again about what we got. There was some initial (a priori) probability R(X) that the random variable X uniformly distributed on the segment falls within the range X. Some event has happened Y, as a result of which we have obtained the a posteriori probability of the same random variable X: R(X|Y), and this probability differs from R(X) by the coefficient . Event Y called evidence, more or less confirming or refuting X. This coefficient is sometimes called power of evidence. The stronger the evidence, the more the fact of observation Y changes the prior probability, the more the posterior probability differs from the prior. If the evidence is weak, the posterior is nearly equal to the prior.

Bayes formula for discrete random variables

In the previous section, we derived the Bayes formula for continuous random variables x and y defined on the interval . Consider an example with discrete random variables, each taking on two possible values. In the course of routine medical examinations, it was found that at the age of forty, 1% of women suffer from breast cancer. 80% of women with cancer get positive mammography results. 9.6% of healthy women also get positive mammography results. During the examination, a woman of this age group received a positive mammogram result. What is the probability that she actually has breast cancer?

The course of reasoning/calculations is as follows. Of the 1% of cancer patients, mammography will give 80% positive results = 1% * 80% = 0.8%. Of 99% of healthy women, mammography will give 9.6% positive results = 99% * 9.6% = 9.504%. In total, out of 10.304% (9.504% + 0.8%) with positive mammogram results, only 0.8% are sick, and the remaining 9.504% are healthy. Thus, the probability that a woman with a positive mammogram has cancer is 0.8% / 10.304% = 7.764%. Did you think 80% or so?

In our example, the Bayes formula takes the following form:

Let's talk about the "physical" meaning of this formula once again. X is a random variable (diagnosis), which takes the following values: X 1- sick and X 2- healthy; Y– random variable (measurement result - mammography), which takes the values: Y 1- a positive result and Y2- negative result; p(X 1)- the probability of illness before mammography (a priori probability), equal to 1%; R(Y 1 |X 1 ) – the probability of a positive result if the patient is sick (conditional probability, since it must be specified in the conditions of the task), equal to 80%; R(Y 1 |X 2 ) – the probability of a positive result if the patient is healthy (also conditional probability), equal to 9.6%; p(X 2)- the probability that the patient is healthy before mammography (a priori probability), equal to 99%; p(X 1|Y 1 ) – the probability that the patient is ill, given a positive mammogram result (posterior probability).

It can be seen that the posterior probability (what we are looking for) is proportional to the prior probability (initial) with a slightly more complex coefficient . I will emphasize again. In my opinion, this is a fundamental aspect of the Bayesian approach. Dimension ( Y) added a certain amount of information to the initially available (a priori), which clarified our knowledge about the object.

Examples

To consolidate the material covered, try to solve several problems.

Example 1 There are 3 urns; in the first 3 white balls and 1 black; in the second - 2 white balls and 3 black ones; in the third - 3 white balls. Someone randomly approaches one of the urns and draws 1 ball from it. This ball is white. Find the posterior probabilities that the ball is drawn from the 1st, 2nd, 3rd urn.

Solution. We have three hypotheses: H 1 = (first urn selected), H 2 = (second urn selected), H 3 = (third urn selected). Since the urn is chosen at random, the a priori probabilities of the hypotheses are: Р(Н 1) = Р(Н 2) = Р(Н 3) = 1/3.

As a result of the experiment, the event A = appeared (a white ball was taken out of the selected urn). Conditional probabilities of event A under hypotheses H 1, H 2, H 3: P(A|H 1) = 3/4, P(A|H 2) = 2/5, P(A|H 3) = 1. For example , the first equality reads like this: “the probability of drawing a white ball if the first urn is chosen is 3/4 (since there are 4 balls in the first urn, and 3 of them are white)”.

Applying the Bayes formula, we find the posterior probabilities of the hypotheses:

Thus, in the light of information about the occurrence of event A, the probabilities of the hypotheses changed: the most probable became the hypothesis H 3 , the least probable - the hypothesis H 2 .

Example 2 Two shooters independently shoot at the same target, each firing one shot. The probability of hitting the target for the first shooter is 0.8, for the second - 0.4. After shooting, one hole was found in the target. Find the probability that this hole belongs to the first shooter (we discard the outcome (both holes coincided) as negligibly unlikely).

Solution. Before the experiment, the following hypotheses are possible: H 1 = (neither the first nor the second arrows will hit), H 2 = (both arrows will hit), H 3 - (the first shooter will hit, and the second will not), H 4 = (the first shooter will not will hit, and the second will hit). Prior probabilities of hypotheses:

P (H 1) \u003d 0.2 * 0.6 \u003d 0.12; P (H 2) \u003d 0.8 * 0.4 \u003d 0.32; P (H 3) \u003d 0.8 * 0.6 \u003d 0.48; P (H 4) \u003d 0.2 * 0.4 \u003d 0.08.

The conditional probabilities of the observed event A = (there is one hole in the target) under these hypotheses are: P(A|H 1) = P(A|H 2) = 0; P(A|H 3) = P(A|H 4) = 1

After experience, the hypotheses H 1 and H 2 become impossible, and the posterior probabilities of the hypotheses H 3 and H 4 according to the Bayes formula will be:

Bayes against spam

Bayes' formula has found wide application in the development of spam filters. Let's say you want to train a computer to determine which emails are spam. We will start from the dictionary and word combinations using Bayesian estimates. Let us first create a space of hypotheses. Let us have 2 hypotheses regarding any letter: H A is spam, H B is not spam, but a normal, necessary letter.

First, let's "train" our future anti-spam system. Let's take all the letters we have and divide them into two "heaps" of 10 letters. We put spam letters in one and call it the H A heap, in the other we put the necessary correspondence and call it the H B heap. Now let's see: what words and phrases are found in spam and necessary emails and with what frequency? These words and phrases will be called evidence and denoted by E 1 , E 2 ... It turns out that commonly used words (for example, the words “like”, “your”) in the heaps H A and H B occur with approximately the same frequency. Thus, the presence of these words in a letter tells us nothing about which heap it belongs to (weak evidence). Let's assign to these words a neutral value of the estimation of the probability of "spam", say, 0.5.

Let the phrase "conversational English" appear in only 10 letters, and more often in spam emails (for example, in 7 spam emails out of all 10) than in the right ones (in 3 out of 10). Let's give this phrase a higher score of 7/10 for spam, and a lower score for normal emails: 3/10. Conversely, it turned out that the word "buddy" was more common in normal letters (6 out of 10). And so we received a short letter: “Friend! How is your spoken English?. Let's try to evaluate its "spamness". We will put the general estimates P(H A), P(H B) of belonging to each heap using a somewhat simplified Bayes formula and our approximate estimates:

P(H A) = A/(A+B), where A \u003d p a1 * p a2 * ... * pan, B \u003d p b1 * p b2 * ... * p b n \u003d (1 - p a1) * (1 - p a2) * ... * (1 - p an).

Table 1. Simplified (and incomplete) Bayesian evaluation of writing

Thus, our hypothetical letter received an assessment of the probability of belonging with an emphasis in the direction of "spam". Can we decide to throw the letter into one of the piles? Let's set the decision thresholds:

We will assume that the letter belongs to the heap H i if P(H i) ≥ T.
The letter does not belong to the heap if P(H i) ≤ L.
If L ≤ P(H i) ≤ T, then no decision can be made.

You can take T = 0.95 and L = 0.05. Since for the letter in question and 0.05< P(H A) < 0,95, и 0,05 < P(H В) < 0,95, то мы не сможем принять решение, куда отнести данное письмо: к спаму (H A) или к нужным письмам (H B). Можно ли улучшить оценку, используя больше информации?

Yes. Let's calculate the score for each piece of evidence in a different way, just like Bayes suggested. Let:

F a is the total number of spam emails;

F ai is the number of letters with a certificate i in a pile of spam;

F b is the total number of letters needed;

F bi is the number of letters with a certificate i in a pile of necessary (relevant) letters.

Then: p ai = F ai /F a , p bi = F bi /F b . P(H A) = A/(A+B), P(H B) = B/(A+B), whereА = p a1 *p a2 *…*p an , B = p b1 *p b2 *…*p b n

Please note that the scores of word-evidence p ai and p bi have become objective and can be calculated without human participation.

Table 2. A more accurate (but incomplete) Bayesian estimate for available features from a letter

We got a quite definite result - with a large margin of probability, the letter can be attributed to the necessary letters, since P(H B) = 0.997 > T = 0.95. Why did the result change? Because we used more information - we took into account the number of letters in each of the heaps and, by the way, determined the estimates p ai and p bi much more correctly. They were determined in the same way as Bayes himself did, by calculating the conditional probabilities. In other words, p a3 is the probability that the word “buddy” will appear in the email, given that the email already belongs to the spam heap H A . The result was not long in coming - it seems that we can make a decision with greater certainty.

Bayes vs Corporate Fraud

An interesting application of the Bayesian approach was described by MAGNUS8.

My current project (IS for detecting fraud in a manufacturing enterprise) uses the Bayes formula to determine the probability of fraud (fraud) in the presence / absence of several facts indirectly in favor of the hypothesis of the possibility of fraud. The algorithm is self-learning (with feedback), i.e. recalculates its coefficients (conditional probabilities) upon actual confirmation or non-confirmation of the fraud during the verification by the economic security service.

It is probably worth saying that such methods when designing algorithms require a fairly high mathematical culture of the developer, because the slightest error in the derivation and/or implementation of computational formulas will nullify and discredit the entire method. Probabilistic methods are especially guilty of this, since human thinking is not adapted to work with probabilistic categories and, accordingly, there is no “visibility” and understanding of the “physical meaning” of intermediate and final probabilistic parameters. Such an understanding exists only for the basic concepts of probability theory, and then you just need to very carefully combine and derive complex things according to the laws of probability theory - common sense will no longer help for composite objects. This, in particular, is associated with quite serious methodological battles taking place on the pages of modern books on the philosophy of probability, as well as a large number of sophisms, paradoxes and curiosity problems on this topic.

One more nuance that I had to face - unfortunately, almost everything more or less USEFUL IN PRACTICE on this topic is written in English. In Russian-language sources, there is basically only a well-known theory with demonstration examples only for the most primitive cases.

I fully agree with the last comment. For example, Google, when trying to find something like “Bayesian Probability” book, did not give anything intelligible. True, he said that a book with Bayesian statistics was banned in China. (Statistics professor Andrew Gelman reported on a Columbia University blog that his book, Data Analysis with Regression and Multilevel/Hierarchical Models, was banned from publication in China. text.”) I wonder if a similar reason led to the absence of books on Bayesian probability in Russia?

Conservatism in the process of human information processing

Probabilities determine the degree of uncertainty. Probability, both according to Bayes and our intuition, is simply a number between zero and what represents the degree to which a somewhat idealized person believes the statement is true. The reason why man is somewhat idealized is that the sum of his probabilities for two mutually exclusive events must equal his probability of either of those events occurring. The property of additivity has such implications that few real people can match them all.

Bayes' theorem is a trivial consequence of the property of additivity, undeniable and agreed upon by all probabilists, Bayesian and otherwise. One way to write it is the following. If P(H A |D) is the subsequent probability that hypothesis A was after the given value D was observed, P(H A) is its prior probability before the given value D was observed, P(D|H A ) is the probability that a given value D will be observed, if H A is true, and P(D) is the unconditional probability of a given value D, then

(1) P(H A |D) = P(D|H A) * P(H A) / P(D)

P(D) is best thought of as a normalizing constant, causing the posterior probabilities to add up to one over the exhaustive set of mutually exclusive hypotheses that are being considered. If it needs to be calculated, it can be like this:

But more often P(D) is eliminated rather than counted. A convenient way to eliminate it is to transform Bayes' theorem into the form of a probability-odds relation.

Consider another hypothesis, H B , mutually exclusive to H A, and change your mind about it based on the same given quantity that changed your mind about H A. Bayes' theorem says that

(2) P(H B |D) = P(D|H B) * P(H B) / P(D)

Now we divide Equation 1 by Equation 2; the result will be like this:

where Ω 1 are the posterior odds in favor of H A in terms of H B , Ω 0 are the prior odds, and L is a number familiar to statisticians as a ratio of probabilities. Equation 3 is the same relevant version of Bayes' theorem as Equation 1, and is often much more useful especially for experiments involving hypotheses. Bayesian proponents argue that Bayes' theorem is a formally optimal rule for how to revise opinions in the light of new data.

We are interested in comparing the ideal behavior defined by Bayes' theorem with the actual behavior of people. To give you some idea of what this means, let's try an experiment with you as the subject. This bag contains 1000 poker chips. I have two of these bags, one with 700 red and 300 blue chips, and the other with 300 red and 700 blue. I flipped a coin to determine which one to use. Thus, if our opinions are the same, your current probability of drawing a bag with more red chips is 0.5. Now, you randomly sample, returning after each token. In 12 chips, you get 8 red and 4 blue. Now, based on everything you know, what is the probability that a bag came up with more reds? It is clear that it is higher than 0.5. Please do not continue reading until you have recorded your rating.

If you look like a typical subject, your score falls between 0.7 and 0.8. If we did the corresponding calculation, however, the answer would be 0.97. Indeed, it is very rare for a person who has not previously been shown the influence of conservatism to come up with such a high estimate, even if he was familiar with Bayes' theorem.

If the proportion of red chips in the bag is R, then the probability of getting r red chips and ( n-r) blue in n samples with return - p r (1–p)n–r. Thus, in a typical bag and poker chip experiment, if HA means that the proportion of red chips is r A and HB means that the share is RB, then the probability ratio:

When applying Bayes' formula, one must take into account only the probability of the actual observation, and not the probabilities of other observations that he might have made but did not. This principle has broad implications for all statistical and non-statistical applications of Bayes' theorem; it is the most important technical tool of Bayesian thinking.

Bayesian revolution

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayesian rule" or something called Bayesian thinking. They're really into it, so you go online and you find a page about Bayes' theorem and... It's an equation. And that's all... Why does a mathematical concept give rise to such enthusiasm in the minds? What kind of "Bayesian revolution" is taking place among scientists, and it is argued that even the experimental approach itself can be described as its special case? What is the secret that the followers of Bayes know? What kind of light do they see?

The Bayesian revolution in science did not happen because more and more cognitive scientists suddenly began to notice that mental phenomena have a Bayesian structure; not because scientists in every field have started using the Bayesian method; but because science itself is a special case of Bayes' theorem; experimental evidence is Bayesian evidence. Bayesian revolutionaries argue that when you do an experiment and you get evidence that "supports" or "refutes" your theory, that confirmation or refutation occurs according to Bayesian rules. For example, you must take into account not only that your theory can explain the phenomenon, but also that there are other possible explanations that can also predict this phenomenon.

Previously, the most popular philosophy of science was the old philosophy that was displaced by the Bayesian revolution. Karl Popper's idea that theories can be completely falsified, but never completely confirmed, is another special case of Bayesian rules; if p(X|A) ≈ 1 - if the theory makes correct predictions, then the observation of ~X falsifies A very strongly. On the other hand, if p(X|A) ≈ 1 and we observe X, this does not support the theory very much; some other condition B is possible, such that p(X|B) ≈ 1, and under which observation of X does not evidence for A but evidence for B. To observe X definitely confirming A, we would need to know not that p(X|A) ≈ 1 and that p(X|~A) ≈ 0, which we cannot know because we cannot consider all possible alternative explanations. For example, when Einstein's theory of general relativity surpassed Newton's highly verifiable theory of gravity, it made all the predictions of Newton's theory a special case of Einstein's.

Similarly, Popper's claim that an idea must be falsifiable can be interpreted as a manifestation of the Bayesian rule about the conservation of probability; if the result X is positive evidence for the theory, then the result ~X must falsify the theory to some extent. If you're trying to interpret both X and ~X as "supporting" a theory, Bayesian rules say that's impossible! To increase the likelihood of a theory, you must subject it to tests that can potentially reduce its likelihood; this is not just a rule to detect charlatans in science, but a consequence of the Bayesian Probability Theorem. On the other hand, Popper's idea that only falsification is needed and no confirmation is needed is wrong. Bayes' Theorem shows that falsification is very strong evidence compared to confirmation, but falsification is still probabilistic in nature; it is not governed by fundamentally different rules and does not differ in this from confirmation, as Popper argues.

Thus we find that many phenomena in the cognitive sciences, plus the statistical methods used by scientists, plus the scientific method itself, are all special cases of Bayes' theorem. This is what the Bayesian revolution is all about.

Welcome to the Bayesian Conspiracy!

Literature on Bayesian Probability

2. Nobel laureate in economics Kahneman (et al.) describes a lot of different applications of Bayes in a wonderful book. In my summary of this very large book alone, I counted 27 references to the name of a Presbyterian minister. Minimum formulas. (.. I really liked it. True, it’s complicated, a lot of mathematics (and where without it), but individual chapters (for example, Chapter 4. Information), clearly on the topic. I advise everyone. Even if mathematics is difficult for you, read through the line , skipping the math, and fishing for useful grains ...

14. (supplement dated January 15, 2017), a chapter from Tony Crilly's book. 50 ideas you need to know about. Maths.

The physicist Richard Feynman, Nobel laureate, speaking about one philosopher with a particularly great conceit, once said: “It’s not philosophy as a science that irritates me at all, but the pomp that has been created around it. If only philosophers could laugh at themselves! If only they could say: "I say it's like this, but Von Leipzig thought it was different, and he also knows something about it." If only they remembered to clarify that it was only their .

The understanding (study) of probabilities begins where the classical course of probability theory ends. For some reason, at school and university they teach frequency (combinatorial) probability, or the probability of what is defined. The human brain is wired differently. We have theories (opinions) about everything in the world. We subjectively assess the likelihood of certain events. We can also change our mind if something unexpected happens. This is what we do every day. For example, if you meet a friend at the monument to Pushkin, you understand whether she will be on time, 15 minutes or half an hour late. But when you step out onto the square from the metro and see 20 cm of fresh snow, you will update your probabilities to account for the new data.

This approach was first described by Bayes and Laplace. Although Laplace, I think he was not familiar with the work of Bayes. For some reason I do not understand, the Bayesian approach is rather poorly represented in the Russian-language literature. For comparison, I note that upon request Bayes Ozon gives 4 links, and Amazon - about 1000.

This note is a translation of a small English book, and will give you an intuitive understanding of how to use Bayes' theorem. It starts with a definition and then uses examples in Excel to help you follow the entire line of reasoning.

Scott Hartshorn. Bayes' Theorem Examples: A Visual Guide For Beginners. – 2016, 82 p.

Download note in or format, examples in format

Definition of Bayes' theorem and intuitive explanation

Bayes' theorem

where A and B are events, P(A) and P(B) are the probabilities of A and B without taking each other into account, P(A|B) is the conditional probability of event A, provided that B is true, P (B|A) is the conditional probability of B if A is true.

Actually, the equation is somewhat more complicated, but for most applications this is sufficient. The result of the calculation is simply a normalized weighted value based on the initial guess. So, take an initial guess, weigh it against other initial possibilities, normalize based on observation:

In the course of solving problems, we will perform the following steps (they will become clearer later):

Decide which of the probabilities we want to calculate and which one we observe.
Estimate the initial probabilities for all possible options.
Assuming the truth of some initial option, calculate the probability of our observation; and so on for all initial variants.
Find the weighted value as the product of the initial probability (step 2) and the conditional probability (step 3), and so on for each of the initial options.
Normalize the results: divide each weighted probability (step 4) by the sum of all weighted probabilities; sum of normalized probabilities = 1.
Repeat steps 2-5 for each new observation.

Example 1. A simple example with bones

Suppose your friend has 3 dice: with 4, 6 and 8 edges. It randomly selects one of them, does not show you, throws it and reports the result - 2. Calculate the probability that a 4-hedron, 6-hedron, 8-hedron was chosen.

Step 1. We want to calculate the probability of choosing a 4-sided, 6-sided or 8-sided. We observe the dropped number - 2.

Step 2. Since there were 3 bones, the initial probability of choosing each of them is 1/3.

Step 3. Observation - the dice fell with face 2. If a 4-hedron was taken, the chances of this are 1/4. For a 6-hedron, the chances of getting a 2 are 1/6. For an 8-hedron - 1/8.

Step 4. Rolling 2 for a 4-sided = 1/3 * 1/4 = 1/12, for a 6-sided = 1/3 * 1/6 = 1/18, for an 8-sided = 1/3 * 1/8 = 1/24.

Step 5. Total probability of getting a 2 = 1/12 + 1/18 + 1/24 = 13/72. This number is less than 1 because the chances of throwing a 2 are less than 1. But we know that we have already thrown a 2. So we need to divide the odds of each option from step 4 by 13/72 so that the sum of all the odds for all the dice to lay 2nd is 1. This process is called normalization.

Normalizing each weighted probability, we find the probability that this particular bone was chosen:

4-sided = (1/12) / (13/72) = 6/13
6-sided = (1/18) / (13/72) = 4/13
8-sided = (1/24) / (13/72) = 3/13

And this is the answer.

When we started solving the problem, we assumed that the probability of choosing a certain bone is 33.3%. After rolling a 2, we calculated that the chances that a 4-hedron was chosen initially rose to 46.1%, the chances of choosing a 6-hedron decreased to 30.8%, and the chances that an 8-hedron was chosen dropped altogether. up to 23.1%.

If we take another roll, we could use the newly calculated percentages as our initial guesses and refine the probabilities based on the second observation.

If you have a single observation, it is convenient to present all the steps in the form of a table:

Table. 1. Step-by-step solution in the form of a table (for formulas, see the Excel file on the sheet Example 1)

Note:

If, for example, a 7 fell out instead of a 2, then the chances at step 3 for a 4- and 6-hedron would be equal to zero, and after normalization, the chances of an 8-hedron would be 100%.
Since the example only includes three dice and one roll, we have used simple fractions. For most problems with lots of options and events, it's easier to work with decimals.

Example 2: More bones. More throws

This time we have 6 bones with 4, 6, 8, 10, 12 and 20 faces. We choose one of them randomly and roll 15 times. What is the probability that a certain bone was chosen?

I am using a model in Excel (Figure 1; see sheet Example 2). Random numbers are generated in column B using the function =RANDBETWEEN(1,$B$9). AT this case cell B9 has an octagon selected, so the random numbers can take values from 1 to 8. Since Excel updates the random numbers after each change in the worksheet, I copied column B to the clipboard and pasted only the values into column C. Now the values do not change and will be used for subsequent drawings. (I added you the ability to "play" with the choice of the number of faces and random throws on the sheet Example 2 game. Particularly curious results are obtained if the number 13 is set in cell B9 🙂 - Note. Baguzina.)

Rice. 1. Random number generator

Step 2. Since there are only six dice, the probability of choosing one at random is 1/6 or 0.167.

Steps 3 and 4. Let's write an equation for the probability of the initial choice of a certain die after the corresponding roll. As we saw at the end of Example 1, some rolls may not match certain dice. For example, rolling a 9 makes the probability of a 4-, 6- and 8-sided die equal to zero. If a “legitimate” number is rolled, then its probability for a given die is equal to one divided by the number of faces. For convenience, we have combined steps 3 and 4, so we will immediately write down the formula for the probability of a toss multiplied by the normalized probability after the previous toss (Fig. 2):

IF(roll > number of edges; 0; 1/number of edges * previous normalized probability)

If you use carefully, you can drag this formula through all the lines.

Rice. 2. Probability equation; To enlarge an image, right-click on it and select Open image in new tab

Step 5. The last step is to normalize the results after each roll (region L11:R28 in Fig. 3).

Rice. 3. Normalization of results

So, after 15 rolls, with a 96.4% probability, we can assume that we initially chose an 8-sided die. Although chances remain that a bone with b about a large number of faces: 3.4% - for a 10-sided bone, 0.2% - for a 12-sided bone, 0.0001% - for a 20-sided one. But the probability of 4- and 6-sided dice is zero, since among the dropped numbers there were 7 and 8. This, of course, corresponds to the fact that we entered the number 8 in cell B9, limiting the values for the random number generator.

If we plot the probability of each initial choice of the die, roll by roll, we see (Figure 4):

After the first throw, the probability of choosing a 4-sided die drops to zero, since a 6 immediately fell out. Therefore, the leadership was captured by the variant of the 6-sided bone.
For the first few rolls, the 6-sided die has the highest probability, since it contains the fewest edges among the dice that can match the rolled values.
On the fifth roll, an 8 is rolled, the probability of a 6-sided drops to zero, and the 8-sided becomes the leader.
The probabilities of 10-, 12-, and 20-sided dice gradually decreased on the first rolls and then spiked when the 6-sided die fell out of the race. This is because the results were normalized over a much smaller sample.

Rice. 4. Change in probabilities roll by roll

Note:

Bayes' theorem for multiple events is simply repeated multiplication on sequentially updated data. The final answer does not depend on the order in which the events occurred.
It is not necessary to normalize the probabilities after each event. You can do this once at the very end. The problem is that if you don't do normalization all the time, the probabilities become so small that Excel may not work correctly due to rounding errors. Thus, it's more practical to normalize at each step than to check if you're close to the edge of Excel's precision.

Bayes' theorem. Terminology

The initial probability, the probability of each possibility before the observation occurred, is called a priori.
The normalized answer after calculating the probability for each data point (for each observation) is called a posteriori.
The total probability used to normalize the response is normalization constant.
Conditional probability, i.e. the probability of each event is called credibility.

Here is how these terms look for the first example (compare with Fig. 1).

Rice. 5. Terms of Bayes' theorem

The Bayes theorem itself in the new definitions looks like this (compare with formula 2):

Example 3: A dishonest coin

You have a coin that you suspect is not fair. You throw it 100 times. Calculate the probability that a dishonest coin will land heads up with a probability of 0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%.

Let's turn to the Excel file, sheet Example 3. In cells B13:B112, I generated a random number between 0 and 1, and with Paste Special, I put the values in column C. In cell B8, I entered the expected percentage of heads for this dishonest coin. In column D, using the IF function, I turned the probabilities into units (eagles, for probability R from 0.35 to 1) or to zeros (tails, for R from 0 to 0.35).

Rice. 6. Initial data for tossing a dishonest coin

I got 63 heads and 37 tails, which matches the random number generator well if we set the probability of heads to 65% at the input.

Step 1. We want to calculate the probabilities that heads are in the 0%, 10%, ... 100% baskets, seeing 63 heads and 37 tails on 100 rolls.

Step 2. There are 11 initial possibilities: probabilities 0%, 10%, ... 100%. We will naively assume that all initial possibilities have an equal probability, that is, 1 chance in 11 (Fig. 7). (More realistically, we could weight the initial probabilities around 50% more than the probabilities at the 0% and 100% edges. But the cool thing is that since we have a whopping 100 tosses, the initial probabilities aren’t all that great. important!)

Step 3 and 4. Likelihood calculation. To calculate the probability after each toss, Excel uses the IF function. In the case of heads, the likelihood is equal to the product of the possibility and the previous normalized probability. If it's tails, the likelihood is (1 minus chance) * previous normalized probability (Figure 8).

Rice. 8. Plausibility

Step 5. Normalization is performed as in the previous example.

The results are most visually presented as a series of histograms. The initial plot is the prior probability. Then each new graph is the situation after the next 25 throws (Fig. 9). Since we set the probability of an eagle to 65% at the input, the presented graphs are not surprising.

Rice. 9. Probabilities of options after a series of throws

What does the 70% chance for a 0.6 opportunity really mean? It's not a 70% chance that the coin hits exactly 60%. Since we had a 10% step between options, we estimate that there is a 70% chance that this coin will fall between 55% and 65%. The decision to use 11 initial options, in increments of 10%, was completely arbitrary. We could use 101 initial possibilities in 1% increments. In this case, we would have a result with a maximum at 63% (since we had 63 heads) and a smoother drop on the chart.

Note that in this example we observed slower convergence compared to Example 2. This is because the difference between a coin flipping 60% versus 70% is smaller than between 8 and 10 sided dice.

Example 4. More bones. But with errors in the data stream

Let's go back to example 2. A friend has 4, 6, 8, 10, 12, 20-sided dice in a bag. He draws one die at random and rolls it 80 times. He writes down the dropped numbers, but in 5% of cases he is mistaken. In this case, a random number between 1 and 20 appears instead of the actual result of the roll. After 80 rolls, which dice do you think was chosen?

As input to Excel (sheet Example 4) I entered the number of sides (8) as well as the probability that the data contains an error (0.05). The formula for the throw value (fig. 10):

IF(RAND() > Error Probability; RANDBETWEEN(1, number of faces); RANDBETWEEN(1, 20))

If the random number is greater than the error probability (0.05), then there was no error on this throw, so the random number generator chooses a value between 1 and the “guessed” number of sides of the die, otherwise a random integer between 1 and 20 should be generated.

Rice. 10. Roll value calculation

At first glance, we could solve this problem in the same way as in example 2. But, if we do not take into account the probability of error, we get a probability graph like in Fig. 11. (The easiest way to get it in EXCEL is to first generate the throws in column B with an error value of 0.05; then transfer the throw values to column C, and finally change the value in cell B11 to 0; since the likelihood formulas are in the range D14 :J94 refer to cell B11, the effect of not accounting for errors will be achieved.)

Rice. 11. Processing the value of throws without taking into account the probability of the presence of errors

Since the probability of error is small, and the random number generator is set to an 8-hedron, the probability of the latter becomes dominant with each throw. Moreover, since an error has a 40% (eight out of twenty) probability of giving a value within 8, the error value that affected the result did not appear until the 63rd roll. However, if errors are not taken into account, the probability of an 8-hedron will turn to zero, and 100% will get a 20-hedron. Note that by the 63rd throw, the probability of a 20-hedron was only 2 * 10 -25.

There is a 5% chance of getting an error, and a 60% chance of an error giving a value greater than 8. That is, 3% of the rolls will yield an error greater than 8, which happened on roll 63 when the entry 17 was made. up to 1, as in Fig. eleven.

If a person carefully observes the data, he can detect this error and not take into account the erroneous values. To automate the process, complete the likelihood equation with an error check. Never set error probabilities to zero if you accept that they cannot be completely eliminated. If you take into account the probabilities of error, then hundreds of "correct" data will not allow individual erroneous values to spoil the picture.

We supplement the equation of the likelihood function with an error check (Fig. 12):

IF($C15>F$13;$B$11*1/20*N14;($B$11*1/20+(1-$B$11)/F$13)*N14)

Rice. 12. Likelihood function with allowance for errors

If the recorded throw value more number faces ($C15>F$13), we do not set the conditional probability to zero, but decrease it taking into account the error probability ($B$11*1/20*N14). If the written number is less than the number of faces, we increase the conditional probability not in full, and also taking into account a possible error ($B$11*1/20+(1-$B$11)/F$13)*N14). In the latter case, we believe that the written number could be both the result of an error ($B$11*1/20) and the result of a correct notation (1-$B$11)/F$13).

The change in the normalized probability becomes more resistant to possible errors (Fig. 13).

Rice. 13. Change in normalized probability from toss to toss

In this example, the 6-sided die is initially the favorite because the first 3 rolls are 5, 6, 1. Then a 7 comes up and the probability of an 8-sided die goes up. However, the appearance of a 7 does not reset the probability of a 6, because the 7 may be a mistake. And the next nine rolls seem to confirm this, when no more than 6 rolls out: the probability of a 6-hedron starts growing again. However, on the 14th and 15th rolls, 7s are again thrown, and the probability of a 6-sided die approaches zero. Later, the values 17 and 19 appear, which the "system" determines as clearly erroneous.

Example 4A. What if you have a really high error rate?

This example is similar to the previous one, but the error rate has been increased from 5% to 75%. As the data became less relevant, we increased the number of rolls to 250. Applying the same equations as in example 4, we get the following graph:

Rice. 14. Normalized probability at 75% of erroneous entries

With such a high error rate, many more rolls were needed. In addition, the outcome is less certain, and a 6-hedron periodically becomes more likely. If you have an even higher error rate, like 99%, you can still get the correct answer. Obviously, the higher the error rate, the more rolls to make. For 75% of errors, we get one correct value out of four. If the probability of error is 99%, we would get only one correct value out of a hundred. We probably need 25 times more data to identify the dominant variant.

What if you don't know the probability of error? I recommend playing with examples 4 and 4A by setting cell B11 to different values from very small (for example, 2 * 10 -25 for example 4) to very large (for example, 90% for example 4A). Here are the main findings:

If the error rate estimate is higher than the actual error rate, the results will converge more slowly, but still converge to the correct answer.
If you estimate the error rate too low, there is a risk that the results will not be correct.
The lower the actual error rate, the more wiggle room you have in guessing the error rate.
The higher the actual error rate, the more data you need.

Example 5. German tank problem

In this problem, you are trying to estimate how many tanks were produced based on the serial numbers of captured tanks. Bayes' theorem was used by the Allies during World War II, and ultimately yielded lower results than those reported by intelligence. After the war, records showed that statistical estimates using Bayes' theorem were more accurate. (It is curious that I wrote a note on this topic without yet knowing what Bayesian probabilities are; see .- Note. Baguzina.)

So, you are analyzing serial numbers taken from wrecked or captured tanks. The goal is to estimate how many tanks were produced. Here's what you know about tank serial numbers:

They start at 1.
These are whole numbers without gaps.
You have found the following serial numbers: 30, 70, 140, 125.

We are interested in the answer to the question: what is the maximum number of tanks? I will start with 1000 tanks. But someone else could start with 500 tanks or 2000 tanks, and we may get different results. I am going to analyze every 20 tanks, which means I have 50 initial possibilities for the number of tanks. You can complicate the model and analyze for each individual number in Excel, but the answer will not change much, and the analysis will become much more complicated.

I am assuming that all possibilities of the number of tanks are equal (i.e. the probability of having 50 tanks is the same as having 500). Note that there are more columns in the Excel file than shown in the figure. The conditional probability for the likelihood function is very similar to the conditional probability from Example 2:

If the observed serial number is greater than the maximum serial number for this group, then the probability of having that many tanks is 0.
If the observed serial number is less than the maximum serial number for that group, the probability is one divided by the number of tanks multiplied by the normalized probability in the previous step (Figure 15).

Rice. 15. Conditional probabilities for the distribution of tanks into groups

The normalized probabilities look like this (Fig. 16).

Rice. 16. Normalized probabilities of the number of tanks

There is a large spike in probability for the maximum observed serial number. After that, an asymptotic decrease to zero occurs. For 4 detected serial numbers, the maximum corresponds to 140 tanks. But while this number is the most likely answer, it's not the best estimate, as it almost certainly underestimates the number of tanks.

If we take the weighted average number of tanks, i.e. sum the pairwise multiplied groups and their probabilities for the four tanks by applying the formula:

ROUND(SUMPRODUCT(BD9:DA9,BD14:DA14),0)

we get the best score of 193.

If we initially started from 2000 tanks, weighted average there would be 195 tanks, which essentially does not change anything.

Example 6 Drug Testing

You know that 0.5% of the population uses drugs. You have a test that is 99% true positive for drug users and 98% true negative for nonusers. You randomly choose a person, test and get a positive result. What is the probability that the person actually uses drugs?

For our random individual initial probability that he is a drug user is 0.5%, and the probability that he is not a drug user is 99.5%.

The next step is to calculate the conditional probability:

If the subject uses drugs, then the test will be positive in 99% of cases and negative in 1% of cases.
If the subject does not use drugs, then the test will be positive in 2% of cases and negative in 98% of cases.

The likelihood functions for drug users and non-users are shown in Fig. 17.

Rice. 17. Likelihood functions: (a) for drug users; (b) for non-drug users

After normalizing, we see that despite a positive test result, the probability that this random person is using drugs is only 0.1992 or 19.9%. This result surprises many people because after all, the accuracy of the test is quite high - as much as 99%. Since the initial probability was only 0.5%, even a large increase in this probability was not enough to make the response really large.

Most people's intuition doesn't take initial probability into account. Even if the conditional probability is indeed high, a very low initial probability can result in a low final probability. Most people's intuition is set around a 50/50 initial probability. If so, and the test result is positive, then the normalized probability is the expected 98%, confirming that the person is using drugs (Figure 18).

Rice. 18. Test result with initial probability 50/50

For an alternative approach to explaining such situations, see .

For a bibliography on Bayes' theorem, see the end of this note.

If the event BUT can only happen when one of the events that form complete group of incompatible events , then the probability of the event BUT calculated by the formula

This formula is called total probability formula .

Consider again the complete group of incompatible events , whose probabilities of occurrence are . Event BUT can only occur together with any of the events that we will call hypotheses . Then according to the total probability formula

If the event BUT happened, it can change the probabilities of the hypotheses .

According to the probability multiplication theorem

Similarly, for other hypotheses

The resulting formula is called Bayes formula (Bayes formula ). The probabilities of the hypotheses are called posterior probabilities , whereas - prior probabilities .

Example. The store received new products from three enterprises. The percentage composition of these products is as follows: 20% - products of the first enterprise, 30% - products of the second enterprise, 50% - products of the third enterprise; further, 10% of the products of the first enterprise of the highest grade, at the second enterprise - 5% and at the third - 20% of the products of the highest grade. Find the probability that a randomly purchased new product will be of the highest quality.

Solution. Denote by AT the event consisting in the fact that the premium product will be purchased, let us denote the events consisting in the purchase of products belonging to the first, second and third enterprises, respectively.

We can apply the total probability formula, and in our notation:

Substituting these values into the total probability formula, we obtain the required probability:

Example. One of the three shooters is called to the line of fire and fires two shots. The probability of hitting the target with one shot for the first shooter is 0.3, for the second - 0.5; for the third - 0.8. The target is not hit. Find the probability that the shots were fired by the first shooter.

Solution. Three hypotheses are possible:

The first shooter is called to the line of fire,

The second shooter is called to the line of fire,

A third shooter was called to the line of fire.

Since calling any shooter to the line of fire is equally possible, then

As a result of the experiment, event B was observed - after the shots fired, the target was not hit. The conditional probabilities of this event under the hypotheses made are:

using the Bayes formula, we find the probability of the hypothesis after the experiment:

Example. On three automatic machines, parts of the same type are processed, which arrive after processing on a common conveyor. The first machine gives 2% rejects, the second - 7%, the third - 10%. The productivity of the first machine is 3 times greater than the productivity of the second, and the third is 2 times less than the second.

a) What is the defect rate on the assembly line?

b) What are the proportions of the parts of each machine among the defective parts on the conveyor?

Solution. Let's take one part at random from the assembly line and consider event A - the part is defective. It is associated with hypotheses as to where this part was machined: - a randomly selected part was machined on the th machine,.

Conditional probabilities (in the condition of the problem they are given in the form of percentages):

The dependencies between machine performances mean the following:

And since the hypotheses form a complete group, then .

Having solved the resulting system of equations, we find: .

a) The total probability that a part taken at random from the assembly line is defective:

In other words, in the mass of parts coming off the assembly line, the defect is 4%.

b) Let it be known that a part taken at random is defective. Using the Bayes formula, we find the conditional probabilities of the hypotheses:

Thus, in total mass of defective parts on the conveyor, the share of the first machine is 33%, the second - 39%, the third - 28%.

Practical tasks

Exercise 1

Solving problems in the main sections of probability theory

The goal is to gain practical skills in solving problems on

sections of probability theory

Preparation for the practical task

To get acquainted with the theoretical material on this topic, to study the content of the theoretical, as well as the relevant sections in the literature

Task execution order

Solve 5 problems according to the number of the task option given in Table 1.

Initial data options

Table 1

	task number

The composition of the report for task 1

5 solved problems according to the variant number.

Tasks for independent solution

1.. Are the following groups of events cases: a) experience - tossing a coin; developments: A1- the appearance of the coat of arms; A2- the appearance of a number; b) experience - tossing two coins; developments: IN 1- the appearance of two coats of arms; IN 2 - the appearance of two digits; AT 3- the appearance of one coat of arms and one number; c) experience - throwing a dice; developments: C1 - the appearance of no more than two points; C2 - the appearance of three or four points; C3 - the appearance of at least five points; d) experience - a shot at a target; developments: D1- hit; D2- miss; e) experience - two shots at the target; developments: E0- not a single hit; E1- one hit; E2- two hits; f) experience - drawing two cards from the deck; developments: F1- the appearance of two red cards; F2- the appearance of two black cards?

2. Urn A contains white and B black balls. One ball is drawn at random from the urn. Find the probability that this ball is white.

3. In urn A whites and B black balls. One ball is taken out of the urn and set aside. This ball is white. After that, another ball is taken from the urn. Find the probability that this ball is also white.

4. In urn A whites and B black balls. One ball was taken out of the urn and put aside without looking. After that, another ball was taken from the urn. He turned out to be white. Find the probability that the first ball put aside is also white.

5. From an urn containing A whites and B black balls, take out one by one all the balls except one. Find the probability that the last ball left in the urn is white.

6. From the urn in which A white balls and B black, take out in a row all the balls in it. Find the probability that the second ball drawn is white.

7. In an urn A of white and B of black balls (A > 2). Two balls are taken out of the urn at once. Find the probability that both balls are white.

8. White and B in urn A black balls (A > 2, B > 3). Five balls are taken out of the urn at once. Find Probability R two of them will be white and three will be black.

9. In a party consisting of X products, there is I defective. From the batch is selected for control I products. Find Probability R which of them exactly J products will be defective.

10. A die is thrown once. Find the probability of the following events: BUT - the appearance of an even number of points; AT- the appearance of at least 5 points; FROM- appearance no more than 5 points.

11. A die is thrown twice. Find Probability R that the same number of points will appear both times.

12. Two dice are thrown at the same time. Find the probabilities of the following events: BUT- the sum of the dropped points is equal to 8; AT- the product of the dropped points is equal to 8; FROM- the sum of the dropped points is greater than their product.

13. Two coins are tossed. Which of the following events is more likely: BUT - coins will lie on the same sides; AT - Do the coins lie on different sides?

14. In urn A whites and B black balls (A > 2; B > 2). Two balls are taken out of the urn at the same time. Which event is more likely: BUT- balls of the same color; AT - balls of different colors?

15. Three players are playing cards. Each of them is dealt 10 cards and two cards are left in the draw. One of the players sees that he has 6 cards of a diamond suit and 4 cards of a non-diamond suit. He discards two of those four cards and takes the draw. Find the probability that he buys two diamonds.

16. From an urn containing P numbered balls, randomly take out one by one all the balls in it. Find the probability that the numbers of the drawn balls will be in order: 1, 2,..., P.

17. The same urn as in the previous problem, but after taking out each ball is put back in and mixed with others, and its number is written down. Find the probability that the natural sequence of numbers will be written down: 1, 2,..., n.

18. A full deck of cards (52 sheets) is divided at random into two equal packs of 26 sheets. Find the probabilities of the following events: BUT - in each of the packs there will be two aces; AT- in one of the packs there will be no aces, and in the other - all four; S-in one of the packs will have one ace, and the other pack will have three.

19. 18 teams participate in the basketball championship, from which two groups of 9 teams each are randomly formed. There are 5 teams among the participants of the competition

extra class. Find the probabilities of the following events: BUT - all extra-class teams will fall into the same group; AT- two extra-class teams will get into one of the groups, and three - into the other.

20. Numbers are written on nine cards: 0, 1, 2, 3, 4, 5, 6, 7, 8. Two of them are taken out at random and placed on the table in the order of appearance, then the resulting number is read, for example 07 (seven), 14 ( fourteen), etc. Find the probability that the number is even.

21. Numbers are written on five cards: 1, 2, 3, 4, 5. Two of them, one after the other, are taken out. Find the probability that the number on the second card is greater than the number on the first.

22. The same question as in problem 21, but the first card after being drawn is put back and mixed with the rest, and the number on it is written down.

23. In urn A white, B black and C red balls. One by one, all the balls in it are taken out of the urn and their colors are written down. Find the probability that white appears before black in this list.

24. There are two urns: in the first one A whites and B black balls; in the second C white and D black. A ball is drawn from each urn. Find the probability that both balls are white.

25. Under the conditions of Problem 24, find the probability that the drawn balls will be of different colors.

26. There are seven nests in the drum of a revolver, five of them are loaded with cartridges, and two are left empty. The drum is set in rotation, as a result of which one of the sockets is randomly placed against the barrel. After that, the trigger is pressed; if the cell was empty, the shot does not occur. Find Probability R the fact that, having repeated such an experiment twice in a row, we will not shoot both times.

27. Under the same conditions (see Problem 26), find the probability that both times the shot will occur.

28. There is an A in the urn; balls labeled 1, 2, ..., to From the urn I once one ball is drawn (I<к), the number of the ball is written down and the ball is put back into the urn. Find Probability R that all recorded numbers will be different.

29. The word "book" is composed of five letters of the split alphabet. A child who could not read scattered these letters and then put them together in random order. Find Probability R the fact that he again got the word "book".

30. The word "pineapple" is made up of the letters of the split alphabet. A child who could not read scattered these letters and then put them together in random order. Find Probability R the fact that he again has the word "pineapple

31. From a full deck of cards (52 sheets, 4 suits), several cards are taken out at once. How many cards must be taken out in order to say with a probability greater than 0.50 that among them there will be cards of the same suit?

32. N people are randomly seated at a round table (N > 2). Find Probability R that two fixed faces BUT and AT will be nearby.

33. The same problem (see 32), but the table is rectangular, and N the person is seated randomly along one of its sides.

34. Numbers from 1 to N. Of these N two barrels are randomly selected. Find the probability that numbers less than k are written on both barrels (2

35. Numbers from 1 to N. Of these N two barrels are randomly selected. Find the probability that one of the barrels has a number greater than k , and on the other - less than k . (2

36. Battery out M guns firing at a group consisting of N goals (M< N). The guns choose their targets sequentially, randomly, provided that no two guns can fire at the same target. Find Probability R the fact that targets with numbers 1, 2, ..., will be fired upon M.

37.. Battery consisting of to guns, fires at a group consisting of I aircraft (to< 2). Each weapon selects its target randomly and independently of the others. Find the probability that all to guns will fire at the same target.

38. Under the conditions of the previous problem, find the probability that all guns will fire at different targets.

39. Four balls are randomly scattered over four holes; each ball hits one or another hole with the same probability and independently of the others (there are no obstacles to getting several balls into the same hole). Find the probability that there will be three balls in one of the holes, one - in the other, and no balls in the other two holes.

40. Masha quarreled with Petya and does not want to ride with him in the same bus. There are 5 buses from the hostel to the institute from 7 to 8. Those who do not have time for these buses are late for the lecture. In how many ways can Masha and Petya get to the institute on different buses and not be late for the lecture?

41. There are 3 analysts, 10 programmers and 20 engineers in the information technology department of the bank. For overtime on a holiday, the head of the department must allocate one employee. In how many ways can this be done?

42. The head of the security service of the bank must daily place 10 guards in 10 posts. In how many ways can this be done?

43. The new president of the bank must appoint 2 new vice presidents from among the 10 directors. In how many ways can this be done?

44. One of the warring parties captured 12, and the other - 15 prisoners. In how many ways can 7 prisoners of war be exchanged?

45. Petya and Masha collect video discs. Petya has 30 comedies, 80 action films and 7 melodramas, Masha has 20 comedies, 5 action films and 90 melodramas. In how many ways can Petya and Masha exchange 3 comedies, 2 action films and 1 melodrama?

46. Under the conditions of Problem 45, in how many ways can Petya and Masha exchange 3 melodramas and 5 comedies?

47. Under the conditions of problem 45, in how many ways can Petya and Masha exchange 2 action films and 7 comedies.

48. One of the warring parties captured 15, and the other - 16 prisoners. In how many ways can 5 prisoners of war be exchanged?

49. How many cars can be registered in 1 city if the number has 3 digits and 3 letters )?

50. One of the warring parties captured 14, and the other - 17 prisoners. In how many ways can 6 prisoners of war be exchanged?

51. How many different words can be formed by rearranging the letters in the word "mother"?

52. There are 3 red and 7 green apples in a basket. One apple is taken out of it. Find the probability that it will be red.

53. There are 3 red and 7 green apples in a basket. One green apple was taken out of it and set aside. Then 1 more apple is taken out of the basket. What is the probability that this apple is green?

54. In a batch of 1,000 items, 4 are defective. For control, a batch of 100 products is selected. What is the probability of LLP that the control lot will not be defective?

56. In the 80s, the sportloto 5 out of 36 game was popular in the USSR. The player noted on the card 5 numbers from 1 to 36 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player did not guess any number.

57. In the 80s, the game “sportloto 5 out of 36” was popular in the USSR. The player noted on the card 5 numbers from 1 to 36 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player guessed one number.

58. In the 80s, the sportloto 5 out of 36 game was popular in the USSR. The player noted on the card 5 numbers from 1 to 36 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player guessed 3 numbers.

59. In the 80s, the sportloto 5 out of 36 game was popular in the USSR. The player noted on the card 5 numbers from 1 to 36 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player did not guess all 5 numbers.

60. In the 80s, the sportloto 6 out of 49 game was popular in the USSR. The player noted on the card 6 numbers from 1 to 49 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player guessed 2 numbers.

61. In the 80s, the game "sportloto 6 out of 49" was popular in the USSR. The player noted on the card 6 numbers from 1 to 49 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player did not guess any number.

62. In the 80s, the game "sportloto 6 out of 49" was popular in the USSR. The player noted on the card 6 numbers from 1 to 49 and received prizes of various denominations if he guessed a different number of numbers announced by the draw commission. Find the probability that the player guessed all 6 numbers.

63. In a batch of 1,000 items, 4 are defective. For control, a batch of 100 products is selected. What is the probability of LLP that only 1 defective will be in the control lot?

64. How many different words can be formed by rearranging the letters in the word "book"?

65. How many different words can be formed by rearranging the letters in the word "pineapple"?

66. 6 people entered the elevator, and the hostel has 7 floors. What is the probability that all 6 people exit on the same floor?

67. 6 people entered the elevator, the building has 7 floors. What is the probability that all 6 people exit on different floors?

68. During a thunderstorm, a wire break occurred on the section between 40 and 79 km of the power line. Assuming that the break is equally possible at any point, find the probability that the break occurred between the 40th and 45th kilometers.

69. On the 200-kilometer section of the gas pipeline, a gas leak occurs between compressor stations A and B, which is equally possible at any point in the pipeline. What is the probability that the leak occurs within 20 km of A

70. On the 200-kilometer section of the gas pipeline, a gas leak occurs between compressor stations A and B, which is equally possible at any point in the pipeline. What is the probability that the leak is closer to A than to B?

71. The radar of the traffic police inspector has an accuracy of 10 km / h and rounds to the nearest side. What happens more often - rounding in favor of the driver or the inspector?

72. Masha spends 40 to 50 minutes on her way to the institute, and any time in this interval is equally probable. What is the probability that she will spend on the road from 45 to 50 minutes.

73. Petya and Masha agreed to meet at the monument to Pushkin from 12 to 13 hours, but no one could indicate the exact time of arrival. They agreed to wait for each other for 15 minutes. What is the probability of their meeting?

74. Fishermen caught 120 fish in the pond, 10 of them were ringed. What is the probability of catching a ringed fish?

75. From a basket containing 3 red and 7 green apples, take out all the apples in turn. What is the probability that the 2nd apple is red?

76. From a basket containing 3 red and 7 green apples, take out all the apples in turn. What is the probability that the last apple is green?

77. Students consider that out of 50 tickets 10 are “good”. Petya and Masha take turns pulling one ticket each. What is the probability that Masha got a "good" ticket?

78. Students consider that out of 50 tickets 10 are “good”. Petya and Masha take turns pulling one ticket each. What is the probability that they both got a "good" ticket?

79. Masha came to the exam knowing the answers to 20 questions of the program out of 25. The professor asks 3 questions. What is the probability that Masha will answer 3 questions?

80. Masha came to the exam knowing the answers to 20 questions of the program out of 25. The professor asks 3 questions. What is the probability that Masha will not answer any of the questions?

81. Masha came to the exam knowing the answers to 20 questions of the program out of 25. The professor asks 3 questions. What is the probability that Masha will answer 1 question?

82. The statistics of bank loan requests is as follows: 10% - state. authorities, 20% - other banks, the rest - individuals. The probability of loan default is 0.01, 0.05 and 0.2, respectively. What proportion of loans are non-refundable?

83. the probability that the weekly turnover of an ice cream merchant will exceed 2000 rubles. is 80% in clear weather, 50% in partly cloudy and 10% in rainy weather. What is the probability that the turnover will exceed 2000 rubles. if the probability of clear weather is 20%, and partly cloudy and rainy - 40% each.

84. White (b) and C are in urn A black (h) balls. Two balls are taken out of the urn (simultaneously or sequentially). Find the probability that both balls are white.

85. In urn A whites and B

86. In urn A whites and B

87. In urn A whites and B black balls. One ball is taken out of the urn, its color is marked and the ball is returned to the urn. After that, another ball is taken from the urn. Find the probability that these balls will be of different colors.

88. There is a box with nine new tennis balls. Three balls are taken for the game; after the game they are put back. When choosing balls, they do not distinguish between played and unplayed balls. What is the probability that after three games there will be no unplayed balls in the box?

89. Leaving the apartment, N each guest will put on their own galoshes;

90. Leaving the apartment, N guests with the same shoe size put on galoshes in the dark. Each of them can distinguish the right galosh from the left, but cannot distinguish his own from someone else's. Find the probability that each guest will put on galoshes belonging to one pair (maybe not their own).

91. Under the conditions of problem 90, find the probability that everyone will leave in their galoshes if the guests cannot distinguish the right galoshes from the left and simply take the first two galoshes that come across.

92. Shooting is underway at the aircraft, the vulnerable parts of which are two engines and the cockpit. In order to hit (disable) the aircraft, it is enough to hit both engines together or the cockpit. Under given firing conditions, the probability of hitting the first engine is p1 second engine p2, cockpit p3. Parts of the aircraft are affected independently of each other. Find the probability that the plane will be hit.

93. Two shooters, independently of one another, fire two shots (each at their own target). Probability of hitting the target with one shot for the first shooter p1 for the second p2. The winner of the competition is the shooter, in the target of which there will be more holes. Find Probability Rx what the first shooter wins.

94. behind a space object, the object is detected with a probability R. Object detection in each cycle occurs independently of the others. Find the probability that when P cycles the object will be detected.

95. 32 letters of the Russian alphabet are written on cut alphabet cards. Five cards are drawn at random, one after the other, and placed on the table in the order in which they appear. Find the probability that the word "end" will be obtained.

96. Two balls are scattered randomly and independently of each other over four cells located one after the other in a straight line. Each ball with the same probability 1/4 hits each cell. Find the probability that the balls will fall into neighboring cells.

97. Incendiary projectiles are being fired at the aircraft. The fuel on the aircraft is concentrated in four tanks located in the fuselage one after the other. Tank sizes are the same. In order to ignite the aircraft, it is enough to hit two shells either in the same tank or in neighboring tanks. It is known that two shells hit the tank area. Find the probability that the plane will catch fire.

98. From a full deck of cards (52 sheets), four cards are taken out at once. Find the probability that all four of these cards are of the same suit.

99. From a full deck of cards (52 sheets), four cards are taken out at once, but each card is returned to the deck after being taken out. Find the probability that all four cards are of the same suit.

100. When the ignition is turned on, the engine starts with a probability R.

101. The device can operate in two modes: 1) normal and 2) abnormal. Normal mode is observed in 80% of all cases of device operation; abnormal - in 20%. Probability of device failure in time t in normal mode is 0.1; in the abnormal - 0.7. Find Total Probability R failure of the device.

102. The store receives goods from 3 suppliers: 55% from the 1st, 20 from the 2nd and 25% from the 3rd. The share of marriage is 5, 6 and 8 percent, respectively. What is the probability that the purchased defective product came from the second supplier.

103. The flow of cars past gas stations consists of 60% trucks and 40% cars. What is the probability of finding a truck at a gas station if the probability of refueling is 0.1, and a car is 0.3

104. The flow of cars past gas stations consists of 60% trucks and 40% cars. What is the probability of finding a truck at a gas station if the probability of refueling is 0.1, and a car is 0.3

105. The store receives goods from 3 suppliers: 55% from the 1st, 20 from the 2nd and 25% from the 3rd. The share of marriage is 5, 6 and 8 percent, respectively. What is the probability that the purchased defective product came from the 1st supplier.

106. 32 letters of the Russian alphabet are written on cut alphabet cards. Five cards are drawn at random, one after the other, and placed on the table in the order in which they appear. Find the probability of getting the word "book".

107. The store receives goods from 3 suppliers: 55% from the 1st, 20 from the 2nd and 25% from the 3rd. The share of marriage is 5, 6 and 8 percent, respectively. What is the probability that the purchased defective product came from the 1st supplier.

108. Two balls are scattered randomly and independently of each other over four cells located one after the other in a straight line. Each ball with the same probability 1/4 hits each cell. Find the probability that 2 balls fall into the same cell

109. When the ignition is turned on, the engine starts to work with a probability R. Find the probability that the engine will start running the second time the ignition is turned on;

110. Incendiary projectiles are fired at the aircraft. The fuel on the aircraft is concentrated in four tanks located in the fuselage one after the other. Tank sizes are the same. In order to ignite the aircraft, it is enough to hit two shells in the same tank. It is known that two shells hit the tank area. Find the probability that the plane will catch fire

111. Incendiary projectiles are fired at the aircraft. The fuel on the aircraft is concentrated in four tanks located in the fuselage one after the other. Tank sizes are the same. In order to ignite the aircraft, it is enough to hit two shells in neighboring tanks. It is known that two shells hit the tank area. Find the probability that the plane will catch fire

112. In urn A whites and B black balls. One ball is taken out of the urn, its color is marked and the ball is returned to the urn. After that, another ball is taken from the urn. Find the probability that both balls drawn are white.

113. In urn A whites and B black balls. Two balls are taken out of the urn at once. Find the probability that these balls will be of different colors.

114. Two balls are scattered randomly and independently of each other over four cells located one after the other in a straight line. Each ball with the same probability 1/4 hits each cell. Find the probability that the balls will fall into neighboring cells.

115. Masha came to the exam knowing the answers to 20 questions of the program out of 25. The professor asks 3 questions. What is the probability that Masha will answer 2 questions?

116. Students consider that out of 50 tickets 10 are “good”. Petya and Masha take turns pulling one ticket each. What is the probability that they both got a "good" ticket?

117. The statistics of bank loan requests is as follows: 10% - state. authorities, 20% - other banks, the rest - individuals. The probability of loan default is 0.01, 0.05 and 0.2, respectively. What proportion of loans are non-refundable?

118. 32 letters of the Russian alphabet are written on cut alphabet cards. Five cards are drawn at random, one after the other, and placed on the table in the order in which they appear. Find the probability that the word "end" will be obtained.

119 The statistics of bank loan requests is as follows: 10% - state. authorities, 20% - other banks, the rest - individuals. The probability of loan default is 0.01, 0.05 and 0.2, respectively. What proportion of loans are non-refundable?

120. the probability that the weekly turnover of an ice cream merchant will exceed 2000 rubles. is 80% in clear weather, 50% in partly cloudy and 10% in rainy weather. What is the probability that the turnover will exceed 2000 rubles. if the probability of clear weather is 20%, and partly cloudy and rainy - 40% each.

Let their probabilities and the corresponding conditional probabilities be known. Then the probability of the event occurring is:

This formula is called total probability formulas. In textbooks, it is formulated by a theorem, the proof of which is elementary: according to event algebra, (event happened and or an event happened and after it came the event or an event happened and after it came the event or …. or an event happened and event followed). Since the hypotheses are incompatible, and the event is dependent, then according to addition theorem for the probabilities of incompatible events (first step) and the theorem of multiplication of probabilities of dependent events (second step):

Probably, many anticipate the content of the first example =)

Wherever you spit - everywhere the urn:

Task 1

There are three identical urns. The first urn contains 4 white and 7 black balls, the second urn contains only white balls, and the third urn contains only black balls. One urn is chosen at random and a ball is drawn from it at random. What is the probability that this ball is black?

Solution: consider the event - a black ball will be drawn from a randomly selected urn. This event may or may not occur as a result of one of the following hypotheses:
– the 1st urn will be chosen;
– the 2nd urn will be chosen;
– the 3rd urn will be chosen.

Since the urn is chosen at random, the choice of any of the three urns equally possible, Consequently:

Note that the above hypotheses form full group of events, that is, according to the condition, a black ball can appear only from these urns, and for example, not fly from a billiard table. Let's do a simple intermediate check:
OK, let's move on:

The first urn contains 4 white + 7 black = 11 balls, each classical definition:
is the probability of drawing a black ball on condition that the 1st urn will be selected.

The second urn contains only white balls, so if chosen the appearance of a black ball becomes impossible: .

And, finally, in the third urn there are only black balls, which means that the corresponding conditional probability extraction of the black ball will be (event is certain).

is the probability that a black ball will be drawn from a randomly selected urn.

Answer:

The analyzed example again suggests how important it is to UNDERSTAND THE CONDITION. Let's take the same problems with urns and balls - with their external similarity, the methods of solving can be completely different: somewhere it is required to apply only classical definition of probability, somewhere events independent, somewhere dependent, and somewhere we are talking about hypotheses. At the same time, there is no clear formal criterion for choosing a solution path - you almost always need to think about it. How to improve your skills? We solve, we solve and we solve again!

Task 2

There are 5 different rifles in the shooting range. The probabilities of hitting the target for a given shooter are respectively equal to 0.5; 0.55; 0.7; 0.75 and 0.4. What is the probability of hitting the target if the shooter fires one shot from a randomly selected rifle?

Short solution and answer at the end of the lesson.

In most thematic problems, the hypotheses are, of course, not equally probable:

Task 3

There are 5 rifles in the pyramid, three of which are equipped with an optical sight. The probability that the shooter will hit the target when fired from a rifle with a telescopic sight is 0.95; for a rifle without a telescopic sight, this probability is 0.7. Find the probability that the target will be hit if the shooter fires one shot from a rifle taken at random.

Solution: in this problem, the number of rifles is exactly the same as in the previous one, but there are only two hypotheses:
- the shooter will choose a rifle with an optical sight;
- the shooter will select a rifle without a telescopic sight.
By classical definition of probability: .
Control:

Consider the event: - the shooter hits the target with a randomly selected rifle.
By condition: .

According to the total probability formula:

Answer: 0,85

In practice, a shortened way of designing a task, which you are also familiar with, is quite acceptable:

Solution: according to the classical definition: are the probabilities of choosing a rifle with and without an optical sight, respectively.

By condition, – probabilities of hitting the target with the respective types of rifles.

According to the total probability formula:
is the probability that the shooter will hit the target with a randomly selected rifle.

Answer: 0,85

The following task for an independent solution:

Task 4

The engine operates in three modes: normal, forced and idling. In idle mode, the probability of its failure is 0.05, in normal mode - 0.1, and in forced mode - 0.7. 70% of the time the engine runs in normal mode, and 20% in forced mode. What is the probability of engine failure during operation?

Just in case, let me remind you - to get the probabilities, the percentages must be divided by 100. Be very careful! According to my observations, the conditions of problems for the total probability formula are often tried to be confused; and I specifically chose such an example. I'll tell you a secret - I almost got confused myself =)

Solution at the end of the lesson (formulated in a short way)

Problems for Bayes formulas

The material is closely related to the content of the previous paragraph. Let the event occur as a result of the implementation of one of the hypotheses . How to determine the probability that a particular hypothesis took place?

On condition that event already happened, probabilities of hypotheses overestimated according to the formulas that received the name of the English priest Thomas Bayes:

- the probability that the hypothesis took place;
- the probability that the hypothesis took place;
…
is the probability that the hypothesis was true.

At first glance, it seems like a complete absurdity - why recalculate the probabilities of hypotheses, if they are already known? But in fact there is a difference:

- this is a priori(estimated before tests) probabilities.

- this is a posteriori(estimated after tests) the probabilities of the same hypotheses, recalculated in connection with "newly discovered circumstances" - taking into account the fact that the event happened.

Let's look at this difference with a specific example:

Task 5

The warehouse received 2 batches of products: the first - 4000 pieces, the second - 6000 pieces. The average percentage of non-standard products in the first batch is 20%, and in the second - 10%. Randomly taken from the warehouse, the product turned out to be standard. Find the probability that it is: a) from the first batch, b) from the second batch.

First part solutions consists in using the total probability formula. In other words, the calculations are carried out under the assumption that the test not yet produced and event "the product turned out to be standard" until it comes.

Let's consider two hypotheses:
- a product taken at random will be from the 1st batch;
- a product taken at random will be from the 2nd batch.

Total: 4000 + 6000 = 10000 items in stock. According to the classical definition:
.

Control:

Consider the dependent event: – an item taken at random from the warehouse will be standard.

In the first batch 100% - 20% = 80% standard products, therefore: on condition that it belongs to the 1st party.

Similarly, in the second batch 100% - 10% = 90% standard products and is the probability that a randomly selected item in the warehouse will be a standard item on condition that it belongs to the 2nd party.

According to the total probability formula:
is the probability that a product chosen at random from the warehouse will be a standard product.

Part two. Suppose that a product taken at random from the warehouse turned out to be standard. This phrase is directly spelled out in the condition, and it states the fact that the event happened.

According to Bayes' formulas:

a) - the probability that the selected standard product belongs to the 1st batch;

b) - the probability that the selected standard product belongs to the 2nd batch.

After revaluation hypotheses, of course, still form full group:
(examination;-))

Answer:

Ivan Vasilyevich, who changed his profession again and became the director of the plant, will help us understand the meaning of the reassessment of hypotheses. He knows that today the 1st shop shipped 4000 items to the warehouse, and the 2nd shop - 6000 products, and he comes to make sure of this. Suppose all products are of the same type and are in the same container. Naturally, Ivan Vasilyevich previously calculated that the product that he will now remove for verification will most likely be produced by the 1st workshop and with a probability by the second. But after the selected item turns out to be standard, he exclaims: “What a cool bolt! - it was rather released by the 2nd workshop. Thus, the probability of the second hypothesis is overestimated for the better , and the probability of the first hypothesis is underestimated: . And this overestimation is not unreasonable - after all, the 2nd workshop not only produced more products, but also works 2 times better!

You say, pure subjectivism? Partly - yes, moreover, Bayes himself interpreted a posteriori probabilities as trust level. However, not everything is so simple - there is an objective grain in the Bayesian approach. After all, the probability that the product will be standard (0.8 and 0.9 for the 1st and 2nd shops, respectively) this is preliminary(a priori) and medium estimates. But, speaking philosophically, everything flows, everything changes, including probabilities. It is quite possible that at the time of the study more successful 2nd shop increased the percentage of standard products (and/or the 1st shop reduced), and if you check more or all 10 thousand items in stock, then the overestimated values will be much closer to the truth.

By the way, if Ivan Vasilyevich extracts a non-standard part, then vice versa - he will “suspect” the 1st shop more and less - the second. I suggest you check it out for yourself:

Task 6

The warehouse received 2 batches of products: the first - 4000 pieces, the second - 6000 pieces. The average percentage of non-standard products in the first batch is 20%, in the second - 10%. A product taken at random from the warehouse turned out to be not standard. Find the probability that it is: a) from the first batch, b) from the second batch.

The condition will be distinguished by two letters, which I have highlighted in bold. The problem can be solved from scratch, or you can use the results of previous calculations. In the sample, I carried out a complete solution, but in order to avoid a formal overlay with Task No. 5, the event “A product taken at random from the warehouse will be non-standard” marked with .

The Bayesian scheme of re-evaluation of probabilities is found everywhere, and it is also actively exploited by various kinds of scammers. Consider a three-letter joint-stock company that has become a household name, which attracts deposits from the population, allegedly invests them somewhere, regularly pays dividends, etc. What's happening? Day after day, month after month passes, and more and more facts, conveyed through advertising and word of mouth, only increase the level of confidence in the financial pyramid (posterior Bayesian re-evaluation due to past events!). That is, in the eyes of depositors, there is a constant increase in the likelihood that "this is a serious office"; while the probability of the opposite hypothesis (“these are regular scammers”), of course, decreases and decreases. The rest, I think, is clear. It is noteworthy that the earned reputation gives the organizers time to successfully hide from Ivan Vasilyevich, who was left not only without a batch of bolts, but also without pants.

We will return to no less interesting examples a little later, but for now, perhaps the most common case with three hypotheses is next in line:

Task 7

Electric lamps are manufactured at three factories. The 1st plant produces 30% of the total number of lamps, the 2nd - 55%, and the 3rd - the rest. The products of the 1st plant contain 1% of defective lamps, the 2nd - 1.5%, the 3rd - 2%. The store receives products from all three factories. The lamp I bought was defective. What is the probability that it was produced by plant 2?

Note that in problems on Bayes formulas in the condition necessarily some what happened an event, in this case, the purchase of a lamp.

Events have increased and solution it is more convenient to arrange in a "fast" style.

The algorithm is exactly the same: at the first step, we find the probability that the purchased lamp will will be defective.

Using the initial data, we translate the percentages into probabilities:
are the probabilities that the lamp is produced by the 1st, 2nd and 3rd factories, respectively.
Control:

Similarly: - the probabilities of manufacturing a defective lamp for the respective factories.

According to the total probability formula:

- the probability that the purchased lamp will be defective.

Step two. Let the purchased lamp be defective (the event happened)

According to the Bayes formula:
- the probability that the purchased defective lamp is manufactured by the second factory

Answer:

Why did the initial probability of the 2nd hypothesis increase after the reassessment? After all, the second plant produces lamps of average quality (the first one is better, the third one is worse). So why did it increase a posteriori the probability that the defective lamp is from the 2nd factory? This is no longer due to "reputation", but to size. Since plant No. 2 produced the largest number of lamps, they blame it (at least subjectively): “most likely, this defective lamp is from there”.

It is interesting to note that the probabilities of the 1st and 3rd hypotheses were overestimated in the expected directions and became equal:

Control: , which was to be verified.

By the way, about underestimated and overestimated:

Task 8

In the student group, 3 people have a high level of training, 19 people have an average level and 3 people have a low level. The probabilities of passing the exam successfully for these students are respectively: 0.95; 0.7 and 0.4. It is known that some student passed the exam. What is the probability that:

a) he was very well prepared;
b) was moderately prepared;
c) was poorly prepared.

Perform calculations and analyze the results of reevaluation of hypotheses.

The task is close to reality and is especially plausible for a group of part-time students, where the teacher practically does not know the abilities of this or that student. In this case, the result can cause rather unexpected consequences. (especially for exams in the 1st semester). If a poorly prepared student is lucky enough to get a ticket, then the teacher is likely to consider him a good student or even a strong student, which will bring good dividends in the future (of course, you need to “raise the bar” and maintain your image). If a student studied, crammed, repeated for 7 days and 7 nights, but he was simply unlucky, then further events can develop in the worst possible way - with numerous retakes and balancing on the verge of departure.

Needless to say, reputation is the most important capital, it is no coincidence that many corporations bear the names of their founding fathers, who led the business 100-200 years ago and became famous for their impeccable reputation.

Yes, the Bayesian approach is subjective to a certain extent, but ... that's how life works!

Let's consolidate the material with a final industrial example, in which I will talk about the technical subtleties of the solution that have not yet been encountered:

Task 9

Three workshops of the plant produce parts of the same type, which are assembled in a common container for assembly. It is known that the first shop produces 2 times more parts than the second shop, and 4 times more than the third shop. In the first workshop, the defect is 12%, in the second - 8%, in the third - 4%. For control, one part is taken from the container. What is the probability that it will be defective? What is the probability that the extracted defective part was produced by the 3rd workshop?
Taki Ivan Vasilyevich is on horseback again =) The film must have a happy ending =)

Solution: in contrast to Tasks No. 5-8, a question is explicitly asked here, which is resolved using the total probability formula. But on the other hand, the condition is a little “encrypted”, and the school skill to compose the simplest equations will help us solve this rebus. For "x" it is convenient to take the smallest value:

Let be the share of parts produced by the third workshop.

According to the condition, the first workshop produces 4 times more than the third workshop, so the share of the 1st workshop is .

In addition, the first workshop produces 2 times more products than the second workshop, which means that the share of the latter: .

Let's make and solve the equation:

Thus: - the probabilities that the part removed from the container was released by the 1st, 2nd and 3rd workshops, respectively.

Control: . In addition, it will not be superfluous to look again at the phrase “It is known that the first workshop produces products 2 times more than the second workshop and 4 times more than the third workshop” and make sure that the obtained probabilities really correspond to this condition.

For "X" it was initially possible to take the share of the 1st or the share of the 2nd shop - the probabilities will come out the same. But, one way or another, the most difficult section has been passed, and the solution is on track:

From the condition we find:
- the probability of manufacturing a defective part for the corresponding workshops.

According to the total probability formula:
is the probability that a part randomly extracted from the container will be non-standard.

Question two: what is the probability that the extracted defective part was produced by the 3rd shop? This question assumes that the part has already been removed and is found to be defective. We reevaluate the hypothesis using the Bayes formula:
is the desired probability. Quite expected - after all, the third workshop produces not only the smallest share of parts, but also leads in quality!

In this case, I had to simplify the four-story fraction, which in problems on Bayes formulas has to be done quite often. But for this lesson, I somehow accidentally picked up examples in which many calculations can be done without ordinary fractions.

Since there are no “a” and “be” points in the condition, it is better to provide the answer with text comments:

Answer: - the probability that the part removed from the container will be defective; - the probability that the extracted defective part was released by the 3rd workshop.

As you can see, the problems on the total probability formula and Bayes formulas are quite simple, and, probably, for this reason they so often try to complicate the condition, which I already mentioned at the beginning of the article.

Additional examples are in the file with ready-made solutions for F.P.V. and Bayes formulas, in addition, there are probably those who wish to become more deeply acquainted with this topic in other sources. And the topic is really very interesting - what is it worth alone bayes paradox, which substantiates the everyday advice that if a person is diagnosed with a rare disease, then it makes sense for him to conduct a second and even two repeated independent examinations. It would seem that they do it solely out of desperation ... - but no! But let's not talk about sad things.

is the probability that a randomly selected student will pass the exam.
Let the student pass the exam. According to Bayes' formulas:
a) - the probability that the student who passed the exam was prepared very well. The objective initial probability is overestimated, since almost always some "average" are lucky with questions and they answer very strongly, which gives the erroneous impression of impeccable preparation.
b) is the probability that the student who passed the exam was moderately prepared. The initial probability turns out to be slightly overestimated, because students with an average level of preparation are usually the majority, in addition, the teacher will include unsuccessfully answered “excellent students” here, and occasionally a poorly performing student who was very lucky with a ticket.
in) - the probability that the student who passed the exam was poorly prepared. The initial probability was overestimated for the worse. Not surprising.
Examination:
Answer :