# Big Data and Law of Large Numbers

As an exercise, I suggest you write the code to generate the LLN and CLT plots I shared. Feel free to play with small and large samples to see what happens. An AI specially trained to handle even the smallest variations in medical data would be the ideal tool to find patterns and consistencies that would otherwise have eluded a human specialist. Watson, IBM`s jeopardy! Breaking Powerhouse AI, could possibly have made mistakes due to poor initial data. But it has at least provided the main concept of using machine learning to diagnose medical problems and provide the recommended treatment. At this point, the importance of the law of large numbers for artificial intelligence is likely to become clearer. Because if AI is designed to work with data to perform tasks where a human should or could, it will be necessary for the system to arrive at its most logical interpretation using all the relevant information available. Strong law shows that this will almost certainly not happen. Note that this does not imply that we have 1 with a probability that for each ε > 0 the inequality | X ̄ n − μ | < ε {displaystyle |{ overline {X}}_{n}-mu |<varepsilon } applies to all sufficiently large n because convergence on the set in which it applies is not necessarily uniform.

 The following sections show how big data plays an important role in the accuracy of the result and in improving the performance of the algorithms used. The simplest example of the law of large numbers is the dice. The dice contain six different events with equal probabilities. The expected value of dice events is as follows: Similar to other examples above, the law of large numbers in psychology translates into how a greater number of attempts often leads to a more accurate expectation value. The more studies conducted, the closer the projection is to a correct medical evaluation. After Bernoulli and Poisson published their efforts, other mathematicians also contributed to the refinement of the law, including Chebyshev, Markov, Borel, Cantelli, Kolmogorov, and Khinchin. Markov showed that the law can be applied to a random variable that has no finite variance under another weaker hypothesis, and Khinchin showed in 1929 that if the series consists of identically distributed independent random variables, it is enough that the expected value exists for the weak law of large numbers to be true.   These additional studies have led to two important forms of LLN. One is called the “weak” law and the other is the “strong” law, with respect to two different modes of convergence of the cumulative averages of the sample at the expected value; In particular, as explained below, the strong form implies the weak form.  The law of small numbers is the theory that humans underestimate the variability of small sample sizes. This means that people who study a sample size that is too small tend to overestimate the value of the population due to the incorrect sample size.

The law of large numbers suggests that as Tesla continues to grow, it will become more difficult for the company to maintain this level of productivity. For example, assuming a stable growth rate over the next few years, it quickly becomes clear that Tesla simply cannot maintain its current growth trajectory because the underlying dollar values become inappropriate. Nevertheless, if the coin is thrown often enough because the probability of the two results is the same, the law of large numbers comes into play and the number of heads and numbers will be close to the same. The law of large numbers states that as a business grows, it becomes more difficult to maintain its previous growth rates. As a result, the company`s growth rate decreases as it continues to grow. The law of large numbers can take into account various financial measures such as market capitalization, income, and net income. In this example, mileage data is averaged to represent and optimize driving paths and guidelines. Recorded videos and images are repeatedly analyzed by the AI, so it ends up predicting the visuals with a reliable probability rate. Even data regarding other cars` driving decisions on the road is averaged to help AI make better predictions about what other drivers are most likely to do in the near future. This can be repeated because it is.

The basic structure of a machine learning AI is exactly what the law of large numbers represents as a mathematical theorem that is only translated into a more operational format. There are, of course, variations in the exact information as well as its practical applications, but the basic concept remains consistent. Four different curves representing the data trained with (i) the traditional machine learning algorithm (black), (ii) the small neural network (NN) (blue), (iii) the average NN (green) and (iv) the large NN (red) are displayed. All the curves clearly show the increase in the performance of the algorithms with the amount of data until they reach a plateau. In this case, the law of large numbers was used to perform the tasks of extraction, classification and prediction. Extraction by organizing unstructured information based on seemingly unrelated medical records. Classification by analyzing repetitive patterns and events, no matter how tiny, insignificant or even discrete they may be. And finally, predict a processing method based on what AI considers to be the most likely information that turns out to be accurate and true. If a person wanted to determine the average value of a dataset of 100 possible values, they are more likely to get an accurate average by choosing 20 data points instead of relying on just two. This is because there is a greater probability that the two data points are outliers or not representative of the mean, while there is a lower probability that the 20 data points are not representative. The law of large numbers in probability and statistics states that with the increase in sample size, their average approaches the average of the entire population. This is due to the fact that the sample is more representative of the population as the sample grows.

The strong law of large numbers can even be considered a special case of the ergodic theorem point by point. This view justifies the intuitive interpretation of the expected value (only for Lebesgue integration) of a random variable when it is repeatedly sampled as a “long-term average”. In addition, the law of large numbers allows insurance companies to thoroughly refine the criteria for evaluating premiums by analyzing the characteristics that lead to higher risk. For example, this result is useful for deriving the consistency of a large class of estimators (see Extremum Estimators). In addition to accuracy, the performance of algorithms depends on the size of the data or the amount of data. As the size of the data increases, we force the algorithms to adjust the data correctly, minimizing errors. (The more data, the less likely it is to cause statistical errors.) Figure 3 shows a schematic diagram that illustrates the performance of algorithms with the defined amount of data. In sexual reproduction, the chances of a single microscopic sperm reaching the egg to fertilize it are very slim. Thus, at each encounter, sperm are released in several million at a time (in mammals), which increases the chances of fertility at an almost certain event.

Neural networks (deep learning) are used in AI applications. First, these neural networks are trained by working through a set of data training. As shown in Figure 1, the drive set must represent the entire space of the sample to get an accurate result, and for this reason, the algorithms are trained on large data sets. LLN is important because it ensures stable long-term results for the averages of certain random events.   For example, while a casino can lose money in a single spin of the roulette wheel, its winnings tend to be at a predictable percentage over a large number of spins. Each winning streak of a player is ultimately overcome by the game`s settings. It is important to note that the law (as the name suggests) only applies if a large number of observations are taken into account. There is no principle that a small number of observations agree with the expected value or that a band of one value is immediately “balanced” by the others (see player error).

And if the studies incorporate a selection bias typical of human economic/rational behavior, the law of large numbers does not help solve the bias. Even if the number of studies is increased, selection bias persists. With the increased use of smartphones and other devices, we have had access to large data sets that were not available in the past. These records represent the correct distribution of the problem we are trying to solve. The law can be reworded as “large numbers also deceive”, which is counterintuitive to a descriptive statistician. Specifically, skeptic Penn Jillette said, “Millions of odds to one occur eight times a day in New York City” (population of about 8,000,000).  There are two different versions of the law of large numbers, which are described below. They are called the strong law of large numbers and the weak law of large numbers.   Specified in the case X1, X2,. is an infinite sequence of independent and identically distributed Lebesgue random variables (i.i.d.) with an expected value E(X1) = E(X2) = …= μ, both versions of the law stipulate that the average of the sample The strong law of numbers can be defined more precisely with a few terms of the calculation: In statistical analysis, the law of large numbers is related to the central limit theorem. The central limit theorem states that as the sample size increases, the sample mean is evenly distributed. This is often represented as a bell-shaped curve, where the peak of the curve represents the mean and the right distributions of the sample data fall to the left and right of the curve.