![]() |
![]() |
|
![]() |
![]() |
Encyclopedia :
N :
NO :
NOR :
Normal distribution |
|
|
Normal distributionThe normal distribution, also called Gaussian distribution, is an extremely important probability distribution in many fields, especially in physics and engineering. It is a family of distributions of the same general form, differing in their location and scale parameters: the mean ("average") and standard deviation ("variability"), respectively. The standard normal distribution is the normal distribution with a mean of zero and a standard deviation of one (the green curves in the plots to the right). It is often called the bell curve because the graph of its probability density resembles a bell. Overview The normal distribution is a convenient model of quantitative phenomena in the natural and behavioral sciences. A variety of psychologicalcal test scores and physicalcal phenomena like photon counts have been found to approximately follow a normal distribution. While the underlying causes of these phenomena are often unknown, the use of the normal distribution can be theoretically justified in situations where many small effects are added together into a score or variable that can be observed. The normal distribution also arises in many areas of statistics: for example, the sampling distribution of the mean is approximately normal, even if the distribution of the population the sample is taken from is not normal. In addition, the normal distribution maximizes information entropy among all distributions with known mean and variance, which makes it the natural choice of underlying distribution for data summarized in terms of sample mean and variance. The normal distribution is the most widely used family of distributions in statistics and many statistical tests are based on the assumption of normality. In probability theory, normal distributions arise as the limiting distributions of several continuous and discrete families of distributions. HistoryThe normal distribution was first introduced by de Moivre in an article in 1733 (reprinted in the second edition of his The Doctrine of Chances, 1738) in the context of approximating certain binomial distributions for large n. His result was extended by Laplace in his book Analytical Theory of Probabilities (1812), and is now called the Theorem of de Moivre-Laplace. Laplace used the normal distribution in the analysis of errors of experiments. The important method of least squares was introduced by Legendre in 1805. Gauss, who claimed to have used the method since 1794, justified it rigorously in 1809 by assuming a normal distribution of the errors. The name "bell curve" goes back to Jouffret who used the term "bell surface" in 1872 for a bivariate normal with independent components. The name "normal distribution" was coined independently by Charles S. Peirce, Francis Galton and Wilhelm Lexis around 1875. This terminology is unfortunate, since it reflects and encourages the fallacy that many or all probability distributions are "normal". (See the discussion of "occurrence" below.) That the distribution is called the normal\ or Gaussian distribution is an instance of Stigler's law of eponymy: Specification of the normal distributionThere are various ways to specify a random variable. The most visual is the probability density function (plot at the top), which represents how likely each value of the random variable is. The cumulative density function is a conceptually cleaner way to specify the same information, but to the untrained eye its plot is much less informative (see below). Equivalent ways to specify the normal distribution are: the moments, the cumulants, the characteristic function, the moment-generating function, and the cumulant-generating function. Some of these are very useful for theoretical work, but not intuitive. See probability distribution for a discussion. All of the cumulants of the normal distribution are zero, except the first two. Probability density function
The probability density function of the normal distribution with mean and variance (equivalently, standard deviation ) is an example of a Gaussian function,
If a random variable has this distribution, we write ~ . If and , the distribution is called the standard normal distribution and the probability density function reduces to
Some notable qualities of the normal distribution: Cumulative distribution function
The cumulative distribution function (cdf) is defined as the probability that a variable has a value less than or equal to , and it is expressed in terms of the density function as
. For a normal distribution, it can be shown that the moment generating function is as can be seen by completing the square in the exponent. Characteristic function The characteristic function is defined as the expected value of PropertiesSome of the properties of the normal distribution:
If ~ , then
An important consequence is that the cdf of a general normal distribution is therefore
The standard normal distribution has been tabulated, and the other normal distributions are simple transformations of the standard one. Generating normal random variables For computer simulations, it is often useful to generate values that have a normal distribution. The Box-Muller transform takes two uniformly distributed values as input and maps them to two normally distributed values. The Box-Muller transform is a consequence of the fact that the chi-square distribution with two degrees of freedom (see property 4 above) is an easily-generated exponential random variable. The central limit theorem
The practical importance of the central limit theorem is that the normal distribution can be used as an approximation to some other distributions.
It is typically the case that such approximations are less accurate in the tails of the distribution. Infinite divisibility The normal distributions are infinitely divisible probability distributions. Standard deviationfrom the mean. For the normal distribution, this accounts for 68% of the set while two standard deviations from the mean (blue and brown) account for 95% and three standard deviations (blue, brown and green) account for 99.7%. In practice, one often assumes that data are from an approximately normally distributed population. If that assumption is justified, then about 68% of the values are at within 1 standard deviation away from the mean, about 95% of the values are within two standard deviations and about 99.7% lie within 3 standard deviations. This is known as the "68-95-99.7 rule". Related distributionsOccurrence Approximately normal distributions occur in many situations, as a result of the central limit theorem. Effects can also act as multiplicative (rather than additive) modifications. In that case, the assumption of normality is not justified, and it is the logarithm of the variable of interest that is normally distributed. The distribution of the directly observed variable is then called log-normal. Finally, if there is a single external influence which has a large effect on the variable under consideration, the assumption of normality is not justified either. This is true even if, when the external variable is held constant, the resulting marginal distributions are indeed normal. The full distribution will be a superposition of normal variables, which is not in general normal. This is related to the theory of errors (see below). To summarize, here's a list of situations where approximate normality Of relevance to biology and economics is the fact that complex systems tend to display power laws rather than normality. Photon countingLight intensity from a single source varies with time, as thermal fluctuations can be observed if the light is analyzed at sufficiently high time resolution. The intensity is usually assumed to be normally distributed. In the classical theory of optical coherence, light is modelled as an electromagnetic wave,and correlations are observed and analyzed up to the second order, consistently with the assumption of normality. (See Gaussian stochastic process) However, non-classical correlations are sometimes observed. Quantum mechanics interprets measurements of light intensity as photon counting. The natural assumption in this setting is the Poisson distribution. When light intensity is integrated over times longer than the coherence time and is large, the poisson-to-normal limit is appropriate. Correlations are interpreted in terms of "bunching" and "anti-bunching" of photons with respect to the expected Poisson behaviour. Anti-bunching requires a quantum model of light emission. Ordinary light sources producing light by thermal emission display a so-called blackbody spectrum (of intensity as a function of frequency), and the number of photons at each frequency follows a Bose-Einstein distribution (a geometric distribution). The coherence time of thermal light is exceedingly low, and so a Poisson distribution is appropriate in most cases, even when the intensity is so low as to preclude the approximation by a normal. The intensity of laser light has an exactly normal intensity distribution and long coherence times. The photon distribution is exactly Poisson, and the large intensities make it appropriate to use the normal distribution. It is interesting that the classical model of light correlations applies only to laser light, which is a macroscopic quantum phenomenon. On the other hand, "ordinary" light sources do not follow the "classical" model or the normal distribution. Measurement errorsNormality is the central assumption of the mathematical theory of errors. Similarly, in statistical model-fitting, an indicator of goodness of fit is that the residuals (as the errors are called in that setting) be independent and normally distributed. Any deviation from normality needs to be explained. In that sense, both in model-fitting and in the theory of errors, normality is the the only observation that need not be explained, being expected. Repeated measurements of the same quantity are expected to yield results which are clustered around a particular value. If all major sources of errors have been taken into account, it is assumed that the remaining error must be the result of a large number of very small additive effects, and hence normal. Deviations from normality are interpreted as indications of systematic errors which have not been taken into account. Physical characteristics of biological specimensThe overwhelming biological evidence is that bulk growth processes of living tissue proceed by multiplicative, not additive, increments, and that therefore measures of body size should at most follow a lognormal rather than normal distribution. Despite common claims of normality, the sizes of plants and animals is approximately lognormal. The evidence and an explanation based on models of growth was first published in the classic book
The assumption that linear size of biological specimens is normal leads to a non-normal distribution of weight (since weight/volume is roughly the 3rd power of length, and Gaussian distributions are only preserved by linear transformations), and conversely assuming that weight is normal leads to non-normal lengths. This is a problem, because there is no a priori reason why one of length, or body mass, and not the other, should be normally distributed. Lognormal distributions, on the other hand, are preserved by powers so the "problem" goes away if lognormality is assumed. On the other hand, there are some biological measures where normality is assumed or expected: Financial variablesBecause of the exponential nature of interest and inflation, financial indicators such as interest rates, stock values, or commodity prices make good examples of multiplicative behavior. As such, they should not be expected to be normal, but lognormal. Benoît Mandelbrot, the popularizer of fractals, has claimed that even the assumption of lognormality is flawed, and advocates the use of log-Levy distributionss. It is accepted that financial indicators deviate from lognormality. The distribution of price changes on short time scales is observed to have "heavy tails", so that very small or very large price changes are more likely to occur than a lognormal model would predict. Deviation from lognormality indicates that the assumption of independence of the multiplicative influences is flawed. Lifetime Other examples of variables that are not normally distributed include the lifetimes of humans or mechanical devices. Examples of distributions used in this connection are the exponential distribution (memoryless) and the Weibull distribution. In general, there is no reason that waiting times should be normal, since they are not directly related to any kind of additive influence. Test scoresA great deal of confusion exists over whether or not IQ test scores and intelligence are normally distributed. While for most practical purposes the distributions of IQ and intelligence (or at least psychometric g) can be seen as the same thing, it is important to distinguish between the two terms when discussing whether they are normally distributed. As a deliberate result of test construction, IQ scores are always and obviously normally distributed for the majority of the population. The fact that intelligence is normally distributed is less clear. The difficulty and number of questions on an IQ test is decided based on which combinations will yield a normal distribution. This does not mean, however, that the information is in any way being misrepresented, or that there is any kind of "true" distribution that is being artificially forced into the shape of a normal curve. Intelligence tests can be constructed to yield any kind of score distribution desired. All true IQ tests have a normal distribution of scores as a result of test design; otherwise IQ scores would be meaningless without knowing what test produced them. Intelligence tests in general, however, can produce any kind of distribution. For an example of how arbitrary the distribution of intelligence test scores really is, imagine a 20-item multiple-choice test entirely composed of problems that consist mostly of finding the areas of circles. Such a test, if given to a population of high-school students, would likely yield a U-shaped distribution, with the bulk of the scores being very high or very low, instead of a normal curve. If a student understands how to find the area of a circle, he can likely do so repeatedly and with few errors, and thus would get a perfect or high score on the test, whereas a student who has never had geometry lessons would likely get every question wrong, possibly with a few right due to guessing luck. If a test is composed mostly of easy questions, then most of the test-takers will have high scores and very few will have low scores. If a test is composed entirely of questions so easy or so hard that every person gets either a perfect score or a zero, it fails to make any kind of statistical discrimination at all and yields a rectangular distribution. These are just a few examples of the many varieties of distributions that could theoretically be produced by carefully designing intelligence tests. Whether intelligence itself is normally distributed has been at times a matter of some debate. Some critics maintain that the choice of a normal distribution is entirely arbitrary. Brian Simon once claimed that the normal distribution was specifically chosen by psychometricians to falsely support the idea that superior intelligence is only held by a small minority, thus legitimizing the rule of a privileged elite over the masses of society. Historically, though, intelligence tests were designed without any concern for producing a normal distribution, and scores came out approximately normally distributed anyway. American educational psychologist Arthur Jensen claims that any test that contains "a large number of items," "a wide range of item difficulties," "a variety of content or forms," and "items that have a significant correlation with the sum of all other scores" will inevitably produce a normal distribution. Furthermore, there exists a number of correlations between IQ scores and other human characteristics that are more provably normally distributed, such as nerve conduction velocity and the glucose metabolism rate of a person's brain, supporting the idea that intelligence is normally distributed. Some critics, such as Stephen Jay Gould in his book The Mismeasure of Man, question the validity of intelligence tests in general, not just the fact that intelligence is normally distributed. For further discussion see the article IQ. The Bell Curve is a controversial book on the topic of the heritability of intelligence. However, despite its title, the book does not primarily address whether IQ is normally distributed. Maximum likelihood estimation of parametersSuppose
As a function of μ and σ this is the likelihood function
Usually in maximizing a function of two variables one might consider partial derivatives. But here we will exploit the fact that the value of μ that maximizes the likelihood function with σ fixed does not depend on σ. Therefore, we can find that value of μ, then substitute it from μ in the likelihood function, and finally find the value of σ that maximizes the resulting expression. It is evident that the likelihood function is a decreasing function of the sum
Consequently this average of squares of residuals is maximum-likelihood estimate of σ2, and its square root is the maximum-likelihood estimate of σ. Surprising generalization The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. It involves the spectral theorem and the reason why it can be better to view a scalar as the trace of a 1×1 matrix than as a mere scalar. See estimation of covariance matrices. See also
|
|
|
This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License. |
|
| © 2008 Chamas Enterprises Inc. |