Columbia University PHYSICS LABORATORY TUTORIAL

2.4. Histograms

In the previous section we've introduced the idea of obtaining a reliable error estimate by repeating measurements many times. This idea is so important in science and technology that an entire branch of mathematics, statistical analysis, is devoted to it. It is not our goal here to make you experts in statistical analysis, but it's worthwhile to learn a little bit about it.

Suppose we continue to take measurements of the oscillation period of the pendulum, as discussed in the previous section, until we have 10 measurements (in seconds):

3.9, 3.5, 3.7, 3.4, 3.5, 3.6, 3.7, 3.6, 3.8, 3.6

Let's count how many times each number occurs.

Values, T (s)3.43.53.63.73.83.9
Occurrences, n123211

We can plot this frequency distribution of measurements in a chart called a "histogram".

Histogram

We can also plot our results as a bar histogram.

Bar Histogram

Note that the shape of the bar histogram resembles a bell curve.

gaussian distribution

If we had taken not ten but many more measurements, the bar histogram would resemble the bell curve even more1. In fact, for very large numbers of measurements the bar histogram would be indistinguishable from a continuous curve. This curve is called the limiting distribution. For most of the experiments in the lab the limiting distribution will be a bell curve, which is also known as a normal or Gaussian distribution.

The nice thing about histograms is that we can see the mean value of our measurements. In our example here, it's clearly very near 3.6 s. Can we find the standard deviation from a histogram?

First, let's calculate the standard deviation using the formula in the previous section. We obtain 0.14 s. Let's indicate this in our histogram.

Histogram with deviation range

We see that the points at 3.5, 3.6, 3.7 are within the range defined by the standard deviation. Note that in 7 out of 10 measurements, we obtained one of those three points. In other words, they account for 70% of our measurements. This is not a coincidence. It is a property of the normal distribution that 68% of the measurements lie within one standard deviation on either side of the mean. Since our distribution comes close to a normal distribution, our result 70% is close to the theoretical value of 68%.

By reversing the argument, we can find the standard deviation from the histogram. Given a histogram, we can find a region centered at the mean value, which includes 68% (about 2/3) of the measurements. Then, because of the property stated above, the width of this region will be 2 standard deviations. (One standard deviation for each side from the mean.) So in our example, suppose we don't know the formula for standard deviation. By looking at the histogram, we see that 7 out of 10 measurements, which is a good approximation to 2/3, are in the region between 3.45 s and 3.75 s. The width of the region is 0.3 s. Therefore the standard deviation is 0.15 s. Compare this to 0.14 s obtained via the formula. They are very close.

In the lab, in order to find the standard deviation, you can use either the algebraic formula discussed in the previous section or the 2/3 rule discussed here. The difference between the two methods can be considered negligible for the purposes of the lab.

<< Previous Page Next Page >>

1However, the number of occurrences on the vertical axis would change.

© Columbia University