Error Experiments – Stiftung FNJE

When conducting an experiment, you will rarely measure or calculate exactly the value you have calculated using a table formula. This is because every human or machine experimenter makes ‘mistakes’ or because some sources of error simply cannot be ruled out: there is a certain amount of ‘noise’ in every measured value.

To illustrate this and exaggerate it, we present two ‘freehand experiments’ below.

Even in school (secondary level I), we learn that an experiment should be carried out several times under the same ‘laboratory’ conditions in order to then calculate a mean value from several measured values.

Mean value of the results

The mean value here refers to the arithmetic mean m: the sum of all measured values v_i divided by the number i of measured values.

m (v_i) =

v₁ + v₂ + … + v_i

The simplest example of this is a report card after a class test, which (hopefully) some primary school teachers have already given you so that you can assess where you stand in comparison to your classmates.

Question: Why is this not allowed? So why is it not allowed – strictly mathematically speaking – to calculate the average of a school class’s grades?

Tip: If you don’t know the answer, read the subpage ‘FYI: Data types’.

Absolute error

How much does each individual value deviate from this average? This is given as the absolute error, e.g.:

x_abs = |v_i – m (v_i) |

Relative error

To put a figure into perspective, you need a reference value, which for us is usually the number 100 or 1000. The relative error is therefore typically expressed as a percentage or per mille. This makes it easy to compare the results of different sub-experiments, even if these sub-experiments provide measurements in different units.

When scientists specify the relative error of their measurement, this is the sum of all absolute errors produced by the apparatus (calculated according to the error propagation law), expressed as a percentage of the measured value. A scientist is taking a measurement for the first time, i.e. they are measuring something for which there is a table value. Therefore, they specify the relative error as the measurement uncertainty (tolerance) of their measuring device:

Example: Measurement with a ruler. Scale 1 mm => measurement uncertainty 0.5 mm.

If we measure the length of a table (1700 mm), then 0.5 mm tolerance is less than 0.3 per mille of the measured value. However, if we measure the length of a fingernail (1 cm) with the same ruler, 0.5 mm tolerance is 5% of the measured value. This is how scientific measurements must be specified.

If you have measured a natural constant or material parameter at school for which there are already reference tables and charts, you can determine the accuracy of your experiment (which is certainly much lower than the precision measurements used in science) by comparing your measured value with the table value.

The relative error of a measurement therefore indicates how much it deviates from an ideal value. For example, you could say: our measurement of the speed of sound in air (20 °C) deviates by 5% from the table value (and it does not matter whether we specify it in km/h or km/s).

Now you can see how the relative error can be calculated – just like any percentage calculation using the rule of three:

x_rel [%] =

(v_Ref.-m(v_i )) * 100

v_Referenzwert

or for scientific measurements:

x_rel [%] =

abs.Messunsicherheit * 100

v_Messwert

Data Types

In everyday life, we are confronted with different types of data. We roughly distinguish between three groups: metric, nominal and ordinal data.

Nominal data

Some data sets consist of words. For example, all yes/no decisions and some academic degrees (e.g. all doctoral theses) are evaluated exclusively with verbal judgements. They are marked ‘good’ or ‘very good’ or ‘satisfactory’ – or the Latin equivalents ‘magna cum laude’, ‘summa cum laude’ or ‘cum laude’. Of course, we all know that ‘summa cum laude’ is better than ‘cum laude’ and that ‘good’ is better than ‘satisfactory’. But is this scale linear or somehow measurable? Is a ‘2’ twice as good as a ‘4’ – or, in the (more logical) secondary school grading system, are 12 points perhaps three times as good as 4 points? And how can you even compare the judgements of different teachers? This might be possible in arithmetic, where the answer is either right or wrong, but even when it comes to explanations in maths, teachers have a lot of leeway in their assessment.

One candidate may have calculated correctly but quoted incorrectly, while another may have written correctly but programmed incorrectly or drawn nonsensical pictures… In reality, the assessment must therefore be vague, but an average value from this data is completely meaningless: how would one calculate the average value from words? Apart from this purely practical and mathematical problem, there is also another difficulty: firstly, the data from such final theses was not produced under the same conditions (every teacher is different, all students are different, all topics of final theses are different… not to mention their personal backgrounds) and secondly, they do not necessarily all have the same unit, which would be a prerequisite for summing them up.

Ordinal data

However, if we compare not the ratings themselves but the numerical values associated with them, we have ordinal data. To continue with the example, let us now take the arrival times of the runners in the Berlin Marathon. However, we do not compare gold, silver, bronze, 4th place and so on, but, for example, the amount of prize money the winners receive. This allows us to compile statistics.

However, the prize money is not necessarily distributed equally. It is like comparing the gross wages of different occupational groups. Or the standard lengths of apples with those of pears. Calculating the mean value of these values should therefore also be thoroughly questioned.

However, they can definitely be compared using the relation symbols ‘<’ and ‘>’.

Metric data

The only completely unproblematic use of averages is with metric data. This is data that is given in the realm of natural or real numbers and allows arithmetic operations.

For example, data from an experiment that was carried out several times under the same laboratory conditions. For example, measuring the speed of sound with a starter cap from sports in the schoolyard, which was carried out several times on the same day, with the same wind strength, humidity and temperature… or measuring the temperature of a light bulb wire at different currents, all of which were carried out in the same isolated laboratory… or something like that.

In primary school, in order to give pupils a vague impression of where they stand, the vague judgements ‘good’ and ‘not so good’ or ‘better’ are converted into numbers. This trick (a mathematical representation) gives the illusion that it is acceptable to calculate an average, and – let’s be honest – at least you know how the teacher sees you (whether this really reflects reality is a discussion we’ll leave to other people).

Freehand Experiments

Experiment 1 – Darts

The ideal dart thrower would always hit the centre. However, very few of us are ‘ideal throwers’. Every now and then, the darts miss the target.

Why not play darts on your next day out? There are bound to be some good throwers and some not so good ones in your group. The distribution of the darts on the board should therefore be reasonably statistical.

Caution: Never hang a dartboard on a door when playing! If someone opens the door from the other side, serious injuries can occur.

Then count how many darts landed in each of the three rings. The better you throw, the closer your average will be to the centre. However, if you were to measure the board precisely, i.e. place a coordinate system in the centre and then specify the exact coordinates for all the holes, you would probably get a 3D Gaussian distribution.

BTW: Let’s hope that Cupid has a better hit rate than we do. 🙂

Experiment 2 – Yard or street

If you have trouble finding a dartboard or a safe place to throw darts, you can modify the experiment: Draw a few chalk lines in the schoolyard and then throw a sandbag as if you were shot-putting or throwing a ball (all the same, please!), and then measure the throwing distances.

Note: Balls are not suitable because they roll away and don’t stay in place. That’s why we use sandbags. 🙂

Notes

Please consider whether it is really permissible to calculate an arithmetic mean for the data collected in this way.
Also consider qualitatively which sources of error could interfere with the experiment and then try to estimate the magnitude of these errors quantitatively.

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31