Overview
PyPi module | N/A | |||||||||||||||||||||||||
git repository | https://bitbucket.org/arrizza-public/running-average | |||||||||||||||||||||||||
git command | git clone git@bitbucket.org:arrizza-public/running-average.git | |||||||||||||||||||||||||
Verification Report | https://arrizza.com/web-ver/running-average-report.html | |||||||||||||||||||||||||
Version Info |
|
- installation: see https://arrizza.com/setup-common
Summary
It is sometimes necessary to keep track of an average for a list of numbers that are input only one at a time.
Scripts
- See Quick Start for information on using scripts.
- See xplat-utils submodule for information on the submodule.
Using the full list
The average of a set of numbers is the sum of the numbers divided by the number of items.
In code this looks like:
#include <iostream>
#define N 5
int list[N];
void use_full_list(int* list)
{
int sum = 0;
for (int i = 0; i < N; ++i)
{
sum += list[i];
}
double average = (double) sum / N;
printf("%-15s: sum: %3d N: %3d average: %.2f\n", "full list", sum, N, average);
}
int main(void)
{
list[0] = 1;
list[1] = 2;
list[2] = 4;
list[3] = 3;
list[4] = 2;
use_full_list(list);
return 0;
}
which generates this output:
full list : sum: 12 N: 5 average: 2.40
Running Average
But this requires the full set of numbers to calculate. What if we didn't want to store the full set of numbers?
void one_at_a_time(int val, int& num_elems, int& sum, double& average)
{
num_elems += 1;
sum += val;
average = (double) sum / num_elems;
printf("%-15s: sum: %3d N: %3d average: %.2f\n", "one at a time", sum, num_elems, average);
}
// and in main()
int num_elems = 0;
int sum = 0;
double average = 0.0;
for (int i = 0; i < N; ++i)
{
one_at_a_time(list[i], num_elems, sum, average);
}
gives output:
one at a time : sum: 1 N: 1 average: 1.00
one at a time : sum: 3 N: 2 average: 1.50
one at a time : sum: 7 N: 3 average: 2.33
one at a time : sum: 10 N: 4 average: 2.50
one at a time : sum: 12 N: 5 average: 2.40
In other words, we get the same final result, but we can calculate the average on the fly.
Storing the Average
But notice the sum in the output above, it keeps going up 1, 3, 7 etc. It may be possible that we overflow the variable sum.
void only_average(int val, int& num_elems, double& average)
{
double sum = average * num_elems;
sum += val;
num_elems += 1;
average = sum / num_elems;
printf("%-15s: sum: %5.2f N: %3d average: %5.2f\n", "only average", sum, num_elems, average);
}
// and in main()
int num_elems2 = 0;
double average2 = 0.0;
for (int i = 0; i < N; ++i)
{
only_average(list[i], num_elems2, average2);
}
The output is:
only average : sum: 1.00 N: 1 average: 1.00
only average : sum: 3.00 N: 2 average: 1.50
only average : sum: 7.00 N: 3 average: 2.33
only average : sum: 10.00 N: 4 average: 2.50
only average : sum: 12.00 N: 5 average: 2.40
Again, we get the same final average but this time we only store a couple of variables.
Caveats
Note that the line
double sum = average * num_elems;
can still overflow.
And also note that this may not be exactly equal to $\(average * num\)$ since it is finite arithmetic.