ScriptingTechnical

Temporal Averaging

Otherwise known as a way of combining a ton of photos of the same area to pull out only the static content. It’s one of the least time consuming ways to get impossible photos of empty streets.

Basically you take 40 or so photos of a busy section of road without moving the camera even slightly (aligning the images takes somewhat more code, and levels of computer vision that I don’t want to go into right now). It is unlikely that any of these photos will show the entire road with no traffic in sight, but if you look at a single particular pixel in every one of those photos, it is very likely that the most common value for it will be ‘road’.

Ideally, you just select that most common value for each pixel and ignore everything else. In practice, cameras and images don’t store or detect the exact colour of an object, and each pixel of it will vary per photo, so taking a pure mode is out as an average for that reason – there will be no reliable most common colour to select. To get around this, the mode would have to be of ranges.

Either way, to use the mode every single image has to be considered for each pixel. Depending on the number of images this can lead to awful memory use as every image has to be held available, or huge amounts of disk access as groups of pixels are read in every so often. Breaking the images into groups and taking the mode of each before repeating for those new temporary images would be a loss of accuracy but help solve the above problems.

The arithmetic mean is wonderful for computation and is what I’m currently using. Two images can be read in, averaged, and stored as a temporary image, a third image can be read in (replacing the second), averaged with the temporary image, then stored so as to replace it, and so on. Only two images are in memory at any one time. Unfortunately there may be an increased number of rounding errors depending on how the values are stored. Another problem is that unlike the mode, everything gets a place in the final image – one bright red car on a black road will require a large number of images without a car in that place in order to average back down to an invisible level, whereas using mode, it would have been left out completely.

The Code

This is a piece of C++ code that uses the arithmetic mean and Qt (if you want to compile it you’ll need to build a basic Qt framework around it). For images I recommend non-streaming webcams pointing at roads or highstreets.

// temporal averaging
// zoril.co.uk
// 07/Nov/2011

// we only need two images
QImage im_base;
QImage im_loaded;

// files need to be sequentially numbered
int num = 1;

// load images matching the name in the text field
// e.g. current?.jpg
// => current1.jpg
// => current2.jpg
// => current3.jpg
// => ...

// first image used for dimensions
im_base.load(ui->filepattern_fld->text().replace("?", QString::number(num)));

// set dimensions
int w = im_base.width();
int h = im_base.height();

// all the time an image loads
while(im_loaded.load(ui->filepattern_fld->text().replace("?", QString::number(num+1)))) {
   // loop through every pixel
   for(int x = 0; x < w; x++) {
      for(int y = 0; y < h; y++) { // get the current average colour uint px_base = im_base.pixel(x ,y); // and the new colour uint px_loaded = im_loaded.pixel(x, y); // break into components and take (cumulative moving) average uint red = (((px_loaded >> 16) & 0xFF) + num*((px_base >> 16) & 0xFF)) / (num+1);
         uint gre = (((px_loaded >>  8) & 0xFF) + num*((px_base >>  8) & 0xFF)) / (num+1);
         uint blu = (((px_loaded      ) & 0xFF) + num*((px_base      ) & 0xFF)) / (num+1);

         // make into colour
         QRgb newpx = qRgb(red, gre, blu);

         // store to averaged image
         im_base.setPixel(x, y, newpx);
      }
   }
   // move onto next image
   num++;
}
// save image
im_base.save("output.jpg");

red, gre and blu would probably be better off not forced into ints and stored in a virtual image each loop.

The Original Images

The Resulting Image

This image was generated using 17 source images (from a webcam) and the arithmetic mean.

Temporally Averaged from 17 Images
Temporally Averaged from 17 Images

Using 70 Images

Temporally Averaged from 70 Images
Temporally Averaged from 70 Images

I shall update this if I find a way to have some form of weighted average. I’m thinking.. read in x images, find some form of average, and then base further inclusion on deviation from that.