I want to approximate images using constant-color shapes. Given a shape, how do we choose where to put it?

We have the original image, p, and the approximate image, \tilde{p}. I want to place a shape, S, with area A on \tilde{p} such that it improves the approximation.

First, define the average color under the shape (which it will be filled with) as:

\overline{c} = \frac{1}{A}\sum_{S} p

We’ll say that the error in approximation is the sum of the squared difference between p and \tilde{p}. Then we want to find the change in the error due to filling in \tilde{p(S)} with \overline{c}. That change, new error minus old error, for drawing the shape around one pixel is:

\begin{aligned} \Delta_{err} &= \sum_{S}\left[ \left( p-\overline{c} \right)^2 - \left( p-\tilde{p} \right)^2 \right] \\ &= \sum_{S}\left[ \left( p^2- 2 p \overline{c} + \overline{c}^2 \right) - \left( p-\tilde{p} \right)^2 \right]\\ &= F \left( p^2 \right) - 2 \overline{c} F \left( p \right)+ A \overline{c}^2 - F\left( \left(p-\tilde{p}\right)^2 \right) \\ &= F \left( p^2 \right) - \frac{1}{A} F(p)^2 - F\left( \left(p-\tilde{p}\right)^2 \right) \end{aligned}

I’ve introduced F(p) to represent the result of filtering p with the binary shape S. This means \overline{c} = F(p)/A . The definition allows the error function to be defined in terms of convolutions, which can massively speed up the computation (depending on your programming language, particularly).

Ok! Now we have \Delta_{err}, so we just need to find the position where it is the most negative, corresponding to the most improvement, and place the shape there! Cool. Let’s try it with the simplest shapes: squares and circles that shrink over time:

It works! Not the most thrilling method, but the convergence is fun to watch.

Since the formulation of this problem doesn’t care about the properties of the shape, how about using a bad one? A bulls-eye seems like a bad idea, so let’s try it!

Yep, not the best idea! It does converge, however, so that’s nice.

I was inspired to work on this by a nice interactive post which used ink splats to approximate an image. (Which was, in turn, inspired by my ink splats post). That led me to realize that I hadn’t made a useful technique for this sort of optimization. Now that I have one, let’s throw some ink!

Splats! Unfortunately, the ink isn’t great for portraiture, since humans like to see lots of detail. Here are some non-human examples, to see the stylization.

## 6 replies on “Image Approximation with Constant-Color Shapes”

Hello,

Thank you for the awesome blog!

I want to make sure if I understand this one correctly. Each iteration you draw several randomly sized and positioned shapes on top of the image from previous step and then pick one of them, which gives the smallest error?

Also which environment do you use for your image processing projects? Is it python + opencv?

Thanks, I’m glad you like it!

At each step, I test all possible locations (each pixel to center the shape on) for the shape via convolution/filtering. Then I pick the pixel where the error is minimal.

I’ve also tried the random-position method, and that definitely works too. The random method is faster to run, but takes more iterations to converge.

I tend to use Matlab (with the image processing toolbox), but I’ve done some in python.

Thank you for the response!

Could you please explain in a bit more detail the step where you’re introducing the F filtering function? I can’t understand how the last formula with F function makes computations easier than the sum notation. What exactly the F function computes? Doesn’t it still need a kernel size the same as the size of the whole shape?

It does need a kernel of the same size as the shape, right. Using filtering is partly a stylistic preference, and partly that MATLAB is much faster with convolutions/filtering than with nested sums.

I was a little sloppy with notation. For simplicity, I’ll pretend that the image p is in black and white. F(p) is the filtered version of the whole image. For each point (x,y) in F(p), we have the matrix:

F(p)[y,x] = sum( sum(p[y+v,x+u]*S[v,u], over u for the width of S), over v for the height of S)

That needs to be computed for each (x,y) point, so the simplest way would be to use four nested sums along x,y,u,v.

I found something that I had forgotten to put in the post (and just now added): the filter-based definition of the average color. That simplifies the notation, and the resulting equation.

Do you calculate the distance between two colors as a plain RGB distance, like so: squared_error = (r2-r1)^2+(g2-g1)^2+(b2-b1)^2?

Yep. It’s not the most perfect solution for human perception, but it works well.