Between 1863-1944, Prokudin-Gorskii took a pioneering collection of Red, Green, and Blue photos using multiple cameras and corresponding colored filters. Unfortunately, the translation (and possibly rotation and scale) are misaligned, which presents a learning opportunity for us to get our hands dirty in image manipulation. This challenge additionally creates an opportunity to learn about computational efficiency when digitally manipulating images.
We first divide the image’s height into three individual images. Then we create all possible permutations of the translated image within a window size and compare the alignment using the structural similarity method from skimage. To make it more computationally tractable for very large images (TIFs), we use a hierarchical resolution approach to align the images (pyramid resolution scheme).
I explored reducing for loops and roll operations by using a tiling approach in hopes of reducing the computational load. Unfortunately, I did not have enough time to confirm that this was a faster approach than individually transforming images. Additionally, I explored using “fancy indexing” to apply transformations in bulk for the set of tiling images, but could not finish in time.
While the algorithm achieved alignment, I realize that my vector displacement might be off, as it was relative to the previous level in the pyramid.
The biggest challenge was spending too much time on the fancy indexing approach. Instead of doing alignment at the highest resolution, I skipped this level, and in hindsight, exploring gradient descent on the error could have yielded more efficient results.