Saturday, November 19, 2011

Image processing

Motivation

The last couple of days I have been working in a set of utilities for manipulating images. I have done functions for read, write, modify and view them. I have worked with PPM images [1]. I know they are too expensive, but also have a very easy syntax.

I started working with this because a partner told me about how to encrypt an image and give it to two folks, while no one can know the original image, but together can solve it.

I don't know if having control over PPM images is useful for someone else. I'm thinking about if it has worth to upload this work to Hackage [2]. I saw a couple of packages about the same issue, but I actually don't like none of them.

PPM as data

First at all, we need a data structure where store our PPM images. In a first try, I decided to use an Array [3] of RGB values. I changed my idea when I realize that modifying an Array is expensive (maybe, I am wrong here, can somebody confirm it?). A mutable array [4] was not an option, because I wanted to keep pure my code. Then, I shifted to Map [5]. Actually, this was a bit arbitrary, but after testing it, I moved again to IntMap [6]. IntMap is far more efficient than Map, and make unions very fast! Then, I decided to use unions for operations between PPM images, and the current implementation is built on top of an IntMap.

I wanted to make a simple interface, so my basic operations were:

getPixelUnsafe :: PPM -> Int -> Int -> RGB
putPixelUnsafe :: PPM -> Int -> Int -> RGB -> PPM

With this couple of functions it might be easy to implement almost every functionality we want to have, keeping the PPM type absctract. Anyway, I will sometimes re-use methods of the IntMap to make the library faster. I tagged these functions as "unsafe", because if the coordinates of the requested pixel are out of range of the PPM image, they will cause problems. We will use them only in contexts where we are sure that the coordinates are valid.

So, the first and base library provides a pure interface for create and manipulate PPM images, and a pair parser/render of PPM images. As an example:

createPPM :: Int -> Int -> RGB -> PPM

?> renderFilePPM "black.ppm" $createPPM 100 100 0 This call creates a black 100x100 PPM image. As you can see, RGB values are an instance of Num. In the example, fromInteger 0 = RGB 0 0 0. Moreover, abs acts like id, negate calculates the inverse color and signum turns a color into the grayscale. PPM image viewer The next tool I created was a PPM image viewer. I used GTK [7] to create the GUI. This step was very easy. After parsing a file, we just get the RGB value in each pixel and, using Cairo [8], print it in a drawing area. The code of the renderer is short so I will post it here: module PPMViewer.Render ( ppmRender ) where import Graphics.Rendering.Cairo import Codec.Image.PPM ppmRender :: PPM -> Render () ppmRender ppm = do let w = getWidth ppm h = getHeight ppm sequence_ [ do let rgb = getPixelUnsafe ppm x y r = fromIntegral (rgbRed rgb) / 255 g = fromIntegral (rgbGreen rgb) / 255 b = fromIntegral (rgbBlue rgb) / 255 setSourceRGB r g b rectangle (fromIntegral x) (fromIntegral y) 1 1 stroke | x <- [ 1 .. w ] , y <- [ 1 .. h ] ] I felt a bit odd using rectangles to print pixels. Is there another option? I don't know too much about Cairo, so if anyone knows a better implementation, I will be very glad to know it. PPM image converter Yes, we now have an interface that in theory can create any image, but creating a whole image pixel by pixel is non-viable! This is why I created the next tool: a PPM image converter from other formats (PNG, JPG, whatever). I used here again GTK, and since it has functions that read image files and turn them to Pixbufs, my solution was to create a function that transform a Pixbuf into a PPM data. Here is the code: module PPMConverter.Pixbuf ( pixbufPPM ) where import Graphics.UI.Gtk import Data.Array.MArray (readArray) import Codec.Image.PPM import Codec.Image.PPM.Internal import Control.Applicative import Data.Word (Word8) import qualified Data.IntMap as Map pixbufPPM :: Pixbuf -> IO PPM pixbufPPM pb = do n <- pixbufGetNChannels pb w <- pixbufGetWidth pb h <- pixbufGetHeight pb r <- pixbufGetRowstride pb arr <- (pixbufGetPixels pb :: IO (PixbufData Int Word8)) let xs :: [IO (Int,RGB)] xs = [ do let getp :: Int -> IO Int getp = fmap fromIntegral . readArray arr r <- getp p g <- getp$ p + 1
b <- getp $p + 2 return (pixelToInt w x y , RGB r g b) | x <- [ 1 .. w ] , y <- [ 1 .. h ] , let p = (y-1) * r + (x-1) * n ] a <- foldl (\r x -> liftA2 (uncurry Map.insert) x r) (pure Map.empty) xs return$ PPM w h a

I guess this admits a lot of improvements. Indeed, I getting space problems with this function for big images.

The encryption algorithm

Now, let's see a working example of these tools: the encryption algorithm motivated by the problem introduced at the beginning of this text. As a first step, we convert a PNG file with the PPM converter:

Left: the original PNG image. Right: the PPM image output.

Albatross are so cute! Well... now we save the file wherever you want. We can see it with the PPM viewer:

The PPM Viewer

It's time to encrypt this image. The first step is to create a random image of the same size of the original. Then, sum pixel by pixel and coordinate by coordinate (module 255) both images. The random and sum images form the two pieces of the puzzle. If we subtract one from the another we will get the original image!

But, what happens if we do the operation backwards?

Scary! But we have obtained the negative of the original image.

Operations between pixels are of trivial implementation using this library. And are very fast because they use union of IntMap, which are indeed very fast!

Conclusion

This is a sample of images (PPM here) implemented in Haskell. It was an interesting (and funny) exercise for me. If you are interested in have this functionality available publicy, make me know it.

Daniel Díaz.

References

[1] The PPM image format: http://netpbm.sourceforge.net/doc/ppm.html