Tuesday, December 05, 2006

Images - a beginners guide

It's appears to be a regular feature of my life that I have to explain, and occasionally re-explain, certain fundamental features of images stored on computers, especially when it comes to printing. Well now I've got this blog I can post it all here and just tell people to read it. Firstly this is written with a certain practicality in mind, most people don't care how something works, but more why they should be doing this. With this in mind I'll skip large chunks of technical information, no complaints please.

How we see things

Sorry, but I have to start here as this is the basis of pretty much everything. We have colour sensors in our eyes called cones, these react to red, green, and blue light shining through our eyes. Now if you take a really close look at your monitor you'll see that it's made up of tiny red, green, and blue dots. Altering how much light each of these dots emit we can make up every other colour, emit equal parts red, green, and blue and we see things on a scale of black to white, lower the red and things start to turn cyan. This is known the RGB colour scheme, on occasion as emittive colours.

However, most objects don't emit light, they reflect it. You can read a monitor in the dark, but you can't read a book. So what do you do when you want to print something, like for instance a photo? Well you could use the RGB model described above, mix up some red ink, some green ink, and some blue ink and create the colours that way. Trouble is it doesn't work, at least not well. It turns out that what works great when you're emitting light doesn't work that well when you're trying to reflect it.

What to do? If you've got a colour printer you already know the answer to this, you don't buy red, green, and blue cartridges; you buy cyan, magenta, and yellow ones. These work great for reflective colours. Oops we're one missing, you need a black cartridge too. Why's that, simple black is not a colour I need to make that clear, black is the absence of colour. For the emittive colours this is easy, don't send any light out. How do you do it for reflective colours, you can't print nothing? So you need a separate black tank for printing, now you can't just mix up some colours to create black, so most of the cartridges you'll buy aren't really black; they're very dark green, or blue, or occasionally red. If you've ever smudged some black ink and wondered why it changes colour now you know.

So with the inclusion of the black for printing purposes only we now have a new scheme - the CMYK or reflective colours.

As an aside here's the colour wheel

Miss out a colour and the one opposite will predominate, very handy when your printer decides to print let's say a green cast on things, you know it's your magenta causing a problem.

Image formats

Pretty much everyone today is familiar with JPG, some with GIF and a few with PNG. These are all examples of raster (or bitmap) images. What does that mean? Well a bit like your monitor screen these images are made up of tiny dots called pixels, each pixel stores information about what colour it is either in RGB or CMYK format. The most common method is using RGB and storing 256 levels of detail for red, green, and blue. To cut a long story short in most methods each pixel requires 3bytes of information. So one of my photos at 3072 x 2304 pixels would take up 20.25 Megabytes each! That's a hefty size, so how come they're smaller?

In order to make the images a more manageable size different formats can compress them. There are two ways of compressing information - lossless and lossy. PNGs are lossless, what does that mean? Well let's say you were ordering some ink cartridges over the phone, without compression you'd phone up and say "I'd like one black cartridge please", put the phone down, redial and say "I'd like one black cartridge please", put the phone down again, redial and say "I'd like one black cartridge please". Daft, you'd just phone up once and say "I'd like three black cartridges please". This is the essence of lossless, the information at the end of the compression is still the same.

So what about lossy formats like JPG? Well again let's say you wanted three cartridges this time two black and one off-black, for lossless you'd order two blacks and an off-black, but for lossy you'd order three black cartridges. Lossy loses information. What you start with is not what you'll end with. It works by comparing colours with its neighbours, if it appears close enough to be mistaken for another more predominant colour it'll be set to that. The levels at which it decides is adjusted using the compression settings. Less compression, less changes; more compression, more changes. This is why you can get 'artifacts' the compressor decides that an entire block can be safely set to the same colour so you get some odd effects

So that's raster, the other format is vector images. Unlike rasters vector images don't store pixel by pixel information, they store equations. Draw a line between two points and a raster image would have to store every point along that line separately, a vector image would store the start point, the end point and how the line is drawn. This results in two things; firstly vector images are useless for photographs, too much change; secondly you can enlarge a vector image without any loss of detail, no blurry bits.

So rasters are best used for photographs and other highly changeable images, vectors for line art such as comics or logos.

As most of the file formats are designed for viewing on a screen, they tend to store information in RGB format, this of course needs to be translated into CMYK for the printer. If you start dealing with images at the raw end of things you'll get to the point where you create and store them in CMYK format, for high-end printing this results in total control over the output colours.

Size matters

This is the biggie (no pun intended). Why does that photo filling up your screen want to print out at the size of a postage stamp? There are three 'sizes' associated with photos - screen size, print size, storage size. Screen size is fairly obvious, my photos are 3072 x 2304 pixels; on a screen with a resolution of 1024x768 my photo will appear three times larger then I can see in one go.

Storage size we've already dealt with to a large extent - it's the amount of space your photo is actually taking up on your hard-drive. Wondering why your apparently tiny photo is taking so long to email? This is the property that you're concerned with.

Print size is an awkward one, you need one extra bit of information - dpi. Despite metrication we still use inches for this. What does it mean? Well instead of 'dots per inch' think of it as 'pixels per inch'. So for my photos at 72dpi the native print size is (3072/72)x(2304/72) inches or about 42"x32".

Occasionally when you go to a professional printer they'll ask for something at least 300dpi, most people will open their 72dpi photos fiddle with the settings and re-save it at 300dpi. Let's deal with that. Firstly you're not adding any extra information to the image, where's do you think it's going to come from? All that happens is you're shrinking the native print size from (in my case) 42"x32" to 10"x7" if you print both images out on an A4 sheet of paper they'll appear exactly the same. Want to know what the equivalent dpi you'll be printing at for a given size? Simply divide the pixel size by the print size in inches. A result of below 300dpi will start to look a bit blocky close-up.

Okay that's it, I might come back and amend some bits and bobs, but if you read this you'll end up knowing more about images then the majority of people out there.