Scanning old papers in TIFF format. Is scanning at 48-bit color worth it

color-depthscanningtiff

I am currently scanning old papers with some notes on them using an Epson V370 scanner. I want the output files to be TIFFs, however I am not sure which bit depth to choose.

One of the papers has only a few notes on it in black ink and no other colors. I want to suck out the highest quality from that scanner, but is there a point in scanning white paper with black ink at such a high color depth such as the 48-bit that is the max on my scanner?

Also if I have papers with blue ink, will higher color depth make a difference in the quality?

Best Answer

To answer how many bits of color depth you need and how it affects your results, let me start with a quick explanation of what color depth actually is.

What is color depth?

Color depth describes how many shades of color will be stored. If an image has extremely fine gradations of a color, scanning and storing an extremely high number of colors means that those fine distinctions will be coded differently in the stored image and can be differentiated when you do image manipulation. Storing a lower number of bits means that some of those gradations will be stored as the same color, so they won't be differentiated.

You may have seen this effect trying to store a photo containing fine gradients into a lower bit format, like an 8 bit GIF, which stores only 256 unique colors. Instead of a continuous gradient, you see bands because multiple shades must be condensed into fewer available colors, producing color "steps", like the comparison below.

enter image description here

enter image description here

How many bits are required?

The human eye can distinguish more than 256 shades of each primary color, but that is sufficient to render images in what appears to be photographic quality. That requires 8 bits for each primary color, or 24 bits. In combination, that's over 16 million colors. At 48 bits, or 16 bits per primary color, over 65,000 shades of each primary color can be differentiated. That is far beyond what the eye can distinguish.

48 bit color

So why bother with 48 bit color at all? Because it's useful for photographic work. Detail may be washed out in the brightest areas or hard for the eye to distinguish in the darkest areas. With image manipulation, these ranges can be stretched to put more distance between similar colors so this detail is better differentiated. However, that leaves holes in the color spectrum. Starting with 48 bits provides those in-between colors that would otherwise be missing.

When you stretch one range of colors, other colors get compressed, consolidating some colors. Other types of image manipulation cause similar loss of some of the color values. When you start out with 24 bits, the cumulative loss through successive processing steps can be a noticeable degradation. Starting with 48 bits, even substantial loss of colors still leaves far more than is required.

The result typically has to be down-sampled to 24 bits to display it normally or print it. So even for photographic work, 48 bits is special-purpose.

Color depth vs. ability to scan colors

The scanner has specific optical properties and every scan is captured at the color depth that the hardware produces. That information is processed by software to produce an image at a specified color depth. So if your scanner is capable of 48 bit color, that's what's captured. If you want only 24 bits, some of the colors are consolidated.

However, at any color depth, every color on the page will be stored as something. The difference is that at higher color depth, you will be able to tell more of them apart. So, for example, a higher color depth doesn't let you capture blue better.

Scanning text

If you are talking about text, there is absolutely no benefit from using 48 bits. It will just give you huge files that are slow to work with. But some amount of color depth can be helpful in cleaning up the scan.

Use of color information for cleanup

Consider a fax. It works with 1 bit, which gives you black or white. So every color on the page must be represented by one or the other. That's accomplished by selecting a threshold darkness. Anything lighter becomes white; anything darker becomes black (essentially the same process is used to convert 48 bit colors to 24 bit colors). With a fax, the result is often a mess -- blocky letters, a smudge becomes a grainy black blob, a fold in the paper becomes a black line.

That's because of what the scanner sees. The paper isn't pure white (and it might be yellowed in an uneven way). If there are any folds or wrinkles, you can see them because they introduce shading. The letters on the page aren't pure black, and often contain lighter areas. Dirt or smudges have darkness and color. Often, the darkest portions of artifacts are darker than the lightest portions of the content. This complicates trying to produce a clean scanned page.

Having some color information to work with allows you to use image manipulation tools to clean up the scan; to distinguish artifacts from content. After the artifacts have been removed, the scan can be made more readable by reducing the color depth. Forcing text to be dark and the background to be white more closely mimics what the original document looked like when it was freshly printed on white paper.

Bottom Line

Color depth won't improve your ability to capture colors, like blue, which don't scan as well as some other colors. However, it gives you the ability to improve the result. Scanning in 24 bit color is a good starting point if the originals are not pristine. Even if it was originally black ink on white paper, the color information will make it much easier to get rid of artifacts, which usually do have color.

Once you have removed the artifacts, the color information can be used to improve the appearance of the content. Blue ink that didn't scan well can be darkened without affecting colors that did scan well. Things like an embossed notary seal that might be barely visible can be darkened. Off-white paper can be whitened. Contrast between the content and the background can be improved.

Once all of this is done, a much smaller range of colors can be used to represent the page. So 24 bit color can be reduced to 8 bit color (or less), or grayscale. This allows the finished result to be stored in a much smaller file while looking better than the original.

Low color depth trick

If you are working with text and want the end result to look like clean black text on white paper, there is a trick you can do using low color depth. You start with a substantially higher resolution than what is needed for the result, say 800 to 1200 dpi, and 24 bit color. Use the color information to remove artifacts, improve contrast, etc. until it is as good as you can get it. Then convert the image to 1 bit color (black and white).

This will force the cleaned image to black on white while the high resolution will capture fine detail in the content. Then down-sample to the desired resolution (typically 200 to 300 dpi). Down sampling will convert the file to grayscale or 24 bit color. If this is not automatic, select grayscale as the output.

This will have a similar effect to ClearType (sub-pixel rendering). Detail that would have been totally lost scanning at high contrast and normal resolution will be preserved in a few bits of grayscale. The file can be saved in something like a 4 bit grayscale, which will be a very small file with high quality results.

Related Question