Wednesday 4 March 2015

Unit 38 - Colour management procedures, Storing images and files for future use

Digital Image File Types.
JPG, GIF, TIFF, PNG, BMP. What are they, and how do you choose? These and many other file types are used to encode digital images. The choices are simpler than you might think.
Part of the reason for the plethora of file types is the need for compression. Image files can be quite large, and larger file types mean more disk usage and slower downloads. Compression is a term used to describe ways of cutting the size of the file. Compression schemes can by lossy or lossless.
Another reason for the many file types is that images differ in the number of colours they contain. If an image has few colours, a file type can be designed to exploit this as a way of reducing file size.
Lossy vs. Lossless compression
You will often hear the terms "lossy" and "lossless" compression. A lossless compression algorithm discards no information. It looks for more efficient ways to represent an image, while making no compromises in accuracy. In contrast, lossy algorithms accept some degradation in the image in order to achieve smaller file size.
A lossless algorithm might, for example, look for a recurring pattern in the file, and replace each occurrence with a short abbreviation, thereby cutting the file size. In contrast, a lossy algorithm might store colour information at a lower resolution than the image itself, since the eye is not so sensitive to changes in colour of a small distance.

Number of colours
Images start with differing numbers of colours in them. The simplest images may contain only two colours, such as black and white, and will need only 1 bit to represent each pixel. Many early PC video cards would support only 16 fixed colours. Later cards would display 256 simultaneously, any of which could be chosen from a pool of 224, or 16 million colours. New cards devote 24 bits to each pixel, and are therefore capable of displaying 224, or 16 million colours without restriction. A few display even more. Since the eye has trouble distinguishing between similar colours, 24 bit or 16 million colours is often called True Colour.

The file types
TIFF is, in principle, a very flexible format that can be lossless or lossy. The details of the image storage algorithm are included as part of the file. In practice, TIFF is used almost exclusively as a lossless image storage format that uses no compression at all. Most graphics programs that use TIFF do not compression. Consequently, file sizes are quite big. (Sometimes a lossless compression algorithm called LZW is used, but it is not universally supported.)
PNG is also a lossless storage format. However, in contrast with common TIFF usage, it looks for patterns in the image that it can use to compress file size. The compression is exactly reversible, so the image is recovered exactly.
GIF creates a table of up to 256 colours from a pool of 16 million. If the image has fewer than 256 colours, GIF can render the image exactly. When the image contains many colours, software that creates the GIF uses any of several algorithms to approximate the collars in the image with the limited palette of 256 colours available. Better algorithms search the image to find an optimum set of 256 collars. Sometimes GIF uses the nearest colour to represent each pixel, and sometimes it uses "error diffusion" to adjust the colour of nearby pixels to correct for the error in each pixel.
GIF achieves compression in two ways. First, it reduces the number of colours of colour-rich images, thereby reducing the number of bits needed per pixel, as just described. Second, it replaces commonly occurring patterns (especially large areas of uniform colour) with a short abbreviation: instead of storing "white, white, white, white, white," it stores "5 white."
Thus, GIF is "lossless" only for images with 256 colours or less. For a rich, true colour image, GIF may "lose" 99.998% of the colours.
JPG is optimized for photographs and similar continuous tone images that contain many, many colours. It can achieve astounding compression ratios even while maintaining very high image quality. GIF compression is unkind to such images. JPG works by analysing images and discarding kinds of information that the eye is least likely to notice. It stores information as 24 bit colour. Important: the degree of compression of JPG is adjustable. At moderate compression levels of photographic images, it is very difficult for the eye to discern any difference from the original, even at extreme magnification. Compression factors of more than 20 are often quite acceptable. Better graphics programs, such as Paint Shop Pro and Photoshop, allow you to view the image quality and file size as a function of compression level, so that you can conveniently choose the balance between quality and file size.
RAW is an image output option available on some digital cameras. Though lossless, it is a factor of three of four smaller than TIFF files of the same image. The disadvantage is that there is a different RAW format for each manufacturer, and so you may have to use the manufacturer's software to view the images. (Some graphics applications can read some manufacturer's RAW formats.)
BMP is an uncompressed proprietary format invented by Microsoft. There is really no reason to ever use this format.
PSD, PSP, etc are proprietary formats used by graphics programs. Photoshop's files have the PSD extension, while Paint Shop Pro files use PSP. These are the preferred working formats as you edit images in the software, because only the proprietary formats retain all the editing power of the programs. These packages use layers, for example, to build complex images, and layer information may be lost in the non-proprietary formats such as TIFF and JPG. However, be sure to save your end result as a standard TIFF or JPG, or you may not be able to view it in a few years when your software has changed.
Currently, GIF and JPG are the formats used for nearly all web images. PNG is supported by most of the latest generation browsers. TIFF is not widely supported by web browsers, and should be avoided for web use. PNG does everything GIF does, and better, so expect to see PNG replace GIF in the future. PNG will not replace JPG, since JPG is capable of much greater compression of photographic images, even when set for quite minimal loss of quality.


File size comparisons
Below are comparisons of the same image saved in several popular file types. (Note that there is no reason to view more than one of the TIFFs or the PNG. Since all are lossless formats, their appearance is identical.)
File type
Size
Tiff, uncompressed
901K
Tiff, LZW lossless compression.
928K
JPG, High quality
319K
JPG, medium quality
188K
JPG, my usual web quality
105K
JPG, low quality / high compression
50K
JPG, absurdly high compression
18K
PNG, lossless compression
741K
GIF, lossless compression, but only 256 colours
286K

When should you use each?
TIFF
This is usually the best quality output from a digital camera. Digital cameras often offer around three JPG quality settings plus TIFF. Since JPG always means at least some loss of quality, TIFF means better quality. However, the file size is huge compared to even the best JPG setting, and the advantages may not be noticeable.
A more important use of TIFF is as the working storage format as you edit and manipulate digital images. You do not want to go through several load, edit, save cycles with JPG storage, as the degradation accumulates with each new save. One or two JPG saves at high quality may not be noticeable, but the tenth certainly will be. TIFF is lossless, so there is no degradation associated with saving a TIFF file.
Does NOT use TIFF for web images. They produce big files, and more importantly, most web browsers will not display TIFFs.

JPG
This is the format of choice for nearly all photographs on the web. You can achieve excellent quality even at rather high compression settings. I also use JPG as the ultimate format for all my digital photographs. If I edit a photo, I will use my software's proprietary format until finished, and then save the result as a JPG.
Digital cameras save in a JPG format by default. Switching to TIFF or RAW improves quality in principle, but the difference is difficult to see. Shooting in TIFF has two disadvantages compared to JPG: fewer photos per memory card, and a longer wait between photographs as the image transfers to the card. I rarely shoot in TIFF mode.
Never use JPG for line art. On images such as these with areas of uniform colour with sharp edges, JPG does a poor job. These are tasks for which GIF and PNG are well suited. See JPG vs. GIF for web images.
GIF
If your image has fewer than 256 colours and contains large areas of uniform colour, GIF is your choice. The files will be small yet perfect. Here is an example of an image well-suited for GIF:
http://users.wfu.edu/matthews/misc/jpg_vs_gif/testImage.gif
Do NOT use GIF for photographic images, since it can contain only 256 colors per image.

PNG
PNG is of principal value in two applications:
  1. If you have an image with large areas of exactly uniform colour, but contains more than 256 colours, PNG is your choice. Its strategy is similar to that of GIF, but it supports 16 million colours, not just 256.
  2. If you want to display a photograph exactly without loss on the web, PNG is your choice. Later generation web browsers support PNG, and PNG is the only lossless format that web browsers support.
PNG is superior to GIF. It produces smaller files and allows more colours. PNG also supports partial transparency. Partial transparency can be used for many useful purposes, such as fades and antialiasing of text. Unfortunately, Microsoft's Internet Explorer does not properly support PNG transparency, so for now web authors must avoid using transparency in PNG images.

Other formats
When using graphics software such as Photoshop or Paint Shop Pro, working files should be in the proprietary format of the software. Save final results in TIFF, PNG, or JPG.
Use RAW only for in-camera storage, and copy or convert to TIFF, PNG, or JPG as soon as you transfer to your PC. You do not want your image archives to be in a proprietary format. Although several graphics programs can now read the RAW format for many digital cameras, it is unwise to rely on any proprietary format for long term storage. Will you be able to read a RAW file in five years? In twenty? JPG is the format most likely to be readable in 50 years. Thus, it is appropriate to use RAW to store images in the camera and perhaps for temporary lossless storage on your PC, but be sure to create a TIFF, or better still a PNG or JPG, for archival storage.

Choosing a File Format for Digital Still Images
The choice of file formats can often prove overwhelming for someone new to the world of digital imaging. The aim of this document is to explain some of the factors that should be considered before choosing a format and suggest suitable file formats for specific applications.

Over the years, there have been a number of file formats that have been proposed and used. Every year, this choice gets larger and larger as new file formats are introduced and it is not always immediately clear which is the best one to use in any particular case. The choice will depend on a number of factors, which will vary according to the type of media and how you intend to use the file. Each stage of the process, from capture through to delivery, has its own requirements that may affect this choice.
This report provides a brief look at some of these factors and provides guidelines to making the best choice from what is available.
For a full introduction to the file formats themselves, see the JISC Digital Media advice document File Formats and Compression.

Choose a non-proprietary open 'standard'
Despite the large range of available file formats, choosing one should not be too hard as only a very few of them are normally recommended for digitisation projects. Any digitisation project will need to consider the long-term usefulness and accessibility of the images and this means choosing a file that is both an established industry 'standard' as well as a non-proprietary format. This limits the range to a much more easily considered number that includes the most common four below:
·         TIFF (Tagged Image File Format)
·         PNG (Portable Network Graphics)
·         JPEG or JFIF (Joint Photographic Experts Group File Interchange Format)
·         GIF (Graphic Interchange Format)
There can be good reasons why a project might wish or need to use another file format at some part of their project, such as some of the proprietary formats including:
·         PDF (Portable Document Format/Adobe Acrobat File)
·         PSD (Photoshop Document/Adobe Photoshop Image File)
·         The camera's native RAW file

However this is likely to come about because of some specific need of a particular project and cannot be covered here. For details of these file types and many others, please see the JISC Digital Media advice document File Formats and Compression.

This is the first step in the digitisation process. When capturing images, it is important that they are all created at the highest possible quality and at a size appropriate for all subsequent uses. Errors at this point will certainly compromise the quality of the whole project and the only recovery option will be to go back and re-capture the original.
All digital capture devices originally capture values of Red, Green and Blue. The number of different describable colours (or tones of grey) will depend upon the 'bit-depth' of the device. Any modern device will be able to capture in at least 24-bit colour (or 8-bit B&W) (see the JISC Digital Media advice document The Digital Still Image), although some modern devices can capture at higher bit depths, right up to 48-bit.

Some of the more advanced cameras offer their own un-processed RAW formats. These files contain all of the original data as captured by the sensor without alteration. These images are then processed on the computer where fine adjustments can be made to the white balance, exposure and sharpness before saving in a non-proprietary format. RAW files usually contain higher bit depths than the equivalent JPEGs and TIFFs produced by the camera.
Once the capture device has created the image, it must be saved for later use.

Format requirements
A file format should be chosen that:
·         Retains all information that was created by the capture device. This will mean using a file format that can store the image in at least the same colour depth as it was created. 24-bit for colour and 8-bit for B&W should be considered the minimum although files captured with a larger bit-depth should really be archived with this information
·         Retains any capture device colour management information (ICC profile)
·         Uses (or can be set up to use) no compression
The suggested format here is: TIFF or the proprietary format of capture device.
Although it is normally advisable to avoid all proprietary file formats, there can be an argument for the temporary use of a proprietary format within the scanning software if it is able to offer some level of additional functionality. However it would still need to be converted to another open standard format before being archived.
When we mention or specify TIFF, it is important to realise that the TIFF file format comes in a range of types, supporting different functionality, such as multipages and even a choice of compressions including JPEG. So when we specify TIFF for archival purposes we always mean an uncompressed Baseline TIFF v6 with Intel byte order (PC option).

There are two possible methodologies for creating a Master Archive and both have advantages, depending on the project.

Method 1 - Archive all data exactly as created by the capture device.
The Master Archive contains a copy of each image in a form as close as possible to the original captured data. This enables the project to go back to the archive knowing that they have an exact copy of everything that was originally created by the capture device for the project. It should be realised that as images are pre-optimisation, they might not look as good as those archived using Method 2. They will be in a totally original form but not necessarily the highest visual quality. With this approach, it is important to use a colour space that in no way compromises the colour gamut of the original data. This will often mean leaving the image within the capture device's own colour space, but could mean using a larger or unbounded colour space such as CIE Lab.

Method 2 - Archive an optimised version of image file.
The Master Archive contains a copy of each image after it has been prepared and optimised for use at its highest quality (see Basic Guidelines for Image Capture and Optimisation). This has the advantage of archiving the image in a ready-to-use state. The optimisation need only be done once, and all images can be handled in a consistent way. However, it is inevitable that some data will have been lost in the process and if the optimisation (see Basic Guidelines for Image Capture and Optimisation) is in any way inappropriate or badly undertaken then the project will be unable to go back to the original data and work from there. For this approach it would make sense to save the image in a colour space appropriate for the intended use of the image in the future. (Adobe RGB 1998 would be advised for print/web, but sRGB could be used if the only delivery medium was going to be the web).


Format requirements
The requirements of a file format for archiving are the same as for creation except that it should also:
·         Be an open standard file format - proprietary formats should not be used, as there is uncertainty about the ability to open the file in the future. A possible exception to this might be the Adobe Photoshop format - see below
·         Preferably not use any compression, although lossless compression may be acceptable. Be aware that one of the most common lossless compressions is LZW, which is based upon patented technology and should therefore be avoided.
·          
Suggested formats:
·         Method 1 DNG, TIFF, PNG
·         Method 2 TIFF, PNG or possibly PSD

One way around the question of whether to archive before or after optimisation is to use the 'layers' features of Photoshop and save the image as a PSD file. This proprietary file format allows both the original image (un-optimised) and any optimisation to be stored within the same file. This effectively allows both states of the file to be archived within the same file. The PSD file is however a 'Proprietary' format and its use should therefore be approached with great care.

All image optimisation and manipulation is undertaken within image processing software. Whilst carrying out this work, it can be useful to save the image in the proprietary format of the image processing software.

Editing can be a time consuming process and the proprietary formats offer increased functionality that enable extra information (e.g. layers, masks and channels) to be stored. This enables subsequent editing to resume from where the last session finished without having to recreate any prior work. Unfortunately using a proprietary file format in this way conflicts with the preservation requirements of our archive images. This is where archiving after optimisation can have an advantage.
On the other hand, if the image is going to require a lot of manipulation or will be made for a specific use then it can be helpful to have access to the original file before any other processing has been undertaken. This is an advantage of archiving before optimisation.
Suggested formats: Image processing proprietary formats such as PSD for Photoshop,PSP for Paint Shop Pro and PNG for Fireworks. However TIFF is still a good choice if the increased functionality of the proprietary formats are not required (the TIFF format can save some layer information but only a few programs such as Photoshop CS can read this information - so it can no longer be considered a truly open source file).
However, once the image manipulation has been finished the file should be saved in a form appropriate to its subsequent use.

Choosing the correct image file format for delivery probably poses the hardest decision with the biggest variety of choice. These are just some of the issues that will need to be considered:
·         What is the intended use of the image after delivery?
·         How much image resolution is needed to convey the intellectual content to the user?
·         On what output device is the image going to be used - monitor, printer, projector?
·         What are the capabilities of the output device? What bit depth can it handle? What is the required resolution?
·         What bandwidth is available for delivery?
·         Is the image for photo-realistic or presentation use?
·         How is the image going to be delivered? CD-ROM, tape, WAP, Internet (dialup, broadband, LAN or WAN connection)?
·         Is there a requirement to add any watermarking or deal with any other digital rights management issue?
·         Do the users require the image to be provided with any colour profile or other colour management information?

With so many considerations, combined with the proliferation of file formats, each designed for a specific use, it is little wonder that this subject continues to confuse and engender debate.
With this in mind, the following are more in the form of ideas for consideration than guidelines.

It is hard to give generic advice in this area, the important thing is to talk to the person doing the printing as mistakes can be costly and it is the printer who should understand what must be provided for the agreed use. They will hopefully be able to give you specific image preparation guidelines so as to help you prepare images correctly for their workflow.
Normally the printer will want images in a high quality uncompressed format such as TIFF or within an encapsulated metafile such as EPS or PDF (although in the commercial world Quark files are also popular as many printers have an established workflow based around Quark XPress, which provides all layout and sizing, whilst the image is provided as a linked TIFF).
Remember that the printing process uses subtractive colour rather than additive colour (see the JISC Digital Media advice document The Digital Still Image) and this means the image must be printed from a CMYK file rather than an RGB one. It will therefore be necessary for either you or the printer to convert the image file from RGB to CMYK. This is rarely an easy task and should be undertaken with care by a skilled operator who understands the workings of a CMYK printing workflow. Due to problems with this process, it is becoming more common to provide the printer with an RGB file and ask them to undertake the transformation. When this is done, it is normal to use an RGB colour space that is designed to transform to CMYK easily. There are a few possibilities, but the most common and almost standard is Adobe RGB 1998.
Suggested formats: TIFF (RGB), TIFF (CMYK), EPS, PDF

It is quite normal to have to undertake a fair amount of testing and adjusting with a desktop printer before it is possible to get the best results out of it. Most of these devices (certainly all those using ink/pigment) print in CMYK, however they normally undertake the conversion themselves and have been designed to work best with RGB data. The exceptions to this are 'continuous tone' printers such as the dye sublimation and photo-printer types which print in RGB.
The normal desktop printers (ink-jet and colour photocopier) are designed to work happily with a range of image file formats, including JPEG compressed files. However they will still work best with the maximum amount of image data supplied by an uncompressed image such as a TIFF or PSD. Nonetheless, surprisingly good results can be obtained from JPEG compressed files as long as the quality is set at the highest setting (with a file size larger than 10% of original).
Suggested formats: TIFF (RGB), PSD, JPEG (high quality setting)

For most digitisation projects, the most common delivery format is simply a monitor with the images viewed through a web browser interface. This makes the choice of file format easy as the current selection of web browsers only support a small range of image file formats (JPEG, GIF & PNG), although this range can be extended with the use of the appropriate plug-in.

Delivering images through a web browser has some inherent advantages and unfortunately some challenges. The main advantage is that (in common with all monitor delivery) images naturally look 'good' on a monitor where their perceived 'brightness' (the light is being transmitted to you, rather than reflected) hides many small deficiencies in quality that would compromise quality if the image was printed. On the other hand, present browsers have only limited image-viewing capabilities and are unable to 'zoom' in and out of the images. This means that delivery is limited to images with pixel dimensions that fit within the user's browser - suggested standards at present are to design web pages to a size of 800 x 600 pixels giving standard image sizes of approx 512 pixels on the longest edge.
The largest limitation on the quality of images delivered on the web and the main influence on 'choice', is the need for them to be compressed to a size that makes their delivery over the limited available bandwidth possible. All the file formats supported by web browsers provide compression, however the amount and method of compression varies.

Web browsers currently support the following file formats:
·         JPEG (JFIF) - JPEG is not actually a file type, but a type of compression proposed by the Joint Photographic Experts Group. It is used within the JFIF file format that uses the file extension .jpg and we colloquially call the 'JPEG'. It is a lossy compression and will provide the best quality and lowest file size for continuous tone images. The amount of compression given to the file is chosen at the time of saving the file and allows for variation in quality against file size: as a rule of thumb, it is normally considered that a file compressed with JPEG to 10% of its original size will be visually acceptable with no obvious compression artefacts. However it is common if required, to compress right down to 2-4% if the lower quality is acceptable.
·         GIF - The Graphic Interchange Format, is an 8-bit (and under) indexed file type only offering a range of 256 (or less) different colours (these can either be a standard selection or a image-dependent selection by user-choice). It was designed in the early days of the Internet by CompuServe and works best for use with simple images using block colours, such as graphics, logos and banners. GIF uses lossless LZW compression, the amount of compression will depend totally on the type of image being saved. A full colour continuous tone image is unlikely to compress to less than 30% of its original size, however a solid colour vector image should compress far more. The GIF file format supports layers allowing it to offer both transparency and animation.
·         PNG - The Portable Network Graphics (colloquially called 'PING') file is an open source 'standard' that was introduced to overcome the possible patent problems associated with the GIF format (the LZW patent expired in 2004). It is normally used in either an 8-bit indexed version or as a 24-bit full colour version, although there is also an infrequently used 48-bit version as well. This makes it a very versatile format offering either the advantages of lossless compression in full colour (as an archive format) or as a GIF replacement in 8-bit form. However it cannot compete with the JPEG in terms of producing high quality and small, full colour images for viewing on the web. The compression available from PNG in 24-bit mode is typical for a lossless compression providing a file of about 60-75% of the original size and in 8-bit mode it is much the same as GIF. PNG supports transparency (even variable opacity) but is not able to provide animation.
·         The JPEG 2000 (j2k or jp2) format was developed to replace the popular JPEG format; it makes use of wavelet compression, which can use either lossless or lossy methods of compression. While it doesn't offer any significant increase in compression ratios over normal JPEG there is less of the blockiness and artifacting associated with standard JPEG compression. While JPEG2000 is not as widely supported as was first hoped, it is slowly gaining in popularity however; it looks unlikely that it will replace JPEG in the near future.

Suggested formats and relevant uses: JPEG, PNG, GIF
It is quite legitimate to use any of these file formats for web delivery, however they do have particular strengths and weaknesses that should be considered in your choice. The table below sets out some of the more common needs, the best choice and the reason for making your choice:
Need or Use
Recommended File Type
Reason
Normal continuous-tone full colour image at the highest quality
JPEG or PNG
PNG will allow you to deliver an image at the highest quality using lossless compression. However file size will be very large (approx 60% of original). JPEG at its best quality setting, should be visually identical but provide a larger compression (approx 10-25% or original).
Normal continuous-tone full colour image at highest compression
JPEG
JPEG will allow compression of the image down to approx 2-4% of the original size. At this compression, quality is likely to suffer, but in some cases this can be acceptable
A web banner or logo with 8-bit or less colour
PNG or GIF
Both PNG and GIF offer the best compression for file size. PNG is 'patent' free, but might have problems with older browsers
Continuous-tone greyscale image
JPEG, PNG orGIF
As greyscale is only 8-bit anyway, all of the formats should provide comparable quality, however JPEG is likely to provide highest compression (with corresponding drop in quality)
Black and White bi-tonal images
PNG or GIF
In this case, GIF or PNG should provide equal quality. JPEG is not recommended as it will give a file size larger than PNG/GIF due to it being unable to store less than 8-bit greyscale
Image or logo with transparent layers
PNG or GIF
Both PNG and GIF support transparency. PNG is non-patented. PNG also offers multi-layers and variable-transparency. Note this is not supported in older browsers
A full colour image with lossless compression
PNG
As stated above, only PNG allows you to deliver a losslessly compressed image
Animated image
GIF
At present only GIF can support animation
A zoomable or streamable image
JPEG, JP2,VFZ
This will largely depend upon server software, however it is hoped that browsers will be able to provide this with newer file types such as JPEG 2000 or VFZ
A file with reliable image metadata tagging
JPEG, PNG, JP2
At present this is not supported by the current web browsers, however JPEG and PNG both do support IPTC data. JPEG 2000 also has an XML-based inbuilt metadata system, which should hopefully be readable by future web browsers
A file with integral rights management
VFZ, JP2
So far all these systems will need some server-side software and plug-ins within the user's browser, however again it is hoped that JPEG 2000 and next generation browsers will be able to provide this functionality

File formats for PowerPoint or other multimedia programs
As long as the intended delivery format is still using a monitor, all the file formats recommended for use within a web browser will still be good choices. However if MS PowerPoint is being used to create posters or some other printed media, it might well be better to consider some of the image file formats suggested in the section for Commercial printing or Desktop printing.

The main influence on choice will be the available bandwidth for the delivery of this material. If there are bandwidth restrictions then it will make sense to use some of the file formats suggested for web delivery, however if the presentation is to be delivered locally then there is no reason to not use images of a correspondingly higher quality.
Suggested formats for monitor delivery: JPEG, PNG and GIF (at compression rate to suit delivery bandwidth and PC performance)
Suggested formats for print delivery: JPEG - High Quality, TIFF, PNG and GIF


No comments:

Post a Comment