Gordon W. Paynter
This report was submitted in partial fulfilment of the requirements for
the course 0657.420 "Report of an Investigation" at The University of Waikato.
October 1995
Copyright 1995 Gordon W. Paynter
Every day, new uses for computer graphics are devised. Even home computers work with detailed images, not just the text and numeric information they have manipulated in the past. The increased use of pictures in computing has led to many advances in hardware and software, and strained computer resources in many ways. One area that is particularly affected is the computer's storage space--graphics require a great deal of it, and most people have a limited amount.
In 1988, Michael Barnsley proposed fractal image compression as a technique to store images in a small amount of space. His idea was based on his work with fractal mathematics, and his observation that most natural scenes contain affine redundancy, or sub-sections that look very similar.
The advantages of fractal compression are that it can be used to create very small files that decompress very quickly. These qualities are becoming more and more important in applications such as multimedia and animation. The disadvantages of the technique are that compressing images is computation intensive, time consuming, and of unpredictable quality.
This report explains the fractal compression and decompression algorithms and describes and evaluates an implementation of fractal compression.
Reference
Gordon W. Paynter (1995) "Fractal Image Compression." 0657.420 Report of an Investigation, Department of Computer Science, The University of Waikato, Hamilton, New Zealand.
Notes:
Figure 3 is broken. I have discovered that a lot of people are actually reading this document because they keep mailing me and requesting Figure 3. However, I have been unable to replace it.
In recent years, the storage and manipulation of images has become a highly computerised task. Paintings, photographs, videos, and other types of graphic are being used in almost every kind of computer application. A limiting factor in the use of graphics, however, is their size: a lot of device space is needed to store large images, and a high bandwidth is needed to transmit them. This restriction has led to the development of techniques such as fractal image compression to store pictures in smaller files.
Image compression techniques can be broadly grouped into two types: lossless and lossy. Lossless techniques are those that compress and then decompress images without losing any data. The decompressed image is identical to the original. For most images these techniques have compression ratios between 2:1 and 10:1 (Barnsley & Sloan, 1988). Lossy techniques on the other hand, do not always recreate an image exactly. Image quality is traded off for higher compression ratios. The most commonly used lossy image format is the Joint Photographic Expert Group (JPEG) standard, which can restore high quality images at ratios of 5:1, and recognisable results at ratios exceeding 50:1(Pennebaker & Mitchell, 1993).
Fractal image compression is a lossy technique proposed by Michael Barnsley and Alan Sloan in 1988, based on their work at the Georgia Institute of Technology (Barnsley & Sloan, 1988). Since then it has been the subject of much research, and several variations on the original algorithm have been proposed (though few have been implemented).
The aims of this investigation were to implement and evaluate a fractal image compression algorithm and to compare its performance to the JPEG standard.
The first article suggesting fractal image compression was in BYTE magazine in 1988 (Barnsley & Sloan, 1988). The article claimed fractal compression was possible, and potentially worthwhile, but all the examples of compression were calculated by hand.
In 1989, A. Jacquin, a graduate student of Barnsley’s, submitted a PhD. thesis with a computer implementation of the compression algorithm (Jacquin, 1989). This algorithm has since been patented by Barnsley, and is fully described in Fractal Image Compression (Barnsley & Hurd, 1993). Barnsley and Sloan have continued to research fractal image compression for their business, Iterated Systems Inc., but have not published their findings.
Since 1992 several papers and series of lecture notes by Yuval Fisher of the University of California have appeared, and contain many improvements on and alternatives to Barnsley's algorithms (Fisher 1992, 1994b). These ideas are combined in a recent book titled Fractal image Compression: Theory and Application to digital images (Fisher, 1994a).
The next section of this report contains a general introduction to fractal image compression. It does not attempt to explain the mathematical basis of fractal image compression or why it works, it merely outlines the steps in Barnsley's simple compression and decompression algorithms. Section 3 examines some of the shortcomings of Barnsley's approach, and suggests several improvements. The techniques presented in this section are based on Fisher's work.
Section 4 is a description of the implementation of programs for fractal compression (PRESS) and fractal decompression (UNPRESS). An analysis of the implementation, with comparisons to other lossy image compression techniques, is carried out in Section 5. Section 6 contains the conclusions of this project.
Fractal image compression is based on the idea that any real-world world image contains affine redundancy, or self-similarity, and that this redundancy can be exploited to compress images. Barnsley used ferns, clouds, and other natural phenomena with obvious repeating motifs as examples. Computer generated fractal compression relies on far more abstract affine redundancies. Small portions of an image are used as the basis of relationships which are difficult, if not impossible, to spot with the human eye. Figure 1 shows some examples of affine redundancy. The white bordered section is similar to the black bordered sections, but is exactly twice as large.
A description of fractal image compression and decompression is given below. This simple compression algorithm provides a flexible basis for more advanced scaling, transformation, and searching techniques. It is based on Barnsley's work, and a mathematical justification can be found in his book (Barnsley & Hurd, 1993). Barnsley published code to find affine redundancy in 256-level grey-scale images whose dimensions were multiples of the domain block dimensions. He used the tiled domain blocks, mean-shifting brightness adjustment, and exhaustive search described below.
The simplest compression implementations start by taking an image and dividing it into a number of tiles, or domain blocks. Usually these tiles are four or eight pixels square but any size is possible--the only proviso is that they completely cover the image. Larger domain blocks will result in smaller files, but the recreated image will be of lower quality.
[Barnsley refers to his tiles as domain blocks because there is exactly one range block for each--they are the domain of the matching function. Fisher refers to the domain blocks as range blocks (and vice versa) because transformations are applied to range blocks to generate domain blocks--the domain blocks form the range of the transformation function. This report uses Barnsley's definitions throughout.]
For each domain block, the computer searches the image to find a range block that is similar to the domain block, but on a different scale. The domain block is said to map to this range block, or to match it. The range block must be larger than the domain block.
To simplify the compression (and reduce the amount of data that is required to store the image), the set of range blocks is usually defined to be all those blocks with sides that are exactly twice the length of the domain block sides. Throughout this document, the word block is used to describe a small image that is a subsection of a larger one.
In many cases two parts of the image will be similar when transformed by some means. For example, a domain block may look like a range block that has been reflected through its y axis. This reflection is an example of an affine transformation. There are eight different affine transformations, and each has a clear visual parallel. They are shown in Figure 2.
Generally, each domain block is compared to each range block under each of the eight affine transformations, but it will rarely be possible to find a perfect match between a given domain block and one of the transformed range blocks. Therefore, each time two blocks are compared, the distance between them is calculated. The distance is defined as the sum of the squares of the difference between corresponding pixels in each block. A distance of zero between two blocks means that they are identical. The matching range block is the one whose distance from the domain block is smallest.
The output of the compression process is the set of matches. For every domain block, the location of the corresponding range block and the transformation applied to it are recorded.
Brightness, in a grey-scale image, is the intensity of a pixel, represented by an integer value. The algorithm described so far is impractical because no account is taken of the different levels of brightness in different parts of the image. This means that the matches are unlikely to be accurate, and that the decompression process cannot produce a recognisable result because the intensity of every pixel in the output image will be the same.
Barnsley solved this problem by introducing a brightness adjustment as a transformation of the range blocks. Before calculating the distance between a pair of blocks, the mean shift is calculated. This value is the amount by which every pixel in the range block must be increased before the range block's average intensity is the same as that of the domain block.
Barnsley's approach to matching a domain block with a range block is to consider every possible combination of domain block and transformed range block. (In the implementation the number of affine transformations was restricted to three: flipping through the x axis, flipping through the y axis, and flipping through the line y= x). This is a very inefficient algorithm, but is simple and, if extended to the full set of affine transformations, will always find the best possible match for a domain block.
Decompression is a simpler and very much faster process than compression. Images that take hours or days to compress can be decompressed in seconds. Reconstruction starts with a blank image--or any image--of the same size as the original. We will call this the step 0 image, and use it to construct the step 1 image.
Each domain block in the step 1 image is approximated by applying the transformations associated with that domain block to the a range block taken from the step 0 image. The range block is found by taking the same sub-section of the step 0 image that the best-match range block occupied in the original image. The step 1 image will be slightly closer to the initial image.
A step 2 image is now constructed from the step 1 image in the same way that the step 1 image was created from the step 0 image. This process is repeated until the step n image and the step n-1 image are indistinguishable, and repeating the process will add no new detail. This image is called the fractal attractor of the transformations, and is what we use to approximate the original image.
Generally, the fractal attactor is found in less than 16 iterations. Figure 3 shows an image being rebuilt in the decompression process. The top images are the step 0 image (blank), and the result after one, two, three, four, and fifteen iterations of the decompression algorithm. Note that most of the detail has been added by the fourth iteration.
The ideas in this section are drawn from Fisher's publications (Fisher 1992, 1994a, 1994b). Fisher uses the quad-tree partition, least squares brightness, and patterned search techniques described below to encode and decode square, 256-level grey-scale images whose dimensions are powers of two.
The quad-tree partition is an alternative method for dividing the image to be compressed into domain blocks. Rather than tiling the image with small blocks of regular sizes, the image is split into large squares, and these squares are split into four smaller squares if a suitably accurate match cannot be found. Matches are then attempted for these sub-squares, and if none can be found, these sub-squares are, in turn, divided into sub-squares as is shown in Figure 4. The process continues until a minimum square size is reached, when a match must be made, however bad.
The advantage of this approach is that large areas with little detail that are easily matched can be coded very quickly using only a little storage space. One difficulty in implementing the quad-tree partition, however, is that unless the input images are square, and some power of two in width and height, the rules for performing subdivisions and fitting squares become complex. Alternative methods of defining domain blocks are HV and triangular partitioning. Both are extensions of the quad-tree method.
The HV method starts with the complete image as the only domain block. There is no range block that this can be matched to, so it is split into two smaller blocks, either horizontally or vertically, in such a way as the split falls on a naturally occurring edge--a major change in the brightness--in the image. Both the domain blocks created in this fashion are recursively split in this manner until a matchable domain block is found. Another advantage of the HV method is that larger domain blocks are likely because at each recursion the domain blocks are halved in size (on average) instead of quartered.
The triangulation method works in a similar fashion. The initial image is split into two triangular domain blocks, and each of these is recursively split into two or four sub-triangles along edges in the image. Although this method will probably create superior sets of domain blocks, transformations such as rotation and reflection become quite complex when dealing with variable scalene triangles. Possible HV and triangular partitions are shown in Figure 5.
In order to change the brightness of the range blocks, and to allow better matches, Barnsley used the mean shift technique described in Section 2.2. Fisher suggests using least squares regression to find a relationship between corresponding pixel values in the domain block and the range block. This method estimates not only the block's brightness, but also the variability of the pixels--that is, their spread around the mean, or the amount of contrast.
The regression yields a brightness adjustment b and a contrast adjustment c. Any pixel r in a range block can be approximated from the corresponding pixel d in the domain block using an expression of the form d = c.r + b. At the same time that b and c are found, the mean square error of the match is calculated. This is an estimate of how good an approximation is provided by the formula.
Performing a least squares regression is a more complex process than comparing the means, and thus takes longer. However, it does more than replace the brightness adjustment in Barnsley's code: by using the mean square error as a distance measure it replaces the comparison operations as well. The least squares method requires that both the contrast adjustment and the brightness adjustment be stored, making the storage file larger (particularly since the contrast adjustment is a floating point value).
Searching the range blocks for suitable matches for a domain blocks is a process that can be optimised in many ways. The approach Fisher takes is to pre-process all the range blocks and divide them into classes based on the relative brightness of their four quadrants. Rather than searching the entire image for a match, it is only necessary to compare a domain block to the range blocks in the class that most closely resembles the domain block.
The search is performed faster in this manner, but there is a small possibility that the best match for a given domain block will be missed. The program must also store the range blocks, requiring considerably more of the computer's memory.
Barnsley made no attempt to store the output of his program, as it was not intended to perform a realistic compression, merely to demonstrate the fractal transform process. Fisher builds a file by writing data bit by bit to the file in the most efficient manner possible--approximately 27 bits per transformation (Fisher, 1994a). Extra information is required to describe the structure of the quad-tree, but the use of that structure will usually result in there being fewer domain blocks to store.
The main aim of this project was the implementation of a practical compression program using fractal techniques. There were no fractal image compression programs in the public domain when the project was initiated, though there are several now. Two main programs, named PRESS and UNPRESS, have been developed, and can be used to compress and decompress images and obtain useful results.
The PRESS program takes a portable greymap (PGM) graphics file as input and outputs an encoded transformation format (ETF) file. The unpress program takes an ETF file and from it recreates a PGM file. PGM graphics files were chosen because they perform no compression, they exactly match the manner in which the graphic is stored in memory, and they can easily be created, viewed, and manipulated by the Sun SPARCStation. They contain a short header describing the image dimensions, and a body that consists of the pixel values dumped row by row in binary format.
The PRESS program performs fractal image compression. For every domain block in the image, a range block is found that closely approximates it. The output is a file describing which range block is associated with each domain block, and what transformations were applied to the range block to make it match the domain block.
Domain blocks are found by dividing the image into tiles of a fixed size. The default dimension is four pixels square, but this can be altered at the command line to give different compression ratios and image qualities. A quad-tree partition was not used because I was unaware of the technique when the original code was written, and I felt that the amount of time that would be required to change it would be better spent on other improvements.
The range blocks are always square and are have sides exactly twice as long as the domain block sides. In theory, the set of range blocks consists of all the squares in the image with the appropriate dimensions. But in practice only every second possible block in every second row is used.
Section 2 describes the range blocks as being chosen from the image, then reduced, and then compared to the domain blocks. However, in the implementation, the entire image is reduced at the start of the compression, and the range blocks are then chosen from the reduced image with no need for further scaling. This greatly increases the speed of the program, but because every pixel in the reduced image is the average of a 2(2 square of pixels in the original, the number of range blocks is reduced by a factor of four.
The PRESS program uses all eight of the affine transformations described in Section 2.1. Every range block is compared to each domain block under each of these transformations.
The PRESS program originally used Barnsley's mean shifting technique, but this has problems when the brightness adjustments are very small. Because the mean shift sets the brightness relative to the brightness in other parts of the image, the result image may have an incorrect average intensity if the initial, blank image used in decompression is different in overall brightness to the final image and the mean shift is very small.
Figure 6 shows what may happen in this case. The top image was compressed (with an early version of PRESS) and decompressed. Although the shape of the rose is unmistakable in the picture, the grey levels are clearly not right. Adding a constant amount to every pixel in the image results in the middle image, which is a much better approximation of the Rose image shown at the bottom.
This problem has been fixed by storing the mean brightness of the domain block, rather than the difference in brightness between the range block and domain block. An advantage of this solution is that it will result in a slightly smaller output file because the mean of the domain block is in the range [0, 255] and the mean shift is in the range [-255,255]. A further benefit is that it takes fewer iterations to reach a stable image in the decompression process. A possible disadvantage is that some of the purity of the algorithm is lost--we store information that describes what part of an image looks like, rather than describing it in terms of other parts of the image.
One of the most obvious reasons that the least squares shifting algorithm is superior to Barnsley's simple mean adjustment technique is that the former makes allowances for the different range of pixel values, or contrast, of the two blocks being compared. This allows almost matches between blocks with similar patterns but different contrasts, such as those in Figure 7.
A version of PRESS has been developed that allows for contrast using the standard deviation of a block. When a range block is compared to a domain block, both its brightness (mean pixel value) and contrast (standard deviation of pixel values) are adjusted so that they are the same as the domain block. Similarly, in the decompression process, the range blocks are adjusted so that both these parameters are the same in the approximated domain block as they were in the corresponding original domain block.
In order to calculate the distance between a domain block and a reduced, shifted, and transformed range block, PRESS calculates the sum of the squares of the differences between their corresponding pixel values. If the blocks are identical, then the distance between them will be zero. The range block that best matches a domain block is the one that is the the smallest distance from it in terms of this metric.
The PRESS program searches for matches using the same exhaustive technique that Barnsley used. There are many ways that this technique could be improved, but I decided to concentrate on optimising for file size, rather than speed. PRESS searches for a match for each domain block in turn, and only continues to the next domain block if a perfect match is found, or if all the possible range blocks have been searched.
Suppose we wish to compress a w(h image, using domain blocks of size d. Assuming w and h are multiples of d, there are a total of (w/d)((h/d) domain blocks and (w/2-d+1)((h/2-d+1) range blocks. The number of comparisons required is, at worst, the number of domain blocks multiplied by the number of range blocks multiplied by the number of affine transformations (eight). In a practical example, we may want to compress a 320(200 pixel image using 8(8 domain blocks. There will be 1000 domain blocks, 14229 range blocks, and up to 113, 832, 000 comparisons.
Fractal transform format (FTF) files describe the output of the compression process (in fact, they were the output of the compression process in early versions of the program), and the inputs to the decompression process. They are in a human-readable text format for debugging purposes, and are defined in imitation of the PGM format as is shown in figure 8.
The last line is repeated once for each domain block. In this format, each number is stored in three or four bytes, and each domain block is stored in approximately sixteen bytes including whitespace.
Encoded Transforms Format (ETF) files contain the same information as FTF files, but the information is compressed by adaptive arithmetic encoding (Bell, Cleary, & Witten, 1990). This method for storing information is used because it is lossless, gives high compression ratios and because the implementation lends itself to the information being compressed. In particular, the ability to dynamically determine different sets of symbols to be compressed allows the press and unpress programs to create symbol tables for the range block co-ordinates that are dependent on image size. The outputs of press are non-random, and suitable for lossless compression.
The UNPRESS program reconstructs an image by iteratively refining each domain block until as much detail as possible has been added. There is very little that can be done to optimise UNPRESS, and it is essentially the same as the implementations described in Sections 2, 3, allowing for the contrast adjustment described in Section 4.1.5 and file format changes described in Section 4.1.7.
Writing the PRESS program has led to several interesting problems, both in the algorithms and their implementation. Because every domain block in the image is derived from many others, and, in turn, could be the basis for any or all of the other blocks, a small error will usually turn the entire image to meaningless snow. Some, such as the problems with the mean shift technique (see Section 4.1.3) have only had a partial effect on the output. Others are described below.
Figure 9 shows a "grainy photograph" effect which was the result of a variable being initialised in every iteration of the 90 degrees rotation transformation, effectively rendering it useless. The effect on the image is of interest because the structure of the picture is maintained by the transformations, but many errors are introduced in the detail.
The other problem that has effected the output of many early images occurs in an inconsistency between the Xview program and the C printf function. In C, the command printf("\n") outputs two bytes, a carriage return byte (13) and a line feed byte (10). Xview, when reading a PGM file, will accept this standard on each line of the header except the last, when it will read the carriage return byte, but considers the line feed byte to be part of the image data. The effect is that the top left hand pixel of the image produced has value 10, and that the leftmost column of pixels is misaligned.
The quality of the press program can be assessed along three principle dimensions. These are the amount of compression performed, the quality of the reconstructed image, and the time taken to perform compression and decompression. Little effort has been made to optimise PRESS for speed because it is unimportant in many applications and because it was ignored by the algorithms PRESS is compared to. However, the compression times are listed in Appendix B to give an indication of PRESS' time efficiency.
The Waterloo Montreal Verona Fractal Research Initiative is a collaborative project of the Universities of Waterloo and Montreal in Canada and the University of Verona in Italy. The University of Waterloo has provided WWW and FTP sites containing a library of sample images, a collection of lossy compression programs, and an analysis of each program based on the images (Kominek, 1995). The aim of the project is to provide a library of compression tests that avoid the problems usually encountered when comparing graphics compression:
The test suite is called the "Waterloo Repertoire", and consists of three sets of images in greyscale and colour. Although PRESS can compress any greyscale image, it has only been evaluated with the first set of greyscale images. This contains twelve 256-level greyscale images that are 256 pixels square and contain a mixture of natural and synthetic images. Each of the synthetic images is chosen to test a particular property of digitised images.
The sample images are assessed by their compression ratio and quality. The compression ratio is the size of the raw data (in this case 256(256 bytes) divided by the compressed file size. Image quality is measured with the root mean square (RMS) error of pixels in the two images. Points near the top of the graph have low error, those near the bottom have high error. The left of the graph contains areas of low compression, and the right areas of high compression. Obviously, points near the top right are most desirable, and those near the bottom left are least desirable. The scale of the rate-distortion graphs changes from image to image.
In order to provide a basis for comparisons, compression curves for three other image compression implementations are been plotted in the same axis These are a standard commercial JPEG implementation by Images Incorporated (JPEG), the Independent JPEG Group implementation of the JPEG technique (IJPG), and Said and Pearlman's Wavelet based compression implementation(SAPB). SAPB performs consistently better than any other program evaluated by the fractal research initiative. These have been chosen to represent "good" compression (JPEG), "better" compression (IJPG), and "the best" compression (SAPB).
Two other fractal image compression programs have been evaluated on these images: Fisher's quad-tree implementation (TRNA and TRNB) and Barnsley's most recent (commercial) releases (FIFA and FIFB). Fisher's results are included on figures 10 and 11 for comparison to PRESS. TRNA uses a three-level quad-tree, and TRNB a four-level quad-tree.
There are two versions of the press program that give useful results--version 1 and version 2. The difference is that version 1 does not take the contrast of domain blocks into account and version 2 does. Their results, which are often similar, are referred to as v1 and v2 respectively. The full set of results are listed in Appendix A and plotted on rate-distortion graphs in Appendix B.
Figure 10, the rate-distortion graph for the Bird image, is typical of a natural image. At low compression ratios the JPEG and IJPG methods are superior to the fractal-based coders, but for when compression ratios exceed 25:1 they begin to fall off sharply. Although the fractal-based coders are initially less reliable, they tend to have a flatter curve overall, and are superior at high compression ratios. The SAPB has the best of both--high compression and low error.
Figure 11 shows the same graph for Circles, a highly "synthetic" image consisting of successively smaller circles in different shades of grey. This image is particularly well-suited to fractal compression as it has almost no texture. All the techniques have curves of a fairly similar shape to the ones they had on the Bird image, but the JPEG, IJPG and SAPB curves are much lower on the graph (they have greater RMS error) and the fractal curves are higher on the graph--the error in the compression is lower.
Because there is no way of telling how soon press will find matches for domain blocks, and there is no way of telling how accurate these matches will be when they are found, it is very difficult to estimate how much any given image will compress, and what quality the quality of the result will be.
This section describes some of the weaknesses of the press program, and suggests some ways that they may be remedied.
First, the compression process is limited by the accuracy of matches between domain blocks and range blocks. Perfect matches simply do not exist for some sections of the image. This is illustrated in Figure 12--there are no range blocks that match the domain blocks that fall along the main horizontal edge. This problem is exaggerated in the decompression process. If a block cannot be properly approximated in the compression, it cannot be exactly rebuilt in the decompression (and neither can any other domain blocks that are related to it through range blocks).
When large domain blocks are used (just how large changes from image to image, and is a function on image size) the problem is exaggerated and the quality of the results can be so low that the output is of no practical value.
Second, diminishing returns are inherent to improvements in the compression process--this is illustrated by the fact that in many of the rate-distortion graphs of version 1 and version 2 of PRESS have very similar curves. In other words, by adding the standard deviation parameter the image quality has been increased but the compression ratio has decreased. Figure 13 shows an enlargement of the bird image compressed with both versions of press. Version 2 (right) is a higher quality result (particularly along the beak) but the compressed file is 18% larger. An equivalent increase in quality at the expense of file size may have been possible using version 1 with slightly smaller domain blocks.
Third, when domain blocks become sufficiently large, it becomes possible to see the shapes of the domain blocks in the output image, as in Figure 14. These flaws can possibly be removed by processing the result (Fisher, 1994a), or turned into an advantage by domain block sets like the HV partition that divide domain blocks along edges in the image.
Fourth, when the range blocks are reduced to the size of the domain blocks, some of the detail in the range blocks is lost. This can be seen in Figure 15 , where the car at the base of the pyramid is almost non-existent in the compressed image (right)..
This problem is particularly evident in the compression of images containing thin lines, as the line intensity and shape will often be lost when the range blocks are reduced. Figure 16 shows that this problem is particularly bad in version 1 of PRESS (left) but that it can often be minimised by the contrast adjustment. Note that the thin lines are sharpened by the contrast adjustment in version 2 of press but the thicker line in the bottom left still shows obvious artefacts of the compression. This problem arises because the thin lines are reasonably approximated by reducing the thick line, but there is nothing that can be reduced to give an accurate approximation of the thick line. It is worse when compressing images that contain lines that are all the same thickness, such as a screen of text. The standard deviation adjustment helps to sharpen the image, but in doing so the shapes are often distorted.
One solution to this problem is to change the way that range blocks are reduced. Usually, they are reduced by taking the mean of every 2(2 pixel block. In addition to the usual range-block set, version 1b of PRESS attempts to find matches in the set of range blocks found by taking the top left pixel of every 2(2 pixel block. Blocks reduced in this way tend to have sharper edges but are less true to form and may omit very thin lines. In effect, the set of range blocks has been doubled in size. Because they are sharper, the new range blocks are more suitable to compressing the text image than the old range blocks. Version 1b (v1b) worked very well on the synthetic images, but had little effect on complex and natural images other than increasing the file size slightly and doubling the compression time.
Figure 17 shows part of the original Text image (top left) and the effects of each compression method. The standard compression (v1, top right) of text looks "washed out"--all the lines are blurred when the pixels are averaged. Allowing for the contrast (v2, bottom left) eliminates this problem, giving a much sharper image. However it does not recreate the shape of the letters very well. Changing the compression technique (v1b, bottom right) maintains the shape of the letters and the contrast in the image.
It is a little harder to highlight the strengths of fractal compression because high quality reconstructed images tend to be indistinguishable to the human eye. The following areas are those where fractal compression consistently excels.
First, synthetic images with little or no texture information are compressed very well because they contain a lot of affine redundancy. They tend to be of a high quality and compress very quickly. For example, the squares image compresses with a RMS error of 0. Exceptions are images with lines or intricate detail.
Second, many of the rate-distortion graphs for natural images show that JPEG based compression schemes perform best at low compression ratios, but fractal compression gives better results at higher compression ratios. There is evidence to support the idea that this is because fractal compression creates the detail of images in a "natural" manner. At high compression ratios the approximation is close enough to give a good result, but with small blocks it often gets the finest details incorrect. This quality (called scale independence) can be used to enlarge images without the "blocky" effect created by simply increasing the size of each pixel. This leads to the idea that when an image is recreated at a larger size than the original was when it was compressed, the decompression process creates more detail than was in the original picture, and therefore has an infinite compression ratio.
There are several areas where improvements could increase the performance of the press program. These include
This project began by conducting an investigation of fractal image compression, a new, patented, and largely untested technique that exploits affine redundancy to store digitised graphics (Sections 2 and 3). Next, a fractal compression algorithm was designed and implemented (Section 4) Finally, the system was analysed and tested, providing an informal comparison of fractal image compression to other techniques (Section 5).
PRESS appears to work best at high compression ratios and with synthetic images--which is ironic, since Barnsley suggested fractal-based compression after observing affine redundancy in natural scenes. Fractal compression is not good at storing the smallest details of an image, because they are often smoothed out when the range blocks are reduced. This means that very low error is difficult to achieve, even with very small domain blocks. Fisher's fractal compression program usually performs better than PRESS, but has the same strengths and weaknesses.
PRESS performs comparably to some of the best compression programs in the world--and in some cases better. These results indicate that fractal image compression outperforms JPEG at high compression ratios but generally takes longer to compress an image.