Resampling of images and video sequences
Moscow State University, 2001
Done with Alex Lukin:
Git:
https://github.com/mashimello/mashimello
Bitmap Images
Photoshop and other graphics programs generate bitmap images. Bitmaps use a grid (two-dimensional array) to represent the image. Each cell (pixel) has its own coordinates and color. When we edit bitmaps we are working with pixels, not objects or shapes.
Bitmap (raster) images depend on their size (resolution), because they consist of a finite set of pixels. Therefore, bitmaps appear pixelated when zoomed in on the screen or when printed. Bitmaps are good for representing the soft transitions found in photographs and painted images.
Resolution of digital image
Research addresses the problem of changing the resolution of digital images for various classes of images, such as digital photographs, computer generated graphics, text, etc. The algorithm is based on the principle of maximizing the preservation of image information in the frequency and spatial domains to achieve maximum image clarity and minimum aliasing.
In image rendering — we often see anti-aliasing settings. For example, in 3d Studio max in the rendering dialog, you can select a lot of anti-aliasing settings. An ordinary 3d studio max user does not want to go into the internal structure of this algorithm, but only wants to get the highest quality image without visible artifacts.
Image resolution is determined by the number of pixels that carry color information. A pixel is a unit of information displayed on a screen. Each pixel reflects a certain amount of information. The more pixels there are in the image, the more detailed the image itself.
Resolution can be represented in different ways:
1. The number of samples (dots) per inch (samples per inch -spi, scanners)
2. The number of dots per inch (Pixels per inch (ppi, monitors)
3. Dots per inch (dpi, printers)
The more dots in the image, the more detail and smoother transitions in images, but the file size also increases with increasing resolution.
It should be understood that the resolution specified in the technical specification of devices such as printers is a different concept, since printers use multiple dots of different colors to print a single dot on an image. In other words, a 600 dpi printer does not mean that it prints 600 dots per inch (600 ppi).
The danger of changing the image resolution is that decreasing or increasing the resolution adds or removes pixels in the image.
Reducing image resolution is called down-sampling, and increasing image size is called up-sampling.
The problem of changing the resolution of images and resampling
Previous research
The main approaches to resizing images are: interpolation (linear, bilinear, bicubic), filtering (filters with a separable kernel and an inseparable kernel) and others, less common: fractal scaling, c-spline and others.
Most users are satisfied with the results of existing resampling tools. Most do not even have doubts that the result they get after resampling the image in a program such as Adobe Photoshop can be significantly improved.
We also compared our results with resampling algorithms built into popular software programs.
Aliasing and anti-aliasing
Anti-aliasing is a general problem in computer graphics. Almost any program that displays graphics to a screen or other device includes operation of rasterization.
The problem of aliasing also arises when the image is resized, for example, when we reduce the image, we actually reduce the sampling rate. This leads to the fact that some frequencies that were previously below the Nyquist frequency are now above the new Nyquist frequency. There is a need for anti-aliasing.
It follows from the Nyquist-Shannon sampling theorem that for the correct restoration of the original signal from the digitized one, it is necessary that the spectrum of the original signal does not contain frequencies higher than half the sampling frequency (Nyquist frequency). Therefore, before digitizing the signal, all frequencies above the Nyquist frequency are usually filtered from it. This is called anti-aliasing. For images, this leads to the fact that small details are lost, the picture is blurred. But on the other hand, after restoring the signal to analog form, we get the correct transmission of low frequencies. If we do not use anti-aliasing, then we will not only lose the upper frequencies, but also spoil the lower ones. In this case, frequencies above the Nyquist frequency will float to the lower part of the spectrum and superimpose on the frequencies already present there. This is called aliasing.
The quality of the bitmap is highly dependent on the quality of anti-aliasing. Without proper anti-aliasing, the image looks very unrealistic, with “torn” edges and other artifacts. With the correct anti-aliasing, the picture looks natural, with smooth edges, the text is easy to read, and most of the unwanted effects and artifacts are eliminated. A few years ago, good anti-aliasing took too much CPU time to render, but today programs can significantly improve the quality of their graphics thanks to anti-aliasing.
The anti-aliasing process when rendering an image of XxY resolution is similar to rendering an image of 3Xx3Y and then resampling it to the target resolution of XxY.
Resampling is one of the primary and most common operations in the tasks of pre-press graphics, compositing, digital photography and other tasks in one way or another related to image resizing. The quality of images after resizing strongly depends on the quality of the algorithm.
An overview of popular filters and interpolation algorithms
The results of many standard resampling filters have been tested. Not all of them are suitable for all classes of processed images. The best ones are bicubic interpolation, Lanczos, Mitchell, and Catmull-Rom.
Bicubic interpolation is widely accepted for high quality resampling. The result after bicubic interpolation is clear, not strongly influenced by step overshoot, there is no “sampling frequency ripple” effect. But anti-aliasing is not enough. Also, the bicubic method does not work well with fine details in images.
The Lanczos filter is a sinc function multiplied by a weight window with three ripples. Since the second ripple is quite large, the filter produces a large “step overshoot”. And the third ripple is the source of a little “ringing”. The main advantages of the Lanczos filter are high image clarity and no aliasing. But since the spatial characteristics of the filter are far from ideal, the filter cannot be considered universal.
The Mitchell filter has excellent performance in the spatial domain, acceptable aliasing, but generates a rather blurry image, therefore, it cannot be considered universal either.
Catmull-Rom spline is the best of the listed options for day-to-day work. It does a good job of aliasing, the image is sharp enough and the step overshoot is minimized.
Anti-aliasing and resampling filters in 3DS MAX, Photoshop, ACDSee
3DS MAX
The Kinetix 3D Studio MAX program has a large set of anti-aliasing filters. Anti-aliasing in 3DS MAX is performed by rendering an image with a higher resolution and then down-sampling using the selected filter.
Area — Original filter 3ds max resized.
Blackman — 5x5 filter for sharp images.
Blend — A blend of a harsh filter with a Gaussian filter.
Catmull-Rom is a 5x5 filter that gives good results on boundaries.
Cubic is 5x5 filter based on a cubic spline.
Mitchell-Netravali — A filter that adjusts the balance between blur and ringing.
Quadratic is a 3x3 filter based on a quadratic spline.
Sharp Quadratic — 3x3 filter for sharp edges.
Adobe Photoshop
Bicubic interpolation, Bilinear interpolation, Nearest neighbor method
ACDSee
Popular program ACDSee provides a wide range of algorithms for resampling images.
- Rectangular filter
- Triangular filter
- Bicubic interpolation
- Bell
- Lanczos filter
- Mitchell filter
Resampling algorithm properties.
There are some common artifacts that occur in images being resampled. The quality of the resampling algorithm can be considered the extent to which the possibility of the appearance of any of the unrevealed artifacts in the resulting image is minimized.
Blurring of the image
Aliasing.
Occurs when we are trying to sample (sample) an image with too low a sampling rate. The high frequencies of the original image are converted to the low frequencies of the resulting image.
Spatial Frequency Ripple (SFR) occurs when resampling of a homogeneous area of an image gives an uneven distribution of color in this area in the resulting image.
Ringing (Gibs effect) is an artifact in which the brightness levels near the sharp edges of the original image begin to oscillate in the resulting image.
Step overshoot this is a ring with only one oscillation.
Ringing is not always perceived as a negative effect, because with one oscillation, this effect sharpens the edges of the image
Sub-pixel image shifts occur when the resampling algorithm is implemented incorrectly or inaccurately. After changing the image resolution, the resulting image is shifted compared to the original.
Methods for assessing the quality of resampling algorithms
To assess the quality of resampling of filters with different kernels, strict rules must be followed for this assessment. One popular way is to stretch the image and then downsize to the original size. Then you need to compare the resulting image with the original using one of the standard PSNR metrics or another. You can also use metrics that take into account the perception of color by the human eye. All these metrics are very sensitive to the sharpness of the processed image, but insensitive to effects such as aliasing or ringing. Therefore, it is necessary to develop special algorithms for comparing the quality of resampling with filters using different kernels.
One can use Gabor filters to decompose (decompose) an image into perceptual frequencies and compare the two images with weights at each frequency.
Lets compare of a reduced and then enlarged image with the original image. It is important to note that quality assessment models perform well only on some subclass of images. For example, a monochrome image, such as text, is not heavily distorted by the step overshoot effect, but images with sharp transitions will be sensitive to ringing effects.
Various resampling algorithms:
1. Original image,
2. Down-sampled image image
3. Bicubic interpolation (poor anti-aliasing, torn edges, large overshoot),
4. Lanczos filter (too large ringing and overshoot),
5. Mitchell filter (normal overshoot, but blurry),
6. Catmull-rom filter (blurry),
Objective
To develop a high-quality method of resampling and image processing to achieve maximum quality in accordance with existing metrics and visual perception. Adapt it for the tasks of processing video sequences.
When zoomed out — the most complete preservation of the image information (clarity), the absence of aliasing and other artifacts.
When zoomed in, there is no jigging effect, outliers near sharp edges and other artifacts, and the image is clear.
DSP approach to resampling and building a resampling filter.
The proposed approach uses the DSP approach to resampling, since a raster image can be thought of as an evenly sifted continuous image.
The criterion of ideal reconstruction is laid down in Nyquest’s theorem and is expressed in terms of the image spectrum, which means that when we need to build resampling and ani-aliasing filters, we need to pay more attention to the frequency rather than the spatial response of the filter. But the convolution operation makes it possible for us to connect these areas together, and use an already known technique to optimize the filter’s operation.
The proposed algorithm is based on standard signal processing technology. The algorithm allows you to resize the image by a fractional number of times. The standard algorithm has been modified to meet all resampling requirements: border correction has been implemented, sub-pixel shifts have been eliminated. To achieve high performance and computational efficiency, polyphase filtering is applied. The problem of optimizing the quality of the algorithm’s operation is reduced to the problem of constructing an optimal filter kernel.
Construction of filter
Let us consider the construction of an anti-aliasing filter. The main parameters of the filter are the degree of suppression of unwanted spectral components, the flatness of the frequency and phase characteristics in the pass band and the width of the transition band (between transmission and suppression).
The filter for the resampling task must meet the following requirements: the frequency response must be flat in the pass band, and provide sufficient rejection in the stop band to prevent visible aliasing. Frequencies that are multiples of the sampling frequency must be suppressed particularly strongly to avoid DC aliasing. The phase response of the filter should be linear to prevent asymmetry in the resulting image. The impulse response of the filter should contain as few side lobes as possible, and their amplitude should be as small as possible (to prevent the occurrence of the Gibbs effect).
Unfortunately, all of these criteria cannot be met at the same time.
An ideal low-pass filter is a Sinc function that has many large amplitude side lobes, resulting in ringing. The Gaussian filter kernel (Gaussian kernel), good only in the spatial domain, poorly manifests itself in the frequency domain, which leads to blurring of the image.
Filtering digital data is implemented using the convolution operation with the filter kernel. An ideal low-pass filter has an infinite convolution kernel. One way to approximate an ideal filter is to truncate the infinite convolution kernel to a finite segment. However, this method is bad in that the frequency response of the filter at the same time acquires unwanted ripple (with an amplitude of 9%) near the transition band. Moreover, the amplitude of the pulsations does not change with increasing convolution length. However, the biggest drawback of this filter is how it transforms sudden changes in brightness in the image. After filtering, brightness ripples appear near the boundaries. This makes the filter unusable for graphics.
There are ways to avoid ripple in the filter’s frequency response and to improve the stop band rejection by slightly broadening the transition band. The most common method is the weighting function method. This method involves not truncating the infinite convolution kernel of an ideal filter, but multiplying it by a weight function (window) of a special type, which smoothly vanishes outside the specified interval. A lot of weight windows have been invented, they all affect the frequency response of the filter in different ways. We used the Kaiser window because it allows to adjust the trade-off between stop band suppression and transition bandwidth. In addition, it has one of the best characteristics among weight windows.
So, using the Kaiser window, we have built a filter with very good frequency characteristics. But when we applied it to resampling images, we found that the brightness ripple near the contrast edges did not disappear. Thus, we concluded that when constructing a filter, it is necessary to take into account not only its frequency response, but also how it changes the details spatially. When constructing filters for processing an audio signal, only their frequency response is important, since the frequency spectrum of an audio signal is much more important to hearing than a specific waveform. But for visual perception, it is the spatial domain (specific waveform) that is essential, and not the frequency domain. Thus, the task of constructing a good anti-aliasing filter for an image requires a solution taking into account both the frequency and spatial characteristics of the filter. Note that the resulting filter with a Kaiser window would be perfect for anti-aliasing an audio signal.
There are several ways to build a filter that will satisfy all resampling requests. One is a window overlay on the sinc function, the other uses linear programming to optimize the frequency and spatial response of the filter. The Remez exchange method is often used to build optimal FIR filters with equal ripple, which are not suitable for resampling images, since they do not optimize the impulse response of the filter, which leads to a step overshoot effect. Despite its advantages, linear programming is a rather slow way to optimize filter performance. (it should be borne in mind that we need to carry out the filter construction procedure every time we want to perform the resampling operation, since the filter width depends on the compression or stretching ratio of the image). Linear programming helps to minimize the length of the filter kernel, and the length of the filter is not so critical in our case.
In the Impress algorithm we can independently change 3 main parameters:
Smoothing amount. The Smoothing amount is controlled by the cutoff frequency. Increasing the amount of blur will result in a “blurry” not sharp picture, decreasing the amount of blur will result in a sharp picture, but will also add an aliasing effect.
The suppression in the suppression band is controlled by the width of the Kaiser window. Increasing the parameter results in less aliasing and less SFR; decreasing this parameter will reduce the bandwidth, which will suppress the aliasing effect, but increase the SFR effect.
The number of side lobes in the filter kernel determines the trade-off between frequency and spatial characteristics of the filters.
The more spatial ripple, the clearer and alias-free the image we will get at the output, but at the same time we will increase the ringing effect. The less side ripple in the filter core, the less ringing effect we get.
The suppression of one of the undesirable effects leads to the manifestation of the other. It is necessary to combine these parameters so as to minimize the appearance of each of the artifacts, and to make the filter optimal for the task of changing the image resolution.
A set of parameters that give the correct result, which were selected empirically, are saved in special presets. For example, the “normal” preset is used by default and is currently considered the best for most image classes and results expected from the resampling algorithm. The result of the algorithm’s work with this preset is close in quality to bicubic interpolation, but shows comparatively better results in anti-aliasing. The “Sharp” preset emphasizes the sharpness of the image. There are also several presets for dealing with “step overshoot” and “ringing” effects.
Implementation of the Impress
Algorithm The algorithm resamples a one-dimensional signal x [n] of length N into another one-dimensional signal y [n] of length M.
Method
The following steps explain the method used by the algorithm. To increase the speed of the algorithm, some of the optimizations described below can be used.
Choosing the compression / stretching ratio
First, you need to choose the optimal coefficients for the resampling operation. The greatest common divisor K for N and M is found, such that N / K and M / K are greater than 64 (so that the frequency response properties of the resampling filter can be predicted. As a result, the stretch factor is chosen U = M / K, and the compression ratio D = N / K.
Construction of the resampling filter
The resampling filter is a low-pass filter with a cutoff frequency F*min (1/D,1/U), where F is half the sampling frequency. The filter kernel is a sinc function multiplied by the Kaiser weight window. There are several parameters that greatly affect the quality of the filtering.
NLobes is the number of lobes per one side of the filter kernel (may be fractional). If nLobes = 1 then the filter does not create the Gibbs effect near the brightness drops. If nLobes ≤ 2, then a mild Gibbs effect will be observed (one side lobe). If nLobes> 2 then the ringing will be visible to unwanted degree.
SmoothingAmount is responsible for the width of the filter.The wider the filter, the lower the cutoff frequency, and the more blur appears during the filtering process. The narrower the filter, the clearer the image will be at the output. SmoothingAmount is a fractional number that varies in octet 1.
Suppr is the amount of the suppression of the side lobes of the filter (in both impulse and frequency response). This is the β parameter of the Kaiser window. It adjusts the trade-off between the cutoff slope of the frequency response of the filter and the degree of suppression in the notch band. Typical Suppr values are in the region of 2–4.
ES (“ExtraSharpness”) is a parameter that determines the degree of sharpening. The idea of the method is to amplify the frequency response of the high-pass resampling filter in the pass-band. To do this, the Gaussian kernel is subtracted from the usual one built by our algorithm.
The main problem with the choice of these parameters is that the result of the filter operation depends on a combination of the choice of these parameters in a very complex, non-linear way.
After choosing the above parameters, the filter is constructed in the spatial domain as follows. The filter width (number of coefficients) is selected as
. And the values of the filter coefficients are as follows:
Here i = 0, …, L-1, and
.
Kaiser window:
where I(x) is the Bessel function of the zero order
Parameter B depends on the suppression requirements of the low-pass filter. Kaiser calculated a simple formula for calculating B and N if known As (minimum suppression in the suppression band) and delta F (normalized transition band width).
Kernels and frequency response of filters (illustrations)
Amplitude-frequency characteristics of filters.
Interpolation / Filtering
Decimation To resample the original signal, an interpolation / filtering / decimation sequence is used. First, the original signal is filled with zeros U times. Each value of the original signal is multiplied by U and then U-1 zeros are written. Before each value of the original signal P = round (U / 2) zeros, and after the last element UP zeros to minimize sub-pixel shifts. The length of the resulting N * U.
The next step is to filter the received signal using the built anti-aliasing filter. Correction at the boundaries at this step is equivalent to copying the boundary values to the left and right beyond the boundaries of the processing interval.
In conclusion, you need to decimate the received signal by D times. To do this, each D-1 of D values samples from the signal in a is discarded in a symmetric form (as in interpolation). The length of the received signal is M. The resampling operation is over.
Optimization
The described method produces high-quality resampling when the resampling filter parameters are correctly selected. A huge amount of calculation is unacceptable for most applications can draw several optimizations to improve performance of the algorithm dozens of times without any loss of accuracy-..
Polyphase filtering
Polyphase filtering is a technique that avoids the huge amount of calculations that have to be done during the interpolation process / filtering / approximation to receive the final signal without storing the intermediate signal of long length in memory. Polyphase filtering allows you to express the values of the resulting signal directly through the values of the original signal and the values of the Resampling filter. Using this technique, large values of U and D can be selected without sacrificing the performance of the algorithm. The time spent on the resampling operation depends only on the lengths of the input and output signals and the resampling filter parameters.
Coefficient caching
It is obvious that in the process of resampling images, the operation of one-dimensional resampling is repeated a great many times (for each row and column) with the same parameters. To reduce the number of recalculations, all resampling filter coefficients are stored in memory. This allows for repeated resampling operations, in which all the necessary weights are already prepared in advance in memory.
ExtraSharp Modification (High definition)
A special set of parameters for the high definition filter has been created. For this, a special core and the “Extra sharpness” preset were created.
The idea of the method is to amplify the frequency response of the high-pass resampling filter in the passband. To do this, the Gaussian kernel is subtracted from the usual one built by our algorithm. This results in subtraction of high frequencies.
The formula for obtaining the filter can be written as follows:
Where i = 0, …, L-1, and
.
ES (“Extra Sharpness”) is a parameter that determines the degree of sharpening increase and can be changed in the range from 0 (original filter) to 0.5 (for maximum sharpness).
Let us illustrate this modification graphically.
Blue color shows the graph of an unprocessed filter, and pink shows the Gaussian kernel, which will be subtracted (ExtraSharpness = 0.2).
Filter after subtracting Gaussian
kernel Kernel after normalization procedure
As you can see from the diagrams, the new filter has less attenuation in the pass-band, but good attenuation in the suppression band, which is reflected in better image clarity.
It is important to note that the length of the filter has not increased, consequently, and the time required for filtering has not increased.
Eliminate the Sampling Frequency Ripple effect.
The Sampling Frequency Ripple effect occurs due to the aliasing of the constant component of the signal (DC) during resampling. To prevent it, the suppression of frequencies that are multiples of the cutoff frequency of the filter should be maximized.
When solving this problem, you should vary the parameter values so that they are as close as possible to those obtained in previous experiments and saved settings. Otherwise, we will get other artifacts that arise in resampling tasks: the image may turn out to be too blurry or have large ripples near the borders.
In previous studies, the authors have struggled with this effect in two ways:
Calculated the filter kernel so that frequencies that are multiples of the sampling rate were analytically assumed to be 0. This approach leads to the use of limited technologies such as cubic splines. The main disadvantage of this approach is that it is very difficult to optimize the filter kernel in the frequency domain, since in the spatial domain we are limited to a narrow class of kernels.
Another method of constructing a filter uses its desired frequency response. This method requires the use of sinc functions and weight windows. The main issue is the choice of the weight window. Unser et al. suggested using 2 weight windows: Dirichlet (rectangular window) and Hanning window. But the rectangular window has shown poor results in eliminating the SFR effect. And the Hanning window eliminates this effect, but does not allow adjusting any parameters. Other weighting windows were used in order to further suppress frequencies that are multiples of the sampling frequency, but more suppression in the suppression frequency leads to a wider bandwidth, which makes the image blurry
Our approach is trying to use the merits of each method.
The resampling algorithm of the “Impress” program is controlled by 3 parameters:
Smoothing Amount — the degree of blurring of the image, which affects the cutoff frequency of the anti-aliasing filter. The higher the Smoothing Amount parameter, the lower the cutoff frequency of the anti-aliasing filter. Changing the Smoothing Amount proportionally “stretches” or “compresses” the frequency axis, and all the lobes in the frequency response of the filter (and most importantly, the minimums of the spectrum) are shifted to the left or right, respectively.
Thus, by changing the Smoothing Amount parameter, we can achieve that at the desired frequency there will be a minimum of the filter’s frequency response. However, at the same time, not the minimums of the frequency response may fall on other multiples of the cutoff frequencies we need. It follows that, by changing only the cutoff frequency of the filter, it is impossible to achieve minima in the frequency response of the filter at frequencies that are multiples of the cutoff frequency.
Another parameter is Ringing Suppression. This parameter affects the width of the frequency response side lobes. So, having fixed the Smoothing Amount parameter at the desired frequency (for example, 1KHz), we can then change the Ringing Suppression parameter until we achieve the fact that at another frequency we need (say, 2x1KHz = 2KHz) there will also be a minimum.
But it should be added that when changing the Ringing Suppression parameter, one must keep in mind that all parameters have a nontrivial effect on the distribution of the minimums of the filter’s frequency response. The task of eliminating the Sampling Frequency Ripple effect is made easier by the fact that we do not need to provide minima at all multiple frequencies — just a few first multiples of frequencies are enough, since from the construction of the filter it follows that even the maxima of the spectrum ripple at high frequencies will be negligible in their absolute value.
Analytical Rationale
Main Assumption: In the filter kernel, the frequencies that are multiples of the sampling rate should be zero.
The filter built using the Kaiser window uses the parameter B not only to determine the suppression in the suppression band, but also the bandwidth. A large value of this parameter gives more suppression in the suppression band and a less sharp roll-off (wider bandwidth) Fig. 1. Attenuation of 40 dB means that the aliasing amplitude has decreased by about 100 times.
The Sampling Frequency Ripple elimination algorithm suppresses frequencies that are multiples of the sampling frequency.
Resampling with poly-phase structure
Function PreparePoly
Function PreparePoly prepares coefficients for a one-dimensional signal resampling operation. The result is stored in three arrays: PMap, LMap, and CMap. The inputs are the length of the original 1D signal, the length of the 1D output, and a filter kernel prepared by the function BuildScaledOPTIMALKernel.
Function DoResampling
This function directly resamples a 1D digital signal using the PMap, LMap, and CMap arrays prepared by the function PreparePoly. The algorithm can be represented as follows.
To calculate each next value of point [i] of the output signal, you need to calculate the following result: PMap[i] — the value of the point in the original signal, LMap[i] contains the length of the result, the array at the current point (CMapI) contains the LMap[i] coefficients of the result … After the result has been calculated, we save it as a point in the output signal.
Function PreparePoly
Function PreparePoly contains 3 similar, each of which prepares coefficients. The first cycle prepares the coefficients for the left point, for which it is necessary to perform boundary correction (the signal virtually continues by copying beyond the boundaries of the left point). The second cycle is responsible for intermediate points for which no boundary correction is needed. The third loop processes the border on the right.
In each cycle, the corresponding part of the coefficients is PMap, LMap, and CMap calculated and stored for later use in the function DoResampling. It would be possible to combine these 3 loops into one, but that would require performing checks on each iteration of such a loop (checking the left and right bounds).
At each iteration (for each point of the output signal), the intermediate results are calculated (the initial index in the original signal, the initial index in the polyphase filter), and we cache the coefficients, going along the filter kernel with a step KU (stretching coefficient).
Detailed description of the polyphase structure
Let X[0, …, Len-1] be the original signal,
KU and KD are the compression ratios and (integers),
Y[0, …, NewLen-1] is the output signal,
Kernel[0, …, KerSize-1] — filter kernel. KerSize must be odd to ensure that the center point is unique.
The interpolation / filtering / decimation approach uses an intermediate array Z[0, …, Len*KU — 1] to store the intermediate signal to be filtered. Let’s look at the direct implementation first.
Direct implementation
Polyphase Implementation
Algorithm explanation
The variable PMapI stores the current index of the array Y, the element of which is to be calculated. Consider the case where PMapI = 1 (calculating the second output pixel). Let KernelSize = 19, KU = 7, KD = 11.
The variable ConvCenter stores the current index in the array Znew, corresponding to the pixel PMapI in the array Y . Here ConvCenter = 16.
Now we need to calculate the dot product of Z[7],…, Z[25] with the kernel Kernel[0],…, Kernel[18]. According to the concept of the polyphase framework, you need to skip zero pixels in the array Z . dot product must be calculated as follows
What can be written
We obtain SPixel = 1 and ConvI = 3.
Now we can calculate a general expression for SPixel and ConvI
Boundary correction
When the above dot product is considered
, the index [SPixel+k] is compared with 0 and with Len. If SPixel+k <0, then put SPixel+k = 0, i.e. duplicate the leftmost pixel beyond the left border. If SPixel + K > Len, then put SPixel+k = Len-1, i.e. duplicate the right pixel.
Building adaptive and compositional filters.
The idea of the method
Resampling algorithms using various filters have been described above. It’s interesting to think about when to apply one filter and when to use another. This primarily depends on the result that we want to get. Those. for the sharpest resulting image, it is recommended to use the filter with the highest parameter Sharpness, and for a blurry one, it is necessary to increase the Smoothing Amount. Secondly, the choice of the filter depends on the class of images being resampled. For example, we created special settings for resizing monochrome images (for testing, we built in our program an algorithm for converting images to monochrome using the method of Error Diffusion. For images with smooth transitions, it is better to use an un-sharp filter, etc.
But the end user considers changing the size of the picture as a daily, intermediate task and does not want to think about the filter parameters and other internal details of the operation of our algorithm. Which, he also gets a result that does not meet his expectations.
Here comes the idea of automatic selection of filter parameters based on the analysis of the original image. This can be done in several ways. contrast, view the histogram of colors, gradients in different directions and decide which image we are working with, and based on this, choose the optimal filter parameters. The good thing about this method is that it is fast, i.e. we only analyze the image once, and then run the algorithm we are used to (i.e., we don’t even need to modify the code). But the main disadvantage of the approach is that usually in the same image there are different details, i.e. it usually contains both smooth transitions and sharp edges, and so on. Hence, it should be concluded that this approach is not universal and does not give qualitative results.
Another, more complicated, variant of the solution to the problem posed is to select the filter parameters at each point of the resampled image. To do this, when executing the resampling algorithm at each iteration (for each pixel), it is necessary to analyze the neighborhood of the considered pixel and select the filter parameters. Moreover, the size of this neighborhood must be greater than or equal to the filter width. The proposed method solves the problem described above, since an appropriate filter is applied to different parts of the image. The main disadvantage of this adaptive filtering approach is that it will work thousands of times slower than the method described above. This is due to the fact that we need to process a huge amount of information (analyze the area width * height times), and also because we cannot use a speed-optimized algorithm if the filter at each point is different. In addition to the unacceptably low speed of work, there are also problems with the borders of the image.
Implementation of the method
The proposed method will work not much slower (2–3 times) the first, but also have much more control than the second. The idea of the method arose from the very formulation of the problem: in some areas of the image, you need to use a filter with some settings, and in others — with others. To begin with, it is proposed to analyze the original image and identify these areas on it. For convenience, consider the case when we need to use a smooth filter on blurry areas of the image and a sharp filter near the edges of the image.
To do this, first create a mask image that will be used to determine where to use which filter.
To do this, first apply the border selection filter to the image.
At this point, it should be noted that we do not want to get sharp transitions between the areas of application of different filters. In other words, the goal is not to see where one filter ended its action and another began its action. Therefore, it is not enough to have only a mask on which the selected contours will be fixed — it is necessary to obtain the areas of the borders and the borders themselves.
To get these areas, let’s expand the contours by applying to the resulting image. With contours (pre-grayscale) Dilatation algorithm. And then we will blur the resulting image, for example, using a Gaussian filter. Let’s get the mask we need (alpha channel). The mask acquisition process is illustrated below.
Recall that we will differentiate the areas according to whether they relate to the contours of the image or not, and we will use 2 filters (sharp and blurring).
The next step is to resample the mask and image itself. And we resample the original image itself twice — with a sharp and blurring filter. The image will be resampled using the default filter (the most universal setting is Normal). Next, you need to use a mask for the two resulting ones to mount the desired final image. The black points of the mask will be the result of the blur filter, and the white ones — the sharp one. In the rest, we calculate the color by the formula C (x, y) = A (x, y) * alpha + (1-alpha) * B (x, y).
Illustration of the method
Let us illustrate the described algorithm.
The proposed method has many advantages over the first two. First, the time required to execute the algorithm does not greatly exceed the execution time of the first method (you need to resample the image 3 times and get a mask). Secondly, you can use various other masks, not just the one described in our example. For example, you can select areas where changes in color are insignificant or any other (objects, desired color areas). You can load a pre-molded mask. This is a polka dot way to see the difference between the two filters. Third, the algorithm can be parallelized, since all three tasks are completely independent.
Adaptation of the algorithm for resampling video sequences
Resampling for video sequences imposes its own requirements on the algorithms.
First, very often, for example, when viewing a digital signal of one resolution, say 320x240, on a monitor with a resolution of 1024x768, the priority requirement is the speed of the algorithm. The algorithm must be able to process images on the fly (real time), i.e. from 24 to 30 images per second depending on the video format. In this case, the nearest neighbor algorithm or bilinear interpolation is usually used.
The “impression” algorithm is primarily intended for high-quality resampling, so it is primarily not applicable for on-the-fly processing.
When resampling video sequences, there are many problems associated primarily with the device itself and the video signal format. Typically, for smaller volumes, the digital video signal is compressed using some kind of video compression method. Such a signal must first be converted into a sequence of independent frames, which is performed by codec programs and programs such as AVI-Constructor, Adobe Premiere, Discreet 3DS MAX, Virtual DUB. Sometimes the digitized video signal is split into semi-fields. The image consists of two fields: one field contains even, and the other — odd lines of the picture (interlaced). To resample such a signal, you first need to deinterlacing the signal, i.e. combine half-fields into one static frame.
Results
Examples of the algorithm and comparison
Example 1.
Reducing Image 1 by various algorithms up to a resolution of 200x300.
Then the image was enlarged to 1200x1800 resolution, and the region in which the comparison was made was cut out from it.
Example 2.
Image for magnification
Example 3.
An image with a resolution of 1500x1125 is first reduced, then enlarged by various algorithms. Then the regions in the images are compared.
As you can see from the figures, the Impress: Normal preset gives excellent results for most image classes and presents an excellent compromise for resampling artifacts.
The Extra Sharp and Sharp presets do a great job with sharp images.
Summary
A number of classes in C ++ were developed for working with images using the described methods. In particular, it is the СImageStore class for storing and loading 24-bit and 32-bit images in the Windows system format and a basic set of operations. Also, an interface class CImage was created (containing about 60 functions), encapsulating the functions of the “main” and giving the user ample opportunities for image processing, and the programmer — the convenience of expanding the functionality.
The described algorithm is built into the Impress program and is successfully used in practical work with images.
Substantial work has been successfully carried out to optimize the performance of the algorithm. The performance of the algorithm was equal to the performance of other popular algorithms.
Most of the methods for changing the image resolution have been studied.
A set of filters for resampling images has been developed
Work has been done to assess the quality of various resampling algorithms.
A program has been written that implements resampling with the obtained filters.
The algorithm has been tested, which has shown its excellent properties.
Download compiled version of Impress windows application