# Automatic regularization parameter selection: controlled wavelet domain sparsity

Summary: Choosing the regularization parameter is a hard problem, for which many approaches exist. Here we discuss the recently developed and fully automatic method, called controlled wavelet domain sparsity (CWDS) in the context of X-ray tomography. This approach involves sparsity promoting regularization with respect to an orthogonal wavelet basis. The minimization problem is solved using the iterative soft-thresholding algorithm and the regularization parameter is selected by making an assumption, that an a priori known level of sparsity is available. The regularization parameter is then varied during the iterations of the algorithm so that the final reconstruction has the desired level of sparsity.

Author: Juho Rimpeläinen.

Software: Tested on both Matlab R2017a and Octave (version 4.2.1)

Literature: It is recommended you check out the previous posts about X-ray tomography. For more information on the reconstruction algorithm see . The CWDS method was introduced in .

## The algorithm

We study CT imaging by minimizing the following penalty functional: Here W denotes the digital implementation of a two dimensional wavelet transform and μ is the regularization parameter. This can be minimized using the iterative soft thresholding algorithm. Namely, by using iterations: with the soft-thresholding operator Sμ, which applies the following function to each component of the vector: Additionally, we make sure that each pixel of the iterate is positive simply by setting the negative elements to zero. Note, however, that it is uncertain how enforcing nonnegativity this way affects the convergence of the algorithm. For this reason, it might be preferable to use for example the primal-dual fixed point algorithm, which allows easy nonnegativity constraints. We also encourage the reader to check out  for a useful generalization for the iterative soft-thresholding algorithm. However, we deem the iterative soft thresholding algorithm sufficient for demonstrating the method for selecting the regularization parameter.

The method for selecting the parameter μ is based on an assumption, that an a priori known estimate for the sparsity level of the object being imaged is available. This could be obtained, for example, from a set of previously computed reconstructions. Here we assume that the estimate is expressed as a ratio of nonzero wavelet coefficients of the reconstruction. Now on each iteration of the algorithm, we can easily measure the current ratio of the nonzero coefficients, and then we try to adjust μ so that the desired ratio is reached. This could be achieved by using a proportional-integral-derivative (PID) controller. However, even a simple integral control seems to be sufficient: at each iteration μ is updated using where α is a non-negative tuning parameter, y(i) is the ratio of nonzero coefficients at i-th iteration, and yprior is the a priori known estimate for the ratio. The  parameter α is a stepsize:  a too large value results in oscillative behavior in the ratio of nonzero coefficients as well as in the values of μ, whereas if the value is too small, reaching the desired ratio takes a long time. There are probably countless different ways to select α, but here we consider a very simple and intuitive approach. The basic idea is to first select a too large value for α, and decrease it each time the difference e = y(i) – yprior changes sign. This way we can ensure (at least to some degree) that the desired ratio of nonzero coefficients will be reached in reasonable time. Our implementation of this idea updates α as follows

This simple tuning strategy has some drawbacks: first of all it requires one to have a reasonable guess for the initial α, so that it is sufficiently large, but preferably not too large as it can take a very long time to decrease it to a good level. We solve this by computing the mean of the largest wavelet coefficients of the back-projection reconstruction and then using some fraction of the mean as the initial value. The thinking here is that back-projection reconstruction should contain most of the major features of the object, and thus it could be used to find a somewhat reasonable range for μ, which in turn helps us to find large enough α so that the scheme above works. Additionally, our method might not always have the best performance, because if α decreases too quickly, one might end up in a situation where y(i) is still far away from yprior, but α is so small that the change in μ is very slow.

## Computational implementation and results

Next let us  briefly inspect the code (download links above) before looking at the results. The main bulk of the code is in the function cwds.m. It contains the code for the reconstruction algorithm, as well as the implementation for the control scheme. The function Smu_wavelet_oper.m implements the operator Sμ defined above, and it in turn uses the functions wavetrans2D.m, wavetrans2D_inv.m and Smu.m, which implement the wavelet transform, the inverse wavelet transform and Sμ. Finally the file test.m can be used to set up the experiment: loading the data, setting up the values for different parameters, as well as defining the orthonormal wavelet basis used.

In the following computational examples we use the 328×328 walnut dataset obtainable from here (documentation in arXiv) . We use the desired sparsity level of yprior = 0.32. As the initial value for α, we take 10% of the mean of the n(1-yprior) largest wavelet coefficients of the back-projection reconstruction, here n is the total number of wavelet coefficients. In the first example, we use 30 projection directions. Running the file test.m produces the following figures. The evolution of the ratio of nonzero wavelet coefficients of the iterate, the relative square norm change, and the thresholding parameter as the iteration progresses.

Let us also see the reconstruction from more complete data. This is easily obtained by changing `Nang = 120` in test.m. The figures can be seen below. The results are fairly similar to ones above, besides the obvious impovement in reconstruction quality, however the results are obtained slightly faster (in 113 iterations vs. 151 iterations), and the peak in the value of μ is lower. This is due to a better intial value for α, obtained from more complete data.