CSCI 421
Numerical Computing, Spring 2025
Homework Assignment 7
1. (20 points) Implement the (normalized) power method (AG p. 289), inverse iteration (AG p. 294) and the Rayleigh quotient iteration (AG p. 295). They all use the notation μk = vk(T)Avk for the Rayleigh quotient (note that the usual Rayleigh quotient denominator vk(T)vk is always one). In inverse iteration, for efficiency you need to do the LU factorization of the matrix A - αI outside the loop and then use forward and back substitution (using \ is fine) inside the loop. This is not possible for the Rayleigh quotient iteration because the matrix depends on the iteration counter k. In all cases compute and display the eigenvalue residual norm ⅡAvk - μkvk Ⅱ at the end of the loop and terminate when either the residual norm less than 10—14 or the iteration counter k reaches 100. Also print k , μk and the eigenvalue residual norm at the end of the loop using fprintf.
Run all 3 iterations on the tridiagonal matrix T computed from calling tridiagExample.m, with inputs N=100 and nonsym=0 .1. Set v0 to the vector of all ones. For inverse iteration, use α = 4. Using semilogy, plot the residual norms of all 3 methods on the same plot. Explain carefully how the plot demonstrates what we discussed about the con- vergence of these 3 methods in class. Note that the Rayleigh quotient iteration does not find the largest eigenvalue of T, unlike the other 2 methods.
2. (20 pts) As I mentioned in class, if we first compute the eigenvalues of a matrix A and then later we decide we want the eigenvector corre- sponding to a computed eigenvalue λ, we can do this very effectively by solving the equation (A - λI)x = b, where b is chosen randomly. This amounts to just one step of inverse iteration (the inverse power method) using the shift λ . The reason it works is that λ is not exactly an eigenvalue of A and hence A - λI is not exactly singular. It is very badly conditioned, so there is a lot of rounding error incurred in solving the equation, but luckily, and very surprisingly, the rounding error is almost entirely in the magnitude of the solution x, not the di- rection. But we don’t care about the magnitude because we are going to normalize x anyway. And the reason the resulting x is very close to an eigenvector is the same reason that the inverse power iteration converges so fast: see AG p. 293 and my notes. Verify these claims as follows:
• choose a random square matrix with odd order, say 11 or 101, so that it must have at least one real eigenvalue
• compute its eigenvalues with evalues = eig(A)
• choose one of the computed real eigenvalues for λ (using real or imag) and solve (A − λI)x = b using \, where b is chosen randomly – note that Matlab gives you a warning about A−λI being nearly singular
• normalize x by dividing by its 2-norm
• check the eigenvalue-eigenvector residual ⅡAx − λxⅡ2 : it should be around machine precision
• another much more expensive way to find the eigenvector is to compute the SVD of B = A − λI. The eigenvector of A corre- sponding to the eigenvalue λ is one of the left or right singular vectors of B: which one and why?
3. (30 pts) This concerns the SVD Block Power method for approximat- ing the largest r singular values and corresponding singular vectors of an m × n matrix A, where r is less than n, usually much less:
Initialize V0 to an n × r matrix For k = 1, 2, . . .
Set U(ˆ) = AVk— 1
Get reduced QR factorization U(ˆ) = QURU
Set Uk = QU
Set V(ˆ) = AT Uk
Get reduced QR factorization V(ˆ) = QVRV
Set Vk = QV
This is also given on p. 22-6 of my lecture notes for April 14, but it is not discussed by AG.
(a) The signs of the diagonal entries of R in a QR factorization are arbitrary: although Gram-Schmidt (which we are not us- ing) chooses them to be positive, Householder (which Matlab’s qr uses) does not. Since we’d like them to be positive in the next part, write a short Matlab function [Q,R]=qrPosDiagR(A) which computes the reduced Q and R factors of A by calling Mat- lab’s qr(A,0) and then modifies the Q and R factors so that the diagonal entries of R are all positive (or zero, but that’s unlikely), by changing the sign of the corresponding row of R and the cor- responding column of Q, so that the equations A = QR still holds (check this!) One way to see that this “works” is to observe that QR = QS2 R = (QS)(SR) where S is a diagonal matrix of ±1’s.
(b) Implement the SVD Block Power method given above.
Use qrPosDiagR for both QR factorizations in the loop. It turns out that for random matrices, both the upper triangular matri- ces RU and RV converge to diagonal matrices, so compute the matrix 2-norm of the strictly upper triangular parts of both RU and RV (either by using triu( .,1) or by subtracting off the di- agonal entries) and terminate the loop when the maximum of these two norms is less than a tolerance, tol. The inputs to this function should be an m × n matrix A (it should not mat- ter whether m > n, m = n or m < n), an integer r defining the number of columns in Uk and Vk that is usually much less than m and n, and the tolerance tol. Before the loop, set the initial matrix V0 to the n × r matrix whose columns are the first r columns of the n × n identity matrix, obtained by eye(n,r); you can get m and n from size(A). This block power method can be slow, like the block power iteration for eigenvalues, but it works pretty well when r is small. Run it on this matrix A, setting r = 3 and setting tol to say 10—6 (you can also try other values if you like). Now compare the diagonal entries of the final RU and RV to the largest r singular values of A which you can compute from Matlab’s svd. Are they good approximations to the singular values? How do the final Uk and Vk compare with the corresponding singular vectors computed by Matlab’s svd?
(c) It is not hard to see from the SVD notes that the r largest singular
values and the corresponding singular vectors of A, defined by
Σ(˜) = diag(σ1,... ,σr ), U(˜) = [u1,... , ur ], V(˜) = [v1,... , vr ],
satisfy AV(˜) = U(˜)Σ(˜) and ATU(˜) =V(˜)Σ(˜) . Explain why when the block
power method terminates it must be the case that
ⅡAVk— 1 − UkDUⅡ ≤ tol and ⅡAT Uk − VkDVⅡ ≤ tol,
where DU and DV are respectively the diagonals of RU and RV , and also check to make sure that this is actually the case on the example you tested. These small residuals help to explain why the approximations to the singular values and singular vectors computed by the block power method are meaningful provided the tolerance condition is satisfied.
(d) What is the operation count for each iteration of the loop, in terms of r, m, n using Big O? Explain your reasoning.
4. (30 pts) Download WallOfWindows.jpg and read it into Matlab:
A=imread(’WallOfWindows.jpg’,’jpg’)
Display it with image(A). If you type whos A you will see that A is a 3456 × 4608 × 3 three-dimensional “matrix” (or tensor) of unsigned 8-bit integers. Split these into red, green and blue components, each a 3456 × 4608 matrix of unsigned integers, via A_red=A(:,:,1), etc, and convert these to double precision matrices using double.
(a) Compute the reduced (economy-sized) SVD of each color com- ponent matrix separately using Matlab’s svd(M,0). Then, for each component matrix, compute its nearest rank r approxima- tion (see p. 6–7 of Notes on Eigenvalues and Singular Values), for r = 10, 20, 30 and 100. Recombine them into the 3456 × 4608 × 3 uint8 format, and then display the resulting four pictures using image, labelling them appropriately. How well do they approxi- mate the original picture in your opinion? This shows that SVD is a useful approach to image compression: instead of storing the whole matrix, you need only to store the part of the SVD needed to construct the rank r approximation: how much storage does this require, in terms of m, n and r?
(b) The problem with the method just described is that it is expensive to call svd on such a large matrix and it would take much too long for a much larger matrix. Can you get pictures that are just as good using your block power method from the previous question, with less computational expense, again using r = 10, 20, 30 and 100? For the singular values, you could use the diagonal of either the final RU or RV or an average of the two, and for the singular vectors, use the final Uk and Vk . You may find you don’t have to make the tolerance in the block power method very small to get good pictures. How big can you make tol and still get good pictures, in your opinion, and how does the time required compare to the time required using svd?