*This page is designed as a place where group members can communicate
with each other.*

*by Adam
Huang, April 26*

To improve the parallelism of Howard's original code, which uses a global GMRES solver with an incomplete LU preconditioner, I have been working on the Domain Decomposition approach to the solver. I applied the partitioning routines from the METIS 2.0 package. This is a DD library developed by George Karypis of the CS Department of the University of Minnesota. The algorithms implemented by METIS are based on the multilevel graph partitioning scheme. I took the lazy approach here and just applied the routines to the assembled global matrix instead of to the mesh.

The problem is how to solve this system. Different approaches have been explored:

- The Schur complement approach----In this approach the
subdomain blocks are factorized and the system is reduced to the Schur
complement, E-D*Invers(B)*C, which has the order of the number of interface
nodes. GMRES is then applied to this reduced system. Without the preconditioner
the convergence is very slow (see the graph). Then I used the ILUT factor of E
as a preconditioner, and noticed a dramatic improvement in the convergence.
Also with this approach, I followed a suggestion by Dr. Sameh about computing
the diagonal elements of the complement D*Invers(B)*C. What I found out is that
the magnitude of these diagonal elements is one order smaller than that of the
E matrix. Hence the expensive operation of computing the diagonal elements of
D*Invers(B)*C only gives marginal improvement of convergence.

*Schur complement approach: Convergence with and without preconditioner* - The block diagonal approach----With this approach the block
diagonal ILUT is used as a preconditioner for the global matrix. This approach
gives a high degree of parallelism; for instance, with 4 processors I got a
speedup of about 2.5. However, the convergence is even slower than with the
Schur complement approach because we are solving a much larger system.

*Convergence with block diagonal preconditioner* - Block Jacobi approach----I also tried to implement the block
Jacobi iteration taking into account the cyclical-2 property of the matrix. I
don't get any convergence with this approach. I suspect the reason for this is
that the matrix is not at all diagonal-dominant, of course there is the
possibility that there are errors in my implementations.

Overall the Domain Decomposition approach I'm taking has not shown any speedup over the original global GMRES solver when running in a sequential mode, because of the slow convergence of the new system. Of course we get quite a lot of parallelism. Larger problems need to be tested to see if there is some benefit.

*by Matt Knepley, June 26*

I hope to implement the improvements listed below by July 15 (this should include testing).

- Settle on an acceptable preconditioner. ILUT seems fine to me, but we could see if Block SSOR is less costly.
- Implement Deflated GMRES. I have code to do this here and have tested it. I will integrate it into the libraries this week.
- Tune Inexact Newton Iteration. We could start out with something easy, and then there is a nice article in the new SISC by Eisenstat and Walker about adaptively choosing tolerances.

I hope to implement the improvements listed below by July 15 (this should include testing).

- Get CVS up and running at UMN. This is going fine.
- Install SPARSKIT as a vendor product. This seems straightforward.

I hope to implement the improvements listed below by August 1.

- Standardize test cases. This seems to have already been done by Howard, but I need to familiarize myself with the procedure.
- Expand test criteria. Right now, tests seem to only look for physical predictions. We should also be looking toward benchmarks across different solvers and preconditioners.

| AEM Home | Institute of Technology |

| Academics |
**Research** | People |
Information | Contact
AEM |

| Major Projects |
**by Specialty** |
Other Programs |

*Last updated October 16, 2000*