## EECS 358 - Introduction to Parallel Computing

**CATALOG DESCRIPTION: **Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, distributed memory, message-passing data-parallel architectures, and programming.

REQUIRED TEXTS: A. Grama, A. Gupta, G. Karypis, and Vipin Kumar, Introduction to Parallel Computing , Addison Wesley, 2 nd edition, 2003

**REFERENCE TEXTS:**

• I. Foster, *Designing and Building Parallel Programs*, Addison Wesley 1995.

• B. Bauer, *Practical Parallel Programming *, Academic Press 1992.

• W. Gropp et al, *Using MPI: Portable Parallel Programming with the Message Passing **Interface *, MIT Press 1994.

• C. Koelbel et al, *The High-Performance Fortran Handbook *, MIT Press 1994.

**COURSE COORDINATOR: Prof. Gokhan Memik**

**COURSE GOALS: **To provide an introduction to the field of parallel computing. The goals are to provide an overview of the three basic types of parallel computing: shared memory, distributed memory message-passing, and data parallel computing, with hands-on experience with real parallel programming on actual parallel machines.

**PREREQUISITES **: EECS 361 and EECS 230 ** or **EECS 231

**EECS 211**

*or***PREREQUISITES BY TOPIC:**

- An overview of computer architecture
- Basic concepts of processors, ALUs, memories, caches, input-output
- Basic to intermediate concepts on programming of serial computers using C or Fortran
- Simple concepts of data structures like arrays and link lists in programs
- Some knowledge of scientific and engineering applications

**DETAILED COURSE TOPICS:**

**Week 1 **: Introduction to parallel computing: motivation for parallel computing, options of parallel computing, economics of parallel computing, basic concepts of parallel algorithms. Introduction to parallel programming: data and task parallelism, coarse and fine grain parallelism, performance of parallel programs, load balancing and scheduling, analysis of simple parallel programs.

**Week 2 **: Overview of shared memory parallel architectures: memory organization, interconnect organization, cache coherence, case studies of machines such as SGI Challenge, IBM J-30, HP/Convex Exemplar. Introduction to shared memory parallel programming: shared memory model, process creation and destruction, mutual exclusion, locks, barriers.

**Week 3: **Explicit shared memory programming: loop scheduling, static and dynamic, loop parallelization strategies. Shared memory parallel programming: use of PTHREADS libraries, case studies of explicit parallel programming, computation of PI, matrix multiplication, solution of partial differential equations.

**Week 4 **: Implicit shared memory parallel programming: use of compiler directives for parallel programming, DOALL and DOACROSS and PRAGMA directives for loop level parallelism, parallel programming examples using directives.

**Week 5 **: Distributed memory multicomputer architectures: overview of distributed memory parallel machines, message passing schemes, store and forward versus wormhole routing, interconnection networks, case studies of parallel machines such as Intel Paragon, IBM SP-2, Thinking Machine CM-5. Global Communication operations in distributed memory machines: one-to-all broadcast, reduction, shift, scatter, gather operations, analysis of performance of above operations on various parallel architectures.

**Week 6 **: Introduction to message-passing programming: basics of message passing, global and local addresses, single-program multiple data (SPMD programs) introduction to Message Passing Interface (MPI). Intermediate concepts in message passing programming: global and local addresses, loop scheduling for parallel loops. Advanced message-passing concepts: topologies, and decompositions, case studies of example applications, matrix multiplication, solution of partial differential equations.

**Week 7 **: Introduction to SIMD parallel architectures: Single-instruction multiple data stream architectures, control and data units, interconnection networks, case studies of machines such as Thinking Machines CM-2, CM-5 and Masspar MP-2. Introduction to data parallel programming: Fortran-90, array sections, array operations, array intrinsic operations.

**Week 8 **: Introduction to High Performance Fortran (HPF): FORALL directives, INDEPENDENT directives, simple parallel programs. High Performance Fortran data distribution and alignment directives, simple parallel programming examples, matrix multiplication, solution of partial differential equations.

**Week 9 **: Methodology for Parallel Algorithm Design: concurrency, locality, scalability, modularity; partitioning, agglomeration, communication, mapping, performance analysis of parallel algorithms. Parallel Matrix Algorithms: matrix representations, parallel dense matrix operations, matrix-vector, matrix-matrix multiplication, solutions of linear system of equations.

**Week 10 **: Parallel Sparse Matrix Solvers: sparse matrix representations, parallel iterative methods, parallel direct methods. Parallel Search Algorithms: optimization methods, parallel best first search, parallel depth-first search, speedup anomalies.

**COMPUTER USAGE: **Students get hands-on parallel programming experience on 3 parallel machines at the Electrical and Computer Engineering Department, including a 16 processor IBM SP-2 distributed memory machine, an 8 processor IBM J-40 shared memory machine, and an 8-processor SGI Origin 2000 distributed shared memory multiprocessor. In addition, students will use the machines in the Wilkinson Lab as a cluster as well as access a dedicated cluster with 16 processors for the final project.

**HOMEWORK ASSIGNMENTS:**

**Homework 1: **Design problems dealing with shared memory parallel programming, examples of program transformations to parallelize loops, use of explicit and implicit parallel programs using both the SGI directives and PTHREADS.

**Homework 2: **Design problems dealing with distributed memory message-passing parallel programming, use of MPI, analysis of communication patterns.

**Homework 3: **Design programs related to data parallel programming, use of High Performance Fortran, data layouts and alignments.

**Homework 4: **Design of parallel algorithms for various problems including matrix operations on dense and sparse matrices, analysis of parallel algorithms.

**LABORATORY PROJECTS:**

**Lab 1: **Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using both explicit parallel programs with PTHREADS and implicit parallel programs using SGI directions, experiments on SGI Origin 2000 and IBM J-40 shared memory multiprocessors.

**Lab 2: **Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using message-passing parallel programming using C or Fortran with MPI message passing, and experiments on IBM SP-2 distributed memory and SGI Origin 2000 multiprocessor for portability.

**Lab 3: **Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using data parallel programming with High Performance Fortran and experiments on the IBM SP-2 distributed memory multiprocessor.

**GRADES:**

- Four homeworks - 20 %
- Three labs - 20 %
- Midterm exam - 30%
- Final exam - 30%

**COURSE OBJECTIVES: When a student completes this course, s/he should be able to:**

- Solve a given problem using parallel computing. Analyze the problem for various ways of parallelization, and design the best parallel algorithm.
- Have a broad understanding of shared memory parallel architectures and programming.
- Design a shared memory parallel program for a given parallel algorithm using both explicit and implicit parallel programming, measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
- Have a broad understanding of distributed memory parallel architectures and programming.
- Design a message-passing distributed memory parallel program for a given parallel algorithm using the portable Message-Passing Interface (MPI), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
- Have a broad understanding of data parallel architectures and programming.
- Design a data parallel program for a given parallel algorithm using High Performance Fortran (HPF), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.

**ABET CONTENT CATEGORY: **100% Engineering (Design component).