Parallel computing and GPU programming with Julia¶

Introduction¶

Alexis Montoison

No description has been provided for this image

There are many types of parallelism:

Instruction level parallelism (e.g. SIMD)
Multi-threading (shared memory)
Multi-processing (shared system memory)
Distributed processing (typically no shared memory)

And then there are highly-parallel hardware accelerators like GPUs.

Important: At the center of any efficient parallel code is a fast serial code!!!

When go parallel?¶

If parts of your (optimized!) serial code aren't fast enough.
- note that parallelization typically increases the code complexity.
If your system has multiple execution units (CPU cores, GPU streaming multiprocessors, ...).
- particularly important on large supercomputers but also already on modern desktop computers and laptops.

How many CPU threads / cores do I have?¶

In [ ]:

using Hwloc
Hwloc.num_physical_cores()

Note that there may be more than one CPU thread per physical CPU core (e.g. hyperthreading).

In [ ]:

Sys.CPU_THREADS

Amdahl's law¶

Naive strong scaling expectation: I have 4 cores, give me my 4x speedup!

If $p$ is the fraction of a code that can be parallelized, then the maximal theoretical speedup by parallelization on $n$ cores is given by $$ F(n) = \frac{1}{1 - p + p / n} $$

In [ ]:

using Plots
F(p,n) = 1/(1-p + p/n)

pl = plot()
for p in (0.5, 0.7, 0.9, 0.95, 0.99)
    plot!(pl, n -> F(p,n), 1:128, lab="p=$p", lw=2,
        legend=:topleft, xlab="number of cores", ylab="parallel speedup", frame=:box)
end
pl

Parallel computing in Julia¶

Julia provides support for all types of parallelism mentioned above


Instruction level parallelism (e.g. SIMD)	→ `@simd`, SIMD.jl, ...
Multi-threading (shared memory)	→ Base.Threads, ThreadsX.jl, FLoops.jl, ..
Multi-processing (shared system memory)	→ Distributed.jl, MPI.jl, ...
Distributed processing (typically no shared memory)	→ Distributed.jl, MPI.jl, ...
GPU programming	→ CUDA.jl, AMDGPU.jl, oneAPI.jl, KernelAbstractions.jl, ...

Reference: JuliaUCL24