Teaching

Introduction to parallel programming

Hands-on starting from a Python image processing code, offloading hot spot to a C function, and optimizing it using OpenMP task parallelism, then OpenMP data parallelism, and finally vector instructions.

Subject / Source and results

Here are some results on a dual-X5670 (2×6 cores @2.9GHz):

  • From naive python to naive C: 233x
  • From naive C to task based parallelism: 2.1x (493x over python)
  • From task based parallelism to data parallelism: 7.6x (3758x over python)
  • From data parallelism to SSE2 vectorizing: 13.46x (50612x over python)

Introduction to Object-Oriented Programming with Java

40h, 5 days seminar. More details to come…