Description
WENO schemes are in general non linear procedure (due to the smoothness indicators computation) that could take a not so negligible cpu-time. However, WENO procedures typically operate on a very reduced size stencil that seems (at a first view) to not offer much space for parallel scaling.
As Rouson cleary states into his great book, preliminary optimization could be very dangerous: for the moment we do not care about eventual performance bottlenecks for parallel architectures. However, here I would like to discuss about future strategies for supporting parallel architectures.
As a first guess, I suppose that the main parallel features that library should provides asap are:
- being thread safe: WENO generally operates on a reduced size stencil that is typically a slice of a greater block; the procedures operating on this parent block are generally suitable to exploit thread-shared-memory parallelism, thus it could be crucial to ensure the thread safety for our library;
- exploiting vectorization: this could speed up greatly the interpolation.
In this contest, your experience (I am thinking to Zaak, Andrea, Francesco, Rouson, Muller, Americi and many other members of our group) is very important. Please, feel free to post any pertinent comments.
Activity