Specialized linear algebra library, tailored for computational material science codes.
copied from cf-staging / splaSPLA provides specialized functions for distributed matrix multiplication, as required in some computational material science codes. It aims for maximum computation and communication overlap when possible and allows any combination of host and device pointers if compiled with GPU support. C++, C and Fortran interfaces are available.