portable and efficient thread pool implementation
conda install mark.harfouche::pthreadpool
It provides similar functionality to pragma omp parallel for, but with additional features.