An asynchronous scheduler using MPI for Adaptive
The Adaptive scheduler solves the following problem, you need to run a few 100 learners and can use >1k cores. You can't use a centrally managed place that is responsible for all the workers (like with dask or ipyparallel) because >1k cores is too many for them to handle. You also don't want to use dask or ipyparallel inside a job script because they write job scripts on their own. Having a job script that runs code that creates job scripts... With adaptive-scheduler you only need to define the learners and then it takes care of the running (and restarting) of the jobs on the cluster.