Estimate an optimal dynamic treatment regime using Interactive Q-learning.
conda install r::r-iqlearn