Implements reinforcement learning environments and algorithms as described in Sutton & Barto (1998, ISBN:0262193981). The Q-Learning algorithm can be used with function approximation, eligibility traces (Singh & Sutton (1996) <doi:10.1007/BF00114726>) and experience replay (Mnih et al. (2013) <arXiv:1312.5602>).
Label | Latest Version |
---|---|
main | 0.2.1 |