Near-optimality for multi-action multi-resource restless bandits with many arms /
We consider multi-action restless bandits with multiple resource constraints, also referred to as weakly coupled Markov decision processes. This problem is important in recommender systems, active learning, revenue management, and many other areas. An optimal policy can be theoretically found by sol...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis Book |
Language: | English |
Published: |
Ann Arbor, Michigan :
ProQuest Information and Learning,
2022
|
Subjects: |