Institute for Social Research API | From Ads to Interventions: Contextual Bandits in Mobile Health

The first paper on contextual bandits was written by Michael Woodroofe in 1979 (Journal of the American Statistical Association, 74(368), 799â€”806, 1979) but the term â€œcontextual banditsâ€ was invented only recently in 2008 by Langford and Zhang (Advances in neural information processing systems, pages 817â€”824, 2008). Woodroofeâ€™s motivating application was clinical trials whereas modern interest in this problem was driven to a great extent by problems on the internet, such as online ad and online news article placement. We have now come full circle because contextual bandits provide a natural framework for sequential decision making in mobile health. We will survey the contextual bandits literature with a focus on modifications needed to adapt existing approaches to the mobile health setting. We discuss specific challenges in this direction such as: good initialization of the learning algorithm, finding interpretable policies, assessing usefulness of tailoring variables, computational considerations, robustness to failure of assumptions, and dealing with variables that are costly to acquire and missing.