A generalization error for Q-learning