We study bandit algorithms for e-commerce recommender systems. The question we pose is whether it is necessary to consider reinforcement learning effects in recommender systems. A key reason to introduce a recommender system for a product page on an e-commerce site is to increase the order value by improving the chance of making an upsale. If the recommender system merely predicts the next purchase, there might be no positive effect at all on the order value, since the recommender system predicts sales that would have happened independent of the recommender system. What we really are looking for are the false negatives, i.e., purchases that happen as a consequence of the recommender system. These purchases entail the entire uplift and should be present as reinforcement learning effects. This effect cannot be displayed in a simulation of the site, since there are no reinforcement learning effects present in a simulation. The attribution model must capture the uplift to guarantee an increased order value. However, such an attribution model is not practical, due to data sparsity. Given this starting point, we study some standard attribution models for e-commerce recommender systems, and describe how these fare when applied in a reinforcement learning algorithm, both in a simulation and on live sites.