A general feature of reinforcement learning is the trade-off between exploration, in which the system tries out new kinds of actions to see how effective they are, and exploitation, in which the system makes use of actions that are known to yield a high reward.
Christopher M. Bishop - «Pattern Recognition and Machine Learning» (2006)
exploration is where we gather more information that might lead us to better decisions in the future
exploitation is where we make the best decision given current information