Want to train a machine translation system? Train it on a gazillion pairs of sentences of parallel corpora, and that creates a lot of breakthrough results. Increasingly, I'm seeing results on small data where you want to try to take in results even if you have 1,000 images.