Transformer architecture from 'Attention is all you need' paper.