Dependent Transformers

Optimus Prime in Transformers - Age of Extinction (2014)

Transformer-based models have surpassed RNN-based neural machine translation models in terms of performance. While they achieve high scores when trained on large amounts of training data, there is still scope for improvement in their use in low- and moderate-resource settings. We propose incorporating syntax information explicitly into the training process using dependency parse trees. We explore two methods of augmenting the training data with parse trees, and report results on subsets of the Europarl corpus. We also evaluate the self-attention matrices for syntactic representation by inducing dependency trees from them. We find that in this training regime, the use of explicit syntactic information does not improve the performance of the Transformer models in a low-resource setting. Our data and trained models are available here

Avatar
Ujwal Narayan
SDE (ML)

My research interests include narrative understanding, applications of NLP over long documents, language theory, and exploring LLMs and making them more interpretable with a focus on factuality.

Related