Hierarchical Generalization without Hierarchial Bias

Photo by David Ballew on Unsplash

Our project aims to study how recurennt neural networks perform on tasks that require hierarchial generalisation. We study the specific case of transforming declarative sentences to polar questions in English. We generate the declrative sentences using context free grammars and transform them using a deterministic rule to generate the data. Then, we train seq2seq models on the genrated data. We evaluate the generated data on a test set with sentences similar to the ones presented during training and a generalisation set that requires the model to learn the correct the heirarchial rule for high accuracy. We perform the same task for multiple models and observe the effects of different random initialisations on the network.

Avatar
Ujwal Narayan
SDE (ML)

My research interests include narrative understanding, applications of NLP over long documents, language theory, and exploring LLMs and making them more interpretable with a focus on factuality.

Related