Large Language Models (LLMs) have rapidly advanced in multiple capabilities, such as text and code understanding, leading to their widespread use in a wide range of applications, such as healthcare, education, and search. Due to the critical nature …
The Securities and Exchange Board of India (SEBI) is the regulatory body for securities and commodities in India. SEBI creates, and enforces regulations that must be followed by all listed companies. To the best of our knowledge, this is the first …
A system that performs semantic processing of SEBI documents using language models to produce enriched regulations containing timelines of amendments and cross references to legal case files is presented.
Character arcs are important theoretical devices employed in literary studies to understand character journeys, identify tropes across literary genres, and establish similarities between narratives. This work addresses the novel task of …
This work explores the gains attributed to Task Adaptive Pretraining (TAPT) prior to fine-tuning of Transformer-based architectures and builds upon an architecture that takes emojis and segmented hashtags into consideration for classification, to experimentally showcase the performance upgrades due to TAPT.
This work leverages Transformer language models to identify hate speech in a multilingual setting with a pre-trained multilingual Transformer-based text encoder at the base and is able to successfully identify and classify hate speech from multiple languages.
This paper linguistically analyze what constitutes an event in this language, the challenges faced with discourse level annotation and representation due to the rich derivational morphology of the language, which can be used for semantic annotation and corpus development for other tasks in the language.