Natural Language Inference
What is Natural Language Inference?
Given two statements or sentences $S_1$ and $S_2$, the task of determining whether a hypothesis is true, given a premise. Consider the following examples with $S_1$ acting as the hypothesis and $S_2$ acting as the premise.
$S_1$ | $S_2$ | NLI Tag |
---|---|---|
Some men are playing a sport | 22 male players are playing cricket | Entailment |
Harry Potter hates all sports | All the students at Hogwarts except Hermione and some Ravenclaw boys love quidditch | Contradiction |
Rory is the last centurion | Rose has been on the TARDIS | Neutral |
NLI need not be a strictly logical conclusion. The task is loosely defined and if an average person proficient in the language can verify the premise given the hypothesis then, we say that they are entailed and so on for the other relation labels.
NLI thus seems necessary for what has been termed as the “Holy Grail of NLP”, Natural Language Understanding(NLU)
What makes NLI hard?
Consider the second example in the table given above. The mappings any system needs to do tp generate the right answer would be something like the following:
- Harry Potter is a student in Hogwarts
- Harry Potter is in Gryffindor
- Harry Potter is not Hermione
- Harry Potter is not in Ravenclaw
- Qudditch is a sport
- Harry Potter being a student in Hogwarts and Harry not being Hermione or a Ravenclaw boy must therefore like quidditch a sport.
- But in the hypothesis, it states that he hates all sports, and thus hates qudditch
Only then can you identify the contradiction and clearly label it. Thus you not only need to get the semantic component right, but you must also pay attention at a pragmatic level and enhance your model or algorithm with ontologies, metonymy, hyper and hyponymy TODO this section
Not just this, but a variety of other cognitive processes such as math, and color abstraction is also necessary for a good NLI system. Consider the following sentence.
Two men and a woman were driving a teal car
Then the premise three people were driving a blueish vehicle is also true.
Applications of NLI
All semantic tasks will greatly benefit by NLI. But amongst those, some areas that are of particular significance are
- Question Answering
- Search and Information Retrieval
- Automatic Summarization
NLI can also be used for paraphrase detection. Two sentences are termed as paraphrases of each other if $S_1$ entails $S_2$ and vice-versa. That is to say if $S_1$ can be inferred from $S_2$ and $S_2$ can be inferred from $S_1$, then they are semantically equivalent or similar and thus are paraphrases of each other.
Paraphrase detection has a very interesting application in the evaluation of Machine Translation(MT) systems1. MT tasks are usually evaluated with metricsw like BLUE scores, but the issue with that is models often now optimize for the BLUE score itself and not for the underlying semantic content itself. TODO this section
How is NLI being done right now?
This particular post explores non neural approaches to Inference. Neural Architectures deserve a seperate blog post, and will be part 2 of this series.
Non Neural Apporaches
Bag Of Words
Here you try to map each word in the premise to a word in the hypothesis. You essentially have two bag of words and try to find the most similar pairs and assert if the meaning of the hypothesis words is subsumed in the meaning of the premise words. It is fairly robust and can deal with lexical dissimilarities like in the case of
- Increased ⇒ Grows
- Reported ⇒ Saw
- Companies ⇒ Google
but in sentences like,
Ram killed Ravan
Ravan killed Ram
Advantages
Disadvantages
-
It gives the same results as the sentences have identical words. The theta or the semantic roles are ignored by this approach.
-
Another shortcoming of this approach is the handling of negatives. Luke I am your father and Luke I am not your father, are very similar but they convey two opposite things.
-
Quantifiers are another major issue here. Switching, most to every drastically changes the semantic meaning but both word still have very high similarity scores.
It is possible to mitigate some of the errors by giving it syntactic and semantic information for the action roles and the domains and constrictions of each word, but this makes the system brittle and very tedious to maintain and improve. Moreover it cannot generalize well to situations not encountered before. Inspire of all these issues for most general use cases it remains a decent baseline and can be used in a pinch.
Logic Approaches
There are mainly two types of logical inferences.
- First order Logic
- Natural Logic
In the first order logic you try to create axioms or rules. And using these rules you infer. But these formal approaches are not vert adaptive and cannot handle the complexitites and the intricasies: