Abstract :
[en] In this paper, we address the problem of Rationale Extraction (RE) from Natural Language Processing: given a context ($C$), a related question ($Q$) and its answer ($A$), the task is to find the best sentence-level rationale ($R^*$). This rationale is loosely defined as being the subset of sentences of the context $C$ such that producing $A$ would require at least $R^*$.
We have constructed a database where each entry is composed of the four terms ($C$, $Q$, $A$, $R^*$) to explore different methods in the particular case where the answer is one or multiple full sentences.
The methods studied are based on TF-IDF scores, embedding similarity, classifiers and attention and have been evaluated using a sentence overlap metric akin to the Intersection over Union (IoU).
Results show that the best scores were achieved by the classifier-based approach. Additionally, we observe the growing difficulty of finding $R$ as the number of sentences in the context increased. Finally, we underlined a correlation in the case of the attention-based method between its performance and the ability of the underlying large language model to provide given $C$ and $Q$ an answer similar to $A$.