Exploration of Closed-Domain Question Answering Explainability Methods With a Sentence-Level Rationale Dataset

Pirenne, Lize; Mokeddem, Samy; Ernst, Damien; Louppe, Gilles

Download

Eprint first made available on ORBi (E-prints, working papers and research blog)

Exploration of Closed-Domain Question Answering Explainability Methods With a Sentence-Level Rationale Dataset

Pirenne, Lize; Mokeddem, Samy; Ernst, Damien et al.

2024

Dataset

Permalink
https://hdl.handle.net/2268/322654

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

paper.pdf

Author preprint (542.71 kB)

Creative Commons License - Attribution, ShareAlike

Added generation / non-generation for a clearer comparison.

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Large Language Models; Rationale Extraction; Classifier; Natural Language Processing; Reinforcement Learning; Fine Tuning

Abstract :

[en] In this paper, we address the problem of Rationale Extraction (RE) from Natural Language Processing: given a context ($C$), a related question ($Q$) and its answer ($A$), the task is to find the best sentence-level rationale ($R^*$). This rationale is loosely defined as being the subset of sentences of the context $C$ such that producing $A$ would require at least $R^*$. We have constructed a database where each entry is composed of the four terms ($C$, $Q$, $A$, $R^*$) to explore different methods in the particular case where the answer is one or multiple full sentences. The methods studied are based on TF-IDF scores, embedding similarity, classifiers and attention and have been evaluated using a sentence overlap metric akin to the Intersection over Union (IoU). Results show that the best scores were achieved by the classifier-based approach. Additionally, we observe the growing difficulty of finding $R$ as the number of sentences in the context increased. Finally, we underlined a correlation in the case of the attention-based method between its performance and the ability of the underlying large language model to provide given $C$ and $Q$ an answer similar to $A$.

Research Center/Unit :

Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège

Disciplines :

Computer science

Author, co-author :

Pirenne, Lize ^✱; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Mokeddem, Samy ^✱; Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Smart grids

Ernst, Damien ; Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Smart grids

Louppe, Gilles ; Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data

^✱ These authors have contributed equally to this work.

Language :

English

Title :

Exploration of Closed-Domain Question Answering Explainability Methods With a Sentence-Level Rationale Dataset

Publication date :

2024

Name of the research project :

ARIAC by DW4AI

Funders :

Walloon region

Funding number :

2010235

Funding text :

Lize Pirenne gratefully acknowledges the financial support of the Walloon Region for Grant No. 2010235 – ARIAC by DW4AI.

Data Set :

Inversta/rationale-databricks-dolly-cqa

Available on ORBi :

since 28 September 2024

Statistics

Number of views

128 (38 by ULiège)

Number of downloads

74 (13 by ULiège)

More statistics