Abstract :
[en] In the modern era, software and regulatory environments are faced with large volumes of unlabeled and unstructured data in natural-language texts, such as agile User Stories (US) and reporting obligations, which must be accurately interpreted, categorized, and structured to support downstream analysis, reuse, and compliance. Natural language processing (NLP) techniques are the most effective solution for these challenges. This thesis addresses key issues in applying NLP techniques to two specific domains: agile software development,
focusing on User Story (US) classification, and regulatory compliance, which involves information extraction from reporting obligations. Our contributions through NLP include a novel evaluation of fine-tuning US classification, a hybrid information extraction (IE) pipeline for regulatory texts, and a publicly released annotated dataset of reporting obligations, which facilitates future research.
Institution :
ULiège - University of Liège [HEC Management School, ULiège], Liege, Belgium