BERT; Llama2; LLMs; Mistral; Neural Networks; Random forest; SVM; Text classification; User story; LLM; Neural-networks; Random forests; Soft goals; User stories; Theoretical Computer Science; Computer Science (all)
Abstract :
[en] We address the problem of classifying Capability, Task, Hard-goal, and Soft-goal in user stories. Such a classification is essential for generating Rationale Tree. Several articles have attempted to classify different aspects of user stories in the past. However, classifying the Capability, Task, Hard-goal, and Soft-goal class has been largely overlooked. To this aim, we present three pipelines. The first two pipelines rely on standard machine learning methods. They differ in how they represent features, i.e. bag-of-word vs. embedding from deep learning methods. Our third pipeline explores a recent NLP development, viz. few-shot classification with two LLMs, Mistral and Llama. Our experiments reveal that using deep learning embedding as a feature of classical machine learning methods significantly improves performance, even for minority classes. Thus, such features could help alleviate class imbalance and data sparsity issues. We also found out that Mistral outperformed Llama. However, its performance was still far below that achieved by classical machine learning methods. We believe that our is novel as we are the first to study the problem of classifying Capability, Task, Hard-goal, and Soft-goal, and as we investigate how LLMs perform in this problem.
Disciplines :
Computer science
Author, co-author :
Chuor, Porchourng ; Université de Liège - ULiège > HEC Liège Research > HEC Liège Research: Business Analytics & Supply Chain Mgmt
Brown, T., Mann, B., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Casanueva, I., Temčinas, T., Gerz, D., Henderson, M., Vulić, I.: Efficient intent detection with dual sentence encoders (2020)
Chitra, S.G.: Classification of low-level tasks to high-level tasks using JIRA data. Ph.D. thesis, Universidade do Porto (Portugal) (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019)
Elsadig, M., Ibrahim, A.O., et al.: Intelligent deep machine learning cyber phishing url detection based on BERT features extraction. Electronics 11(22), 3647 (2022)
Gao, L., Biderman, S., et al.: The pile: an 800gb dataset of diverse text for language modeling (2020)
García-Díaz, J.A., Pan, R., Valencia-García, R.: Leveraging zero and few-shot learning for enhanced model generality in hate speech detection in Spanish and English. Mathematics 11(24), 5004 (2023)
Gomes, L., da Silva Torres, R., Côrtes, M.L.: BERT- and TF-IDF-based feature extraction for long-lived bug prediction in floss: a comparative study. Inf. Softw. Technol. 160, 107217 (2023)
Heng, S.: Impact of unified user-story-based modeling on agile methods: aspects on requirements, design and life cycle management. Ph.D. thesis, Université catholique de Louvain, Belgium (2017)
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: industrial-strength natural language processing in Python (2020)
Jiang, A.Q., Sablayrolles, A., et al.: Mistral 7B (2023)
Jurisch, M., Böhm, S., James-Schulz, T.: Applying machine learning for automatic user story categorization in mobile enterprises application development (2020)
Li, L., Gong, B.: Prompting large language models for malicious webpage detection. In: 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning (PRML), pp. 393–400 (2023)
Loukas, L., Stogiannidis, I., Malakasiotis, P., Vassos, S.: Breaking the bank with ChatGPT: few-shot text classification for finance (2023)
Man, R., Lin, K.: Sentiment analysis algorithm based on BERT and convolutional neural network. In: 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), pp. 769–772. IEEE (2021)
Occhipinti, A., Rogers, L., Angione, C.: A pipeline and comparative study of 12 machine learning models for text classification. Exp. Syst. Appl. 201, 117193 (2022)
OpenAI: GPT-4 technical report (2024)
Ouyang, L., Wu, J., Jiang, X., et al.: Training language models to follow instructions with human feedback. In: Advances in Neural Information Processing Systems, vol. 35, pp. 27730–27744. Curran Associates, Inc. (2022)
Peters, M.E., Ruder, S., Smith, N.A.: To tune or not to tune? Adapting pretrained representations to diverse tasks. ACL 2019, 7 (2019)
Petukhova, A., Matos-Carvalho, J.P., Fachada, N.: Text clustering with LLM embeddings (2024)
Poumay, J., Ittoo, A.: A comprehensive comparison of word embeddings in event & entity coreference resolution (2021)
Raffel, C., Shazeer, N., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Shahid, M.: Splitting user stories using supervised machine learning (2020)
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y.: MPNet: masked and permuted pre-training for language understanding (2020)
Szeghalmy, S., Fazekas, A.: A comparative study of the use of stratified cross-validation and distribution-balanced stratified cross-validation in imbalanced learning. Sensors 23(4), 2333 (2023)
Touvron, H., Martin, L., et al.: Llama 2: open foundation and fine-tuned chat models (2023)
Wautelet, Y., Heng, S., Hintea, D., Kolp, M., Poelmans, S.: Bridging user story sets with the use case model. In: Link, S., Trujillo, J.C. (eds.) ER 2016. LNCS, vol. 9975, pp. 127–138. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47717-6_11
Wautelet, Y., Heng, S., Kolp, M., Mirbel, I.: Unifying and extending user story models. In: Proceedings of the 26th International Conference on Advanced Information Systems Engineering, CAiSE 2014, Thessaloniki, Greece, 16–20 June 2014, pp. 211–225 (2014)
Wautelet, Y., Heng, S., Kolp, M., Mirbel, I., Poelmans, S.: Building a rationale diagram for evaluating user story sets. In: 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS), pp. 1–12. IEEE (2016)
Wenzek, G., et al.: CCNet: extracting high quality monolingual datasets from web crawl data (2019)
Zhu, Y., Kiros, R., et al.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: The IEEE International Conference on Computer Vision (ICCV), December 2015