Abstract :
[en] The purpose of our project is to apply digital methods of so-called “distant reading”, i.e., computational techniques for analyzing large collections of text, to identify patterns in large quantities of academic publications on two regional languages in the Low Countries. In the first stage of this project, we have scraped hundreds of academic publications from 1800 onwards about the Frisian and Limburgish languages in English-, Frisian-, German-, and Dutch-language publications that are indexed in Google Scholar.
Google Scholar is one of the most comprehensive search engines and aggregators of academic publications and scholarly literature [1]. However, despite its search engine offering a very high recall of academic publications [2], its precision is often insufficient to conduct systemic reviews or accurate analyses. To address this imprecision and extract relevant publications, we have combined precise search query logic to filter publications on Frisian and Limburgish and a layer of relevance filtering through a Large Language Model. The resulting publications are manually verified through random sampling. This workflow yields an enormous list of academic literature on the Frisian and Limburgish languages in English, Frisian, German, and Dutch.
In our analysis of these publications over time, we have focused on the following six patterns: (1) the amount of academic publications on Frisian and Limburgish; (2) the types of publications, such as (peer-reviewed or non peer-reviewed) academic books, blogs, journals, and theses; (3) the languages in which the publications have been written; (4) the main topics or themes of the publications; (5) the main authors of the publications; (6) the attitudes of the publications towards the status of Frisian and Limburgish as regional languages. Our poster presentation shares our first findings and invites the audience for a discussion about possible correlations, explanations, and limitations.
[1] Gusenbauer M. Beyond Google Scholar, Scopus, and Web of Science: An evaluation of the backward and forward citation coverage of 59 databases' citation indices. Res Syn Meth. 2024; 15(5): 802-817. doi:10.1002/jrsm.1729
[2] Gehanno, JF., Rollin, L. & Darmoni, S. Is the coverage of google scholar enough to be used alone for systematic reviews. BMC Med Inform Decis Mak 13, 7 (2013). https://doi.org/10.1186/1472-6947-13-7