Reference : The Ramses Project. Exploring Ancient Egyptian linguistic data using a richly annotat...
Scientific congresses and symposiums : Unpublished conference/Abstract
Arts & humanities : Classical & oriental studies
The Ramses Project. Exploring Ancient Egyptian linguistic data using a richly annotated corpus
Winand, Jean mailto [Université de Liège - ULiège > Services généraux (Faculté de philosophie et lettres) > Doyen de la Faculté de Philosophie et lettres >]
Polis, Stéphane mailto [Université de Liège - ULiège > Département des sciences de l'antiquité > Egyptologie >]
Exploring Ancient Languages Through Corpora
14-16 juin 2012
Dag Trygve Truslew Haug
[en] Ancient Egyptian ; corpus ; linguistics ; annotation
[en] The Ramses project — developed at the University of Liège since 2006 — aims at building a richly annotated historical corpus of Late Egyptian texts and, more broadly, of all the written material whose linguistic registers attest Late Egyptian evolutionary features from the 18th dynasty down to the Third Intermediate Period (ca. 1350-700 BCE). It has been specially designed with the idea of having a tool specifically dedicated to research in Egyptian linguistics. The corpus includes, for each text, the relevant graphemic (hieroglyphic transcription with transliteration) and linguistic information (complete morpho-syntactic analysis) as well as a full set of meta-data (description and categorization of the corpus, plus bibliographical references). Starting in 2013, we will progressively provide online access to the Ramses corpus. From a technical point of view, Ramses is a relational database in SQL where the texts are represented and stored in XML. Currently, ca. 1350 texts have been included in the database and received multifaceted annotations: they have been encoded in hieroglyphic script, translated in French and/or English and received annotations for part-of-speech information, lemmatization, and morphological analysis. The corpus consists of slightly more than 300 000 words at the end of 2011 (and is expected to grow up to more than 1 million words in coming years), which amounts to ca. 8000 lemmata, 14 000 inflexions and 45 000 spellings.
In this paper, we review the experience of the Ramses Project in building a richly annotated corpus of an ancient language with a complex writing system. A particular emphasis will be put on the new avenues of research that a tool like Ramses opens up for the study of ancient text languages.
First, we present the state of the art in Egyptology and the reasons for launching such a project. Second, we introduce the editing software and the annotation scheme. Third, we present a series of case studies (study of classifiers; relation between the graphemic and morphological level; valency pattern alternation; diaphasic variation) in order to highlight the capabilities of the search engine and the new avenues that it opens up for research in Ancient Egyptian linguistics.

File(s) associated to this reference

Fulltext file(s):

Open access
Polis&Winand_2012_Exploring.pdfPublisher postprint3.28 MBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.