Collecting data for the Rhapsodie treebank

Anne Lacheret-Dujour; Paola Pietrandrea; Olivier Baude; Nicolas Obin; Anne-Catherine Simon; Atanas Tchobanov

doi:10.1075/scl.89.02lac

Chapitre D'ouvrage Année : 2019

Collecting data for the Rhapsodie treebank

(1) , , (1) , (2) , (3) , (1)

1
2
3

Anne Lacheret-Dujour

Fonction : Auteur
PersonId : 17513
IdHAL : anne-lacheret-dujour
ORCID : 0000-0002-1573-7270

Modèles, Dynamiques, Corpus

Paola Pietrandrea

Fonction : Auteur
PersonId : 919368

Olivier Baude

Fonction : Auteur
PersonId : 4228
IdHAL : olivier-baude
ORCID : 0000-0002-8627-5229
IdRef : 105682640

Modèles, Dynamiques, Corpus

Nicolas Obin

Fonction : Auteur
PersonId : 7042
IdHAL : nicolas-obin
ORCID : 0000-0002-5236-5306
IdRef : 157523799

Analyse et synthèse sonores [Paris]

Anne-Catherine Simon

Fonction : Auteur
PersonId : 859090

Université Catholique de Louvain = Catholic University of Louvain

Atanas Tchobanov

Fonction : Auteur
PersonId : 19315
IdHAL : atanas-tchobanov
ORCID : 0000-0002-0091-1766
IdRef : 067280110

Modèles, Dynamiques, Corpus

Résumé

This chapter is devoted to the development of the Rhapsodie repository. We describe the selection of data to be annotated, the principles used to document the data and discuss the theoretical assumptions underlying the Rhapsodie project. The aim was to provide a corpus to study the interface between discourse, syntax, and prosody in French and the variation of intonosyntactic features according to discourse genre in the marking of informational structure as well as expressivity in unelicited speech. At the beginning of the Rhapsodie project such data were under-represented and the need for spoken corpora of this type in French was strongly felt. Consequently, several challenges had to be addressed. First, we discuss the different obstacles and challenging questions we faced with respect to the development of a well-balanced corpus of different discourse genres produced in different speech situations, such as the nature of the data and the type of information to include in the metadata. Then, we present the sources from which the samples were extracted, legal and ethical issues, and the methodology adopted to encode the metadata.

Domaines

Linguistique

Guillaume Sioly : Connectez-vous pour contacter le contributeur

https://hal.parisnanterre.fr/hal-04088531

Soumis le : jeudi 4 mai 2023-11:33:08

Dernière modification le : mercredi 30 octobre 2024-13:28:28

Dates et versions

hal-04088531 , version 1 (04-05-2023)

Identifiants

HAL Id : hal-04088531 , version 1
DOI : 10.1075/scl.89.02lac

Citer

Anne Lacheret-Dujour, Paola Pietrandrea, Olivier Baude, Nicolas Obin, Anne-Catherine Simon, et al.. Collecting data for the Rhapsodie treebank. Rhapsodie, 89, John Benjamins Publishing Company, pp.7-20, 2019, Studies in Corpus Linguistics, 9789027262929. ⟨10.1075/scl.89.02lac⟩. ⟨hal-04088531⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS MODYCO IRCAM STMS SORBONNE-UNIVERSITE SU-SCIENCES UNIV-PARIS-LUMIERES UNIV-PARIS-NANTERRE

26 Consultations

0 Téléchargements

Collecting data for the Rhapsodie treebank

Résumé

Domaines

Dates et versions

Identifiants

Citer

Relations

Exporter

Collections

Altmetric

Partager