Skip to Main content Skip to Navigation
Book sections

The notion of sentence and other discourse units in corpus annotation

Abstract : The notion of sentence-as it is defined in syntactic, semantic, graphic and prosodic terms-is not a suitable maximal unit for the prosodic and syntactic annotation of spoken corpora. Still, this notion is taken as a reference in many syntactic and prosodic annotation systems. We present here the modular approach we adopted for the annotation of the Rhapsodie corpus of spoken French, which led us to distinguish three types of elementary units operating in discourse (government units, illocutionary units, and intonational periods) and to annotate them separately. We describe the types of interactions identified among these various levels of cohesion. On this basis we propose a reappraisal of the traditional notion of sentence and we define two additional types of discourse units that we consider as the minimal and the maximal span for the notion of sentence.
Document type :
Book sections
Complete list of metadatas
Contributor : Administrateur Hal Nanterre <>
Submitted on : Tuesday, February 16, 2021 - 5:42:15 PM
Last modification on : Tuesday, March 2, 2021 - 10:24:45 AM


Files produced by the author(s)



Paola Pietrandrea, Sylvain Kahane, Anne Lacheret-Dujour, Frédéric Sabio. The notion of sentence and other discourse units in corpus annotation. Tommaso Raso; Heliana Mello. Spoken Corpora and Linguistic Studies, pp.331-364, 2014, ⟨10.1075/scl.61.12pie⟩. ⟨hal-03143343⟩



Record views


Files downloads