Paragraph Level Annotation

Version 0.1 (07/07/2004)

OpenText.org Proposal July 2004

Editors:
Matthew Brook O'Donnell
Stanley E. Porter
Jeffrey T. Reed

Copyright (c) OpenText.org 2001-2004

Abstract

The paragraph is the third level of analysis in the OpenText.org model, built upon the level levels of the clause and the word group. This document outlines the linguistic analysis of the paragraph and its components and describes the XML elements and attributes used for its annotation. Many of the features analyzed at this level are meta-elements providing summaries of the lower levels of annotation and can be calculated automatically.

Status of this document

This document is the initial proposal of the paragraph level annotation scheme. It is currently under review and comments are requested. Please post comments to OpenText.org forum.

Table of contents

1. Introduction

2. Definitions

3. Features analyzed at the paragraph level
3.1. (Field)
3.2. (Tenor)
3.3. Paragraph boundaries, clause level and connections, and paragraph conjunctions (Mode)


1. Introduction

a. There are few formal orthographic conventions for marking paragraph boundaires in ancient manuscripts. Sections or pericopes utilized in modern editions often have orgins in early MSS and codices but are not based upon linguistic criteria. The foundational task for annotation at the paragraph level is the identification of boundaries and shifts in discourse that might support the demarcation of a paragraph unit.

b. A paragraph is made up of a series of clauses. Connections or linkages between clauses and their level of function are analyzed at the paragraph level. A basic distinction is made between primary and secondary clauses and their function within discourse. Primary clauses provide the developmental flow of information, while secondary clauses develop themes and concepts introduced in primary clauses.

c. The notions of clause independence and dependence in Greek have been overly influenced by working in translation and a reliance upon logical analysis. A dependent clause is often defined as a clause that 'cannot stand on its own'. For instance, a clause introduced with i{na is usually classified as dependent because the translation 'in order that X' seems to depend on a previous clause. However, the same argument could be used to classify clauses beginning with gavr or ou\n as dependent.

d. Within paragraphs there are often groups of clauses that function together but cannot be understood as distinct paragraphs. Such clause groups, such as a conditional construction or a speech event (with a frame, e.g. 'she said', and content, e.g. the substance of the speech process). These kind of clause groups are not considered a separate level of discourse between the clause and the paragraph.

e. In the OpenText.org discourse model a basic distinction in the features analyzed and annotated can be made at the paragraph level between textual and meta-textual features. Textual features are words or points in the discourse that receive actual marking, while meta-textual features summarize linguistic patterns and values within the paragraph (e.g. the number of primary and secondary clauses within a given paragraph).

2. Definitions

  • [d1] This document assumes the analysis of the clause and its components as a separate level of analysis. The definition and boundaries of a clause are as defined elsewhere (see Clause Level Annotation).
  • [d2] A paragraph is a unit of discourse (e.g. a series of clauses) exhibiting internal cohesion with a distinct function within the surrounding co-text. The boundaires of a paragraph can be identified through a series of feature criteria that combine to create a discourse shift.
  • 3. Features analyzed at the paragraph level

    a. Though not formally recognized in the current elements and attributes, it is helpful to divide the features analyzed at the paragraph level according to whether they belong to the field, tenor or mode of discourse. Also a distinction is made between textual and meta-textual features.

    3.1. (Field)

    a.

    3.2. (Tenor)

    a.

    3.3. Paragraph boundaries, clause level and connections, and paragraph conjunctions (Mode)

    a. The boundaries of each paragraph are ascertained on the basis of the presence of one or more feature criteria outlined below.

    b.