Second semester

Textual Analysis

Objectives

Apply textual statistics methods to corpora of different types (open-ended questions, associated words, press articles, web pages, interviews, etc.) using specific software (IraMuteQ, Spad, tm package, RCommander plugin (R.TeMiS)).

Interpret and present the results, and adapt their restitution to demand.

Restitution of results.

Course outline

A look back at the origin and development of text statistics methods.

The role of text statistics in text mining.

Presentation of different types of text corpus, their collection and formatting.

Different stages in processing a text corpus: vocabulary reduction, construction of the associated lexicon (lemmatization), different lexical tables and their statistical processing.

Results, interpretation aids: specific vocabulary, context of word use, outputs of multivariate analyses or classifications and help with post-codification of an open question.

Implementation of an analysis using several software packages: IraMuTeQ (Alceste method), specific R packages such as R.TeMiS), Spad.

Restitution of results.

Prerequisites

Not indicated