New Junior Research Group in Digital Humanities


Some "topics” (for example: topics and motives) from French classic plays and enlightenment, visualised as "wordcloud”. The larger a word is displayed, the more important it is in the respective topic. (Graphic: C. Schöch, CLiGS)

Some "topics” (for example: topics and motives) from French classic plays and enlightenment, visualised as "wordcloud”. The larger a word is displayed, the more important it is in the respective topic. (Graphic: C. Schöch, CLiGS)

Visualisation of the topical similarities of 375 plays with the help of the principal component analysis, on the basis of 80 "topics”. Each point represents a play (red = tragedy, green = comedy, blue = tragicomedy). It becomes clear that in tragicomedy, elements of the tragedy and the comedy come into effect.  But also, that there are some works, where such a categorisation does not apply. This is where it is definitely worth it for the philologist to pick up the book. (Graphic: C. Schöch, CLiGS)

Visualisation of the topical similarities of 375 plays with the help of the principal component analysis, on the basis of 80 "topics”. Each point represents a play (red = tragedy, green = comedy, blue = tragicomedy). It becomes clear that in tragicomedy, elements of the tragedy and the comedy come into effect. But also, that there are some works, where such a categorisation does not apply. This is where it is definitely worth it for the philologist to pick up the book. (Graphic: C. Schöch, CLiGS)

Dr. Christof Schöch, head of the CLiGS group. (Foto: Marco Bosch)

Dr. Christof Schöch, head of the CLiGS group. (Foto: Marco Bosch)

The BMBF (Federal Ministry of Education and Research) is funding the setup of a junior research group at the Department of Computer Philology at the University of Würzburg. Over the following four years, the team will dedicate themselves to "Computational Literary Genre Stylistics” (CLiGS) in the field of Romance philology.

At the Institute of German Philology a new junior research group is being formed under the guidance of Christof Schöch. It belongs to the Department of Computer Philology and Modern German Literature of Professor Fotis Jannidis and is funded by the Federal Ministry of Education and Research (BMBF) with about 1.8 million euros over the coming four years. The department will cooperate closely with Professor Andreas Hotho from the Department of Information Technology and the Romance Philology Professor Brigitte Burrichter.

"The largest share of the funding for the Department of Digital Humanities is made up of funds for setting up infrastructures. There are comparatively few pure scientific projects such as the junior research group projects. This is exactly why this sophisticated and ambitious project of ‘computational literary genre stylistics' represents something special”, says Professor Jannidis.

Enabling Larger Basis for Analysis and Interpretations

As head of the group Christof Schöch and his team would like to create foundations within the framework of the funded project that will enable an approach to literary science questions through a combination of comprehensive text data, innovative analysis methods and humanistic interpretation performances in totally new way.

So far, in literary science, examinations have mainly been carried out on only a few works, simply because there is not enough time to read all comedies from one era and to put them in relation to each other, or to compare them with all tragedies from the same time period. "Furthermore we see for example that it is always the same literary works that are used in literary descriptions,” says Schöch. This results in presumably general statements regarding entire text genres and eras that are based on a comparatively small database.

This is where computational literary generic stylistics comes in. In future, it will be possible for a clearly larger number of texts to be analytically compared. This works, for example, thanks to a computer software program that independently recognises certain recurrent stylistic elements and types of expressions in texts, based on different words and word groups - independent of information on the author or already existing assignments of the entire work. The computer can then output these detected patterns and the philologist has starting points for further literary research. If in doubt s/he will take a certain book from the shelf and read it with a view to his/her concrete question.

The researchers are hoping for new methodical approaches to the problem of separating the stylistic signals of authorship and genre in literary texts or, for example, the automatic detection of description, narration and argumentation in narrative texts. It is feasible, that new genre terms may also develop out of this work as well as new methods or at least existing affinities may be doubted, explains Schöch.

Five Spanish and French Text Collections as a Basis

Five comprehensive text collections form the basis of the examination. Christof Schöch is dedicated to French and Spanish classical theatre and theatre of Enlightenment. An existing collection of French plays contains about 750 texts. Further text collections relate to the Spanish theatre of the Spanish Golden Age, the French novel of Enlightenment, the Spanish novel of the 19th and early 20th century and the Latin American novel of the 19th century.

Whilst one of the collections used by Christof Schöch is already well prepared, scientists still have to carry out groundwork with the other collections. "It really is not so easy to have access to well-prepared data” says Schöch, who himself studied Romance studies, English language and literature and Psychology in Freiburg and Tours.

In order for the computers to be able to use the data, they have to be prepared in a uniform XML-based format - this task still has to be carried out on four collections. A byproduct of the project work is that the processed collections are then already prepared and available for other researchers with totally different research issues. However, Schöch, who has been coordinating the "Digital Romance” study group in the Association of German Romance Studies (DRV), since the beginning of 2014, points out: "This is a project where the focus is on the analysis and interpretation, not the digitalisation.”

Close Cooperation Between Romance Philologists and Computer Scientists

"Of course, the focus of the study group is on the literary science as well as the computational knowledge gained. Nevertheless, the training of young scientists is also very important to us as well as the networking in Germany and beyond,” says Schöch.

The project does not only have particular significance for the Julius Maximilian University of Würzburg (which is a strong digital humanities base comprising the Department of Computer Philology, the Kallimachos center for digital humanities, the Digitalisation Centre of the library and numerous activities of informatics technology but also for the Romance studies. "In Germany there are currently two to three Romance projects employing digital methodology and are of comparable size” says Schöch. "The competition for the junior research group projects was really very high, but Christof Schöch managed to convince the appraisers with his excellent and innovative project, which makes me very happy,” says the head of the department Fotis Jannidis.

The humanities scholar Schöch, who has been preparing the project for a year and who supervised it in the application phase, is working closely together with the informatics Technology Department in the management of the junior research group. In addition to this Schöch, a postdoctoral researcher from this field will also be working here. The group is furthermore made up of three Romance study doctoral students as well as an information technology doctoral student and several assistants. "The interdisciplinary structure is one of the special features of the group. It is important to make sure that it is not just information technology nor the Romance studies who take sole control” says the 37 year old computer philologist, who earned his doctorate in 2008.

"It is one of the main tasks of the Digital Humanities Department to establish the exchange between the disciplines,” says Schöch. The scientist can tell from his everyday life experience that this is not always easy. The working methods of the philologists and computer scientists are actually rather different. But this is exactly where the Romance philologist also sees the chance of such projects: "We complement each other, we approach each other methodologically and we learn an enormous amount from each other,” says Schöch.

The study group forms an interface between French and Spanish literary science on one side and text mining and machine learning on the other. "We want to contribute towards adapting contemporary computational methods for new application areas and firmly fix them in the range of methods of literary studies of Romances,” explains Schöch.

Contact

Dr. Christof Schöch
Department of Computer Philology
E-Mail: christof.schoech@uni-wuerzburg.de

Related links:

Website of the CLiGS project: clgs.hypotheses.org

Project description on the website of the University of Würzburg:
www.germanistik.uni-wuerzburg.de/lehrstuehle/computerphilologie/forschung/cligs/

Information on Christof Schöch on the website of the University of Würzburg: www.germanistik.uni-wuerzburg.de/lehrstuehle/computerphilologie/mitarbeiter/schoech/

By: Marco Bosch

08.06.2015, 13:32 Uhr