TRANSLECTURES – Transcription and Translation of Video Lectures

Online educational repositories of video lectures are rapidly growing on the basis of increasingly available and standardised infrastructure. A well-known example of this is the VideoLectures web portal, a free and open access educational video lectures repository, and a major player in the development of the widely used Opencast Matterhorn platform for educational video management. As in other repositories, transcription and translation of video lectures in VideoLectures is needed to make them accessible to speakers of different languages and to people with disabilities. However, also as in other repositories, most lectures in VideoLectures are neither transcribed nor translated because of the lack of efficient solutions to obtain them at a reasonable level of accuracy. The aim of transLectures is to develop innovative, cost-effective solutions to produce accurate transcriptions and translations in VideoLectures, with generality across other Matterhorn-related repositories. Our starting hypothesis is that there is only a relatively small gap for the current technology on automatic speech recognition and machine translation to achieve accurate enough results in the kind of audiovisual object collections we are considering. To close this gap, TL will follow two main research lines. First, we will study how to better deal with object variability by massive adaptation of general-purpose models from available lecture-specific knowledge. Second, as we think that accurate enough results are unlikely to be obtained with fully automatic methods alone, we will explore how to better reach the desired levels of accuracy by intelligent interaction with users. On the other hand, it is our goal not end up with a system prototype that is only evaluated at the lab and can hardly be used in real-life settings. Instead, we will develop tools to work with Matterhorn, and thus we will be able to evaluate them in a real-life setting.

DURATION: 1 November 2011 – 31 October 2014