Not logged in.

Quick Search - Contribution

Contribution Details

Type	Bachelor's Thesis
Scope	Discipline-based scholarship
Title	Real-Time Crowdsourced Speech-to-Text Subtitling
Organization Unit	Dynamic and Distributed Information Systems (Abraham Bernstein)
Authors	Nicola Staub
Supervisors	Abraham Bernstein
Language	English
Institution	University of Zurich
Faculty	Faculty of Economics, Business Administration and Information Technology
Number of Pages	68
Date	2014
Abstract Text	While speech recognition systems often still generate unconvincing results, professional transcribers are not available on demand and charge a lot for their work. Combining the number-crunching capabilities and scalability of computer systems, as well as the creativity and high-level cognitive capability of human beings, the goal of this bachelor thesis is to develop a speech-to-text subtitling algorithm that provides robust quality with costs and the needed processing time reduced to a minimum. Taking advantage of the crowdsourcing platform of Amazon's Mechanical Turk, two entire speeches from conferences were transcribed through the power of non-experts - with astonishing findings. This thesis will compare the resulting subtitles of the own algorithm and two baseline-algorithms among themselves, as well as with captions generated by professional stenographers and computerized speech recognition systems. The focus thereby lies on quality, costs and the total processing time.
Zusammenfassung	Während Spracherkennungssysteme oft unbrauchbare Resultate generieren, sind professionelle Schreibkräfte für Transkriptionsdienste nicht rund um die Uhr verfügbar und teuer in deren Bezahlung. Das Ziel dieser Arbeit ist die Entwicklung eines Algorithmus, der gesprochene Sprache in Text umwandelt, basierend auf der Kombination der enormen Leistung und Skalierbarkeit von Computern mit den vielseitigen kognitiven Fähigkeiten von Menschen. Der Algorithmus soll qualitativ hochstehende Resultate generieren, während die Verarbeitungszeit und Kosten auf einem Minimum gehalten werden. Mit Hilfe der Crowdsourcing Plattform von Amazons Mechanical Turk wurden zwei komplette Vorträge von Konferenzen allein durch Laien transkribiert - mit erstaunlichen Resultaten. Diese Arbeit vergleicht die resultierenden Untertitel des selbsterstellten Algorithmus und zwei Basis-Algorithmen sowohl untereinander, als auch mit Richtwerten von professionellen Stenographen oder computerbasierten Spracherkennungssysteme. Der Fokus wird dabei auf die Qualität, Kosten und die total benötigte Verarbeitungszeit gelegt.
PDF File	Download
Export	BibTeX