Yue Ding, On Isotropy Calibration of Transformer models, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
There have been many works on interpreting Transformer, the state-of-the-art model architecture in natural language processing (NLP). Recent researches reveal that the embedding space of Transformer models is highly anisotropic, i.e., the embeddings occupy only a narrow cone. Previous works (Mu et al., 2017; Liu et al., 2019) show that improving the isotropy of static embeddings (e.g., Word2Vec or GloVe) improves their performance on down-stream tasks. Based on this, different studies propose calibration methods to address the anisotropic problem of the contextualized embeddings of Transformers. However, Cai et al. (2021) show that the embedding space has many clusters, and these clusters are locally isotropic. Luo (2020) reports that embedding vectors of some Transformer models have ‘spikes’ at consistent indices, and this distorts our understanding of the embedding space. Overall, we believe that additional isotropy calibration does not help in improving the performance of Transformers. To better understand Transformers, we conduct an empirical evaluation and find that in most cases, calibration improves the isotropy of the model but decreases the scores on down-stream tasks. In other words, better isotropy does not provide consistent improvements across models and tasks.
Reference
Jiaqi Mu, Suma Bhat, and Pramod Viswanath. All-but-the-top: Simple and effective postprocessing for word representations. arXiv preprint arXiv:1702.01417, 2017.
Tianlin Liu, Lyle Ungar, and Joao Sedoc. Unsupervised post-processing of word vectors via conceptor negation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6778–6785, 2019a.
Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. Isotropy in the contextual embedding space: Clusters and manifolds. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=xYGNO86OWDH.
Ziyang Luo. Catch the" tails" of bert. arXiv preprint arXiv:2011.04393, 2020.
|
|
Adrian Lars Benjamin Iten, Compact Models for the X-Stance Task, University of Zurich, Faculty of Business, Economics and Informatics, 2020. (Bachelor's Thesis)
Machine learning models have brought great progress in natural language processing. We develop different compact models to determine the stance (favor or against) of a politician towards a question based on his comment. We compare simple classication models that only classify single words as well as more complex transformer encoder models, which can consider the whole context of the sequence for classification. We implement a web
application to perform error analysis on the models as well as to visualize the data set. The compact models cannot outperform pre-trained deeper models, but achieve moderate to good performance. |
|
Selin Fabel, Schneewittchens Stiefmutter Automatic Generation of Questions based on Textual Material, University of Zurich, Faculty of Business, Economics and Informatics, 2020. (Master's Thesis)
Examination in form of question and answering has always been a popular way to assess a student's knowledge growth. But since manual question generation is both time-consuming and error-prone, it is not a trivial task. What is more, students often lack of having realistic practise material available. This thesis therefore presents a tool called Schneewittchens Stiefmutter (Snow white's Stepmother) which automates the steps of parsing any textual material, extracting question components, filtering unwanted parts and generating reasonable questions. In addition, this thesis offers a simple and re-runnable script that allows to build a big question component corpus for the German language by means of the knowledge base Wikidata.
The tool produces questions that differ in terms of type and complexity, and cover the first level of Bloom's Question Taxonomy. As the tool's name suggests, one may ask the resulting questions an oracle - or better use them for examination and learning purposes, especially in the field of reading comprehension or geographical quizzes. |
|
Markus Göckeritz, Eager Machine Translation, University of Zurich, Faculty of Business, Economics and Informatics, 2020. (Master's Thesis)
Eager machine translation is a recent approach to simultaneous machine translation introduced by [Press and Smith, 2018].
To improve translation quality, [Press and Smith, 2018] use beam search. With beam search, however, the final translation can only be returned once the entire source sentence was processed. This effectively makes eager translation a non-simultaneous process.
We applied sequence-level knowledge distillation to the eager translation model and show that we can eliminate the need for beam search for eager translation. We furthermore report a significant improvement in both translation quality and translation speed. Our model outperforms the original eager translation model by more than 2 BLEU and translates at almost twice the speed.
We confirm that the distilled data set is more deterministic, more parallel, and more monotonous than the original training data, but show that the increase in determinism, parallelism, and monotonicity in the training data does not explain the superior translation quality. |
|
Eva Viktoria Apati-Nagy, TOPIC MODELING FOR REAL-LIFE LANGUAGE USE OF ADULTS VARYING IN AGE, University of Zurich, Faculty of Business, Economics and Informatics, 2020. (Bachelor's Thesis)
This thesis provides a text analysis of transcripts of oral communication, where data sparsity, the characteristics of spoken language and the manual processing of the utterances present challenges. I conduct an empirical research based on topic modeling, an unsupervised machine learning method, to discover the abstract topics that occur in the corpus and to evaluate the results. The applied techniques include text normalization, Non-negative Matrix Factorization, Latent Dirichlet Allocation, cosine similarity, semantic coherence and exclusivity. Non-negative Matrix Factorization using term frequencies and Frobenius norm provides the most promising model. The generated topics uncover some differences in language use between the age groups in the dataset: young people talk more about entertainment and studies, while older people talk more about purchase and city life. Elderly people talk about time more in terms of the past, while young people more in terms the future. The presented solution can be used in further statistical analyses of the dataset. |
|
Andreas Schaufelbühl, Morphological Inflection of Terminology for Constrained Neural Machine Translation, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Master's Thesis)
Nowadays Neural Machine Translation systems generally achieve remarkable translation quality. Recently proposed constrained decoding approaches allow even the inclusion of pre-defined terms in Neural Machine Translation output. However, the appropriate inection of these terms is an open problem: Their base form is placed without any modication in the output, which may lead to grammatically incorrect results. We examine the use of a stand-alone sequence-to-sequence model to predict the correct inflected form of a term given its basic form in the target language and the source sentence. We show that good results can be achieved in terms of overall accuracy and that the method has limited success in handling rare word forms. |
|
Bernard Silvan Schroffenegger, Entwicklung eines Werkzeugs zum Vergleich von verschiedenen Korpusversionen, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Bachelor's Thesis)
This paper addresses the conception and implementation of a software tool with the aim of numerically comparing different corpus versions, which contain typical annotations for the empirical linguistic research. The comparison takes place on three different levels: On a global one, which does not differentiate between individual documents, but compares the corpora as a whole, a subordinated level abstracting
from the contents of these documents, as well as a minor level, which compares the contents itself.
First of all, the paper briefly discusses the term of a language as well as the characteristics of corpora, especially those used in the domain of corpus linguistics. After that, a requirement analysis takes place for the desired software as the foundation for the development of suitable approaches for the solutions and their implementation. The result constitutes a stable, transparent framework entirely written in Python
with a corresponding program, almost free of any redundancies, which is therefore an optimal basis for numerous extensions. |
|
Simon Clematide, Manfred Klenner, Martin Volk, Searching Answers. Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday,, MV-Wissenschaft, Münster, 2009. (Book/Research Monograph)
|
|
Manfred Klenner, Nominal Anaphora. Can we Tame the Beasts, In: Searching Answers. Festschrift in Honour of Michael Hess on the Occasion of His 60th Birthday, MV-Wissenschaft, Münster, p. 77 - 84, 2009. (Book Chapter)
|
|
Martin Volk, The Automatic Translation of Film Subtitles. A Machine Translation Success Story?, In: Resourceful Language Technology: Festschrift in Honor of Anna Sågvall Hein, Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, p. 202 - 214, 2008. (Book Chapter)
|
|
C. Stocker, D. Macher, R. Studler, N. Bubenhofer, D. Crvelin, R. Liniger, Martin Volk, Studien-CD Linguistik. Multimediale Einführungen und Interaktive Übungen zur Germanistischen Sprachwissenschaft, Niemeyer Verlag, Tübingen, 2004. (Book/Research Monograph)
|
|
Martin Volk, Markup of a Test Suite with SGML, In: Linguistic Databases. CSLI., p. 59 - 76, 1998. (Book Chapter)
Recently, there have been various attempts to set up a test suite covering the syntactic
phenomena of a natural language (cp. Flickinger et al. 1989, Nerbonne et al. 1993).
The latest effort is the TSNLP project (Test Suite for Natural Language Processing)
within the Linguistic Research and Engineering (LRE) framework sponsored by the
European Union (cp. Balkan et al. 1994). These test suites are meant for the testing of
NLP software regarding their coverage of syntactic phenomena. Volk 1995 showed that
a well-organised test suite can also be used to support incremental grammar development
and grammar documentation. The key issues in the organisation of a test suite are the
ease of extensibility and interchangeability as well as the avoidance of redundancy. We
have implemented a test suite, which is optimized for the avoidance of redundancy and
we report on the trade-off for extensibility and interchangeability. |
|
Martin Volk, The Automatic Translation of Idioms. Machine Translation vs. Translation Memory Systems, In: null, St. Augustin, 1998. (Book Chapter)
Translating idioms is one of the most difficult tasks for human translators and translation machines alike. The main problems consist in recognizing an idiom and in distinguishing idiomatic from non-idiomatic usage. Recognition is difficult since many idioms can be modified and others can be discontinuously spread over a clause. But with the help of systematic idiom collections and special rules the recognition of an idiom candidate is always possible. The distinction between idiomatic and non-idiomatic usage is more problematic. Sometimes this can be done by means of special words that are only used in an idiom. But in general this distinction is a question of semantics and pragmatics and therefore beyond the abilities of current translation systems. In this paper we investigate the requirements for automatically recognizing idioms and we check whether idiom recognition is possible within current translation systems, i.e. machine translation and translation memory systems. This is of current interest since the developers of translation systems have started to include huge idiom collections in their products. |
|
Martin Volk, Einsatz einer Testsatzsammlung im Grammar Engineering, Universität Koblenz, 1995. (Dissertation)
Natürlichsprachliche Systeme (von Grammatikprüfprogrammen bis zu Maschineller Übersetzung) umfassen komplexe formale Grammatiken. Aufgrund der Komplexität erfordert deren Aufbau eine ingenieur-wissenschaftliche Herangehensweise, die als ""Grammar Engineering"" bezeichnet wird. Eine grundlegende Ressource im Grammar Engineering-Prozess ist eine Testsatzsammlung, eine systematische Sammlung von Sätzen der Sprache, wobei jeder Satz ein eigenes grammatisches Problem exemplifiziert. Eine solche Satzsammlung kann auf vielfache Weise die Entwicklung von formalen Grammatiken unterstützen. Es wird gezeigt, wie inkrementelles Grammatiktesten mit Hilfe einer Testsatzsammlung organisiert werden kann. Die Vorstellung einer in Prolog implementierten Grammatik-Testumgebung demonstriert die praktische Umsetzbarkeit. |
|
Martin Volk, Was ist Linguistic Engineering?, KI (4), 1994. (Journal Article)
|
|
Martin Volk, Parsing with ID/LP and PS rules, In: Natural Language Processing and Speech Technology. Results of the 3rd KONVENS Conference (Bielefeld), Mouton de Gruyter, Berlin, . (Conference or Workshop Paper)
|
|
Michael Jung, Dirk Richarz, Martin Volk, GTU - Eine Grammatik-Testumgebung, In: Proceedings of KONVENS-94, Vienna, Austria, . (Conference or Workshop Paper)
|
|
Stephan Mehl, Hagen Langer, Martin Volk, Statistische Verfahren zur Zuordnung von Präpositionalphrasen, In: Proceedings of KONVENS-98, Bonn, Germany, . (Conference or Workshop Paper)
Zahlreiche neuere Arbeiten für das Englische zeigen, daß statistische Analysen großer Korpora und Treebanks gute Heuristiken für die Zuordnung von Präpositionalphrasen liefern können. Entsprechende Untersuchungen für das Deutsche scheitern bisher an den fehlenden Daten. Wir zeigen jedoch, daß durch Einbeziehung weiterer Faktoren auch für das Deutsche mit guten Ergebnissen zu rechnen ist. Betrachtet werden der Einfluß unterschiedlicher Gewichte für Verben und Nomina, die Auswirkungen einer vorgeschalteten lexikalischen Disambiguierung sowie die Kopplung lexikalischer und grammatischer Präferenzen. |
|
Stephan Mehl, Britta Heidemann, Martin Volk, Zur Problematik der maschinellen Übersetzung von Nebensätzen zwischen den Sprachen Englisch und Deutsch, In: Evaluation of the Linguistic Performance of Machine Translation Systems. Proceedings of the Workshop at the KONVENS-98, Bonn, Germany, . (Conference or Workshop Paper)
Kommerziell verfügbare Maschinelle Übersetzungssysteme können auf den ersten Blick eine erstaunliche Vielzahl syntaktischer Konstruktionen verarbeiten. Erst eine detaillierte Analyse zeigt die spezifischen Defizite auf.
Wir haben deshalb eine spezielle Testsuite mit 384 Sätzen (226 EN -> DE, 158 DE -> EN) zusammengestellt, die jeweils unterschiedliche Nebensätze enthalten. Dazu gehören indirekte Aussagesätze und Fragesätze, Adverbialsätze, Relativsätze, sowie Infinitiv-, Partizipial- und Gerundium-Konstruktionen.
Nebensätze eignen sich für eine solche Untersuchung besonders gut, weil bei ihrer Übersetzung zahlreiche syntaktische Faktoren eine Rolle spielen. Dazu gehören:
* das Problem der Abgrenzung einer Konstituente
* das Problem der Funktionsbestimmung des Nebensatzes
* das Problem des syntaktischen Transfers in der Übersetzung
Da die Struktur und Funktion von Nebensätzen im Deutschen anhand äusserlicher Merkmale (Satzzeichen, Konjunktionen) deutlicher erkennbar ist als im Englischen, treten Probleme vor allem bei der Übersetzung vom Englischen ins Deutsche auf. Unsere Untersuchung behandelt deshalb vor allem diese Übersetzungsrichtung.
Untersucht wurden die PC-Systeme Langenscheidts T1 (GMS), Personal Translator Plus (IBM, von Rheinbaben & Busch), Power Translator (Globalink) und Systran (MySoft). Nur in wenigen Fällen scheinen bestimmte Konstruktionen allen Systemen gänzlich unbekannt zu sein (z.B. englische Partizipial-Nebensätze). Bei den anderen Fällen zeigt mindestens ein System, dass es prinzipiell möglich ist, dieses Phänomen korrekt zu behandeln. Die meisten Übersetzungsfehler beruhen auf fehlerhafter Abgrenzung des Nebensatzes vom Hauptsatz, Verwechslung der Nebensatztypen, fehlende semantische Analyse und im Bereich der Synthesefehler die fehlerhafte Wortstellung in der Zielsprache.
Kurz zusammengefasst erbrachte die Untersuchung die folgenden Ergebnisse:
1. Die schwierigsten Nebensatzkonstruktionen für die Übersetzungssysteme sind Infinitivkonstruktionen, partizipiale Adverbialsätze und Gerundien. Relativsätze werden gut übersetzt, auch wenn das Relativpronomen fehlt.
2. Von den untersuchten Systemen beherrscht Personal Translator Plus die meisten Nebensatzkonstruktionen. Langenscheidts T1 arbeitet sehr uneinheitlich, manchmal erstaunlich gut und manchmal vollkommen falsch.
3. Die Resultate der Nebensatz-Übersetzungen vom Deutschen ins Englische sind tendenziell besser als in der umgekehrten Richtung. |
|
Martin Volk, Stephan Mehl, Hagen Langer, Hybride NLP-Systeme und das Problem der PP-Anbindung, In: Berichtsband des Workshops, Freiburg, Germany, . (Conference or Workshop Paper)
Das Problem der Anbindungsambiguitaten bei Prapositionalphrasen ist zwar bereits oft und unter
verschiedenen Aspekten - von der Sprachtechnologie bis hin zur Psycholinguistik - untersucht wor-
den, es darf aber gleichwohl als nach wie vor ungelost angesehen werden. PP-Anbindung ist ein
zentrales Problem, da es sich bei Prapositionalphrasen um alles andere als ein marginales lingui-
stisches Phanomen handelt: In deutschen Zeitungstexten kommt auf einen Satz im Durchschnitt
etwa eine Prapositionalphrase, in fachsprachlichen Texten kann dieser Wert noch erheblich hoher
liegen. |
|