Stefan Christiani, A study on activity and location recognition using various sensors, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Bachelor's Thesis)
In our everyday life, we move through different environments and undertake different activities. During some of those moments, we can handle disturbances, in others they become intolerable. Because of this, context sensitive mobile-phones become a potentially important part of our future. This paper presents an experiment, in which a series of sensors was analysed for their capacity to predict the context of a mobile phone on which they were attached to. This test device was then used to record data in real world scenarios, and the accuracy of the resulting predictions was measured. It could be shown that both activities and locations could be detected quite reliably in real world conditions and that certain sensors fare better then others. |
Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, Thomas Zimmermann, Quality of Bug Reports in Eclipse, In: Proceedings of the 2007 OOPSLA Workshop on Eclipse Technology eXchange, ACM, New York, NY, USA, October 2007. (Conference or Workshop Paper)
The information in bug reports influences the speed at which bugs are fixed. However, bug reports differ in their quality of information. We conducted a survey responses among the ECLIPSE developers to determine the information in reports that they widely used and the problems frequently encountered. Our results show that steps to reproduce and stack traces are most sought after by developers, while inaccurate steps to reproduce and incomplete information pose the largest hurdles. Surprisingly, developers are indifferent to bug duplicates. Such insight is useful to design new bug tracking tools that guide reporters at providing more helpful information. We also present a prototype of a quality-meter tool that measures the quality of bug reports by scanning its content. |
Anthony Lymer, Adaptivität im E-Learning: Entwicklung eines Ajax-basierten Eintrittstests für den Einsatz in Lernplattformen, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Bachelor's Thesis)
Das Ziel dieser Arbeit ist es, einen adaptiven Eintrittstest fu?r ein bereits existierendes Lernsystem
zu gestalten. Dabei handelt es sich um das CasIS-Portal, welches das Bearbeiten von Fallstudien
elektronisch unterstu?tzt. In dieser Arbeit wird auf den Eintrittstest und dessen Schnittstellen zu
diesem Portal eingegangen.
Der Eintrittstest soll dem Benutzer vor der Bearbeitung einer Fallstudie helfen, indem er sein
aktuelles Können einschätzt und ihn dementsprechend berät. Nachdem der Test die Kompetenz
des Benutzers gepru?ft hat, werden ihm Lernmaterialien angeboten, damit ihm die Möglichkeit
eingeräumt wird, sich optimal auf die Fallstudie vorzubereiten. Der Eintrittstest ist dabei nicht
als Hu?rde zu verstehen, sondern bietet vielmehr eine Option, um Informationen u?ber das eigene
Können zu erhalten.
Die Arbeit umfasst sowohl die Entwicklung eines Authoring-Tools um Tests zu erstellen als
auch eine Delivery-Engine, welche den Benutzer mit Fragen beliefert und ihm anschliessend Vorbereitungsmodule
anbietet.The aim of this thesis is to develop an adaptive assessment test for the CasIS-Portal - an already
existing e-learning system, which assists users electronically with solving case studies.
This thesis is concerned with designing and attaching an assessment test to the portal.
The test should give support to a user by estimating his current knowledge and then informing
him accordingly.
After a test has been taken and the candidate’s competence has been assessed, he will be
shown the learning material, such that he has the possibility of preparing himself optimally for
the subsequent case study.
The test itself is not to be understood as an obstacle, but rather it provides an opportunity of
obtaining information about one’s current knowledge level.
The work done comprises not only the development of an authoring-tool to create tests but
also a delivery-engine, which presents questions to a candidate and offers him preparation modules. |
Simon Ferndriger, Default Inheritance for OWL-S / Extending the OWL-S (Web Ontology Language for Services) with default logic, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
Currently proposed Web Service Technologies allow describing services syntactically and semantically such that users and software agents are able to discover, invoke, compose and monitor these services with a high degree of automation. Thereby, the services can be connected with an ontology-based semantic description. Up to the present however, none of these standards defines a concrete and self-contained way of connecting these services among each other. This thesis demonstrates how web service creation and web service discovery can benefit from such connections among services and how these benefits can be accomplished by introducing Inheritance Relationships (IR) for OWL-S (OWL-S: Semantic Markup for Web Services, 2004) using ideas from computer science about inheritance. For service creation, this thesis provides the possibility to share specific elements among these services. This sharing is expected to substantially reduce the amount of work necessary for creating and maintaining services. For service discovery, an interpretation of IRs among these services is provided in order to discover service substitutes. These substitutes increase the choice of a service user or the availability of a specific service. Together with the developed prototype, the thesis demonstrates the basic feasibility of applying inheritance for OWL-S by illustrating several use cases. In addition, the thesis provides a basis for further tool development. |
Ursula D'Onofrio, Ginseng Goes Italian, Adding Multilingualism to a Natural Language Search Engine, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
The popularity of the Internet is growing steadily and with it the amount of data shared
online. As a consequence, it is getting more and more difficult to find and organize the
data on the Web. With the Semantic Web a way to structure this data has been provided,
and to query it, SQL-like query languages were developed. Since these formal languages are not likely to be used by “standard” users, the development of natural language interfaces for querying Semantic Web knowledge bases has become a popular subject. The Ginseng project approaches this subject by offering a controlled natural language query interface for the Semantic Web to the user. In this thesis, the extension of the Ginseng application to a multilingual system as well as the implementation of a user interface for managing the semantic annotation of ontologies in Ginseng are presented. |
Isabelle Guyon, Jiwen Li, Theodor Mador, Patrick A. Pletscher, Gerold Schneider, Markus Uhr, Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark, Pattern Recognition Letters, Vol. 28 (12), 2007. (Journal Article)
We used the datasets of the NIPS 2003 challenge on feature selection as part of the practical work of an undergraduate course on feature
extraction. The students were provided with a toolkit implemented in Matlab. Part of the course requirements was that they should
outperform given baseline methods. The results were beyond expectations: the student matched or exceeded the performance of the
best challenge entries and achieved very effective feature selection with simple methods. We make available to the community the results
of this experiment and the corresponding teaching material. These results also provide a new baseline for researchers in feature selection. |
Peter Höltschi, Ein regel- und statistikbasiertes Empfehlungssystem für das Masterstudium in Informatik, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Bachelor's Thesis)
In dieser Bachelorarbeit wird ein regel- und statistikbasiertes Empfehlungssystem für die Planung des
Masterstudiums in Informatik an der Universität Zürich spezifiziert, entworfen und an einem Prototypen
erprobt. Das System unterstützt Informatikstudierende bei der automatischen Erstellung von
Studienplänen. Dadurch wird zum einen die Einhaltung der Studienreglemente garantiert. Andererseits
erhalten die Studierenden ein Bild darüber, wie ihr Masterstudium aussehen könnte. Sie müssen dazu
die Daten ihres Leistungsausweises zur Verfügung stellen und Präferenzen zur Studienrichtung und zur
Modulwahl angeben. Aufgrund dieser Daten erstellt das System mittels mehrerer Filter- und
Sortierfunktionen die gewünschten Studienpläne. In einer Evaluation wurden Studierende um die
manuelle Erstellung eines Studienplans und der Angabe der Daten zur automatischen Erstellung
angefragt. Eine Analyse der Resultate und ein Vergleich zwischen dem manuellen und dem automatisch
erstellten Studienplan hat ergeben, dass die Qualität von letzterem stark von der Qualität und der
Menge der Präferenzangaben des Studenten abhängt. Zudem kam heraus, dass das System zur
optimalen Nutzung mit zusätzlichen Features ausgestattet werden sollte. This Bachelor Thesis describes the specification, design and implementation of a rule- and statistics
based recommendation system for the planning of the master study in informatics at the University of
Zurich. The system supports students in automatically generating study plans. On one hand, this
guaranties the compliance with the reglements of study. On the other hand, the students quickly get a
picture of how a master study plan can look like. For this to work, the student has to provide data of his
transcript of records, some details concerning his course of study and a choice of preferred lecture
contents. Based on this data, the system generates the study plans using filtering and sorting functions.
In an evaluation, some students were asked to provide a manually created study plan and the data for
automatically generating study plans. The analysis of the results and a comparison of the manually and
automatically generated study plans showed that the quality and quantity of the provided data have a
strong impact on the quality of the resulting study plans. To enhance the system, further features should
be implemented. |
Roman Zweifel, Developing a Web Portal for Case Studies, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
In the last few years the e-learning offers from the higher education schools have increased. They supply whole courses and learning materials in the internet for their students. So the internet gives the possibility to have another learning method.
The more learning ressources exist in a portal, it is much more difficult for the students to find the right materials and courses. Today the main approach is to offer a full text search or a catego-risation of course materials. New developments like faceted browsing or semantic annotation is rarely used. This thesis describes the CasIS portal. It is developed for the master studies in Computer Science at the University of Zurich. It supplies different case studies in the dissimilar areas of Information Systems. For a better usability the searching of the right ressources is very important. With the aid of the semantic web and a facetted browsing tool, the student gets the ability to find the appropriate ressources easily. |
Abraham Bernstein, Jayalath Ekanayake, Martin Pinzger, Improving defect prediction using temporal features and non linear models, In: Proceedings of the International Workshop on Principles of Software Evolution, IEEE Computer Society, Dubrovnik, Croatia, 2007-09-01. (Conference or Workshop Paper published in Proceedings)
Predicting the defects in the next release of a large software system is a very valuable asset for the pro ject manger to plan her resources. In this paper we argue that temporal features (or aspects) of the data are central to prediction performance. We also argue that the use of non-linear models, as opposed to traditional regression, is necessary to uncover some of the hidden interrelationships between the features and the defects and maintain the accuracy of the prediction in some cases. Using data obtained from the CVS and Bugzilla repositories of the Eclipse pro ject, we extract a number of temporal features, such as the number of revisions and number of reported issues within the last three months. We then use these data to predict both the location of defects (i.e., the classes in which defects will occur) as well as the number of reported bugs in the next month of the pro ject. To that end we use standard tree-based induction algorithms in comparison with the traditional regression. Our non-linear models uncover the hidden relationships between features and defects, and present them in easy to understand form. Results also show that using the temporal features our prediction model can predict whether a source file will have a defect with an accuracy of 99% (area under ROC curve 0.9251) and the number of defects with a mean absolute error of 0.019 (Spearman’s correlation of 0.96). |
Philippe Hungerbühler, The Influence of SPAM on Performance, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Bachelor's Thesis)
Almost every Internet user knows about the problem of SPAM. At work, especially, it costs time
to sort out irrelevant emails. This thesis deals with the problem of SPAM and its consequences
on productivity at work. For this reason an experiment has been conducted to examine the distraction
of SPAM and its perception. A few hypotheses, stated in advance, have been reviewed
on basis of this experiment. The results and the interpretation are presented and discussed in this
thesis. |
Michael Imhof, Entwicklung eines RDF Parsers für transaktionsbasierte Daten, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Bachelor's Thesis)
Der Java RDF Parser (JRP) ist ein Programm zum Einlesen von Files im RDF
Format und Extrahieren von transaktionsbasierten Daten, die anschliessend in einer
Datenbank gespeichert werden können. Diese Arbeit handelt von der Entwicklung
von JRP und bietet dem Leser einen Einblick in das Design des Codes, das
Datenbank-Schema und die Anbindung sowie eine Evaluation von Jena, der Java
Library die fu?r das Parsen der Daten benutzt wird. Selbstverständlich wurde das
Programm mit realen Daten getestet und bewies auf diese Art und Weise seine
korrekte Funktionalität. Leider kann bis jetzt nichts u?ber die Skalierbarkeit des
Parsers gesagt werden, da fu?r die Performance Tests keine grossen Datensätze
vorhanden waren. The Java RDF Parser (JRP) is a program to read in files in RDF format and extract
transactional data from it that can be stored in a database afterwards. This thesis is
about the development of JRP and gives an insight into the design of the code, the
database schema and connection, as well as an evaluation of Jena, the Java library
that is used to parse the input files. Naturally, the program was tested with real data
and proved the desired functionality. Unfortunately, nothing can be said about the
scalability of the parser, because there were no large datasets available for
performance tests. |
Matthias Linherr, Data Mining auf Kundendaten, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
The aim of this thesis is to implement a platform to enable the alumni associations to
analyse their member-databases. Using statistical methods and data-mining algorithms,
this platform should allow the visualization and appraisal of member behaviour and
member structure. Four different alumni organisations utilise this platform in form of a
web-based application to maintain their databases. They form the base of the following
Katharina Reinecke, Abraham Bernstein, Culturally Adaptive Software: Moving Beyond Internationalization, In: Proceedings of the HCI International (HCII), Springer, Beijing, China, July 2007. (Conference or Workshop Paper)
So far, culture has played a minor role in the design of software. Our experience with imbuto, a program designed for Rwandan agricultural advisors, has shown that cultural adaptation increased efficiency, but was extremely time-consuming and, thus, prohibitively expensive. In order to bridge the gap between cost-savings on one hand, and international usability on the other, this paper promotes the idea of culturally adaptive software. In contrast to manual localization, adaptive software is able to acquire details about an individual's cultural identity during use. Combining insights from the related fields international usability, user modeling and user interface adaptation, we show how research findings can be exploited for an integrated approach to automatically adapt software to the user's cultural frame. |
Sinja Helfenstein, Visualizing Labor Market Dynamics based on Social Security Records A Combination of Temporal and Visual Data Mining, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
The goal of this thesis is the understanding of temporal patterns in the Austrian Social Security Database to derive labor market dynamics. As these structures are very complex, conventional data mining approaches turned out to be inadequate for interpretation and knowledge discovery. The main challenge is the intuitive representation of the time dimension. Therefore, we keep the time dimension by generating movies of concatenated probabilistic model visualizations. Using this combination of temporal and visual data mining allows us to identify various effects such as seasonal hiring cycles, gender and age-related employment dynamics, and demographic influences. |
Domenic Benz, Voraussage von Benutzerverhalten in dynamischen Umgebungen, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
The increasing proliferation of mobile phones has a significant influence on our daily lifes. Allthough the increasing use of mobile devices has brought several advantages, it also has the negative effect of unwanted disturbance and interruptions. It is desirable that a mobile phone has the ability to adapt to the current situation it is in. For such an adaption to become possible, the mobile phone would need to have information about its current context. To achieve this goal, a software is implemented which gathers data from a variety of sensors on a mobile phone. This software is then being used in a prototype experiment. In this experiment we try to determine if it is possible to predict a users activity and location based on the collected data. The software implemented in this thesis and the results of the experiment help to prepare and conduct follow-up experiments in the field of context awareness and human interuptibility research. |
Christoph Kiefer, Imprecise SPARQL: Towards a Unified Framework for Similarity-Based Semantic Web Tasks, In: Proceedings of 2nd Knowledge Web PhD Symposium (KWEPSY) colocated with the 4th Annual European Semantic Web Conference (ESWC), June 2007. (Conference or Workshop Paper)
This proposal explores a unified framework to solve Semantic Web tasks that often require similarity measures, such as RDF retrieval, ontology alignment, and semantic service matchmaking. Our aim is to see how far it is possible to integrate user-defined similarity functions (UDSF) into SPARQL to achieve good results for these tasks.We present some research questions, summarize the experimental work conducted so far, and present our research plan that focuses on the various challenges of similarity querying within the Semantic Web. |
Christoph Kiefer, Abraham Bernstein, Jonas Tappolet, Analyzing Software with iSPARQL, In: Proceedings of the 3rd International Workshop on Semantic Web Enabled Software Engineering (SWESE 2007), Springer, June 2007. (Conference or Workshop Paper)
Dennis Weiss, Mining Customer Networks and Inter-Product Relations in Internet / Digital Entertainment Provider Data, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Master's Thesis)
Today’s telecommunication companies have at their disposal large quantities of detailed transaction data. Methods of data mining can be utilized for generating information on product use, customer behaviour and interaction between customers. Cross-selling analyses, customer segmentation and social network analysis thereby represent only some of the practices which can be employed for facilitating direct marketing procedures. This present thesis illustrates an approach for identifying customer groups and their networks, from which management implications may be derived by means of propositional as well as relational data mining. In this context triple play customers - i.e. subscribers of broadband internet, fixed-line telephony and digital TV - were segmented on the basis of data generated from product use. In addition, network analysis and the search for multi-relational patterns provided further insight into both customer types and their respective needs. |
Cathrin Weiss, Rahul Premraj, Thomas Zimmermann, Andreas Zeller, How Long will it Take to Fix This Bug?, In: Proceedings of the Fourth International Workshop on Mining Software Repositories, IEEE Computer Society, May 2007. (Conference or Workshop Paper)
Predicting the time and effort for a software problem has long been a difficult task. We present an approach that automatically predicts the fixing effort, i.e., the person-hours spent on fixing an issue. Our technique leverages existing issue tracking systems: given a new issue report, we use the Lucene framework to search for similar, earlier reports and use their average time as a prediction. Our approach thus allows for early effort estimation, helping in assigning issues and scheduling stable releases. We evaluated our approach using effort data from the JBoss project. Given a sufficient number of issues reports, our automatic predictions are close to the actual effort; for issues that are bugs, we are off by only one hour, beating naive predictions by a factor of four. |
Cathrin Weiss, Rahul Premraj, Thomas Zimmermann, Andreas Zeller, Predicting Effort to fix Software Bugs, In: Proceedings of the 9th Workshop Software Reengineering, May 2007. (Conference or Workshop Paper)