H C Gall, G Reif, ICSE 2009 Tutorial - Semantic Web Technologies in Software Engineering, In: 31th International Conference on Software Engineering, 2009-05-16. (Conference or Workshop Paper)
Over the years, the software engineering community has developed various tools to support the specification, development, and maintainance of software. Many of these tools use proprietary data formats to store artifacts which hamper interoperability. On the other hand, the Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Ontologies are used to define the concepts in the domain of discourse and their relationships and as such provide the formal vocabulary applications use to exchange data. Besides the Web, the technologies developed for the Semantic Web have proven to be useful also in other domains, especially when data is exchanged between applications from different parties. Software engineering is one of these domains in which recent research shows that Semantic Web technologies are able to reduce the barriers of proprietary data formats and enable interoperability.
In this tutorial, we present Semantic Web technologies and their application in software engineering. We discuss the current status of ontologies for software entities, bug reports, or change requests, as well as semantic representations for software and its documentation. This way, architecture, design, code, or test models can be shared across application boundaries enabling a seamless integration of engineering results. |
|
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, Brendan Murphy, Does distributed development affect software quality? An empirical case study of Windows Vista, In: 31st International Conference on Software Engineering, IEEE, Vancouver, 2009-05-16. (Conference or Workshop Paper published in Proceedings)
It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and communication and coordination overhead. We evaluate this conventional belief by examining the overall development of Windows Vista and comparing the post-release failures of components that were developed in a distributed fashion with those that were developed by collocated teams. We found a negligible difference in failures. This difference becomes even less significant when controlling for the number of developers working on a binary. We also examine component characteristics such as code churn, complexity, dependency information, and test code coverage and find very little difference between distributed and collocated components to investigate if less complex components are more distributed. Further, we examine the software process and phenomena that occurred during the Vista development cycle and present ways in which the development process utilized may be insensitive to geography by mitigating the difficulties introduced in prior work in this area. |
|
Sandro Boccuzzo, H C Gall, CocoViz with ambient audio software exploration, In: 31st International Conference on Software Engineering, IEEE, Vancouver, 2009-05-16. (Conference or Workshop Paper)
For ages we used our ears side by side with our ophthalmic stimuli to gather additional information, leading and supporting us in our visualization. Nowadays numerous software visualization techniques exist that aim to facilitate program comprehension. In this paper we discuss how we can support such software comprehension visualization with environmental audio and lead users to identify relevant aspects. We use cognitive visualization techniques and audio concepts described in our previous work to create an ambient audio software exploration (AASE) out of program entities (packages, classes ...) and their mapped properties. The concepts where implemented in a extended version of our tool called CocoViz. Our first results with the prototype shows that with this combination of visual and aural means we can provide additional information to lead users during program comprehension tasks. |
|
Matthias Hert, Gerald Reif, Harald Gall, Personal Knowledge Mapping with semantic web technologies, In: 1st International Workshop on Personal Knowledge Management at the 5th Conference on Professional Knowledge Management, 2009-03-25. (Conference or Workshop Paper published in Proceedings)
Semantic Web technologies promise great benefits for Personal Knowledge Management (PKM) and Knowledge Management (KM) in general when data needs to be exchanged or integrated. However, the Semantic Web also introduces new issues rooted in its distributed nature as multiple ontologies exist to encode data in the Personal Information Management (PIM) domain. This poses problems for applications processing this data as they would need to support all current and future PIM ontologies. In this paper, we introduce an approach that decouples applications from the data representation by providing a mapping service which translates Semantic Web data between different vocabularies. Our approach consists of the RDF Data Transformation Language (RDTL) to define mappings between different but related ontologies and the prototype implementation RDFTransformer to apply mappings. This allows the definition of mappings that are more complex than simple one-to-one matches. |
|
Marc Körsgen, Analysis Broker, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Bachelor's Thesis)
Today, Software Analysis is a common instrument in the software development environment. As
result of that, many software analysis tools were developed. Ghezzi et al. GG08 from the Department
of Informatics of the University of Zurich focused on the problem that the possibility to
access and combine these tools is strongly limited. Their approach is to hide each tool behind a
web service interface and publish these on a centralized broker. Furthermore these analysis services
are semantically annotated and thus can be categorized by a predefined Software Analysis
Taxonomy.
This bachelor thesis deals with the development of a software prototype to implement the desired
features such as indexing and categorizing analysis services. The goal is to implement a web
service that delivers these features. Beside the web service a Flash based visualization, which accesses
the service, should give human users the possibility to administrate and browse the service
catalog.
At the beginning of this paper, the Software Analysis Taxonomy, according to which the analysis
services are categorized, gets defined, and extended to satisfy the needs of storing web services
and its parameters. Furthermore a structure, how the analysis services have to be described in
their SAWSDL, gets defined.
The main part of this thesis focuses on the actual implementation of the Analysis Broker. Thus it
is giving an overview about the concepts behind the catalog service and its visualization. On the
basis of several examples, the paper reveals the functionality and the design decisions. |
|
Franziska Schait, iChart, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Bachelor's Thesis)
The goal of this thesis is to create a dynamic, updating chart widget to display frequently changing,
web based data. The widget shall be designed in iWidget format so that it can be embedded
into IBM Lotus Mashup Center. The widget extracts predefined values from a web based XML
data source via XPath and stores them as a JSON array. Using the dojo toolkit, the widget creates
a chart displaying the previously extracted values. The chart updates in a user-defined time span.
Thus, dynamic values of multiple attributes can be displayed in dependence to time. The values
can be displayed either as a line chart or as a bar chart. To conclude the thesis, the functionality
of the widget is validated. |
|
H C Gall, Beat Fluri, Martin Pinzger, Change analysis with evolizer and ChangeDistiller, IEEE Software, Vol. 26 (1), 2009. (Journal Article)
|
|
Michael Jehle, Kevin Leopold, Linard Moll, Anthony Lymer, Software Evolution Recognition and Visualization Information Service, No. IFI-2009.06, Version: 1, 2009. (Technical Report)
|
|
Jan Bielik, SEON - Designing Software Engineering Ontologies, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Master's Thesis)
In the field of informatics an ontology provides a commonly shared vocabulary for a domain of discourse, which is used in order to retain the intended meaning of data.
The aim of this thesis is to describe a structured set of software engineering ontologies that represents the data found in revision control systems, issue tracking systems and the static source code information underlying object-oriented programming languages. It presents an approach to an object-oriented metrics library, which identifies design disharmonies in object-oriented software projects from annotations created with the software engineering ontologies presented in this thesis.
For this purpose, we present first the state of the art in ontology development. We stress the needs for comprehensive methodologies and present in detail the available ones, which are often applied and which rely on the most intuitive use. These methodologies should enable a straight forward development of ontologies that are commonly shared and widely used in their specific domain of discourse. Next, we analyze existing ontologies in the field of software engineering with regard to their suitability for integration in our software engineering ontologies.
In a second stage, we outline the various requirements that our software engineering ontologies should fulfill and describe the system implementations and the object-oriented programming languages that are investigated for their representations in OWL. We present SEON, our structured set of software engineering ontologies. These ontologies describe the data of revision control systems, issue tracking systems and the static source code information of object-oriented programming languages. Our software engineering ontologies are subsequently used to annotate these kinds of information for further software analyses.
Finally, we present our object-oriented metrics library, which evaluates the design of object- oriented software projects on the basis of data annotated with our software engineering ontologies and points out possible weaknesses in the design of these software projects. |
|
Sebastian Müller, Change Prism: A Java Visualization for Software Changes, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Bachelor's Thesis)
There are already some visualizations available on the open market that are able to visualize source code files, packages and authors in a threedimensional space. However, none of these visualizations focus on the individual developer and his influence on the entire project or the relationship between the work of different developers. The visualization described in this Bachelor Thesis aims to do this.
Inspired by blank, a new visualization was developed. The visualization follows the same principle in order to map the code files which a developer has modified into a threedimensional diagram. This offers the opportunity to uncover the ""path"" a developer takes through the software system.
The idea was implemented in Java as a standalone application. The necessary metadata about the developers, classes and packages are extracted from a version control system via an interface from a local database.
The subsequent analysis of the developed visualization has shown that interpreting the various ""developers-paths"" through the system allows perfectly valid conclusions about the origin and meaning of individual parts (classes and packages) in relation to the whole system to be drawn. It was also possible to discover dependencies between individual developers which suggest that software developers influence each other in their work. |
|
Michael Küchler, Coupling in ChangePrism: Enhancing ChangePrism with Coupling, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Bachelor's Thesis)
Coupling describes dependencies between different modules of a software project, which can be an important factor in respect to the past and future development. But still the ChangePrism, an application developed at the s.e.a.l. lab of the University of Zurich for visualization of software development, visualizes the development of software only on the basis of data collected by a versioning system. This thesis describes an enhancement of the ChangePrism in order to letting the tool include information about the coupling into its visualization. The new data used by ChangePrism is a FAMIX model of the analyzed project’s sources, and the existing visualization is enriched with that data in different ways. The new functionality of the ChangePrism was finally evaluated with the analysis of the Eclipse JDT Core project. |
|
Sazzadul Alam, Sandro Boccuzzo, Richard Wettel, Philippe Dugerdil, Harald Gall, Michele Lanza, EvoSpaces - Multi-dimentional Navigation Spaces for Software Evolution, In: Human Machine Interaction, Springer, Berlin, p. 167 - 192, 2009. (Book Chapter)
In software development, a major difficulty comes from the intrinsic complexity of software systems and the size of which can easily reach millions of lines of code. But software is an intangible artifact that does not have any natural visual representation. While many software visualization techniques have been proposed in the literature, they are often difficult to interpret. In fact, the user of such representations is confronted with an artificial world that contains and represents intangible objects. The goal of our EVOSPACES project was to investigate effective visual metaphors (i.e., analogies) between natural objects and software objects so that we can exploit the cognitive understanding of the user. The difficulty of the approach is that the common sense expectations about the displayed world should also apply to the world of software objects. To solve this common sense representation problem for software objects our project addressed both the small-scale (i.e., the level of individual objects) and the large-scale (i.e., the level of groups of objects). After many experiments we decided for a "city" metaphor: at the small scale we included different houses and their shapes as visual objects to cover size, structure and history. At the large-scale level we arrange the different types of houses in districts and include their history in diverse layouts. The user then is able to use EVOSPACES virtual software city to navigate and explore all kinds of aspects of a city and its houses: size, age, historical evolution, changes, growth, restructuring, and evolution patterns such as code smells or architectural decay. For that we have developed a software environment named EVOSPACES as a plug-in to Eclipse so that visual metaphors can quickly be implemented in an easily navigable virtual space. Due to the large amount of information we complemented the flat 2D world with full-fledged immersive 3D representation. In this virtual software city, the dimensions and appearance of the buildings can be set according to software metrics. The user of the EVOSPACES environment can then explore a given software system by navigating through the corresponding virtual city. |
|
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, Brendan Murphy, Does distributed development affect software quality? an empirical case study of Windows Vista, Communications of the ACM, Vol. 52 (8), 2009. (Journal Article)
Existing literature on distributed development in software engineering, and other fields discusses various challenges, including cultural barriers, expertise transfer difficulties, and communication and coordination overhead. Conventional wisdom, in fact, holds that distributed software development is riskier and more challenging than collocated development. We revisit this belief, empirically studying the overall development of Windows Vista and comparing the post-release failures of components that were developed in a distributed fashion with those that were developed by collocated teams. We found a negligible difference in failures. This difference becomes even less significant when controlling for the number of developers working on a binary. Furthermore, we also found that component characteristics (such as code churn, complexity, dependency information, and test code coverage) differ very little between distributed and collocated components. Finally, we examine the software process used during the Vista development cycle and examine how it may have mitigated some of the difficulties of distributed development introduced in prior work in this area. |
|
Tsuyoshi Ito, Das Evolutionäre Lernspiel-Konzept: Eine Kombination aus Game-based Learning und Web 2.0, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Dissertation)
Der kontinuierliche Wissenserwerb, das sogenannte Lebenslange Lernen, wird in unserer heutigen Gesellschaft als unverzichtbar angesehen, um sowohl während der Ausbildung wie auch im Beruf dauerhaft konkurrenzfähig bleiben zu können. Dafür werden nach einhelliger Meinung pädagogischer Forscher neue Lernmethoden benötigt, welche die Lernenden zukünftig besser unterstützen als klassische Lernmethoden. Eine solche innovative Lernmethode ist das Game-based Learning, welches den Lernenden durch spielerische Komponenten intrinsisch motiviert, sich mit dem Lerninhalt auseinanderzusetzen. Die Entwicklung und Herstellung solcher Lernspiele gilt jedoch als sehr kostspielig und noch nicht etabliert. Dies führt nicht selten zu Qualitätseinbussen, welche auf zu kleine Budgets für die Umsetzung oder die oberflächliche Zusammenarbeit zwischen Lerninhaltsproduzenten und Spieldesignern zurückzuführen sind.
In dieser Arbeit wird darum ein Konzept vorgestellt und evaluiert, welches im Stile aktueller Webplattformen das Konzept der benutzergenerierten Inhalte, Feedback- und Bewertungsfunktionen in ein Lernspiel integriert, um einerseits die Kosten für die Entwicklung zu senken und andererseits emergente Unterhaltung zu schaffen. Letzteres entsteht durch die Kreativität der aktiven Nutzer. Dieses Konzept wurde durch Mitchel Resnick’s konstruktivistischen Lifelong Kindergarten Approach inspiriert und basiert auf der Idee, dass Lernende durch die Erstellung von Inhalten den Lerninhalt verstehen und anwenden können. |
|
K Wahler, A framework for integrated process and object life cycle modeling, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2009. (Dissertation)
|
|
Marc Vontobel, Purple Leaf - Evaluation of the Adoption of New Features in a Web-Based Social Network, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2008. (Bachelor's Thesis)
Purple Leaf is a social network which offers its member several possibilities to personalize its exclusive events by providing them unique online services. After the size of our platform suddenly increased from 300 initially invited guests to a multiple, we were obliged to completely revise the platform and enlarge our range of services. To embed these new services smoothly into the existing web presence, we fully restructured the application and changed the basis to a modern web framework. After that makeover, we designed five other services which we targeted to increase the customer loyalty and the entertainment value of our platform. Because new features are often not instantly accepted by existing users, we developed an integrated concept for boosting the acceptance of novel functionality. This concept is based on the technology acceptance model which was developed by Davis (1986). The model postulates that the actual use of a new feature is solely based on external factors. On the one hand, there are factors which influence the 'perceived ease-of-use' and on the other hand some that have impact on the 'perceived usefulness'. In order to foster the perceived ease-of-use, we developed several usability concepts and tried to figure out how Web 2.0 features can help to simplify different processes. Beside the creation of intuitive user interfaces and plain procedures, we worked on an elaborated data and application structure which itself also contributed a big part to the simplicity of the new functionality. After we had embedded the services into our Internet portal, we started to analyze the acceptance of one new feature: 'The most favored Guest'. This service allows every sign up member to define his personal list of favored guests for an upcoming event. Once the selected users are informed about their election, they, in turn, have the chance to define their own list. After a first round of selection, we tried to boost the personal acceptance of our members by providing specific incentives. Beside the active interventions into the process of adoption, we also analyzed a passive phenomenon: Does some kind of peer pressure exist within virtual cliques? If so, there might emerge some interesting changes in common marketing strategies which could narrow down the target audience to some single users of the network. In addition, we visualized some of the encountered situations and putted them together in an illustrated book as supplement to this paper. |
|
Jonas Zuberbühler, Change Commander, Recommending Corrective Method Invocation Changes, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2008. (Master's Thesis)
Our investigations of bug fixes in Eclipse showed that a significant amount of bugs were fixed
by moving invocations of certain methods into the then-part or else-part of if-statements with
similar conditions. Based on this finding, we leverage such context changes applied in the past to
support developers while adding invocations of the same method.
In this thesis we describe a recommendation system approach. We leverage the fine-grained
change information extracted by ChangeDistiller of the development history of a software
project and extract patterns among method calls that were moved into an if-statement. Based on
these patterns, we prepare suitable context change recommendations for method invocations. In
doing so, we aim for assisting the software development process and for helping to prevent bugs.
Furthermore, we present ChangeCommander, an Eclipse plug-in that implements our approach
to recommend insertions of particular if-statements before calling a method. ChangeCommander
is seamlessly integrated into the build-process of the Eclipse IDE. It provides
visual feedback on recommendations to the developer, presents context change suggestions by
highlighting affected method invocations in the source code, and provides automated code adaptation
support.
To prove the usefulness of our approach, we conduct a simulation of our recommender on
several software systems. For each system we can suggest context changes that are applied to fix
a bug. This demonstrates that we can support the development process. |
|
Beat Fluri, Change distilling. Enriching software evolution analysis with fine-grained source code change histories, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2008. (Dissertation)
Software systems have to evolve over their life-cycle or they become progressively less useful. The reasons of why software is continuously changed are manifold: Features are added or adapted because of changing requirements; bugs have to be fixed because of faults in the software; or the software has to be migrated because of modernization. One negative effect of the continuing change is the software aging phenomenon. As software is changed from people unaware of the initial design concepts and, mostly, under time-pressure software becomes larger, more complex, and less understandable. As a result, in the last decade, several techniques have been developed to understand the negative impact of continuing change by analyzing change in general and source code change in particular.
The approaches developed so far suffer from the coarse-grained information available for changes. They rely on data provided by versioning systems, which keep track of changes by storing the text differences of a particular file. Changes at the level of source code entities are not considered. In addition, a precise definition and a classification of source code changes are still missing. Both are key to extract and analyze source code changes, and eventually understand the negative impact of continuing change. We therefore claim: Extracting, classifying, and analyzing finegrained source code changes from the history of software systems provide useful insights into problems of continuing change and can identify support mechanisms to reduce them.
The key contribution of this dissertation is change distilling, a methodology to define, classify, extract, and analyze fine-grained source code changes. Change distilling provides a taxonomy of source code changes which defines source code change types according to tree edit operations in the abstract syntax tree. Our change distilling algorithm applies tree differencing pairwise on subsequent versions of abstract syntax trees to extract the tree edit operations.
We provide three empirical experiments to show the benefits of extracting finegrained source code change types. First, we analyze the source code and comment co-change behavior in the evolution of eight software systems. We show that in cases where comments are adapted to source code changes, the related changes happen in the same revision. We also show that in half of these software systems API comments are adapted several revisions after the source code change happened.
Second, we explore whether certain change types appear frequently together. For that we use hierarchical agglomerative clustering to discover change type patterns and present a catalogue of change type patterns. The results from a commercial software system show that certain control flow changes are due to source code cleanup activities, that exception flow is used differently in different system parts, and that API convention changes are spread over many releases.
Third, we investigate whether methods exist whose invocations are significantly more affected by context and update changes than other methods, and whether we can reveal change patterns among these invocation changes. We develop an approach that ranks how often context and update changes were applied to invocations of a particular method and whether these changes were bug fixes. In addition, we extract patterns of context and update changes to assess whether they can be used to provide valuable change suggestions.
The results of our three software evolution experiments provide enough evidence that the analysis of change types helps in understanding software evolution and provides means to support developers in their daily work. |
|
Beat Fluri, J Zuberbühler, H C Gall, Recommending method invocation context changes, In: International Workshop on Recommender Systems for Software Engineering (RSSE 2008), Association for Computing Machinery (ACM), New York, 2008-11-10. (Conference or Workshop Paper published in Proceedings)
Our investigations of bug fixes in Eclipse showed that a significant amount of bugs were fixed by moving invocations of certain methods into the then or else-part of if-statements with similar conditions. Based on this finding, we leverage such context changes applied in the past to support developers while adding invocations of the same method. In this paper we present ChangeCommander, an Eclipse plugin that implements our approach to recommend insertions of particular if-statements before calling a method. ChangeCommander presents context change suggestions by highlighting affected method invocations in the source code and provides automated code adaptation support. |
|
Martin Pinzger, N Nagappan, B Murphy, Can Developer-Module Networks Predict Failures?, In: ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2008-11-09. (Conference or Workshop Paper published in Proceedings)
Software teams should follow a well defined goal and keep their
work focused. Work fragmentation is bad for efficiency and
quality. In this paper we empirically investigate the relationship
between the fragmentation of developer contributions and the
number of post-release failures. Our approach is to represent
developer contributions with a developer-module network that we
call contribution network. We use network centrality measures to
measure the degree of fragmentation of developer contributions.
Fragmentation is determined by the centrality of software modules
in the contribution network. Our claim is that central software
modules are more likely to be failure-prone than modules located
in surrounding areas of the network. We analyze this hypothesis
by exploring the network centrality of Microsoft Windows Vista
binaries using several network centrality measures as well as
linear and logistic regression analysis. In particular, we investigate
which centrality measures are significant to predict the probability
and number of post-release failures. Results of our experiments
show that central modules are more failure-prone than modules
located in surrounding areas of the network. Results further
confirm that number of authors and number of commits are
significant predictors for the probability of post-release failures.
For predicting the number of post-release failures the closeness
centrality measure is most significant. |
|