Not logged in.

Contribution Details

Type Bachelor's Thesis
Scope Discipline-based scholarship
Title Design and Implementation of a System for Reproducible Machine and Deep Learning Models
Organization Unit
Authors
  • Viachaslau Berasneu
Supervisors
  • Alberto Huertas Celdran
  • Jan Von der Assen
  • Burkhard Stiller
Language
  • English
Institution University of Zurich
Faculty Faculty of Business, Economics and Informatics
Date 2023
Abstract Text In recent years, small and midsize enterprises (SMEs) have become increasingly reliant on technology, but lag in terms of investment into cybersecurity. This renders them vulnerable to malware attacks, which are increasingly targeting companies rather than individuals, with great economic impact. This project proposes and implements a prototype tool, which allows for machine learning models to be trained, stored, and tested within the SecBox sandbox environment. Both classification and anomaly detection models are implemented through Scikit-learn, in order to provide predictions about known malware types (binary and multiclass classification), as well as detecting the presence of unseen malware in real-time during the SecBox execution. The models are trained using the system call and resource usage file execution logs available from the SecBox, which are transformed into suitable formats using frequency-based and sequence-based data preprocessing. Model reproducibility is ensured by generating configuration files with references to the random seeds, the datasets used in training, as well as other model parameters, which can be used to re-train the same model. To evaluate and compare model performance, each model type is tested in a realistic scenario of the execution of Monti ransomware within the SecBox, creating a confusion matrix as well as calculating the accuracy, precision, recall and F1-score metrics based on the model predictions. The system call classifier models are shown to have the best performance when classifying Monti malware samples, and the project is concluded by specifying several relevant research areas to be investigated further.
PDF File Download
Export BibTeX