Not logged in.

Contribution Details

Type Master's Thesis
Scope Discipline-based scholarship
Title Work Task Classification from Job Ads onto O*NET: Hierarchy-Aware and Cross-lingual Transfer Approach
Organization Unit
Authors
  • Jinqiao Li
Supervisors
  • Ann-Sophie Gnehm
  • Simon Clematide
  • Martin Volk
Language
  • English
Institution University of Zurich
Faculty Faculty of Business, Economics and Informatics
Date 2023
Abstract Text This project applied a hierarchy-aware and cross-lingual approach to classify job tasks (e.g.: {Verpackungsarbeiten allgemein und in Medizinaltechnik}) from German job advertisements using the ONET English ontology which is a complex ontology with three hierarchical level and fine-grained classes. Two methods, machine translation and multilingual models, are tested to bridge the language gap. The project consisted of two sets of experiments: local classifier experiments using transformer-based models at each hierarchical level, and global hierarchical models on the O*NET data. This work yields several key findings: Firstly, domain adaptation proved effective, with job domain-specific language models outperforming general domain models. Translation quality also influenced classification performance, with DeepL outperforming the SJMM engine. Secondly, state-of-the-art models (TextRNN, TextRCNN, HMCN, HiAGM) were used as global hierarchical models for task classification. These models effectively incorporated hierarchical information, addressing inconsistencies and overfitting through recursive regularization. Furthermore, the best model configurations from both series of experiments are selected to predict job advertisement data, resulting in reliable classification using the O*NET hierarchical ontology. Human post-evaluation, conducted by a German-speaking domain expert, validates the accuracy of the models' predictions. Overall, while this project extensively tested the feasibility of hierarchy-aware classification models, the transformer-based flat model Job-GBERT proves to be a more suitable option for the hierarchical classification of Job Ads data, given its specificity.
PDF File Download
Export BibTeX