Not logged in.

Contribution Details

Type Master's Thesis
Scope Discipline-based scholarship
Title Machine Learning Approach to Polkadot’s Validator Selection Algorithm
Organization Unit
Authors
  • Ben Domenic James Murphy
Supervisors
  • Claudio Tessone
  • Matija Piskorec
Language
  • English
Institution University of Zurich
Faculty Faculty of Business, Economics and Informatics
Date 2023
Abstract Text Polkadot’s validator selection process employs an iterative algorithm, which is dependent on the size of the staking system. As Polkadot’s staking network is growing, I propose a machine learning alternative approach to the current implementation, that is more independent of scale. The algorithm, the sequential Phragmén, aims to reduce a graph of nominator-validator edges to a subset of validators, the active set, and distribute the stake backing them, as evenly as possible. The goal of this thesis is to produce superior results, consequently improving the overall security or to provide solutions of equal quality in faster time. In order to achieve the goal, a pipeline is setup, that gathers data and transforms it such that it is suitable for machine learning models. Predictions are made, which are adjusted to fit the requirements set by Polkadot. The adjusted results are scored and ultimately compared to the solutions discovered by sequential Phragmén. An analysis of the training data reveals, that the active set remains highly static, with only 10 validators on average changing from era to era. This lack of diversity raised concerns regarding potential attack vectors for adversaries. Furthermore, it was observed that many nominators are acting inefficiently. Many of them do not execute their right to nominate up to 16 validators, which would maximize their chance of having a validator included in the active set. Additionally, many of them include validators, which are not eligible targets. This occurs since nominators frequently ignore their duty to actively tend to their validator preferences. They set them once and do not update them. Eligible validators become inactive (intentionally or unintentionally) and consequently remain as part of the nominators preferences. The prediction task was split up into three models: The first model predicts the next active set, the second model predicts the sum of stake each validator receives and the third predicts the individual stake distribution. The results show, that the first two models are trained well and produce satisfactory results. However, the learning curves of the third model reveal a bias, which make the predictions suboptimal. The source of the bias is likely the substantial changes in target values introduced by a slight shift of active set. We conclude that it is unlikely to outperform the sequential Phragmén using a supervised approach under the described conditions. Therefore, we recommend exploring an unsupervised approach for further research. Furthermore, we recommend the development of a tool for nominators, that could increase the convenience and the security of the overall staking system as a consequence.
PDF File Download
Export BibTeX