A reduced variance unsupervised ensemble learning algorithm based on modern portfolio theory

Unlu, RAMAZAN; Xanthopoulos, Petros

doi:10.1016/j.eswa.2021.115085

A reduced variance unsupervised ensemble learning algorithm based on modern portfolio theory

Unlu R., Xanthopoulos P.

EXPERT SYSTEMS WITH APPLICATIONS, cilt.180, 2021 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 180
Basım Tarihi: 2021
Doi Numarası: 10.1016/j.eswa.2021.115085
Dergi Adı: EXPERT SYSTEMS WITH APPLICATIONS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Public Affairs Index, Civil Engineering Abstracts
Anahtar Kelimeler: consensus clustering, ensemble learning, internal quality measures, Markowitz's portfolio theory, CONSENSUS
Abdullah Gül Üniversitesi Adresli: Evet

Özet

Unsupervised ensemble learning or consensus clustering has gained popularity due to its ability to combine multiple clustering solutions into a single solution that is robust and often performs better than the individual ones. There have been several approaches to consensus clustering including voting and weighted voting algorithmic schemes. Although there have been several algorithms for adjusting the weights of a consensus clustering all of them are tuned based on some performance characteristic associated with clustering accuracy. In this paper, we propose a method for incorporating weights by taking into consideration the intra algorithmic variability i.e. algorithms that provide solutions with very different performance upon multiple runs. The methodology is inspired by modern portfolio theory and more specifically from Markowitz model for asset allocation where one is trying to identify the most efficient portfolio through the solution of a convex optimization problem. Here, efficiency is defined as the minimum amount of risk for an expected return. We apply this method to different datasets and compare with respect to performance and robustness. The proposed scheme appears to achieve competitive average performance with very low variability.