Optimal Clustering of Central Bank Role Profile Descriptions

Aidan Wade, Markus Hofmann

Abstract


The Central Bank of Ireland has a set of role profiles used when recruiting new staff but which also contain information about the current skill levels in the bank and which could support project planning. The roles are manually created accord ing to a semi-structured template and the volume of roles makes them increasingly hard to manage, requiring an NLP solution for finding similar roles and apply ing an appropriate grouping. Different pre-processing and dimension reduction methods are tested using K-Means and Agglomerative Clustering (HAC) with clustering metrics Davies-Bouldin and Silhouette. This suggests an optimal num ber of clusters in the range 70 to 130 but the correct value is subjective and requires subject matter expertise.

Keywords


NLP, Clustering, K-Means, HAC, Role Descriptions

Full Text: PDF