Studying the methods of data transformation in the context of increasing the effectiveness of credit scoring models

Kleban Y.; Kleban Y.; Клебан Ю.

doi:10.33111/nfmte.2019.094

Neuro-Fuzzy Modeling Techniques in Economics

ISSN 2415-3516

Yuriy Kleban

Studying the methods of data transformation in the context of increasing the effectiveness of credit scoring models

DOI:

10.33111/nfmte.2019.094

Анотація: У статті проведено дослідження з пошуку найефективнішого підходу до попередньої обробки характеристичних ознак позичальників з метою підвищення точності передбачення дефолтів за кредитними зобов’язаннями. Проаналізовано три основних способи подання даних на входи моделей кредитного скорингу: застосування початкових пояснюючих змінних без трансформації, переведення категоріальних характеристик у набір фіктивних змінних, біннінг показників із розрахунком вагомості ознаки (WOE) для кожної категорії.

Для отримання висновків щодо систематичного впливу цих підходів було проведено по 10 повторюваних ітерацій з побудови нейромережевих моделей персептронного типу за кожним із цих трьох способів підготовки вхідних факторів. Кожна скорингова модель оцінювалась за широким набором показників інтегральної та точкової ефективності.

Результати проведених експериментів засвідчили практично за всіма критеріями перевагу запропонованого автором методологічного підходу до попередньої обробки даних шляхом розбиття кількісних змінних на категорії із забезпеченням тренду їх показників вагомості ознаки та дотриманням обмежень на обсяг спостережень у кожній групі.

Abstract: The article highlights a study on the search for the most effective approach to pre-processing the characteristics of borrowers in order to improve the accuracy of predicting defaults on credit obligations. Three main ways of providing data to the inputs of credit scoring models are analyzed: the use of the initial explanatory variables without transformation, the conversion of categorical characteristics into a set of dummy variables, binning the indicators with the calculation of the weights of evidence (WOE) for each category.

To obtain conclusions about the systematic impact of these approaches, 10 repeated iterations were carried out with the construction of perceptron-type neural network models based on each of these three methods of preparing input factors. All scoring models were evaluated by a wide range of indicators of integrated and point efficiency.

The results of the experiments showed by almost all criteria the advantage of the methodological approach proposed by the author for preliminary data processing by dividing quantitative variables into categories, ensuring the trend in values of their weights of evidence and observing restrictions on the volume of observations in each group.

Key words: scoring model, neural network, creditworthiness, binning, weight of evidence (WOE), informational value (IV), Gini coefficient

UDC: 519.86+330.46

JEL: C45 C51 C52 C53

To cite paper

In APA style

Kleban, Y. (2019). Studying the methods of data transformation in the context of increasing the effectiveness of credit scoring models. Neuro-Fuzzy Modeling Techniques in Economics, 8, 94-123. http://doi.org/10.33111/nfmte.2019.094

In MON style

Клебан Ю. Дослідження способів трансформації даних в контексті підвищення ефективності моделей кредитного скорингу. Нейро-нечіткі технології моделювання в економіці. 2019. № 8. С. 94-123. http://doi.org/10.33111/nfmte.2019.094 (дата звернення: 31.10.2025).

With transliteration

Kleban, Y. (2019) Doslidzhennia sposobiv transformatsii danykh v konteksti pidvyshchennia efektyvnosti modelei kredytnoho skorynhu [Studying the methods of data transformation in the context of increasing the effectiveness of credit scoring models]. Neuro-Fuzzy Modeling Techniques in Economics, no. 8. pp. 94-123. http://doi.org/10.33111/nfmte.2019.094 [in Ukrainian] (accessed 31 Oct 2025).

# 8 / 2019

Download Paper

537

Views

151

Downloads

1

Cited by

The National Bank of Ukraine. (2019). Informatsiia pro daty pryiniattia rishen Natsionalnym bankom pro vyznannia bankiv neplatospromozhnymy ta pro likvidatsiiu, rishen FHVFO pro zaprovadzhennia tymchasovoi administratsii z 2014 roku. Retrieved from https://bank.gov.ua/supervision/reorganizat-liquidat/reorganiz-history [in Ukrainian]
The National Bank of Ukraine. (2019). Zvit pro finansovu stabilnist, cherven 2019 r. Retrieved from https://bank.gov.ua/admin_uploads/article/FSR_2019-R1.pdf?v=4 [in Ukrainian]
The National Bank of Ukraine. (2004). Metodychni vkazivky z inspektuvannia bankiv «Systema otsinky ryzykiv» : Postanova Pravlinnia Natsionalnoho banku Ukrainy vid 15.03.2004 № 104. Retrieved from https://zakon.rada.gov.ua/laws/show/v0104500-04 [in Ukrainian]
The National Bank of Ukraine. (2019, March 1). Dokhody ta vytraty bankiv Ukrainy. Retrieved from https://bank.gov.ua/files/stat/Inc_Exp_Banks_2019-03-01.xlsx
Anderson, R. (2007). The credit scoring toolkit: theory and practice for retail credit risk management. Oxford, UK : Oxford University Press.
Sorokin, A. S. (2014). Postroyeniye skoringovykh kart s ispolzovaniyem modeli logisticheskoy regressii. Naukovedeniye (Science of Science), 2, 1–29 [in Russian]
Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. New Jersey, NJ: John Wiley and Sons.
Jopia, H. (2015, March 24). R Package ‘smbinning’: Optimal Binning for Scoring Modeling. Retrieved from https://blog.revolutionanalytics.com/2015/03/r-packagesmbinning-optimal-binning-for-scoring-modeling.html
Kovalev, M., & Korzhenevskaya, V. (2007). Metodika postroyeniya bankovskoy skoringovoy modeli dlya otsenki kreditosposobnosti fizicheskikh lits. Vestnik Assotsiatsii belorusskikh bankov (Bulletin of the Belarusian Banks Association), 46, 16–20 [in Russian].
Kolyada, Y. V., & Bondar, V. A. (2016). Binninh u neyromerezhevykh skorynhovykh modelyakh. Neyro-nechitki tekhnolohiyi modelyuvannya v ekonomitsi (Neuro-Fuzzy Modeling Techniques in Economics), 5, 60–80. DOI: 10.33111/nfmte.2016.060 [in Ukrainian]
Palkin, N.B., & Afanasiev, V. V. (2013). Optimal’noye kvantovaniye dlya povysheniya kachestva binarnykh klassifikatorov. Shtuchnyy Intelekt (Artificial Intelligence), 4, 392–399 [in Russian]
Fair Isaac Corporation. (2014, March). Building Powerful, Predictive Scorecards: An overview of Scorecard module for FICO Model Builder. Retrieved from http://www.fico.com/en/wp-content/secure_upload/Building_Powerful_Predictive_Scorecards_1991WP.pdf
TIBCO. (2019). TIBCO Statistica 13.5.0. Retrieved from http://documentation.statsoft.com/portals/0/formula%20guide/Weight%20of%20Evidence%20Formula%20Guide.pdf
Matviychuk, A. V., & Kleban, Yu. V. (2017). Binninh kilʹkisnykh zminnykh z formuvannyam trendu dlya zadach skorynhu. Modelyuvannya ta informatsiyni systemy v ekonomitsi (Modeling and information systems in economics), 93, 213–229 [in Ukrainian]
Gaston, S. (2019). CreditScoring. Retrieved from https://github.com/gastonstat/CreditScoring/blob/master/CreditScoring.csv
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5), 1–26. Retrieved from http://www.jstatsoft.org/article/view/v028i05/v28i05.pdf
Marsaglia, G., Tsang, W.W., & Wang, J. (2003). Evaluating Kolmogorov’s Distribution. Journal of Statistical Software, 8(18), 1–4. Retrieved from http://www.jstatsoft.org/v08/i18/paper
Powers, D. M. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2(1), 37–63. Retrieved from https://bioinfopublication.org/files/articles/2_1_1_JMLT.pdf

Меню

Дослідження способів трансформації даних в контексті підвищення ефективності моделей кредитного скорингу

Studying the methods of data transformation in the context of increasing the effectiveness of credit scoring models

10.33111/nfmte.2019.094

References