kvmboard.blogg.se

German credit data set arff download
German credit data set arff download








german credit data set arff download german credit data set arff download

X12 = amount of bill statement in September, 2005 X13 = amount of bill statement in August, 2005. X12-X17: Amount of bill statement (NT dollar). 8 = payment delay for eight months 9 = payment delay for nine months and above. The measurement scale for the repayment status is: -1 = pay duly 1 = payment delay for one month 2 = payment delay for two months. X11 = the repayment status in April, 2005. We tracked the past monthly payment records (from April to September, 2005) as follows: X6 = the repayment status in September, 2005 X7 = the repayment status in August, 2005. X4: Marital status (1 = married 2 = single 3 = others). X3: Education (1 = graduate school 2 = university 3 = high school 4 = others). X1: Amount of the given credit (NT dollar): it includes both the individual consumer credit and his/her family (supplementary) credit. This study reviewed the literature and used the following 23 variables as explanatory variables: This research employed a binary variable, default payment (Yes = 1, No = 0), as the response variable. Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination its regression intercept (A) is close to zero, and regression coefficient (B) to one. Because the real probability of default is unknown, this study presented the novel “Sorting Smoothing Method” to estimate the real probability of default. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. This research aimed at the case of customers’ default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods.










German credit data set arff download