A simulated data set containing information on ten thousand customers. The aim here is to predict which customers will default on their credit card debt.
defaultbalanceFitting a logistic regression model with default as the response and balance as the predictor:
default: “Yes” is Positive| Actual Positive/Event | Actual Negative/Non-event | |
|---|---|---|
| Predicted Positive/Event | True Positive (TP) | False Positive (FP) |
| Predicted Negative/Non-event | False Negative (FN) | True Negative (TN) |
default_test_wpreds <- default_test |>
mutate(
knn_preds = predict(knnfit, new_data = default_test, type = "class")$.pred_class,
logistic_preds = predict(logregfit, new_data = default_test, type = "class")$.pred_class
)
default_test_wpreds |> head() |> kable()| default | student | balance | income | knn_preds | logistic_preds |
|---|---|---|---|---|---|
| No | No | 729.5265 | 44361.63 | No | No |
| No | Yes | 808.6675 | 17600.45 | No | No |
| No | Yes | 1220.5838 | 13268.56 | No | No |
| No | No | 237.0451 | 28251.70 | No | No |
| No | No | 606.7423 | 44994.56 | No | No |
| No | No | 286.2326 | 45042.41 | No | No |
yardstickyardstick is a package that ships with tidymodels meant for model evaluationmetricname(data, truth, estimate, ...)
truth and predicted values in for estimatedefault_test_wpreds |>
binary_metrics(truth = default, estimate = knn_preds, event_level = "second") |>
kable()| .metric | .estimator | .estimate |
|---|---|---|
| accuracy | binary | 0.9757500 |
| recall | binary | 0.3798450 |
| precision | binary | 0.7424242 |
| specificity | binary | 0.9956084 |
| npv | binary | 0.9796645 |
| mcc | binary | 0.5206828 |
| f_meas | binary | 0.5025641 |
default_test_wpreds |>
binary_metrics(truth = default, estimate = logistic_preds, event_level = "second") |>
kable()| .metric | .estimator | .estimate |
|---|---|---|
| accuracy | binary | 0.9730000 |
| recall | binary | 0.3023256 |
| precision | binary | 0.6842105 |
| specificity | binary | 0.9953500 |
| npv | binary | 0.9771747 |
| mcc | binary | 0.4437097 |
| f_meas | binary | 0.4193548 |