On March 5th at 10am in JAAC, Kyle Mayer will be guest lecturing to talk about his role as a data analyst at Micron. Kyle has a Bachelors in Engineering and a Masters in Physics.
Bio: Kyle Mayer is a Senior Data Analytics Engineer at Micron Technology with over 15 years of experience. He currently supports the Global Quality organization with engineering and data science projects. Kyle specializes in modeling complex physical processes, delivering insights that drive business value. Leveraging his expertise in semiconductor manufacturing, Kyle has developed tools and processes that have prevented over $100M in lost revenue. His passion for applying artificial intelligence to solve complex problems and develop automated systems highlights his commitment to innovation. In his spare time, Kyle enjoys spending time with his kids and going on hikes.
A simulated data set containing information on ten thousand customers. The aim here is to predict which customers will default on their credit card debt.
defaultbalanceFitting a logistic regression model with default as the response and balance as the predictor:
default: “Yes” is Positivedefault_test_wprobs <- default_test |>
mutate(
knn_probs = predict(knnfit, new_data = default_test, type = "prob") |> pull(.pred_Yes),
logistic_probs = predict(logregfit, new_data = default_test, type = "prob") |> pull(.pred_Yes)
)
default_test_wprobs |> head() |> kable() # obtain probability predictions| default | student | balance | income | knn_probs | logistic_probs |
|---|---|---|---|---|---|
| No | No | 729.5265 | 44361.63 | 0 | 0.0012842 |
| No | Yes | 808.6675 | 17600.45 | 0 | 0.0019883 |
| No | Yes | 1220.5838 | 13268.56 | 0 | 0.0190870 |
| No | No | 237.0451 | 28251.70 | 0 | 0.0000843 |
| No | No | 606.7423 | 44994.56 | 0 | 0.0006514 |
| No | No | 286.2326 | 45042.41 | 0 | 0.0001107 |
threshold <- 0.5 # set threshold
default_test_wprobs <- default_test_wprobs |>
mutate(knn_preds = as_factor(if_else(knn_probs > threshold, "Yes", "No")),
logistic_preds = as_factor(if_else(logistic_probs > threshold, "Yes", "No"))
)
default_test_wprobs |> head() |> kable()| default | student | balance | income | knn_probs | logistic_probs | knn_preds | logistic_preds |
|---|---|---|---|---|---|---|---|
| No | No | 729.5265 | 44361.63 | 0 | 0.0012842 | No | No |
| No | Yes | 808.6675 | 17600.45 | 0 | 0.0019883 | No | No |
| No | Yes | 1220.5838 | 13268.56 | 0 | 0.0190870 | No | No |
| No | No | 237.0451 | 28251.70 | 0 | 0.0000843 | No | No |
| No | No | 606.7423 | 44994.56 | 0 | 0.0006514 | No | No |
| No | No | 286.2326 | 45042.41 | 0 | 0.0001107 | No | No |
threshold <- 0.5 # set threshold
default_test_wprobs <- default_test_wprobs |>
mutate(knn_preds = as_factor(if_else(knn_probs > threshold, "Yes", "No")),
logistic_preds = as_factor(if_else(logistic_probs > threshold, "Yes", "No")))
default_test_wprobs |> head() |> kable()| default | student | balance | income | knn_probs | logistic_probs | knn_preds | logistic_preds |
|---|---|---|---|---|---|---|---|
| No | No | 729.5265 | 44361.63 | 0 | 0.0012842 | No | No |
| No | Yes | 808.6675 | 17600.45 | 0 | 0.0019883 | No | No |
| No | Yes | 1220.5838 | 13268.56 | 0 | 0.0190870 | No | No |
| No | No | 237.0451 | 28251.70 | 0 | 0.0000843 | No | No |
| No | No | 606.7423 | 44994.56 | 0 | 0.0006514 | No | No |
| No | No | 286.2326 | 45042.41 | 0 | 0.0001107 | No | No |
threshold <- 0.1 # set threshold
default_test_wprobs <- default_test_wprobs |>
mutate(knn_preds = as_factor(if_else(knn_probs > threshold, "Yes", "No")))
roc_metrics(default_test_wprobs, truth = default, estimate = knn_preds, event_level = "second") |> kable()| .metric | .estimator | .estimate |
|---|---|---|
| accuracy | binary | 0.9060000 |
| sensitivity | binary | 0.7364341 |
| specificity | binary | 0.9116507 |
threshold <- 0.9 # set threshold
default_test_wprobs <- default_test_wprobs |>
mutate(knn_preds = as_factor(if_else(knn_probs > threshold, "Yes", "No"))
)
roc_metrics(default_test_wprobs, truth = default, estimate = knn_preds, event_level = "second") |> kable()| .metric | .estimator | .estimate |
|---|---|---|
| accuracy | binary | 0.9685000 |
| sensitivity | binary | 0.0310078 |
| specificity | binary | 0.9997417 |













