Skip to contents

Transforms categorical variables in a data frame into one-hot encoded format. Renames "BIAS" to lowercase.

Usage

shap_to_onehot(shap, wide_input_frame, iblm_model)

Arguments

shap

Data frame containing raw SHAP values from XGBoost.

wide_input_frame

Wide format input data frame (one-hot encoded).

iblm_model

Object of class 'iblm'

Value

A data frame where SHAP values are in wide format for categorical variables. Column "bias" is moved to start.

Examples

df_list <- freMTPLmini |> split_into_train_validate_test(seed = 9000)

iblm_model <- train_iblm_xgb(
  df_list,
  response_var = "ClaimRate",
  family = "poisson"
)

shap <- extract_booster_shap(iblm_model$booster_model, df_list$test)

wide_input_frame <- data_to_onehot(df_list$test, iblm_model)

shap_wide <- shap_to_onehot(shap, wide_input_frame, iblm_model)

shap_wide |> dplyr::glimpse()
#> Rows: 3,764
#> Columns: 24
#> $ bias          <dbl> -0.03920801, -0.03920801, -0.03920801, -0.03920801, -0.0…
#> $ VehPower      <dbl> -9.513386e-03, -1.709264e-02, 1.250970e-02, 5.950155e-03…
#> $ VehAge        <dbl> -0.1181081235, -0.1903213263, 0.0632225871, 0.0886134803…
#> $ DrivAge       <dbl> -0.0106765414, 0.0178347062, -0.0178089067, 0.1007688642…
#> $ BonusMalus    <dbl> -0.009905483, 0.035999734, -0.025095381, 0.007291305, -0…
#> $ AreaA         <dbl> 0.000000000, 0.016070237, 0.000000000, 0.000000000, 0.00…
#> $ AreaB         <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.000000…
#> $ AreaC         <dbl> 0.000000000, 0.000000000, 0.000000000, -0.009702625, -0.…
#> $ AreaD         <dbl> 0.000000000, 0.000000000, 0.003410609, 0.000000000, 0.00…
#> $ AreaE         <dbl> -0.017667860, 0.000000000, 0.000000000, 0.000000000, 0.0…
#> $ AreaF         <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
#> $ VehBrandB1    <dbl> 0.000000000, -0.074893273, 0.000000000, 0.000000000, 0.0…
#> $ VehBrandB10   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
#> $ VehBrandB11   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
#> $ VehBrandB12   <dbl> -0.06775799, 0.00000000, 0.00000000, 0.00000000, 0.00000…
#> $ VehBrandB13   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
#> $ VehBrandB14   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
#> $ VehBrandB2    <dbl> 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 
#> $ VehBrandB3    <dbl> 0.0000000, 0.0000000, -0.0293063, 0.0000000, 0.0000000, 
#> $ VehBrandB4    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
#> $ VehBrandB5    <dbl> 0.00000000, 0.00000000, 0.00000000, 0.04647116, 0.000000…
#> $ VehBrandB6    <dbl> 0.000000000, 0.000000000, 0.000000000, 0.000000000, 0.00…
#> $ VehGasDiesel  <dbl> 0.0016007025, 0.0040426101, 0.0229403824, 0.0000000000, 
#> $ VehGasRegular <dbl> 0.000000000, 0.000000000, 0.000000000, -0.006539818, 0.0…