brulee_multinomial_reg()
fits a model.
Usage
brulee_multinomial_reg(x, ...)
# S3 method for default
brulee_multinomial_reg(x, ...)
# S3 method for data.frame
brulee_multinomial_reg(
x,
y,
epochs = 20L,
penalty = 0.001,
mixture = 0,
validation = 0.1,
optimizer = "LBFGS",
learn_rate = 1,
momentum = 0,
batch_size = NULL,
class_weights = NULL,
stop_iter = 5,
verbose = FALSE,
...
)
# S3 method for matrix
brulee_multinomial_reg(
x,
y,
epochs = 20L,
penalty = 0.001,
mixture = 0,
validation = 0.1,
optimizer = "LBFGS",
learn_rate = 1,
momentum = 0,
batch_size = NULL,
class_weights = NULL,
stop_iter = 5,
verbose = FALSE,
...
)
# S3 method for formula
brulee_multinomial_reg(
formula,
data,
epochs = 20L,
penalty = 0.001,
mixture = 0,
validation = 0.1,
optimizer = "LBFGS",
learn_rate = 1,
momentum = 0,
batch_size = NULL,
class_weights = NULL,
stop_iter = 5,
verbose = FALSE,
...
)
# S3 method for recipe
brulee_multinomial_reg(
x,
data,
epochs = 20L,
penalty = 0.001,
mixture = 0,
validation = 0.1,
optimizer = "LBFGS",
learn_rate = 1,
momentum = 0,
batch_size = NULL,
class_weights = NULL,
stop_iter = 5,
verbose = FALSE,
...
)
Arguments
- x
Depending on the context:
A data frame of predictors.
A matrix of predictors.
A recipe specifying a set of preprocessing steps created from
recipes::recipe()
.
The predictor data should be standardized (e.g. centered or scaled).
- ...
Options to pass to the learning rate schedulers via
set_learn_rate()
. For example, thereduction
orsteps
arguments toschedule_step()
could be passed here.- y
When
x
is a data frame or matrix,y
is the outcome specified as:A data frame with 1 factor column (with three or more levels).
A matrix with 1 factor column (with three or more levels).
A factor vector (with three or more levels).
- epochs
An integer for the number of epochs of training.
- penalty
The amount of weight decay (i.e., L2 regularization).
- mixture
Proportion of Lasso Penalty (type: double, default: 0.0). A value of mixture = 1 corresponds to a pure lasso model, while mixture = 0 indicates ridge regression (a.k.a weight decay).
- validation
The proportion of the data randomly assigned to a validation set.
- optimizer
The method used in the optimization procedure. Possible choices are 'LBFGS' and 'SGD'. Default is 'LBFGS'.
- learn_rate
A positive number that controls the rapidity that the model moves along the descent path. Values around 0.1 or less are typical. (
optimizer = "SGD"
only)- momentum
A positive number usually on
[0.50, 0.99]
for the momentum parameter in gradient descent. (optimizer = "SGD"
only)- batch_size
An integer for the number of training set points in each batch. (
optimizer = "SGD"
only)- class_weights
Numeric class weights (classification only). The value can be:
A named numeric vector (in any order) where the names are the outcome factor levels.
An unnamed numeric vector assumed to be in the same order as the outcome factor levels.
A single numeric value for the least frequent class in the training data and all other classes receive a weight of one.
- stop_iter
A non-negative integer for how many iterations with no improvement before stopping.
- verbose
A logical that prints out the iteration history.
- formula
A formula specifying the outcome term(s) on the left-hand side, and the predictor term(s) on the right-hand side.
- data
When a recipe or formula is used,
data
is specified as:A data frame containing both the predictors and the outcome.
Value
A brulee_multinomial_reg
object with elements:
models_obj
: a serialized raw vector for the torch module.estimates
: a list of matrices with the model parameter estimates per epoch.best_epoch
: an integer for the epoch with the smallest loss.loss
: A vector of loss values (MSE for regression, negative log- likelihood for classification) at each epoch.dim
: A list of data dimensions.parameters
: A list of some tuning parameter values.blueprint
: Thehardhat
blueprint data.
Details
This function fits a linear combination of coefficients and predictors to model the log of the class probabilities. The training process optimizes the cross-entropy loss function.
By default, training halts when the validation loss increases for at least
step_iter
iterations. If validation = 0
the training set loss is used.
The predictors data should all be numeric and encoded in the same units (e.g. standardized to the same range or distribution). If there are factor predictors, use a recipe or formula to create indicator variables (or some other method) to make them numeric. Predictors should be in the same units before training.
The model objects are saved for each epoch so that the number of epochs can
be efficiently tuned. Both the coef()
and predict()
methods for this
model have an epoch
argument (which defaults to the epoch with the best
loss value).
The use of the L1 penalty (a.k.a. the lasso penalty) does not force parameters to be strictly zero (as it does in packages such as glmnet). The zeroing out of parameters is a specific feature the optimization method used in those packages.
Examples
# \donttest{
if (torch::torch_is_installed()) {
library(recipes)
library(yardstick)
data(penguins, package = "modeldata")
penguins <- penguins %>% na.omit()
set.seed(122)
in_train <- sample(1:nrow(penguins), 200)
penguins_train <- penguins[ in_train,]
penguins_test <- penguins[-in_train,]
rec <- recipe(island ~ ., data = penguins_train) %>%
step_dummy(species, sex) %>%
step_normalize(all_predictors())
set.seed(3)
fit <- brulee_multinomial_reg(rec, data = penguins_train, epochs = 5)
fit
predict(fit, penguins_test) %>%
bind_cols(penguins_test) %>%
conf_mat(island, .pred_class)
}
#> Truth
#> Prediction Biscoe Dream Torgersen
#> Biscoe 49 2 3
#> Dream 11 38 6
#> Torgersen 9 8 7
# }