Splits numeric features into equally spaced bins.
See `graphics::hist()`

for details.
Values that fall out of the training data range during prediction are
binned with the lowest / highest bin respectively.

`R6Class`

object inheriting from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

PipeOpHistBin$new(id = "histbin", param_vals = list())

`id`

::`character(1)`

Identifier of resulting object, default`"histbin"`

.`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

Input and output channels are inherited from `PipeOpTaskPreproc`

.

The output is the input `Task`

with all affected numeric features replaced by their binned versions.

The `$state`

is a named `list`

with the `$state`

elements inherited from `PipeOpTaskPreproc`

, as well as:

`breaks`

::`list`

List of intervals representing the bins for each numeric feature.

The parameters are the parameters inherited from `PipeOpTaskPreproc`

, as well as:

`breaks`

::`character(1)`

|`numeric`

|`function`

Either a`character(1)`

string naming an algorithm to compute the number of cells, a`numeric(1)`

giving the number of breaks for the histogram, a vector`numeric`

giving the breakpoints between the histogram cells, or a`function`

to compute the vector of breakpoints or to compute the number of cells. Default is algorithm`"Sturges"`

(see`grDevices::nclass.Sturges()`

). For details see`hist()`

.

Uses the `graphics::hist`

function.

Only methods inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

https://mlr3book.mlr-org.com/list-pipeops.html

Other PipeOps:
`PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTargetTrafo`

,
`PipeOpTaskPreprocSimple`

,
`PipeOpTaskPreproc`

,
`PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_colapply`

,
`mlr_pipeops_collapsefactors`

,
`mlr_pipeops_colroles`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_datefeatures`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputeconstant`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputelearner`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputemode`

,
`mlr_pipeops_imputeoor`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_multiplicityexply`

,
`mlr_pipeops_multiplicityimply`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nmf`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_ovrsplit`

,
`mlr_pipeops_ovrunite`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_proxy`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_randomprojection`

,
`mlr_pipeops_randomresponse`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_renamecolumns`

,
`mlr_pipeops_replicate`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_smote`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_targetinvert`

,
`mlr_pipeops_targetmutate`

,
`mlr_pipeops_targettrafoscalerange`

,
`mlr_pipeops_textvectorizer`

,
`mlr_pipeops_threshold`

,
`mlr_pipeops_tunethreshold`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_updatetarget`

,
`mlr_pipeops_vtreat`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`

library("mlr3") task = tsk("iris") pop = po("histbin") task$data() #> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa 1.4 0.2 5.1 3.5 #> 2: setosa 1.4 0.2 4.9 3.0 #> 3: setosa 1.3 0.2 4.7 3.2 #> 4: setosa 1.5 0.2 4.6 3.1 #> 5: setosa 1.4 0.2 5.0 3.6 #> --- #> 146: virginica 5.2 2.3 6.7 3.0 #> 147: virginica 5.0 1.9 6.3 2.5 #> 148: virginica 5.2 2.0 6.5 3.0 #> 149: virginica 5.4 2.3 6.2 3.4 #> 150: virginica 5.1 1.8 5.9 3.0 pop$train(list(task))[[1]]$data() #> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa [-Inf,1.5] [-Inf,0.2] (5,5.5] (3.4,3.6] #> 2: setosa [-Inf,1.5] [-Inf,0.2] (4.5,5] (2.8,3] #> 3: setosa [-Inf,1.5] [-Inf,0.2] (4.5,5] (3,3.2] #> 4: setosa [-Inf,1.5] [-Inf,0.2] (4.5,5] (3,3.2] #> 5: setosa [-Inf,1.5] [-Inf,0.2] (4.5,5] (3.4,3.6] #> --- #> 146: virginica (5,5.5] (2.2,2.4] (6.5,7] (2.8,3] #> 147: virginica (4.5,5] (1.8,2] (6,6.5] (2.4,2.6] #> 148: virginica (5,5.5] (1.8,2] (6,6.5] (2.8,3] #> 149: virginica (5,5.5] (2.2,2.4] (6,6.5] (3.2,3.4] #> 150: virginica (5,5.5] (1.6,1.8] (5.5,6] (2.8,3] pop$state #> $breaks #> $breaks[[1]] #> [1] -Inf 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 Inf #> #> $breaks[[2]] #> [1] -Inf 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 Inf #> #> $breaks[[3]] #> [1] -Inf 4.5 5.0 5.5 6.0 6.5 7.0 7.5 Inf #> #> $breaks[[4]] #> [1] -Inf 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 Inf #> #> #> $dt_columns #> [1] "Petal.Length" "Petal.Width" "Sepal.Length" "Sepal.Width" #> #> $affected_cols #> [1] "Petal.Length" "Petal.Width" "Sepal.Length" "Sepal.Width" #> #> $intasklayout #> id type #> 1: Petal.Length numeric #> 2: Petal.Width numeric #> 3: Sepal.Length numeric #> 4: Sepal.Width numeric #> #> $outtasklayout #> id type #> 1: Petal.Length ordered #> 2: Petal.Width ordered #> 3: Sepal.Length ordered #> 4: Sepal.Width ordered #> #> $outtaskshell #> Empty data.table (0 rows and 5 cols): Species,Petal.Length,Petal.Width,Sepal.Length,Sepal.Width #>