kosmos.ml.datasets.income_dataset

Classes

class IncomeDataset(*, min_max_scaler: bool = True)

Bases: kosmos.ml.datasets.dataset.SLDataset

Adult Income (Census) dataset — binary classification (>50K vs <=50K).

Notes

  • Instances: 48,842 (32,561 train + 16,281 test)

  • Features: 14 (Mix numerical and categorical), after One-Hot-Decision more columns

  • Classes: 2 (imbalanced; ~24% >50K, ~76% <=50K)

Initialize the dataset.

Parameters:

min_max_scaler (bool) – Whether to apply min-max scaling to the features.


Properties

property class_names

0 -> <=50K, 1 -> >50K.

Type:

Return human-readable class labels

property input_dimension

Number of feature columns.

property output_dim

Number of distinct classes.