Ga naar inhoud

API-referentie

uitnodigingsregel

Uitnodigingsregel: dropout prediction models for student intervention.

detect_separator(file_path, target_column='Dropout')

Detect the CSV separator by trying common delimiters.

Reads the first 5 rows with each candidate separator and returns the first one that produces multiple columns containing the target column.

Parameters:

Name Type Description Default
file_path str | Path

Path to the CSV file.

required
target_column str

Expected column name used to validate the separator.

'Dropout'

Returns:

Type Description
str

Detected separator character, defaults to ',' if none matched.

load_settings(config_file=None, settings_type='default')

Load settings from YAML config file.

Parameters:

Name Type Description Default
config_file str | None

Path to config YAML. Defaults to package metadata config.

None
settings_type str

Which settings block to load ('default' or 'custom').

'default'

Returns:

Type Description
dict

Dictionary of settings values.

train_lasso(dataset_train_scaled, random_seed, dropout_column, alpha_range, model_path=Path('models/lasso_regression.joblib'))

Train a Lasso regression model with grid search over alpha values.

Parameters:

Name Type Description Default
dataset_train_scaled DataFrame

Scaled training DataFrame with features and target.

required
random_seed int

Random state for reproducibility.

required
dropout_column str

Name of the target column.

required
alpha_range list[float]

List of alpha values to search.

required
model_path Path

Path to save the trained model.

Path('models/lasso_regression.joblib')

Returns:

Type Description
Lasso

Best-fit Lasso model.

train_random_forest(dataset_train, random_seed, dropout_column, rf_parameters, model_path=Path('models/random_forest_regressor.joblib'))

Train a Random Forest regressor with grid search hyperparameter tuning.

Parameters:

Name Type Description Default
dataset_train DataFrame

Training DataFrame with features and target.

required
random_seed int

Random state for reproducibility.

required
dropout_column str

Name of the target column.

required
rf_parameters dict

Parameter grid for GridSearchCV.

required
model_path Path

Path to save the trained model.

Path('models/random_forest_regressor.joblib')

Returns:

Type Description
RandomForestRegressor

Best-fit RandomForestRegressor model.

train_svm(dataset_train_scaled, random_seed, dropout_column, svm_parameters, model_path=Path('models/support_vector_machine.joblib'))

Train an SVM classifier with grid search hyperparameter tuning.

Parameters:

Name Type Description Default
dataset_train_scaled DataFrame

Scaled training DataFrame with features and target.

required
random_seed int

Random state for reproducibility.

required
dropout_column str

Name of the target column.

required
svm_parameters dict

Parameter grid for GridSearchCV.

required
model_path Path

Path to save the trained model.

Path('models/support_vector_machine.joblib')

Returns:

Type Description
SVC

Best-fit SVC model with probability estimates enabled.