Ga naar inhoud

Modellen

Base

studentprognose.models.base

BaseForecaster

Bases: ABC

Abstracte basisklasse voor tijdreeksmodellen.

Momenteel is SARIMAForecaster de enige implementatie. De abstractie is intentioneel: toekomstige modellen (bijv. Prophet, ETS) kunnen dezelfde interface implementeren zodat ze uitwisselbaar zijn in de ensemble-pipeline.

SARIMA

studentprognose.models.sarima

SARIMAForecaster

Bases: BaseForecaster

Unified SARIMA forecaster used by both individual and cumulative strategies.

predict_with_sarima_cumulative(data_cumulative, row, predict_year, predict_week, pred_len, skip_years=0, already_printed=False, min_training_year=2016)

Predicts pre-registrations with SARIMA per programme/origin/week for cumulative data.

Returns:

Name Type Description
list list

predictions for each future week, or empty list on error.

predict_with_sarima_individual(data_individual, row, predict_year, predict_week, max_year, numerus_fixus_list, data_exog=None, already_printed=False)

Predicts nr of students with SARIMA per programme/origin/week for individual data.

Returns:

Name Type Description
list list

predictions for each future week, or empty list on error.

XGBoost Classifier

studentprognose.models.xgboost_classifier

predict_applicant(data, predict_year, predict_week, max_year, data_cumulative=None, configuration=None)

Train an XGBoost classifier to predict individual applicant enrollment probability.

Returns:

Type Description
ndarray | float

np.ndarray: predicted enrollment probabilities, or np.nan if no test data.

XGBoost Regressor

studentprognose.models.xgboost_regressor

predict_with_xgboost(train, test, data_studentcount, extra_numeric_cols=None)

Train an XGBoost regressor to predict student counts from cumulative pre-application data.

Returns:

Type Description
ndarray | float

np.ndarray: rounded integer predictions, or np.nan if no student count data.

Ratio-model

studentprognose.models.ratio

predict_with_ratio(data, data_cumulative, data_studentcount, numerus_fixus_list, predict_year)

Predict student influx using the ratio between pre-registrants and actual enrollments.

Uses a 3-year historical average ratio.

Parameters:

Name Type Description Default
data DataFrame

The main output data (will be modified in place and returned).

required
data_cumulative DataFrame

Cumulative pre-application data.

required
data_studentcount DataFrame | None

Actual student count data.

required
numerus_fixus_list dict

Dict of numerus fixus programmes and their caps.

required
predict_year int

The year being predicted.

required

Returns:

Type Description
DataFrame

pd.DataFrame: The data with Prognose_ratio column added.