Ga naar inhoud

Modellen

Base

studentprognose.models.base

BaseForecaster

Bases: ABC

Abstracte basisklasse voor tijdreeksmodellen.

Momenteel is SARIMAForecaster de enige implementatie. De abstractie is intentioneel: toekomstige modellen (bijv. Prophet, ETS) kunnen dezelfde interface implementeren zodat ze uitwisselbaar zijn in de ensemble-pipeline.

SARIMA

studentprognose.models.sarima

SARIMAForecaster

Bases: BaseForecaster

Unified SARIMA forecaster used by both individual and cumulative strategies.

Uses statsforecast ARIMA (CSS-ML estimation) as backend.

predict_with_sarima_cumulative(data_cumulative, row, predict_year, predict_week, pred_len, skip_years=0, already_printed=False, min_training_year=2016, forecaster_factory=None)

Voorspelt vooraanmeldingen per programme/herkomst/week voor cumulatieve data.

Parameters:

Name Type Description Default
forecaster_factory callable | None

Callable die een vers BaseForecaster-object retourneert. Wordt per aanroep gecalld zodat joblib-parallellisatie veilig werkt. Default: SARIMAForecaster met standaard ordes.

None

Returns:

Name Type Description
list list

predictions per toekomstige week, of lege lijst bij fout.

predict_with_sarima_individual(data_individual, row, predict_year, predict_week, max_year, numerus_fixus_list, data_exog=None, already_printed=False)

Predicts nr of students with SARIMA per programme/origin/week for individual data.

Returns:

Name Type Description
list list

predictions for each future week, or empty list on error.

XGBoost Classifier

studentprognose.models.xgboost_classifier

predict_applicant(data, predict_year, predict_week, max_year, data_cumulative=None, configuration=None)

Train an XGBoost classifier to predict individual applicant enrollment probability.

Returns:

Type Description
ndarray | float

np.ndarray: predicted enrollment probabilities, or np.nan if no test data.

XGBoost Regressor

studentprognose.models.xgboost_regressor

predict_with_xgboost(train, test, data_studentcount, extra_numeric_cols=None, regressor=None, config=None)

Train een regressor om studentaantallen te voorspellen uit cumulatieve vooraanmelddata.

Returns:

Type Description
ndarray

(predictions, importance_dict) — predictions afgerond op integers,

dict[str, float] | None

importance_dict is None als het model geen feature importances ondersteunt.

Ratio-model

studentprognose.models.ratio

predict_with_ratio(data, data_cumulative, data_studentcount, numerus_fixus_list, predict_year)

Predict student influx using the ratio between pre-registrants and actual enrollments.

Uses a 3-year historical average ratio.

Parameters:

Name Type Description Default
data DataFrame

The main output data (will be modified in place and returned).

required
data_cumulative DataFrame

Cumulative pre-application data.

required
data_studentcount DataFrame | None

Actual student count data.

required
numerus_fixus_list dict

Dict of numerus fixus programmes and their caps.

required
predict_year int

The year being predicted.

required

Returns:

Type Description
DataFrame

pd.DataFrame: The data with Prognose_ratio column added.