Modellen¶
Base¶
studentprognose.models.base
¶
BaseForecaster
¶
Bases: ABC
Abstracte basisklasse voor tijdreeksmodellen.
Momenteel is SARIMAForecaster de enige implementatie. De abstractie is
intentioneel: toekomstige modellen (bijv. Prophet, ETS) kunnen dezelfde
interface implementeren zodat ze uitwisselbaar zijn in de ensemble-pipeline.
SARIMA¶
studentprognose.models.sarima
¶
SARIMAForecaster
¶
Bases: BaseForecaster
Unified SARIMA forecaster used by both individual and cumulative strategies.
Uses statsforecast ARIMA (CSS-ML estimation) as backend.
predict_with_sarima_cumulative(data_cumulative, row, predict_year, predict_week, pred_len, skip_years=0, already_printed=False, min_training_year=2016, forecaster_factory=None)
¶
Voorspelt vooraanmeldingen per programme/herkomst/week voor cumulatieve data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
forecaster_factory
|
callable | None
|
Callable die een vers BaseForecaster-object retourneert. Wordt per aanroep gecalld zodat joblib-parallellisatie veilig werkt. Default: SARIMAForecaster met standaard ordes. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
list
|
predictions per toekomstige week, of lege lijst bij fout. |
predict_with_sarima_individual(data_individual, row, predict_year, predict_week, max_year, numerus_fixus_list, data_exog=None, already_printed=False)
¶
Predicts nr of students with SARIMA per programme/origin/week for individual data.
Returns:
| Name | Type | Description |
|---|---|---|
list |
list
|
predictions for each future week, or empty list on error. |
XGBoost Classifier¶
studentprognose.models.xgboost_classifier
¶
predict_applicant(data, predict_year, predict_week, max_year, data_cumulative=None, configuration=None)
¶
Train an XGBoost classifier to predict individual applicant enrollment probability.
Returns:
| Type | Description |
|---|---|
ndarray | float
|
np.ndarray: predicted enrollment probabilities, or np.nan if no test data. |
XGBoost Regressor¶
studentprognose.models.xgboost_regressor
¶
predict_with_xgboost(train, test, data_studentcount, extra_numeric_cols=None, regressor=None, config=None)
¶
Train een regressor om studentaantallen te voorspellen uit cumulatieve vooraanmelddata.
Returns:
| Type | Description |
|---|---|
ndarray
|
(predictions, importance_dict) — predictions afgerond op integers, |
dict[str, float] | None
|
importance_dict is None als het model geen feature importances ondersteunt. |
Ratio-model¶
studentprognose.models.ratio
¶
predict_with_ratio(data, data_cumulative, data_studentcount, numerus_fixus_list, predict_year)
¶
Predict student influx using the ratio between pre-registrants and actual enrollments.
Uses a 3-year historical average ratio.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
The main output data (will be modified in place and returned). |
required |
data_cumulative
|
DataFrame
|
Cumulative pre-application data. |
required |
data_studentcount
|
DataFrame | None
|
Actual student count data. |
required |
numerus_fixus_list
|
dict
|
Dict of numerus fixus programmes and their caps. |
required |
predict_year
|
int
|
The year being predicted. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: The data with Prognose_ratio column added. |