Modellen¶
Base¶
studentprognose.models.base
¶
BaseForecaster
¶
Bases: ABC
Abstracte basisklasse voor tijdreeksmodellen.
Momenteel is SARIMAForecaster de enige implementatie. De abstractie is
intentioneel: toekomstige modellen (bijv. Prophet, ETS) kunnen dezelfde
interface implementeren zodat ze uitwisselbaar zijn in de ensemble-pipeline.
SARIMA¶
studentprognose.models.sarima
¶
SARIMAForecaster
¶
predict_with_sarima_cumulative(data_cumulative, row, predict_year, predict_week, pred_len, skip_years=0, already_printed=False, min_training_year=2016)
¶
Predicts pre-registrations with SARIMA per programme/origin/week for cumulative data.
Returns:
| Name | Type | Description |
|---|---|---|
list |
list
|
predictions for each future week, or empty list on error. |
predict_with_sarima_individual(data_individual, row, predict_year, predict_week, max_year, numerus_fixus_list, data_exog=None, already_printed=False)
¶
Predicts nr of students with SARIMA per programme/origin/week for individual data.
Returns:
| Name | Type | Description |
|---|---|---|
list |
list
|
predictions for each future week, or empty list on error. |
XGBoost Classifier¶
studentprognose.models.xgboost_classifier
¶
predict_applicant(data, predict_year, predict_week, max_year, data_cumulative=None, configuration=None)
¶
Train an XGBoost classifier to predict individual applicant enrollment probability.
Returns:
| Type | Description |
|---|---|
ndarray | float
|
np.ndarray: predicted enrollment probabilities, or np.nan if no test data. |
XGBoost Regressor¶
studentprognose.models.xgboost_regressor
¶
predict_with_xgboost(train, test, data_studentcount, extra_numeric_cols=None)
¶
Train an XGBoost regressor to predict student counts from cumulative pre-application data.
Returns:
| Type | Description |
|---|---|
ndarray | float
|
np.ndarray: rounded integer predictions, or np.nan if no student count data. |
Ratio-model¶
studentprognose.models.ratio
¶
predict_with_ratio(data, data_cumulative, data_studentcount, numerus_fixus_list, predict_year)
¶
Predict student influx using the ratio between pre-registrants and actual enrollments.
Uses a 3-year historical average ratio.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
The main output data (will be modified in place and returned). |
required |
data_cumulative
|
DataFrame
|
Cumulative pre-application data. |
required |
data_studentcount
|
DataFrame | None
|
Actual student count data. |
required |
numerus_fixus_list
|
dict
|
Dict of numerus fixus programmes and their caps. |
required |
predict_year
|
int
|
The year being predicted. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: The data with Prognose_ratio column added. |