
Technical details
technicaldetails.Rmd
In this article you can read about the objects and calculations made
in the functions of wisselstroom
Reading in data with read_bek_data()
A bekostigingsbestand can be considered a container for 5 different
sub files with records that are intermingled. Each sub file type has its
own interpretation of the columns. The first step to work with a
bekostigingsbestand, is to read it in, with help of the function
read_bek_data()
. When reading the data, some checks are
made by this function on the filename. So be sure to have the original
filename. And the original data.
Code to make the object
The following code reads in a bekostigingsbestand:
# To read in your institution specific bekostigingsbestand,
# change path_to_file accordingly.
path_to_file <- file.path(system.file("extdata", package = 'wisselstroom'),
"VLPBEK_2025_20240115_99XX.csv")
# read the data and place it in an R data.frame
my_bek_data <- read_bek_data(path_to_file)
A closer look at the object
The resulting object my_bek_data
, is a
data.frame
, with an extra attribute denoting the type of
read-in bekostigingsbestand:
class(my_bek_data)
#> [1] "data.frame"
attributes(my_bek_data)$comment
#> [1] "VLPBEK"
Every row in the orginal csv file becomes a row in the
data.frame
. The separator in the csv file splits up the row
into columns. All 25 columns are read in as character. The columns are
given a general name (V1, V2, V3, …), since the content can have a
different meaning depending on the sub file type. In order to gain
insight in wisselstromen, we do not need all content of the
bekostigingsbestanden. In the example file, this content is x-ed
out.
Have a look at the first 10 rows and 9 columns:
my_bek_data[1:10,1:9]
#> V1 V2 V3 V4 V5 V6 V7 V8 V9
#> 1 VLP 99XX 2025 20240115
#> 2 BLB 700010101 800010101
#> 3 BRD 700010101 800010101 99XX xxxxxxxx N xx xxxx 31001
#> 4 BRD 700010101 800010101 99XX xxxxxxxx J xx xxxx 31001
#> 5 BRD 700010101 800010101 99XX xxxxxxxx N xx xxxx 31001
#> 6 BRR 700010101 800010101 99XX xxxxxxxx N xxxxx xxxx x
#> 7 BRD 700010093 800010093 99XX xxxxxxxx N xx 31002
#> 8 BRD 700010078 800010078 99XX xxxxxxxx N xx 31003
#> 9 BLB 700010068 800010068
#> 10 BRD 700010068 800010068 99XX xxxxxxxx N xx 31004
The first column denotes the sub file type of the row:
- the first row and only this row, is a “VLP” (voorlooprecord) with metadata
- “BLB” records (bekostigingsloopbaan student) are metadata regarding a student’s study career in HE
- “BRD” records (bekostigingsresultaat deelname), details about a specific enrolment of a student
- “BRR” records (bekostigingsresultaat resultaat), details about a specific obtained degree of a student
- the last row and only this row, is a “SLR” (sluitrecord) containing the number of BLB, BRD and BRR rows
This data.frame
is useful if you want to see what the
csv actually contains. For the use of gaining wisselstroom insights, we
will not use all columns, nor all sub file types. We only need a part of
these data. This is the next step.
Disentangle the data with make_flow_basics()
The previous step let to a data.frame
where all the sub
file types are still together. In this step the sub file types are
separated, and the data we need for getting our insights is extracted.
Some new variables are added, based on the data in the file.
Code to make the object
Assuming the object my_bek_data
is already made in the
previous step, this line of code processes my_bek_data
with
help of the function make_flow_basics()
into a
flow_basics
object:
# make the my_flow_basics object from the my_bek_data object
my_flow_basics <- make_flow_basics(my_bek_data)
#> Error in get(paste0(generic, ".", class), envir = get_method_env()) :
#> object 'type_sum.accel' not found
A closer look at the object
The newly made my_flow_basics
object is of type
flow_basics
:
class(my_flow_basics)
#> [1] "flow_basics"
The structure of this object is a list containing 5 elements:
str(my_flow_basics)
#> List of 5
#> $ type : chr "VLPBEK"
#> $ brin_own : chr "99XX"
#> $ date_retrieval: Date[1:1], format: "2024-01-15"
#> $ enrolments :'data.frame': 181 obs. of 12 variables:
#> ..$ student_id : chr [1:181] "b700010101" "b700010101" "b700010101" "b700010093" ...
#> ..$ year_funding : chr [1:181] "2025" "2025" "2025" "2025" ...
#> ..$ BRIN : chr [1:181] "99XX" "99XX" "99XX" "99XX" ...
#> ..$ program_code : chr [1:181] "31001" "31001" "31001" "31002" ...
#> ..$ program_level : chr [1:181] "HBO-BA" "HBO-BA" "HBO-BA" "HBO-BA" ...
#> ..$ program_phase : chr [1:181] "D" "B" "B" "D" ...
#> ..$ date_enrolment : Date[1:181], format: "2022-09-01" "2023-09-01" ...
#> ..$ date_disenrolment: Date[1:181], format: "2023-06-25" "2024-08-31" ...
#> ..$ enrolment_form : chr [1:181] "S" "S" "S" "S" ...
#> ..$ program_form : chr [1:181] "DT" "DT" "DT" "DT" ...
#> ..$ sector : chr [1:181] "GEDRAG_EN_MAATSCHAPPIJ" "GEDRAG_EN_MAATSCHAPPIJ" "GEDRAG_EN_MAATSCHAPPIJ" "" ...
#> ..$ academic_year : chr [1:181] "2022/2023" "2023/2024" "2022/2023" "2022/2023" ...
#> ..- attr(*, "comment")= chr "VLPBEK"
#> $ degrees :'data.frame': 33 obs. of 10 variables:
#> ..$ student_id : chr [1:33] "b700010101" "b700010003" "b700010057" "b700010089" ...
#> ..$ year_funding : chr [1:33] "2025" "2025" "2025" "2025" ...
#> ..$ BRIN : chr [1:33] "99XX" "99XX" "99XX" "99XX" ...
#> ..$ program_code : chr [1:33] "31001" "31015" "31003" "31004" ...
#> ..$ program_level : chr [1:33] "HBO-BA" "HBO-BA" "HBO-BA" "HBO-BA" ...
#> ..$ program_phase : chr [1:33] "D" "D" "B" "B" ...
#> ..$ date_graduation: Date[1:33], format: "2023-06-25" "2023-07-07" ...
#> ..$ program_form : chr [1:33] "DT" "VT" "VT" "VT" ...
#> ..$ sector : chr [1:33] "GEDRAG_EN_MAATSCHAPPIJ" "TECHNIEK" "ECONOMIE" "TECHNIEK" ...
#> ..$ academic_year : chr [1:33] "2022/2023" "2022/2023" "2022/2023" "2022/2023" ...
#> ..- attr(*, "comment")= chr "VLPBEK"
#> - attr(*, "class")= chr "flow_basics"
The elements of the list:
-
type
: either “VLPBEK”, “DEFBEK” or “HISBEK” -
brin_own
: the administrative number of the HEI concerned -
date_retrieval
: the date the original bekostigingsbestand is made -
enrolments
: adata.frame
containing enrolment data -
degrees
: adata.frame
containing data on obtained degrees
If you want to address one of the 5 items in the
my_flow_basics
object, you can do that with the
$
operator, for instance:
-
my_flow_basics$brin_own
to get a text containing the own brin -
my_flow_basics$enrolment
to get adata.frame
containing the enrolment data
type
, brin_own
,
date_retrieval
The VLP record of the original data set, is translated to these 3 items.
enrolments
- All the BRD records (enrolment) of the original data set are placed
in the
enrolments
data.frame
of theflow_basics
object. - Only the relevant columns for wisselstroom insights are included.
- A column
academic_year
is added calculated on basis ofdate_enrolment
. - A column
student_id
is calculated from two original columns in themy_bek_data
,V2
andV3
:-
V2
contains the BSN of the student -
V3
contains the onderwijsnummer (educationa number) of the student - if
V2
has a value, student_id gets that value , prefixed with a “b”, short for BSN - if
V2
does not have a value, student_id gets the value of V3, with prefix “e” for educational number
-
degrees
- All the BRR records (degrees) of the original data set are placed in
the
degree
data.frame
of theflow_basics
object. - Only the relevant columns for wisselstroom insights are included.
- A column
academic_year
is added to calculated on basis ofdate_graduation
. - A column
student_id
is calculated from two original columns in themy_bek_data
,V2
andV3
:-
V2
contains the BSN of the student -
V3
contains the onderwijsnummer (educationa number) of the student - if
V2
has a value, student_id gets that value , prefixed with a “b”, short for BSN - if
V2
does not have a value, student_id gets the value of V3, with prefix “e” for educational number
-
Remarks
- The rows containing the SLR record (totals per record type) and the BLB records (study career totals per student) are not neded for the wisselstrroom calculation, hence ignored.
- This functon also works for HISBEK data. it can help you browse
through the data. The HISBEK data itself, is at the moment not further
implemented in the package
wisselstroom
.
This object is useful if you want to have an overview of the
enrolments, or the degrees. For the use of gaining wisselstroom
insights, we need to make a flow_insights
object. This is
the next step.
Make the insights with make_flow_insights()
The previous step led to a flow_basics object
with
disentangled data needed for flow insights. In this step the actual
calculations are done. Enrolments are condensed, some new variables
including degree information are added, based on the data in the file.
The function make_flow_insights()
will only work for a
flow_basics
object of type “VLPBEK” of “DEFBEK”, not for
type “HISBEK”.
Code to make the object
Assuming the object my_flow_basics
is already made in
the previous step, this line of code processes
my_flow_basics
with help of the function
make_flow_insights()
into a flow_insights
object:
# make the flow_insights object from the my_flow_basics object
my_flow_insights <- make_flow_insights(my_flow_basics)
A closer look at the object
The newly made my_flow_insights
object is of type
flow_insights
:
class(my_flow_insights)
#> [1] "flow_insights"
The structure of this object is a list containing 8 elements:
str(my_flow_insights)
#> List of 8
#> $ type : chr "VLPBEK"
#> $ brin_own : chr "99XX"
#> $ date_retrieval : Date[1:1], format: "2024-01-15"
#> $ enrolments_degrees_compact: tibble [163 × 35] (S3: tbl_df/tbl/data.frame)
#> ..$ academic_year : chr [1:163] "2022/2023" "2022/2023" "2022/2023" "2022/2023" ...
#> ..$ student_id : chr [1:163] "b700010000" "b700010003" "b700010005" "b700010006" ...
#> ..$ BRIN : chr [1:163] "99XX" "99XX" "99XX" "99XX" ...
#> ..$ program_code : chr [1:163] "31009" "31015" "31022" "31015" ...
#> ..$ program_level : chr [1:163] "HBO-BA" "HBO-BA" "HBO-BA" "HBO-BA" ...
#> ..$ program_form : chr [1:163] "VT" "VT" "VT" "VT" ...
#> ..$ date_enrolment : Date[1:163], format: "2022-09-01" "2022-09-01" ...
#> ..$ date_disenrolment : Date[1:163], format: "2023-08-31" "2023-08-31" ...
#> ..$ date_graduation_D : Date[1:163], format: NA "2023-07-07" ...
#> ..$ date_graduation_B : Date[1:163], format: NA NA ...
#> ..$ date_graduation_M : Date[1:163], format: NA NA ...
#> ..$ date_graduation_A : logi [1:163] NA NA NA NA NA NA ...
#> ..$ enrolment_type : chr [1:163] "single" "single" "single" "single" ...
#> ..$ enrolment : chr [1:163] "HBO-BA-99XX-31009" "HBO-BA-99XX-31015" "HBO-BA-99XX-31022" "HBO-BA-99XX-31015" ...
#> ..$ situation_degree : chr [1:163] NA "D" NA "D" ...
#> ..$ situation_brin : chr [1:163] "brin_own" "brin_own" "brin_own" "brin_own" ...
#> ..$ situations_brin : chr [1:163] "brin_own" "brin_own" "brin_own" "brin_own" ...
#> ..$ situations_level : chr [1:163] "HBO-BA" "HBO-BA" "HBO-BA" "HBO-BA" ...
#> ..$ situations_degree : chr [1:163] "" "D" "" "D" ...
#> ..$ n_enrolments : int [1:163] 1 1 1 1 1 1 2 2 1 1 ...
#> ..$ all_enrolments : chr [1:163] "HBO-BA-99XX-31009" "HBO-BA-99XX-31015" "HBO-BA-99XX-31022" "HBO-BA-99XX-31015" ...
#> ..$ final_degree : logi [1:163] FALSE FALSE FALSE FALSE FALSE FALSE ...
#> ..$ n_final_degrees_minyear : int [1:163] 0 0 0 0 0 0 1 1 0 0 ...
#> ..$ all_enrolments_otheryear : chr [1:163] "HBO-BA-99XX-31009" "HBO-BA-99XX-31015" NA "HBO-BA-99XX-31015" ...
#> ..$ n_enrolments_otheryear : num [1:163] 1 1 0 1 1 0 1 1 1 1 ...
#> ..$ situations_brin_otheryear: chr [1:163] "brin_own" "brin_own" "outside HE" "brin_own" ...
#> ..$ suffix : chr [1:163] "(1_1)" "(1_1)" "(1_0)" "(1_1)" ...
#> ..$ enrolment_is_in_bothyears: logi [1:163] TRUE TRUE FALSE TRUE FALSE FALSE ...
#> ..$ student_is_in_bothyears : logi [1:163] TRUE TRUE FALSE TRUE TRUE FALSE ...
#> ..$ all_final_degrees_minyear: logi [1:163] FALSE FALSE FALSE FALSE FALSE FALSE ...
#> ..$ any_final_degrees_minyear: logi [1:163] FALSE FALSE FALSE FALSE FALSE FALSE ...
#> ..$ all_new_otheryear : logi [1:163] FALSE FALSE NA FALSE TRUE NA ...
#> ..$ any_new_otheryear : logi [1:163] FALSE FALSE NA FALSE TRUE NA ...
#> ..$ flow : chr [1:163] "stay" "stay" "stop" "stay" ...
#> ..$ flow_direction : chr [1:163] "flow_to" "flow_to" "flow_to" "flow_to" ...
#> $ switches : tibble [9 × 8] (S3: tbl_df/tbl/data.frame)
#> ..$ from_academic_year: chr [1:9] "2022/2023" "2022/2023" "2022/2023" "2022/2023" ...
#> ..$ from_brin : chr [1:9] "73CC" "73CC" "74DD" "99XX" ...
#> ..$ from_program : chr [1:9] "51021" "51025" "51025" "31005" ...
#> ..$ to_academic_year : chr [1:9] "2023/2024" "2023/2024" "2023/2024" "2023/2024" ...
#> ..$ to_enrolments : chr [1:9] "HBO-BA-99XX-31007" "HBO-BA-99XX-31009" "HBO-BA-99XX-31009" "HBO-BA-71AA-31005" ...
#> ..$ total_switch : int [1:9] 1 1 1 1 1 1 1 1 1
#> ..$ with_prop : int [1:9] 0 0 0 0 0 0 0 0 0
#> ..$ other : int [1:9] 1 1 1 1 1 1 1 1 1
#> $ stacks : tibble [1 × 7] (S3: tbl_df/tbl/data.frame)
#> ..$ from_academic_year: chr "2022/2023"
#> ..$ from_brin : chr "99XX"
#> ..$ from_program : chr "31009"
#> ..$ with_degree : chr "B"
#> ..$ to_academic_year : chr "2023/2024"
#> ..$ to_enrolments : chr "WO-MA-77GG-61031"
#> ..$ total_stack : int 1
#> $ summary_situations_brin : tibble [6 × 3] (S3: tbl_df/tbl/data.frame)
#> ..$ situation_brin_2022/2023: chr [1:6] "brin_own" "brin_own" "brin_own" "brin_own & other HE" ...
#> ..$ situation_brin_2023/2024: chr [1:6] "brin_own" "other HE" "outside HE" "other HE" ...
#> ..$ n_students : int [1:6] 49 4 24 1 2 24
#> $ summary_situations_level : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
#> ..$ situation_level_2022/2023: chr [1:10] "HBO-AD" "HBO-BA" "HBO-BA" "HBO-BA" ...
#> ..$ situation_level_2023/2024: chr [1:10] "outside HE" "HBO-BA" "WO-MA" "outside HE" ...
#> ..$ n_students : int [1:10] 1 50 1 17 1 2 6 2 18 6
#> - attr(*, "class")= chr "flow_insights"
The elements of the list:
-
type
: either “VLPBEK” or “DEFBEK” -
brin_own
: the administrative number of the HEI concerned -
date_retrieval
: the date the original bekostigingsbestand is made -
enrolments_degrees_compact
: adata.frame
with compact enrolment data, adorned with degree data if applicable -
switches
: adata.frame
containing data on switches (from a program without a degree to another program) -
stacks
: adata.frame
containing data on stacks (from a program with a degree to another program) -
summary_presences_brin
: presences_brin per academic year summarised over students -
summary_presences_level
: presences_level per academic year summarised over students
type
, brin_own
,
date_retrieval
These items are copied from the flow_basics
object that
was used as argument.
enrolments_degrees_compact
- An enrolment is seen as the combination of academic year, student, program and institution.
- In a DEFBEK or VLPBEK, there is data from two academic years.
- In the enrolment data.frame in flows_basics, there can be multiple
rows for a student enrolled in the same program at the same institution
in the same academic year, only with different dates of enrolment or
disenrolment, or program forms. In this
data.frame
these rows are condensed into one. - Added to this startingdata.frame
the date_graduation is added, including a propedeutical degree when present. Some extra variables are added, some on enrolment level, some on student level.
Variables in enrolments_degrees_compact
:
variables | explanations |
---|---|
academic_year | academic year date_enrolment |
student_id | unique identifier student |
BRIN | institution of higher education of enrolment |
program_code | isat/croho of study program |
program_level | level of program, one of HBO-AD, HBO-BA, HBO-MA, WO-BA, WO-MA |
program_form | form of enrolment, combination of VT (full-time), DT (part-time), DU (dual) |
date_enrolment | earliest enrolment date in this academic year for this enrolment |
date_disenrolment | latest disenrolment date in this academic year for this enrolment |
date_graduation_D | graduation date propedeutic exam |
date_graduation_B | graduation date bachelor exam |
date_graduation_M | graduation date master exam |
date_graduation_A | graduation date associate degree exam |
enrolment_type | single (when student only has one enrolment in this academic year), else multiple |
enrolment | combintion of program level, BRIN and program code |
situation_degree | when awarded, the type of degree: D, B, M, A |
situation_brin | one of ‘brin_own’ or ‘other HE’ |
situations_brin | combination of situation_brin over all enrolments of student in this academic year |
situations_level | combination of program_level over all enrolments of student in this academic year |
situations_degree | combination of situation_degree over all enrolments of student in this academic year |
n_enrolments | number of enrolment of this student in this academic year |
all_enrolments | combination of enrolment over all enrolments of student in this academic year |
final_degree | TRUE if situation_degree is B, M or A, else FALSE |
n_final_degrees_minyear | the number of final degrees obtained in first year |
all_enrolments_otheryear | combination of enrolment over all enrolments of student in the other academic year in the data |
n_enrolments_otheryear | number of enrolment of this student in the other academic year in the data |
situations_brin_otheryear | combination of situation_brin over all enrolments of student in the other academic year in the data |
suffix | (n_enrolments _ n_enrolment_otheryear) |
enrolment_is_in_bothyears | TRUE or FALSE |
student_is_in_bothyears | TRUE or FALSE |
all_final_degrees_minyear | TRUE if all enrolments of student in first year of data end in final degree |
any_final_degrees_minyear | TRUE if any enrolment of student in first year of data end in final degree |
all_new_otheryear | TRUE if all enrolments of student in other year are not in this year |
any_new_otheryear | TRUE if any enrolment of student in other year is not in this year |
flow | one of stay, switch, start, stop, stack, special |
flow_to | as seen from first year in data or last year |
switches
A summary in a data.frame
of the
enrolments_degrees_compact()
which counts the switches from
the first of the two academic year in the data to the second academic
year. It can happen that someone switches to more than one program.
Variables in switches
:
variables | explanations |
---|---|
from_academic_year | the academic year of the enrolment that stops |
from_brin | the brin of the enrolment that stops |
from_program | the program code of the enrolment that stops |
to_academic_year | the following academic year |
to_enrolments | all enrolments in the following academic year |
total_switch | the number of times this switch occurs in the data |
with_prop | and how often with a propedeutic exam |
other | and how often without a propedeutic exam |
stacks
A summary in a data.frame
of the
enrolments_degrees_compact()
which counts the stacks from
the first of the two academic year in the data to the second academic
year.
Variables in stacks
:
variables | explanations |
---|---|
from_academic_year | the academic year of the enrolment that stops |
from_brin | the brin of the enrolment that stops |
from_program | the program code of the enrolment that stops |
with_degree | the final degree that was obtained for that program |
to_academic_year | the following academic year |
to_enrolments | all enrolments in the following academic year |
total_stack | the number of times this stack occurs in the data |