Data collected during the 2005-2006 school year from two public schools in Portugal. The data comes from two sources: school recordings (e.g. grades, number of school absences) and self-reporting questionnaires (e.g. workday and weekend alcohol consumptions, parents' jobs, quality of family relationships, frequency to go out with friends). It was originally collected and first studied in Cortez and Silva (2008) .
student
A data frame with 395 observations and 45 variables:
alc
student's alcohol consumption (binary: 0 = low, 1 = high)
schoolMS
student's school (binary: 0 = Gabriel Pereira, 1 = Mousinho da Silveira)
sexM
student's gender (binary: 0 = female, 1 = male)
age
standardized student's age
addressU
student's home address type (binary: 0 = rural, 1 = urban)
famsizeLE3
family size (binary: 0 = greater than 3, 1 = lower or equal to 3)
PstatusT
parent's cohabitation status (binary: 0 = living apart, 1 = living together)
Mjobat_home
mother's areas of professional activity (binary: 0 = others, 1 = at home)
Mjobhealth
mother's areas of professional activity (binary: 0 = others, 1 = health care related)
Mjobservices
mother's areas of professional activity (binary: 0 = others, 1 = public services)
Mjobteacher
mother's areas of professional activity (binary: 0 = others, 1 = teacher)
Fjobat_home
father's areas of professional activity (binary: 0 = others, 1 = at home)
Fjobhealth
father's areas of professional activity (binary: 0 = others, 1 = health care related)
Fjobservices
father's areas of professional activity (binary: 0 = others, 1 = public services)
Fjobteacher
father's areas of professional activity (binary: 0 = others, 1 = teacher)
reasoncourse
reason for choosing the school (binary: 0 = others, 1 = course preference)
reasonhome
reason for choosing the school (binary: 0 = others, 1 = close to home)
reasonreputation
reason for choosing the school (binary: 0 = others, 1 = school reputation)
guardianfather
student's guardian (binary: 0 = others, 1 = father)
guardianmother
student's guardian (binary: 0 = others, 1 = mother)
traveltime2
home to school travel time (binary: 0 = others, 1 = 15 to 30 min)
traveltime3
home to school travel time (binary: 0 = others, 1 = more than 30 min)
studytime2
weekly study time (binary: 0 = others, 1 = between 2 to 5 hours)
studytime3
weekly study time (binary: 0 = others, 1 = between 5 to 10 hours)
studytime4
weekly study time (binary: 0 = others, 1 = more than 10 hours)
failures1
number of past class failures (binary: 0 = others, 1 = one)
failures2
number of past class failures (binary: 0 = others, 1 = two or more)
schoolsupyes
extra educational school support (binary: 0 = no, 1 = yes)
famsupyes
family educational support (binary: 0 = no, 1 = yes)
paidyes
extra paid classes (binary: 0 = no, 1 = yes)
activitiesyes
extra-curricular activities (binary: 0 = no, 1 = yes)
nurseryyes
attended nursery school (binary: 0 = no, 1 = yes)
higheryes
willing to take higher education (binary: 0 = no, 1 = yes)
internetyes
Internet access at home (binary: 0 = no, 1 = yes)
romanticyes
involved in a romantic relationship (binary: 0 = no, 1 = yes)
famrel
standardized quality of family relationship (original data from 1 to 5)
freetime2
free time after school (binary: 0 = others, 1 = low)
freetime3
free time after school (binary: 0 = others, 1 = medium)
freetime4
free time after school (binary: 0 = others, 1 = high)
freetime5
free time after school (binary: 0 = others, 1 = very high)
goout
standardized going out with friends variable (original data from 1 to 5)
absences
standardized number of school absences (original data from 0 to 75)
G1
standardized first period grade (original data from 3 to 19)
G2
standardized second period grade (original data from 0 to 19)
G3
standardized final period grade (original data from 0 to 20)
Cortez P, Silva AMG (2008). “Using data mining to predict secondary school student performance.” In Proceedings of 5th Annual Future Business Technology Conference, Porto, 2008, 5-12.