Data collected during the 2005-2006 school year from two public schools in Portugal. The data comes from two sources: school recordings (e.g. grades, number of school absences) and self-reporting questionnaires (e.g. workday and weekend alcohol consumptions, parents' jobs, quality of family relationships, frequency to go out with friends). It was originally collected and first studied in Cortez and Silva (2008) .
studentA data frame with 395 observations and 45 variables:
alcstudent's alcohol consumption (binary: 0 = low, 1 = high)
schoolMSstudent's school (binary: 0 = Gabriel Pereira, 1 = Mousinho da Silveira)
sexMstudent's gender (binary: 0 = female, 1 = male)
agestandardized student's age
addressUstudent's home address type (binary: 0 = rural, 1 = urban)
famsizeLE3family size (binary: 0 = greater than 3, 1 = lower or equal to 3)
PstatusTparent's cohabitation status (binary: 0 = living apart, 1 = living together)
Mjobat_homemother's areas of professional activity (binary: 0 = others, 1 = at home)
Mjobhealthmother's areas of professional activity (binary: 0 = others, 1 = health care related)
Mjobservicesmother's areas of professional activity (binary: 0 = others, 1 = public services)
Mjobteachermother's areas of professional activity (binary: 0 = others, 1 = teacher)
Fjobat_homefather's areas of professional activity (binary: 0 = others, 1 = at home)
Fjobhealthfather's areas of professional activity (binary: 0 = others, 1 = health care related)
Fjobservicesfather's areas of professional activity (binary: 0 = others, 1 = public services)
Fjobteacherfather's areas of professional activity (binary: 0 = others, 1 = teacher)
reasoncoursereason for choosing the school (binary: 0 = others, 1 = course preference)
reasonhomereason for choosing the school (binary: 0 = others, 1 = close to home)
reasonreputationreason for choosing the school (binary: 0 = others, 1 = school reputation)
guardianfatherstudent's guardian (binary: 0 = others, 1 = father)
guardianmotherstudent's guardian (binary: 0 = others, 1 = mother)
traveltime2home to school travel time (binary: 0 = others, 1 = 15 to 30 min)
traveltime3home to school travel time (binary: 0 = others, 1 = more than 30 min)
studytime2weekly study time (binary: 0 = others, 1 = between 2 to 5 hours)
studytime3weekly study time (binary: 0 = others, 1 = between 5 to 10 hours)
studytime4weekly study time (binary: 0 = others, 1 = more than 10 hours)
failures1number of past class failures (binary: 0 = others, 1 = one)
failures2number of past class failures (binary: 0 = others, 1 = two or more)
schoolsupyesextra educational school support (binary: 0 = no, 1 = yes)
famsupyesfamily educational support (binary: 0 = no, 1 = yes)
paidyesextra paid classes (binary: 0 = no, 1 = yes)
activitiesyesextra-curricular activities (binary: 0 = no, 1 = yes)
nurseryyesattended nursery school (binary: 0 = no, 1 = yes)
higheryeswilling to take higher education (binary: 0 = no, 1 = yes)
internetyesInternet access at home (binary: 0 = no, 1 = yes)
romanticyesinvolved in a romantic relationship (binary: 0 = no, 1 = yes)
famrelstandardized quality of family relationship (original data from 1 to 5)
freetime2free time after school (binary: 0 = others, 1 = low)
freetime3free time after school (binary: 0 = others, 1 = medium)
freetime4free time after school (binary: 0 = others, 1 = high)
freetime5free time after school (binary: 0 = others, 1 = very high)
gooutstandardized going out with friends variable (original data from 1 to 5)
absencesstandardized number of school absences (original data from 0 to 75)
G1standardized first period grade (original data from 3 to 19)
G2standardized second period grade (original data from 0 to 19)
G3standardized final period grade (original data from 0 to 20)
Cortez P, Silva AMG (2008). “Using data mining to predict secondary school student performance.” In Proceedings of 5th Annual Future Business Technology Conference, Porto, 2008, 5-12.