Things I’ve Worked On: Select Saylor Reports

I’ve done a good deal of work for Saylor Academy that may have value for other educational organizations, so, in the spirit of open education, I wanted to share them here. The methods discussed herein range from simplistic to relatively high level and from parametric to non-parametric, so there should be a little something for everyone.


Data Report 1: Course Feature Selection

The aim of this analysis is to assess the quality of courses and determine which features of courses are the best predictors of course quality through both parametric (OLS) and non-parametric (support vector machine) means. This information can be used by the education team to identify which dimensions of courses to consider, by the communications team to determine which courses to promote, and by the tech team to determine which features to improve upon and/or focus on. This analysis is an important first, not last, step to analyzing courses and their features. The methods presented herein can be improved upon with additional data (both new samples and new dimensions, particularly new input variables) or complemented with new models, so I welcome any input for improvement.

In part 2, after soliciting input from the Saylor team regarding the input variables and results from the first analysis, new variables were added. The inclusion of these variables in a regression introduces considerable collinearity issues; therefore, principal components regression and LASSO regression are used to reduce the dimensionality of our final models, which focus specifically on completions.


Data Report 4: Learner Speed

This report looks at the amount of time spent on Saylor Academy classes among both complete course engagements (i.e. course enrollments that led to a course completion as of the data download date) and incomplete course engagements (i.e. course enrollments that did not lead to a course completion as of the download date). The results provide timeframes during which Saylor can encourage continued engagement in a course, as well as a general sense of the length of courses Saylor students are willing to tolerate.


Data Report 8: Review 2017

This report looks at the main data points of Saylor Academy in 2017. In particular, in this report, we investigate traffic, registrations, enrollments, and completions via inter- and intra-year comparisons.


Data Report 9: Course Drain

This report investigates the ‘drain’ of users from individual courses based on the following metrics and their ratios in comparison with one another: traffic, enrollment, attempts (of the final), and passes (of the final). Understanding the drain of specific courses will allow us to identify the most and least successful courses in terms of user retention, as well as specific bottlenecks in courses.


Data Report 17: Early Adopter Saylor Profiles

In this exploratory data analysis, I use a number of student features to group recent Saylor profiles. In particular, I group students based on their reason for joining, subject matter, country, and enrollment-completion dyads. This information is helpful for understanding the profile of students who have adopted Saylor Academy at this stage in our development and potential paths of least resistance going forward.