Sarah Lim

Computer Science

Northwestern University

sarah [at] (this domain) [dot] com

Photograph taken in the Smoky Mountains
Sk8er Boi was a pretty good song


I am a research intern at Microsoft Research Cambridge, UK, working with Gavin Smyth and Siân Lindley in the Future of Work group within Human Experience and Design. Starting in October, I will join the Early Product Development group at Khan Academy as a software engineer and researcher.

My research interests broadly intersect human-computer interaction (HCI), programming languages, software engineering, and computing education. I particularly enjoy working on developer tools and programming environments to scaffold and support non-expert programmers.

I recently graduated with a BA in Computer Science from Northwestern University, where I was supervised by Haoqi Zhang, Nell O'Rourke, and Jason Hartline. My studies were generously supported by the Google Lime Scholarship, Palantir Women in Technology Scholarship, Microsoft Tuition Scholarship, and Box Engineering Diversity Scholarship. Thanks, companies!

Lately (or not-so-lately), I've gotten into browser engines, type systems, crossword puzzles, classical music and jazz, policy debate, document preparation, cognitive disability advocacy, and certain games.


Aug 2018

New paper at UIST 2018! Ply: A Visual Web Inspector for Learning from Professional Webpages significantly extends our previous work on visual regression pruning.

Jul 2018

Started an internship at MSR Cambridge. Here, "rocket" is arugula and "maths" is math.

Oct 2017

I'm joining Khan Academy Early Product Development full-time after graduation!

May 2017

Spoke at the Northwestern Big Ideas Forum, "How We Learn About Learning," with professors Nell O'Rourke and David Uttal, and fellow undergrad Gabby Ashenafi.

May 2017

Ply wins the CHI 2017 Student Research Competition! Northwestern Engineering has a nice write-up about the whole thing.

Apr 2017

Received a Microsoft Tuition Scholarship for 2017-18.

Jan 2017

Ply: Visual Regression Pruning for Web Design Source Inspection is accepted to the CHI 2017 SRC.


Recent escapades in research, development, and coursework.

Ply: Visual web inspection

Delta Lab


CSS is syntactically straightforward, but has a steep learning curve and complicated semantics. Inspecting the source of existing webpages can help illustrate concepts, but such webpages are typically too complex to serve as useful learning materials. Drawing inspiration from prior research in both software engineering and the learning sciences, we present a new web inspection tool capable of pruning irrelevant CSS and identifying implicit dependencies between properties. Supervised by Haoqi Zhang and Nell O'Rourke. UIST 2018, Berlin, Germany.

Evaluating peer graders

Northwestern University


Most of the literature on peer grading focuses on inferring a true grade from a set of noisy reports. We study a different problem: inferring the skill and effort of reviewers, from the same reports. Supervised by Jason Hartline.

Tracing WebAssembly function calls

EECS 396: Systems Programming in Rust


With Meg Grasse and mentorship from Nick Fitzgerald and Jim Blandy, we developed a proof-of-concept tool for instrumenting WebAssembly binaries written in Rust to log function calls at runtime.

Classroom exercise reports

Khan Academy


I developed new exercise reports to help teachers visualize student progress and work through problems in the classroom. Mentored by John Resig during my internship at Khan Academy.

Visual regression pruning

Delta Lab


We introduce a visual significance heuristic for removing irrelevant CSS source code during web design reverse-engineering tasks. CHI 2017 Student Research Competition Winner, Denver, Colorado.

Guiding Web Inspection with Tutorial Keyword Frequency

Delta Lab


In order to bridge the gap between web design tutorials and real-world examples, we extend a web inspector to highlight CSS properties frequently mentioned across a given set of tutorials. Google Scholars' Retreat 2016, Mountain View, California.

SVG Charting Library



An opinionated Ember.js addon to replace Highcharts with native SVG and DOM APIs. Released addon as a company-wide multiproduct. I worked on this project during my internship at LinkedIn, under the mentorship of Cody Coats and Michail Yasonik.

Predicting the Popularity of User-Generated Discussion Questions

EECS 349: Machine Learning


Using Python with the Reddit API and NLTK library, we collect information about AskReddit posts over a two-week period to analyze what makes a question popular. Alternating decision trees achieve 72.9819 accuracy with 10-fold cross-validation, an improvement over the ZeroR baseline of 51.0708. Features related to the language of the question, time and day of posting, and initial commenting behavior prove most informative. With Sameer Srivastava, Jennie Werner, and Aiqi Liu.


Northwestern Debate Institute


End-to-end Google Apps Script-based pipeline for publishing practice debate comments to individual students' feedback pages. Previously, instructors needed to manually edit the feedback pages for all four students in order to provide feedback from practice rounds. Deployed at the 2015 Northwestern Debate Institute and subsequently adopted for the entire program in 2016.


I was a teaching assistant every quarter beginning my sophomore year, sometimes for two courses at once. Terms marked with an asterisk (*) denote a head teaching assistant role.