Erin Dahlgren

Simple Problem In Machine Learning

When I was a Bachelors and Masters student in linguistics, I was very interested in computational methods. But back around 2010 when neural networks were hot and new, linguistics and computer scientists were just starting to work together on problems like automatic language translation and speech to text.

When I returned in 2019 things had dramatically changed. I met a large community of computer science students applying neural networks and other machine learning structures to natural language problems. It was exciting to see how the two worlds of linguistics and computer science had blurred.

Despite how exciting this was to see, many of these students talked about a basic problem that affected them every day: keeping track of their experiments. It seemed that everyone was keeping track of their experiments in a different way: logging values to files, manually taking notes, keeping spreadsheets updated, on and on it went. I learned that there could be a relatively simple solution to this problem if someone cared enough to solve it with academics in mind.

Through Bold Public Code, I’m developing a project to standardize how machine learning experiments are tracked over time. Our first goal is to create something dead simple that everyone doing machine learning in an academic setting in Chicago will use. When we learn from that, we’ll decide if and how to expand that project to other users.

This project is maintained by edahlgren