Past Work
Fret Ferret
Instead of hiring teams of data scientists and machine learning engineers to design, train, deploy, and maintain our machine learning models, we used the Bandito™ production-first AutoML platform to provide self-contained reinforcement learning services. Bandito™ allowed us to breeze through developing 32 AI-optimized mini-games for learning to improvise on the guitar, each of which automatically learns the student skill level and personalizes its questions in real-time. Called Fret Ferret™, our tool is currently being used at the Eastman School of Music in Rochester, New York, to advance its guitar students, and we have been invited to demonstrate our technology at ACM CHI 2022, the top conference on computer-human interaction, in New Orleans in May.
The First Transformer Model in Amazon Alexa
Our founder lead the team that put the first transformer-based model into Amazon Alexa. This engineering feat, along with an improved language modeling pipeline design overcame hurdles with natural language understanding (NLU) that previously prevented exciting new products like Alexa’s Music Conversations to come to market. With his guidance, Music Conversations saw astonishing adoption and quickly became adored by its users. Its success led our CEO to advise Amazon’s headquarters on how to bring his approaches to more businesses that depend on Amazon Alexa without large in-house teams to build their own chatbots.
Bandits and Recommender Systems
Personalization and recommender systems present unique challenges that can be addressed with intelligent bandit design. These algorithms require less feature-engineering, and use embeddings to represents users, queries, and items, making them the go-to for products with a history of interactions to build upon. When working with new users or when substantial historical data has not been recorded, bandits can help collect this data efficiently. Moreover, while recommender systems have evolved to enormously complex extremes to accommodate signals as quickly as possible, bandits paired with recommender systems can capture user intent in the moment with little additional effort, providing a tremendous improvement over traditional approaches. This document outlines the recommender system problem, how bandits have been used to address issues in the past, and how we would do it differently.
Blueprints for Reinforcement learning in customer-facing applications
This work outlines primary considerations for designing production-worthy reinforcement learning algorithms in customer-facing settings. In it we (1) establish the notation, (2) discuss how policies are evaluated, (3) show how policies can be defined by evaluation functions alone, (4) demonstrate how we train an evaluation for our policy, even from customer interactions driven by another policy, known as off-policy training, (5) show how we can train the policy directly from customer interactions, and (6) demonstrate how we can combine both approaches for the current state-of-the-art in the field. We also (7) provide practical considerations under the hood for building our policies and evaluation functions, as well as (8) additional resources to help the reader dive more deeply into the field and discover implementations that can be used right now.
Exactly-Solved Bayesian Models
This work addresses a gap we found in the machine learning literature on how to produce an equivalent Bayesian linear regression (BLR) given a kernel for a Gaussian process (GP). Studying these two models side-by-side is critical for pushing any Bayesian application to production, since they are the most expressive models in circulation with exactly-solved deterministic update rules, and understanding them provides us valuable insights into modern Bayesian approximations. In addition, finding the equivalent Bayesian linear regression to a Gaussian process can provide substantial modeling benefits including (1) enormous efficiency savings with large amounts of data, (2) access to incremental (sequential or online) computation methods, (3) access to vetted approximation methods for efficiency such as the Laplace approximation, (4) easier interpretability, and (5) the ability to effortlessly to combine multiple models into one without using bespoke software libraries. This work provides a 25-page write up and example code to help the reader strengthen their understanding.
COVID-19
We implemented three different models on regional growth in cases in COVID-19 using Bayesian inferencing, and developed an automation pipeline to update results daily. Our models make it possible to identify likely outbreaks, as well as to define the ranges of likely values describing its spread, such as where test capping may be occurring. All the code used is provided freely via MIT license over Github to allow other researchers to build on our findings and tools. Click “learn more” to see the Github.
Harmony Classifier
We built a synthetic data generation pipeline using Ableton to train a deep learning model to detect musical notes in raw audio. Paired with a simple interface displaying those notes on a piano and guitar, our tool helps musicians play along with the music that they hear. Click “learn more” to see the Github.
Document surfer
Using Latent Semantic Indexing and D3.js, we built an innovative document corpus viewer that shows relevant items in a time line, performs automatic keyword detection, and allows the user to easily and enjoyable explore topics. This work was presented at SIGGRAPH and is ideal for working with email. Click “learn more” to see the Github.
popular music analysis
In collaboration with Michael Cuthbert at MIT, we extended the Music21 library to accommodate needs in popular music, built a novel music visualization tool, and performed hierarchical analysis on the library of The Beatles. We found out, for example, that their music almost always includes one-to-two unusual chords, one unusual scale degree, and consciously chosen rhythmic structures to contrast the verse, chorus, and bridge sections of their songs. You can see a video of our presentation at SIGGRAPH and our write-up for IEEE VIS 2012 and ECML KPDD. We also spoke at The Quantified Self. Click “learn more” to see the Github.
Social Media
Under our founder’s leadership at Twitter, his team published several articles and presentations regarding community behavior and product measurement:
Quantum Physics, Solar Physics, and algorithms
Under the tutelage of Maxim Olshanii, Alan Aspuru-Guzik, and Eric Heller, our founder published 5 peer-reviewed papers examining algorithms for quantum simulation and interpretation. They are:
Quantum flux and reverse engineering of quantum wave functions
Algorithm for efficient elastic transport calculations for arbitrary device geometries
Threshold for Chaos and Thermalization in the One-Dimensional Mean-Field Bose-Hubbard Model
You can read his thesis on the quantum-classical interface conducted with Eric Heller, Abbott and James Lawrence Professor of Chemistry and Professor of Physics at Harvard University:
Before that, he worked in solar physics: