Hi! I'm a software engineer at Zipline, where I work on behavior planning software for autonomous medical delivery drones. I previously worked at SpaceX, where I led software development for fairing recovery as well as their laser-based inter-satellite communication system. I am interested in the development of autonomous systems in the real world, particularly in real-time environments with limited computational resources.
In 2017 I graduated with an M.S. in Computer Science from Stanford University, where I did research on reinforcement learning at the Stanford Intelligent Systems Laboratory. In May 2015 I earned a double B.S. in Computer Science and Electrical and Computer Engineering from Cornell University. In the past, I have interned at SpaceX as an embedded systems firmware developer and at Cisco as a software developer, and have conducted undergraduate computer architecture research at Cornell's Computer Systems Laboratory.
Projects I have worked on outside of work span domains from FPGA development to natural language understanding. Click the project titles to read the reports -- author names are in alphabetical order. Be sure to check out the demo videos!
We present a novel, multi-agent reinforcement learning formulation of multi-object tracking that treats creating, propagating, and terminating object tracks as actions in a sequential decision-making problem. Our experiments show an improvement in tracking accuracy over similar state-of-the-art, rule-based approaches on a popular multi-object tracking dataset.
Subvocalization is a phenomenon observed while subjects read or think, characterized by involuntary facial and laryngeal muscle movements. By measuring this muscle activity using surface electromyography (EMG), it may be possible to perform automatic speech recognition (ASR) and enable silent, hands-free human-computer interfaces. In our work, we describe the first approach toward end-to-end, session-independent subvocal speech recognition by leveraging character-level recurrent neural networks (RNNs) and the connectionist temporal classification loss (CTC).
We show the limitations of using language models in a generative manner by training discriminator models to distinguish between real and synthetic text produced by language models. Our best discriminative model reaches 90% accuracy on the test set, demonstrating the weaknesses of using language models as generators. We propose the use of generative adversarial networks (GANs) to improve synthetic text generation. We describe architectural changes to recurrent language models that are necessary for their use in GANs, and pre-train these generative architectures using variational auto-encoders.
We use stacked bidirectional LSTMs to impute missing statistics in genome-wide association (GWA) studies. GWA studies correlate single-nucleotide polymorphisms (SNPs) with complex traits by aggregating summary association statistics across many individuals. In our experiment involving the imputation of missing p-values across approximately one million SNPs and 11 traits, our method reduces the mean-squared logarithmic error on imputed p-values by 22.5% when compared to traditional statistical imputation techniques.
Formalized fully-nested interactive POMDPs, a method for optimizing a player's policy in a turn-based, partially-observable game against an unknown rational opponent. Motivated fully-nested I-POMDPs by introducing the game of partially-observable nim, reducing it to a POMDP, and solving it using SARSOP. Showed empirically that increasing the level of a fully-nested I-POMDP does not become intractable for this game.
We investigate the problem of style transfer: Given a document D in a style S, and a separate style S', can we produce a new document D' in style S' that preserves the meaning of D? We describe a novel style transfer approach that does not rely on parallel or pseudo-parallel corpora, making use of anchoring-based paraphrase extraction and recurrent neural language models. Language models implemented in Torch7.
Implemented a virtual reality system that achieves the illusion of depth on an ordinary display, requiring no special equipment other than a webcam and a computer. The display simulates motion parallax and a changing field of view, functioning as a virtual "window" into a 3D scene. We use Haar cascade classifiers, camera models, and Kalman filtering to track the user's head in 3D in real time and update the display according to an off-axis projection model. We extend our system with a gesture recognition pipeline that allows for object or scene orbiting. Uses OpenCV and OpenGL. Demo video.
Designed and trained a fully-convolutional neural network to predict future optical flow from a single video frame. Extended the pipeline with iterative frame warping to generate video predictions in raw pixel space. Implemented in Torch7 and trained on AWS. Demo video.
Developed an FPGA-based automatic table tennis score keeper, implemented fully in hardware logic without the use of a CPU. By analyzing a video feed of a live game, the system awards points in real time with no direct user input. The project has been featured on Hackaday and the October 2015 issue of IEEE Computer Magazine. Check out our demo video!
Wrote and optimized algorithms in C/C++ to benchmark a novel high-performance, energy-efficient parallel computing microarchitecture by mapping them to a research ISA. Contributions acknowledged in two accepted 2014 IEEE MICRO papers (1, 2) authored by the group.
Programmed and debugged the command and data handling board of a high-agility nanosatellite set to launch in 2017. The board serves as the satellite's central router and interfaces with the flight computer, radio, and numerous sensors such as a gyroscope, a spectrometer and a star tracker.
Cornell University, Fall 2014
Cornell University, Spring 2014
Cornell University, Fall 2013