Projects

video code install instructions slides

Winning submisson at Cal Hacks 5.0; display goggles that helps Alzheimer’s patients locate personal affects using object detection and speech recognition.

Implementation description |>| Trained Mask R-CNN object detection classifier on grayed-out ImageNet dataset to sustain realtime inference / classification rate at least 30fps; optimized setup of models (e.g. Yolo v1-3), datasets (e.g. MS Coco) on high-latency Android embedded system |>| Built Android application and custom scripts (for parsing and relaying camera input) and root-installed into Moverio augmented reality display googles to stream camera input, apply bounding boxes around objects to be detected, and output to display feed of device |>| Contributed an optimized low-latency embedded-system implementation that performs speech recognition, video streaming/display, image processing with minimal visible stutter
Domain/application description |>| |>|

Execution of philosophical counselling through knowledge graphing, natural language processing, and sentiment approximation

video paper poster

BreakupBot, a therapeutic chatbot to recuperate from romantic breakups, organically acquired 200+ users of varying demographics within first week.

Implementation description |>| Built Android application and JSON-based API that receives user text input and returns counselling-based responses |>| Based on real-time user variables (lover type, heartrate-sentiment approximation), categorized text corpus is filtered before running hidden markov chain text generator |>| Use random forest classification model and hierarichal clustering to bucket users into John A. Lee's six types of lovers based on preliminary text input |>| Adopted script that estimates heartrate from camera image input based on measurement of signal peak difference at time intervals |>| Scripted web scrapers to pull high-rated responses from love-related forums; constructed knowledge graph from corpora to facilitate filtering for text generator
Domain/application description |>| |>|

Automated source code obfuscation and privacy-preserving execution through sequence-to-sequence networks

pdf

Contributed (1) quantitative framework for evaluating obfuscated code; (2) privacy-preserving system that uses seq2seq models to obfuscate plaintext and execute obfuscated ciphertext. Submitted to ICASSP 2020

Implementation description |>| Implemented reversible character-embedded encoder-decoder model that takes plaintext input, recursively generates obfuscated code to ensure the execution program can run the obfiscated code without error, then returns obfuscated code, h5 model files, and char/word-to-index dictionaries |>| Set up experimental pipeline to obfuscate benchmark source code, and compare/plot defined metrics between benchmark obfuscated and seq2seq obfuscated code
Domain/application description |>| |>|

Motif detection and clustering of franchise location network graphs

code

slides

Clustering network motifs of successful franchise locations, to consequently identify franchise expansion patterns.

Implementation description |>| Hybrid implementation of motif-detection, bridge-detection, and clustering algorithms to yield sequential coordinates of geographical locations depending on category of product/business, based on network de-anonymization framework |>| Developed REST API to run algorithm and pass JSON-formatted output to Ruby on Rails frontend |>| Contributions of this work in the use of network graphs in time-independent pattern interpolation, recursive backtesting method of running/validating the motifs through training/testing franchises
Domain/application description |>| The project holds value from a geographical information systems (GIS) or operations perspective, as it autonomously generates expansion patterns, including non-obvious ones. This would aid newly-developed companies to expand their businesses in a structured way that had "proven" to succeed. |>|

GAN-MC simulation for HIV sequence prediction

paper

poster

Predicting future HIV sequences given initial strains through use of Monte Carlo in mutation, and generative adversarial networks to prune predictions.

Implementation description |>| Applied mutations (addition, substitution, etc) through Monte Carlo upon listed initial strain sequences (source: Stanford HIV database); built adversarial network to generate adversarial sequences, and discriminator/classification network to identify valid subsequent sequences to prune MC-mutations
Domain/application description |>| Further contribution of providing a mutation prediction algorithm is classifying HIV antiretroviral medication for specific strains of HIV, thus optimizing medication intake for patients in terms of viral drig resistance and elimination of virus |>|

Instrumental note generation through object-impact detection

code

video

video

Synthesis of musical notes based on tapping food with utensils, with each dish assigned to a different instrument.

Implementation description |>| Trained Resnet and Yolo object detection models on labelled food images; paired food categories with instruments, and sub-categories with different notes, and encoded x-axis location across the sub-category image with distinct notes |>| Trained separate object detection model to identify utensils, and calculate proximity between utensils and food item (distance~0 infers impact) |>| Human-computer interaction contribution in terms of augmenting a dining experience with sound, visuals and physical action.
Domain/application description |>| |>|

Value-based decision-making predictions through time-series ECoG signal models

code

poster

data

Predicting likelihood to act or not to act through computational models based on (i) expected value to gain and (ii) neural ECoG signals.

Implementation description |>| Built deep learning models (multilayer perceptron, LSTM, R-CNN) with Pytorch to generate ECoG decision-making distributions and prediction of decision classification based on initial ECoG and potential gainable values |>| Built visualization functions to plot MATLAB-stored ECoG signals recorded from epilepsy patients performing gambling tasks
Domain/application description |>| |>|

Polysemy word tagging tool

code

Tag words of multiple definitions to study concept/word learning among children.

Implementation description |>| Loaded corpora from childes-db, loaded tagging functions and text data from SemCor, built interactive tool using JavaScript and jquery for users on Mechanical Turk to tag polysemous words, in order to develop computational models around chidren concept learning
Domain/application description |>| |>|

### Past Hackathon Winnings/Submission

### Past School Projects

### Side Projects

### Open-source Contributions