Siddhartha Datta

Commented-out Note-to-self #1: Add an inspiring paragraph describing my vision and accolades, maybe after dinner –

Commented-out Note-to-self #2: comment out Note-to-self #1

Highlighted projects


Mask R-CNN object detection to augment human peripheral search

Winning submisson at Cal Hacks 5.0; display goggles that helps Alzheimer’s patients locate personal affects using object detection and speech recognition.

[video] [code] [slides]

Trained Mask R-CNN object detection classifier on grayed-out ImageNet dataset to sustain realtime inference / classification rate at least 30fps; optimized setup of models (e.g. Yolo v1-3), datasets (e.g. MS Coco) on high-latency Android embedded system

Built Android application and custom scripts (for parsing and relaying camera input) and root-installed into Moverio augmented reality display googles to stream camera input, apply bounding boxes around objects to be detected, and output to display feed of device

Contributed an optimized low-latency embedded-system implementation that performs speech recognition, video streaming/display, image processing with minimal visible stutter

Learn more

fully responsive

Execution of philosophical counselling through knowledge graphing, natural language processing, and sentiment approximation

BreakupBot, a therapeutic chatbot to recuperate from romantic breakups, organically acquired 200+ users of varying demographics within first week.

[app] [code] [video] [paper] [poster]

Built Android application and JSON-based API that receives user text input and returns counselling-based responses

Based on real-time user variables (lover type, heartrate-sentiment approximation), categorized text corpus is filtered before running hidden markov chain text generator

Use random forest classification model and hierarichal clustering to bucket users into John A. Lee’s six types of lovers based on preliminary text input

Adopted script that estimates heartrate from camera image input based on measurement of signal peak difference at time intervals

Scripted web scrapers to pull high-rated responses from love-related forums; constructed knowledge graph from corpora to facilitate filtering for text generator

Learn more


Dialogue-based automated image editing

GIF-editing chatbot with computer vision functionality.


Messenger bot, dialogue facilitated by quick reply buttons to execute modularized vision functionality (fast style transfer, segmented style transfer, CycleGAN, first order motion transfer, foreground object removal).

Learn more

100% fre4e

Music composition through object-impact detection

Synthesis of musical notes based on tapping food with utensils, with each dish assigned to a different instrument.

[code] [video] [video]

Trained Resnet and Yolo object detection models on labelled food images; paired food categories with instruments, and sub-categories with different notes, and encoded x-axis location across the sub-category image with distinct notes

Trained separate object detection model to identify utensils, and calculate proximity between utensils and food item (distance~0 infers impact)

Human-computer interaction contribution in terms of augmenting a dining experience with sound, visuals and physical action.

Learn more

100% free2

Inductive inference of decision-making motifs from trade execution networks

Prediction of client trade execution through application of reinforcement learning in rational agent decision-making


Client inferencing engine built for an investment bank; maps out traders to stocks with 70+ markets/cognitive features to interpolate trade execution motifs

Network graph between clients and stocks: client-client edges were weighted functions of client-specific properties (e.g. fund type); stock-stock edges on stock-specific properties (e.g. sector, average daily volume); client-stock edges on behavioural/decision-making/trading features (e.g. likelihood of traded with volatility, risk aversity relative to news, momentum/contrarian, preferences for certain sectors, regret minimization, confidence bias)

Edge weight matrices were calculated using reinforcement learning based on historical trades taken by each client on each stock

Output was the identification of cognitive decision-making motifs adopted by each individual trader relative to other traders, and even able to identify successful situation-specific motifs based on the cumulative profit made by those cognitive motifs (i.e. clustered different decision-making motifs to specific scenarios/environments)

Back-testing indicated ~7x% accuracy in predicting client execution on any given day (accuracy rising exponentially as time horizon increases to 2 weeks); successfully captured value-based decision-making cognition of rational agents

100% free

Automated source code obfuscation and privacy-preserving execution through sequence-to-sequence networks

Contributed (1) quantitative framework for evaluating obfuscated code; (2) privacy-preserving system that uses seq2seq models to obfuscate plaintext and execute obfuscated ciphertext.

[arXiv] [code]

Implemented reversible character-embedded encoder-decoder model that takes plaintext input, recursively generates obfuscated code to ensure the execution program can run the obfiscated code without error, then returns obfuscated code, h5 model files, and char/word-to-index dictionaries

Set up experimental pipeline to obfuscate benchmark source code, and compare/plot defined metrics between benchmark obfuscated and seq2seq obfuscated code

Learn more

100% fre3e

GAN-MC simulation for HIV sequence prediction

Predicting future HIV sequences given initial strains through use of Monte Carlo in mutation, and generative adversarial networks to prune predictions.

[paper] [poster]

Applied mutations (addition, substitution, etc) through Monte Carlo upon listed initial strain sequences (source: Stanford HIV database); built adversarial network to generate adversarial sequences, and discriminator/classification network to identify valid subsequent sequences to prune MC-mutations

Further contribution of providing a mutation prediction algorithm is classifying HIV antiretroviral medication for specific strains of HIV, thus optimizing medication intake for patients in terms of viral drig resistance and elimination of virus

Learn more

100% free2

Motif detection and clustering of franchise location network graphs

Clustering network motifs of successful franchise locations, to consequently identify franchise expansion patterns.

[code] [slides]

Hybrid implementation of motif-detection, bridge-detection, and clustering algorithms to yield sequential coordinates of geographical locations depending on category of product/business, based on network de-anonymization framework

Developed REST API to run algorithm and pass JSON-formatted output to Ruby on Rails frontend

Contributions of this work in the use of network graphs in time-independent pattern interpolation, recursive backtesting method of running/validating the motifs through training/testing franchises

Learn more

100% fre3e


Open-sourced electrophysiological signal visualization library built for Python


Defined modules/functions for general-purpose data processing, analysis and modeling of ECoG signal data, including parsing of MATLAB files into Python, different feature engineering techniques for multi-electrode time series data, different visualization techniques, and pre-built class-based decision-making classification models

Learn more

Last updated: 31/08/2020 | Archived projects