Hi, I’m Andre.

Welcome to my portfolio! Below you can find personal projects I have worked on related to my academic and professional interests, which span from product data science to machine learning research in NLP and multimodal learning. You can visit my GitHub page to learn more.

Multimodal Graph Learning

Inductive Graph-based Learning of Multimodal Representations

Research project developing inductive graph-based learning approaches to improve performance of multimodal (image + text) data representations for applications in Zillow Search. This was a joint effort by myself and fellow student researchers Adi Srikanth, David Roth, and Tanya Naheta under supervision of the Zillow Applied Science team and NYU.

[Read More]

SimCSE Validation

Independent Validation and Extension of SimCSE Research Paper (Gao, Yao, and Chen 2021)

Contains work completed by myself, Adi Srikanth, and Jin Ishizuka validating and extending experimental results from SimCSE: Simple Contrastive Learning of Sentence Embeddings by Gao, Yao, and Chen from Princeton University. Our work involved replicating experimental results from the paper, developing a neural sentiment classifier on scraped Twitter data to compare the performance of BERT and SimCSE-generated sentence embeddings as inputs, and evaluating BERT-based and SimCSE-based sentence encodings against each other using feature permutation analysis during sentiment classification.

Click [Read More] below to access source code and a full summary on data sources, methods, and results in the README of the project's GitHub repository.

[Read More]

Movie Recommender

Collaborative Filtering Movie Recommender System

Project exploring various ML-based approaches for building movie recommender system. Focus was on designing, building, and evaluating latent factor models that could train in parallel on medium-large (+1 GB) distributed datasets on a high-performance computing cluster. Models were trained on interactions data (ratings by 280,000 unique viewers on 58,000 unique movies) and evaluated based on how closely their top 100 recommendations for each user matched their true top 100-ranked movies by rating.

Click [Read More] below to access source code and a full summary on data sources, methods, and results in the README of the project's GitHub repository.

[Read More]