M.S. Bioinformatics Graduate • Seeking PhD & Research Roles in BioML
Georgia Institute of Technology
I am a bioinformatics researcher working at the intersection of machine learning and computational biology. My work focuses on geometric, topological, and physics-informed machine learning methods, leveraging protein language models and generative modeling for biomolecular dynamics and drug discovery.
I'm interested in designing practical computational tools and models that bridge machine learning with protein dynamics. Outside research, you'll usually find me experimenting with filmmaking, animation, and motion design, because storytelling, whether in science or cinema, is half the craft.
Relevant Coursework: Machine Learning, Machine Learning for Graphs, Deep Reinforcement Learning, Bioinformatics Algorithms, Complex Systems
Relevant Coursework: Bioinformatics, Molecular Modeling and Drug Design, Proteomics and Protein Engineering, Cell Biology, Biochemical Thermodynamics
A fine-tuned Transformer model mapping gene expression perturbations to molecular SMILES for targeted cancer drug design; achieved 23% novel scaffold generation with high pathway alignment and QED >0.75.
Built a 1D-CNN+Bi-directional GRU classifier on GaMD trajectories to classify protein conformational states. Analyzed inter-subunit pathways using dynamic cross-correlation and estimated transition kinetics with corrected state-transition matrices.
Developed a pseudo-log-likelihood typicality framework for protein language models (ESM2, ProtBERT) combined with TwoNN intrinsic dimensionality to audit evolutionary coverage and detect distributional mismatches in embedding representations.
Implemented an SE(3)-equivariant GNN for free-energy surface prediction from biomolecular ensembles, extended with a latent Markov transition model to capture conformational kinetics. Designed as a fast surrogate scorer in FEP-based virtual screening.
A dual-encoder DTI model with state-conditioned hypergraph fusion. The system encodes ligands via pharmacophore-aware hypergraphs and protein pockets via state-specific residue graphs (agonist vs antagonist), fusing representations through cross-attention for ligand toxicity and functional state classification.
A comprehensive review of physics-informed machine learning approaches for modeling and predicting biomolecular dynamics, bridging traditional molecular simulations with modern deep learning techniques.
Some study & quick-reference notes I made as a Teaching Assistant at GT.
Fast, multi-layer protein language model embedding extractor for ESM-2 and ESM-C. Supports mean-pooled residue embeddings, Flash Attention 2, bfloat16 precision, and SLURM array jobs for large-scale representation learning.
View on GitHubProof-of-concept pipeline for detecting cryptic drug-binding pockets on proteins using conformational ensembles sampled via BioEMU. Ranks pockets by druggability across transient states using a graph attention network, aiming to capture binding sites invisible to single-structure methods.
View on GitHubPython automation tool for batch submitting protein sequences to SwissModel using a single structural template. Features parallel multithreaded processing, built-in rate-limit handling, and both CLI and GUI modes.
View on GitHub
Comprehensive drug discovery suite with ADMET prediction, molecular visualization, physicochemical property calculation, druglikeness evaluation, and integration with ChemSpider and PubChem databases.
View on GitHubI'm open to collaborations, research opportunities, and discussions about bioinformatics and machine learning. Whether you have a project in mind or just want to chat about computational biology, feel free to reach out!