SELF-ATTENTION MECHANISM PLM · 8 HEADS · SEQ LEN 16

About Me

Aaryesh Deshpande

M.S. Bioinformatics Graduate • Seeking PhD & Research Roles in BioML

Georgia Institute of Technology

I am a bioinformatics researcher working at the intersection of machine learning and computational biology. My work focuses on geometric, topological, and physics-informed machine learning methods, leveraging protein language models and generative modeling for biomolecular dynamics and drug discovery.

I'm interested in designing practical computational tools and models that bridge machine learning with protein dynamics. Outside research, you'll usually find me experimenting with filmmaking, animation, and motion design, because storytelling, whether in science or cinema, is half the craft.

Based in Atlanta, GA
Actively seeking PhD & Research roles in BioML

Education

Georgia Institute of Technology

Aug 2024 - May 2026

M.S. Bioinformatics

Relevant Coursework: Machine Learning, Machine Learning for Graphs, Deep Reinforcement Learning, Bioinformatics Algorithms, Complex Systems

GITAM University

Dec 2020 - Apr 2024

B.Tech Biotechnology

Relevant Coursework: Bioinformatics, Molecular Modeling and Drug Design, Proteomics and Protein Engineering, Cell Biology, Biochemical Thermodynamics

Research Interests

ML for Computational Biology

Protein Language Models

Computational Biophysics

Graph Neural Networks

Physics-Informed Neural Networks

Bioinformatics Software Design

Projects

Transformer-Based Generation of Novel Drug-like Molecules

July 2023 - Dec 2023

A fine-tuned Transformer model mapping gene expression perturbations to molecular SMILES for targeted cancer drug design; achieved 23% novel scaffold generation with high pathway alignment and QED >0.75.

Transformers GEO Drug Design Generative Model

Allosteric Dynamics in Bacterial Phosphofructokinase

Nov 2024 - May 2025

Built a 1D-CNN+Bi-directional GRU classifier on GaMD trajectories to classify protein conformational states. Analyzed inter-subunit pathways using dynamic cross-correlation and estimated transition kinetics with corrected state-transition matrices.

Molecular Dynamics CNN + GRU Allostery

Protein Language Model Typicality Landscapes

June 2025 - Dec 2025

Developed a pseudo-log-likelihood typicality framework for protein language models (ESM2, ProtBERT) combined with TwoNN intrinsic dimensionality to audit evolutionary coverage and detect distributional mismatches in embedding representations.

Protein Language Models ESM2 Intrinsic Dimensionality

Molecular Graph Encoder with Dynamics Reconstruction

SE(3)-Equivariant GNN for Free-Energy Landscape Prediction

April 2025 - September 2025

Implemented an SE(3)-equivariant GNN for free-energy surface prediction from biomolecular ensembles, extended with a latent Markov transition model to capture conformational kinetics. Designed as a fast surrogate scorer in FEP-based virtual screening.

Graph Neural Networks SE(3)-Equivariant PIML

State-Conditioned Fusion Hypergraph for PPARγ Ligand Toxicity

Sept 2025 - Feb 2026

A dual-encoder DTI model with state-conditioned hypergraph fusion. The system encodes ligands via pharmacophore-aware hypergraphs and protein pockets via state-specific residue graphs (agonist vs antagonist), fusing representations through cross-attention for ligand toxicity and functional state classification.

Hypergraphs DTI Model Cross-Attention Toxicity Prediction Pharmacophore

Papers & Writing

Learning Biomolecular Motion: The Physics-Informed Machine Learning Paradigm

arXiv:2511.06585 • November 2025

A comprehensive review of physics-informed machine learning approaches for modeling and predicting biomolecular dynamics, bridging traditional molecular simulations with modern deep learning techniques.

Review Paper Physics-Informed ML Biomolecular Dynamics

Notes

Some study & quick-reference notes I made as a Teaching Assistant at GT.

Tools & Software

Active
esm-embed banner

esm-embed

Fast, multi-layer protein language model embedding extractor for ESM-2 and ESM-C. Supports mean-pooled residue embeddings, Flash Attention 2, bfloat16 precision, and SLURM array jobs for large-scale representation learning.

Python ESM-2 ESM-C Flash Attention SLURM
View on GitHub
Experimental
ensemble-pocket-finder architecture diagram

ensemble-pocket-finder

Proof-of-concept pipeline for detecting cryptic drug-binding pockets on proteins using conformational ensembles sampled via BioEMU. Ranks pockets by druggability across transient states using a graph attention network, aiming to capture binding sites invisible to single-structure methods.

Python BioEMU Graph Attention Torch-Geometric Pocket Detection
View on GitHub
Legacy

Swiss Model Batch Processor

Python automation tool for batch submitting protein sequences to SwissModel using a single structural template. Features parallel multithreaded processing, built-in rate-limit handling, and both CLI and GUI modes.

Python PyQt SwissModel Multithreading
View on GitHub
GU Drug Pro Toolkit splash screen

GU Drug Pro Toolkit

Comprehensive drug discovery suite with ADMET prediction, molecular visualization, physicochemical property calculation, druglikeness evaluation, and integration with ChemSpider and PubChem databases.

Python RDKit ADMET Drug Discovery GUI
View on GitHub

Skills & Technologies

Programming

Python Bash SQL C MATLAB JavaScript HTML/CSS

Software & Tools

AMBER AutoDock Vina PyMOL SLURM Git Docker Nextflow SvelteKit PyQt

Libraries & Frameworks

PyTorch Torch-Geometric JAX HuggingFace Transformers ESM & ProtBERT Scikit-learn XGBoost NumPy / SciPy Biopython RDkit DeepChem MDAnalysis

Get in Touch

Let's Connect

I'm open to collaborations, research opportunities, and discussions about bioinformatics and machine learning. Whether you have a project in mind or just want to chat about computational biology, feel free to reach out!