Shantanu Kumar

Shantanu Kumar

Software Engineer / ML
Building on the web

I craft digital experiences with a focus on intelligent systems, performance, and robust engineering. Turning complex algorithms into elegant interfaces.

About Me

I'm a Full-Stack and AI Engineer deeply interested in building intelligent systems from scratch. Currently studying Computer Science with an AI/ML Minor at Bennett University, I span the stack from high-performance frontend interfaces to complex, agentic backend architectures.

Whether it's implementing Transformers from first principles, orchestrating RAG pipelines, or optimizing web applications for thousands of users—I focus on rigorous engineering and turning complex algorithms into robust, deployed products.

Greater Noida, India
B.Tech CS (AI/ML Minor) @Bennett Univ.
+91 9229977256

All Projects

Vision-Language Transformer

Vision-Language Transformer

A PaliGemma-style Vision-Language Transformer implemented from first principles, including ViT, contrastive learning, RoPE, GQA, KV-cache, and autoregressive decoding.

Python ◦ First-Principles
View Repository
Foundation LLM Pipeline

Foundation LLM Pipeline

Building an end-to-end LLM pipeline from a base Transformer through supervised fine-tuning, reward modeling, and PPO-based RLHF.

Python ◦ RLHF
View Repository
TheKourse

TheKourse

AI-powered interactive learning platform with contextual agent explanations, automated project scoring, and in-browser code execution engine.

TypeScript ◦ Startup
View Repository
Vanilla PyTorch Transformer

Vanilla PyTorch Transformer

Implemented the original Transformer architecture from scratch in PyTorch to gain a deep understanding of self-attention, masking, and encoder-decoder stacks.

Python ◦ PyTorch
View Repository
Agentic Blog Generation

Agentic Blog Generation

An agentic blog generation system using LangGraph and FastAPI, orchestrating multi-step LLM workflows with tool use, memory, and validation pipelines.

Python ◦ LangGraph
View Repository
ResNet-34 Audio CNN

ResNet-34 Audio CNN

A ResNet-style CNN for audio spectrogram classification with end-to-end inference deployment using PyTorch, torchaudio, FastAPI, and Modal.

Python ◦ FastAPI
View Repository
Multi-Agent Justice (VIDHI-AI)

Multi-Agent Justice (VIDHI-AI)

Vidhi AI is a multi-agent courtroom simulation modeling legal reasoning through role-specific AI agents, grounded in statutory knowledge using RAG.

Python ◦ RAG
View Repository
Gigabyte Phantasia

Gigabyte Phantasia

Build the landing page website for the 24 hour continous hackathon Gigabyte phantasia the design is inspired by the vegas theme.

Python ◦ FastAPI
View Repository

Tech Stack

Languages

Python
JavaScript
TypeScript
C++
SQL

AI / ML

PyTorch
Transformers
PPO-RLHF
MoE
Vision Transformers

Frameworks

Next.js
React
FastAPI
LangChain
LangGraph

Infrastructure

MongoDB
Supabase
Modal
Git
Docker

Activity