Shreyas Kapse Portfolio

About

AI Engineer with a production backend foundation — building LLM-powered systems using LangChain, LangGraph, and RAG pipelines on top of scalable Java/Spring Boot infrastructure.

Hands-on with agentic workflows, vector databases, hybrid retrieval, and LLM evaluation (RAGAS, LangSmith). Currently at Predii extending LLM retrieval systems and building AI-driven data pipelines.

My backend engineering background — microservices, event-driven systems, async processing — gives me an edge in shipping AI that actually works in production, not just in notebooks.

Technical Skills

Technologies & Tools

AI / ML

LangChain, LangGraph, RAG Pipelines, Qdrant, BM25, Vector Databases, LangSmith, RAGAS, Pydantic, Streamlit

LLM Providers

Ollama, Google Gemini, HuggingFace

Programming Languages

Python, Java, JavaScript, TypeScript, SQL

Backend & Frameworks

Spring Boot, FastAPI, Node.js, NestJS, Microservices Architecture

Databases & Caching

PostgreSQL, MongoDB, MySQL, Redis, Qdrant, Azure Cosmos DB

Messaging & Streaming

RabbitMQ, Apache Kafka, Event-Driven Architecture

DevOps & Tools

Git, Docker, CI/CD, JMeter, Load Testing

Testing

JUnit, Mockito, Integration Testing

Work Experience

Professional Journey

Software Development Engineer

Predii, Pune

Nov 2024 – Present

Extended the LLM-based retrieval system by integrating Azure Cosmos DB as a vector store and migrating retrieval calls to async execution, reducing response latency for AI-driven workflows.
Built a Python-based JMeter automation tool that generates .jmx test plans from a YAML config (endpoint, load, threads, rounds), executes load tests, logs request/response pairs, and produces structured HTML reports.
Extended a Java/RabbitMQ data processing pipeline parsing raw repair order data (JSON, CSV, TSV, MongoDB) into structured output for LLM and analytics systems; contributed a Date-Time enrichment plugin, REST plugin, and Precedence plugin for entity extraction.
Built a Git automation tool to streamline multi-repo release workflows and contributed to Java 13 → 21/25 upgrade across core services.

Software Development Engineer Intern

LogiQuad Solutions, Pune

Jan 2024 – Jun 2024

Designed and implemented RESTful APIs using Spring Boot to handle concurrent requests with low-latency responses under production load conditions.
Integrated Redis caching layer to reduce repeated database hits, improving response times for read-heavy operations.
Built a JUnit/Mockito testing suite achieving 85%+ code coverage (measured via IntelliJ Coverage), reducing production defects.
Delivered 10+ cross-platform features in React Native, improving mobile app stability and reducing crash rates.

Research Intern

VIT Pune

Jul 2023 – Dec 2023

Engineered Android app using Kotlin, MVVM, and Firebase, securing a $500 research grant.
Conducted user behavior and UX research to improve app usability, enhancing navigation for new users.

Head of Cyber Security

Team Quark, VIT Pune

Feb 2023 – Jun 2023

Implemented security measures and proactive defenses to protect software systems and sensitive data from cyber threats.
Developed and enforced security policies and protocols, ensuring compliance with industry best practices.
Led cybersecurity initiatives and trained team members, strengthening the organization's overall security posture.

Selected Work

Projects

YouTube RAG Bot – AI Video Q&A System

Python · LangChain · FastAPI · Qdrant · BM25 · HuggingFace · LangSmith · RAGAS · Chrome Extension

RAG pipeline over YouTube transcripts using hybrid retrieval (BM25 + dense vector search), query expansion, and cross-encoder reranking. Delivers timestamp-grounded answers via a Chrome Extension embedded in YouTube UI. Evaluated with RAGAS (Faithfulness: 0.75, Relevancy: 0.71) and traced end-to-end via LangSmith.

View Code

Retrieval

Hybrid BM25 + Semantic Search + Reranking

Evaluation

RAGAS · Faithfulness 0.75 · Relevancy 0.71

Extension

Chrome Extension with Timestamp Jump UI

AI PR Reviewer – Multi-Agent GitHub Code Review System

Python · LangGraph · LangChain · FastAPI · Google Gemini · Ollama · Pydantic · LangSmith

GitHub App that reviews pull requests using a parallel multi-agent LangGraph pipeline — security, bug risk, and performance agents run simultaneously with typed Pydantic outputs aggregated into a structured merge recommendation. Supports Ollama and Gemini 2.0 Flash with a configurable human-in-the-loop gate, LangSmith tracing, and SQLite checkpointing.

View Code

Architecture

Parallel Multi-Agent LangGraph Pipeline

Integration

GitHub App · JWT · HMAC-SHA256 Webhooks

Control

Configurable Human-in-the-Loop Gate

Education & Research

Academic Background

B.Tech in Computer Engineering

Vishwakarma Institute of Technology (VIT), Pune

Sep 2021 – Jun 2024

CGPA: 8.6/10

Relevant Coursework: Data Structures & Algorithms, Object-Oriented Programming, Database Management Systems, Computer Networks, Operating Systems, Artificial Intelligence, Cyber Security, Compiler Design

Publications

"IOT Smart Stand for Smart Phones" — Published in IEEE INCOFT (International Conference on Futuristic Technologies) 2022
"Prototype Design for Smart Birdnest and Related Android App" — Book Chapter 8, Interactive Media with Next-Gen Technologies and Their Usability Evaluation

AI Engineer Building LLM Systems, RAG Pipelines, and Agentic Workflows

Technologies & Tools

AI / ML

LLM Providers

Programming Languages

Backend & Frameworks

Databases & Caching

Messaging & Streaming

DevOps & Tools

Testing

Professional Journey

Software Development Engineer

Software Development Engineer Intern

Research Intern

Head of Cyber Security

Projects

YouTube RAG Bot – AI Video Q&A System

AI PR Reviewer – Multi-Agent GitHub Code Review System

Resume

Articles & Blog Posts

Chapter 2: Filter, Map, and Reduce — The Superpowers of Java 8 Streams

Java 8 Features: A Complete Guide for Modern Java Developers

What is Synchron?

Academic Background

B.Tech in Computer Engineering

Publications

Contact

Email

Phone

GitHub

LinkedIn