cv
Aspiring AI/ML Researcher with extensive experience in developing and deploying scalable AI-driven solutions.
General Information
| Full Name | Shreyas S |
| Languages | English, Hindi, Kannada, Telugu, Malayalam |
Education
-
2027 Amaravati, AP
M.Tech in Computer Science and Engineering
Vellore Institute of Technology
Experience
-
JUN'2025 SEP'2025 AI Engineer
Puch AI - Parallelized speech-to-text (STT) pipeline on H100 GPUs with CUDA streams, cutting median latency by 18% and boosting throughput by 7% at scale.
- Reduced text-to-speech (TTS) decoder latency while improving audio quality for end users.
- Implemented and migrated Model Context Protocol (MCP) from a single-MCP server setup to support multiple concurrent MCP servers per user, scaling access to millions of Indian users.
- Improved multilingual translation for 11 Indic languages with robust language & script detection, strict validation, and safe fallbacks, lowering misclassification errors in live traffic.
- Researched and shipped audio generation inside text-to-video workflows - launch post reached 700k+ impressions across LinkedIn and Twitter.
-
MAR'2025 JUN'2025 ML Intern
ScoreTravel - Developed ML microservices for intelligent data preprocessing and deduplication, implementing text extraction and similarity algorithms to optimize downstream model costs.
- Benchmarked image processing libraries and conducted performance analysis to identify optimal solutions for high-throughput data pipelines.
- Conducted R&D experiments to determine optimal similarity thresholds and performance baselines, setting up WandB for experiment tracking and model comparison.
- Evaluated solution against Vision Language Models (VLMs) through comparative performance analysis to validate approach effectiveness.
-
SEP'2024 DEC'2024 Student Researcher
BMI department of Emory University - Developed dynamic input size calculation based on sampling rate and EEG segment duration,enhancing data flexibility.
- Automated HDF5 dataset loading for seamless integration across multiple EEG configurations.
- Integrated vector quantization training with optimized codebook and distributed data gathering for synchronized multi-node performance.
- Engineered a DeepLIFT pipeline with a custom model wrapper to compute and visualize feature attributions for MDD EEG data.
- Designed visualization tools (heatmaps, box plots, and time-resolved plots) to analyze channel-wise EEG feature importance.
-
MAY'2024 SEP'2024 Google Summer of Code (GSoC) Contributor
BMI department of Emory University - Preprocessed and cleaned 5 large EEG datasets from various sources
- Customized and adapted multiple code-bases to integrate with our preprocessed EEG data
- Accelerated distributed processing and model training across multiple nodes using Pytorch and HuggingFace Trainer API, streamlining the model implementation process and reducing training time by 20%
- Developed and trained a flexible foundational EEG model, finetuned and achieved 70% accuracy on downstream tasks
- Revised multiple generative pretrained transformer architectures (GPTs), providing options to choose any based on need. Reducing Computation load by 10%
- Utilized PyTorch Captum's DeepLift and Gradient descent algorithms to understand model interpretability, and implemented model profiling for the codebase to monitor and ensure proper resource utilization
-
SEP'2023 APR'2024 AI/ML researcher
DigitalFortress Private Limited - Directed a natural language query agent to efficiently retrieve data from a MongoDB collection utilizing Natural Language Processing and Retrieval Augmented Generation(RAG) resulting in improved user interaction and data retrieval
- Developed and designed pipelines for data preparation and testing for a novel anti-spoof detection model by combining 3 datasets totaling over 55GB, utilizing various filters to enhance data quality
- Deployed the trained model on cloud machine, granting open API access through Flask API, resulting in a 38% reduction in processing time for data analysis tasks
- Pioneered a data downloading channel in Python that efficiently retrieved and processed over 3000+ videos, reducing data acquisition time by 50%
-
MAR'2023(Active) Student Researcher
Center of Excellence in AI and Robotics (AIR), VIT Amaravati - Developed advanced text-based inpainting pipelines, enabling seamless context-aware image modifications using Stable Diffusion.
- Engineered multi-face clustering solutions with DBScan, optimizing face grouping and identity preservation in complex image datasets.
- Fine-tuned Stable Diffusion models, enhancing image generation quality, control, and consistency for diverse tasks.
- Optimized inference pipelines, implementing efficient GPU memory management and accelerated batch processing.
- Integrated custom embeddings and LoRA fine-tuning, allowing for style adaptation and domain-specific image synthesis.
- Developed SOTA ocr pipeline to extract artefacts from mechanical drawings
-
MAR'2023 JUN'2023 Deep Learning and Data Pipelining Intern
DUBSYNC.AI - Automated video trimming pipeline using a modified PyTube, eliminating manual effort.
- Stored processed videos directly in Google Drive with a fallback mechanism for failed files.
- Built an image extraction pipeline using S3FD to generate datasets from videos.
- Enhanced CodeFormer architecture, upgrading from ESRGAN 2x to ESRGAN 4x for improved image quality.
Projects
-
ComicStrips (Text-to-Comic Strip Generation)
- Developed python based reddit crawler and scraped over 1,000 images, then fine-tuned the flux-dev model using Low Rank Adaption technique (LoRA) for text-to-comic strip generation
- Prepared an automatic data annotation pipeline for captioning scrapped images using Large Vision Model while data preparation
- Deployed the model on Hugging Face, achieving 20,000+ downloads, highlighting the project's reach and user interest
-
Adhoc (Automated Documentation Tool)
- Created a command-line tool to automate codebase documentation with local large language models, generating professional outputs in LaTeX, Markdown, and Word formats
- Released as a Python pip package, widely adopted by peers simplifying the generation of detailed, professional documentation for codebase changes with minimal effort
-
AnimeStudio (Anime Edits)
- Fine-tuned a Stable Diffusion model for real-time Ghibli style image transformations (1.3k+ conversions at sub-100ms median latency), and deployed a secure proxy with Upstash Redis for IP-based rate limiting, scaling to 11GB+ bandwidth globally with minimal error rates.
- Mobilized a production grade FastAPI backend with background jobs and automated cleanup, handling 10k+ daily requests in 40+ countries (∼ 140 unique users/day in the first week)
Technical Skills
| Languages | Python, R,Java, SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Redis, Cassandra, Neo4j) |
| Technologies | PyTorch, scikit-learn, Numpy, Pandas, OpenCV |
| Specializations | Deep Learning, Data Pipelines, LLMs, Data Analysis, Statistical Modeling and Deployment. |
Achievements
-
2025 - Academic Rank 7th in the Department of Computer Science and Engineering after 5 semesters.
- Finished as Finalist in Adobe India Hackathon 2025 (Top 7 out of 260k participants).
-
2024 - First and one of the only two students selected for Google Summer of Code (GSoC) from VIT Amaravati
-
2023 - Led a team of 5 international teammates to achieve a Top 50 finish in HackSqaud 2023 (Top 60 winners out of 2500+ teams).