cv

Aspiring AI/ML Researcher with extensive experience in developing and deploying scalable AI-driven solutions.

General Information

Full Name Shreyas S
Languages English, Hindi, Kannada, Telugu, Malayalam

Education

  • 2027

    Amaravati, AP

    M.Tech in Computer Science and Engineering
    Vellore Institute of Technology

Experience

  • JUN'2025 SEP'2025
    AI Engineer
    Puch AI
    • Parallelized speech-to-text (STT) pipeline on H100 GPUs with CUDA streams, cutting median latency by 18% and boosting throughput by 7% at scale.
    • Reduced text-to-speech (TTS) decoder latency while improving audio quality for end users.
    • Implemented and migrated Model Context Protocol (MCP) from a single-MCP server setup to support multiple concurrent MCP servers per user, scaling access to millions of Indian users.
    • Improved multilingual translation for 11 Indic languages with robust language & script detection, strict validation, and safe fallbacks, lowering misclassification errors in live traffic.
    • Researched and shipped audio generation inside text-to-video workflows - launch post reached 700k+ impressions across LinkedIn and Twitter.
  • MAR'2025 JUN'2025
    ML Intern
    ScoreTravel
    • Developed ML microservices for intelligent data preprocessing and deduplication, implementing text extraction and similarity algorithms to optimize downstream model costs.
    • Benchmarked image processing libraries and conducted performance analysis to identify optimal solutions for high-throughput data pipelines.
    • Conducted R&D experiments to determine optimal similarity thresholds and performance baselines, setting up WandB for experiment tracking and model comparison.
    • Evaluated solution against Vision Language Models (VLMs) through comparative performance analysis to validate approach effectiveness.
  • SEP'2024 DEC'2024
    Student Researcher
    BMI department of Emory University
    • Developed dynamic input size calculation based on sampling rate and EEG segment duration,enhancing data flexibility.
    • Automated HDF5 dataset loading for seamless integration across multiple EEG configurations.
    • Integrated vector quantization training with optimized codebook and distributed data gathering for synchronized multi-node performance.
    • Engineered a DeepLIFT pipeline with a custom model wrapper to compute and visualize feature attributions for MDD EEG data.
    • Designed visualization tools (heatmaps, box plots, and time-resolved plots) to analyze channel-wise EEG feature importance.
  • MAY'2024 SEP'2024
    Google Summer of Code (GSoC) Contributor
    BMI department of Emory University
    • Preprocessed and cleaned 5 large EEG datasets from various sources
    • Customized and adapted multiple code-bases to integrate with our preprocessed EEG data
    • Accelerated distributed processing and model training across multiple nodes using Pytorch and HuggingFace Trainer API, streamlining the model implementation process and reducing training time by 20%
    • Developed and trained a flexible foundational EEG model, finetuned and achieved 70% accuracy on downstream tasks
    • Revised multiple generative pretrained transformer architectures (GPTs), providing options to choose any based on need. Reducing Computation load by 10%
    • Utilized PyTorch Captum's DeepLift and Gradient descent algorithms to understand model interpretability, and implemented model profiling for the codebase to monitor and ensure proper resource utilization
  • SEP'2023 APR'2024
    AI/ML researcher
    DigitalFortress Private Limited
    • Directed a natural language query agent to efficiently retrieve data from a MongoDB collection utilizing Natural Language Processing and Retrieval Augmented Generation(RAG) resulting in improved user interaction and data retrieval
    • Developed and designed pipelines for data preparation and testing for a novel anti-spoof detection model by combining 3 datasets totaling over 55GB, utilizing various filters to enhance data quality
    • Deployed the trained model on cloud machine, granting open API access through Flask API, resulting in a 38% reduction in processing time for data analysis tasks
    • Pioneered a data downloading channel in Python that efficiently retrieved and processed over 3000+ videos, reducing data acquisition time by 50%
  • MAR'2023(Active)
    Student Researcher
    Center of Excellence in AI and Robotics (AIR), VIT Amaravati
    • Developed advanced text-based inpainting pipelines, enabling seamless context-aware image modifications using Stable Diffusion.
    • Engineered multi-face clustering solutions with DBScan, optimizing face grouping and identity preservation in complex image datasets.
    • Fine-tuned Stable Diffusion models, enhancing image generation quality, control, and consistency for diverse tasks.
    • Optimized inference pipelines, implementing efficient GPU memory management and accelerated batch processing.
    • Integrated custom embeddings and LoRA fine-tuning, allowing for style adaptation and domain-specific image synthesis.
    • Developed SOTA ocr pipeline to extract artefacts from mechanical drawings
  • MAR'2023 JUN'2023
    Deep Learning and Data Pipelining Intern
    DUBSYNC.AI
    • Automated video trimming pipeline using a modified PyTube, eliminating manual effort.
    • Stored processed videos directly in Google Drive with a fallback mechanism for failed files.
    • Built an image extraction pipeline using S3FD to generate datasets from videos.
    • Enhanced CodeFormer architecture, upgrading from ESRGAN 2x to ESRGAN 4x for improved image quality.

Projects

  • ComicStrips (Text-to-Comic Strip Generation)
    • Developed python based reddit crawler and scraped over 1,000 images, then fine-tuned the flux-dev model using Low Rank Adaption technique (LoRA) for text-to-comic strip generation
    • Prepared an automatic data annotation pipeline for captioning scrapped images using Large Vision Model while data preparation
    • Deployed the model on Hugging Face, achieving 20,000+ downloads, highlighting the project's reach and user interest
  • Adhoc (Automated Documentation Tool)
    • Created a command-line tool to automate codebase documentation with local large language models, generating professional outputs in LaTeX, Markdown, and Word formats
    • Released as a Python pip package, widely adopted by peers simplifying the generation of detailed, professional documentation for codebase changes with minimal effort
  • AnimeStudio (Anime Edits)
    • Fine-tuned a Stable Diffusion model for real-time Ghibli style image transformations (1.3k+ conversions at sub-100ms median latency), and deployed a secure proxy with Upstash Redis for IP-based rate limiting, scaling to 11GB+ bandwidth globally with minimal error rates.
    • Mobilized a production grade FastAPI backend with background jobs and automated cleanup, handling 10k+ daily requests in 40+ countries (∼ 140 unique users/day in the first week)

Technical Skills

Languages Python, R,Java, SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Redis, Cassandra, Neo4j)
Technologies PyTorch, scikit-learn, Numpy, Pandas, OpenCV
Specializations Deep Learning, Data Pipelines, LLMs, Data Analysis, Statistical Modeling and Deployment.

Achievements

  • 2025
    • Academic Rank 7th in the Department of Computer Science and Engineering after 5 semesters.
    • Finished as Finalist in Adobe India Hackathon 2025 (Top 7 out of 260k participants).
  • 2024
    • First and one of the only two students selected for Google Summer of Code (GSoC) from VIT Amaravati
  • 2023
    • Led a team of 5 international teammates to achieve a Top 50 finish in HackSqaud 2023 (Top 60 winners out of 2500+ teams).