cv

Aspiring AI/ML Researcher with extensive experience in developing and deploying scalable AI-driven solutions.

General Information

Full Name Shreyas S
Languages English, Hindi, Kannada, Telugu, Malayalam

Education

  • 2027

    Amaravati, AP

    M.Tech in Computer Science and Engineering
    Vellore Institute of Technology

Experience

  • SEP'2024 DEC'2024
    Student Researcher
    BMI department of Emory University
    • Developed dynamic input size calculation based on sampling rate and EEG segment duration,enhancing data flexibility.
    • Automated HDF5 dataset loading for seamless integration across multiple EEG configurations.
    • Integrated vector quantization training with optimized codebook and distributed data gathering for synchronized multi-node performance.
    • Engineered a DeepLIFT pipeline with a custom model wrapper to compute and visualize feature attributions for MDD EEG data.
    • Designed visualization tools (heatmaps, box plots, and time-resolved plots) to analyze channel-wise EEG feature importance.
  • MAY'2024 SEP'2024
    Google Summer of Code (GSoC) Contributor
    BMI department of Emory University
    • Preprocessed and cleaned 5 large EEG datasets from various sources
    • Customized and adapted multiple code-bases to integrate with our preprocessed EEG data
    • Accelerated distributed processing and model training across multiple nodes using Pytorch and HuggingFace Trainer API, streamlining the model implementation process and reducing training time by 20%
    • Developed and trained a flexible foundational EEG model, finetuned and achieved 70% accuracy on downstream tasks
    • Revised multiple generative pretrained transformer architectures (GPTs), providing options to choose any based on need. Reducing Computation load by 10%
    • Utilized PyTorch Captum's DeepLift and Gradient descent algorithms to understand model interpretability, and implemented model profiling for the codebase to monitor and ensure proper resource utilization
  • SEP'2023 APR'2024
    AI/ML researcher
    DigitalFortress Private Limited
    • Directed a natural language query agent to efficiently retrieve data from a MongoDB collection utilizing Natural Language Processing and Retrieval Augmented Generation(RAG) resulting in improved user interaction and data retrieval
    • Developed and designed pipelines for data preparation and testing for a novel anti-spoof detection model by combining 3 datasets totaling over 55GB, utilizing various filters to enhance data quality
    • Deployed the trained model on cloud machine, granting open API access through Flask API, resulting in a 38% reduction in processing time for data analysis tasks
    • Pioneered a data downloading channel in Python that efficiently retrieved and processed over 3000+ videos, reducing data acquisition time by 50%
  • MAR'2023(Active)
    Student Researcher
    Center of Excellence in AI and Robotics (AIR), VIT Amaravati
    • Developed advanced text-based inpainting pipelines, enabling seamless context-aware image modifications using Stable Diffusion.
    • Engineered multi-face clustering solutions with DBScan, optimizing face grouping and identity preservation in complex image datasets.
    • Fine-tuned Stable Diffusion models, enhancing image generation quality, control, and consistency for diverse tasks.
    • Optimized inference pipelines, implementing efficient GPU memory management and accelerated batch processing.
    • Integrated custom embeddings and LoRA fine-tuning, allowing for style adaptation and domain-specific image synthesis.
    • Implemented a novel multi-res OCR method to localize text from complex vector diagrams.
  • NOV'2024(Active)
    ML Infra/ Backend
    Stealth Startups

Projects

  • ComicStrips (Text-to-Comic Strip Generation)
    • Developed python based reddit crawler and scraped over 1,000 images, then fine-tuned the flux-dev model using Low Rank Adaption technique (LoRA) for text-to-comic strip generation
    • Prepared an automatic data annotation pipeline for captioning scrapped images using Large Vision Model while data preparation
    • Deployed the model on Hugging Face, achieving 20,000+ downloads, highlighting the project's reach and user interest
  • Adhoc (Automated Documentation Tool)
    • Created a command-line tool to automate codebase documentation with local large language models, generating professional outputs in LaTeX, Markdown, and Word formats
    • Released as a Python pip package, widely adopted by peers simplifying the generation of detailed, professional documentation for codebase changes with minimal effort
  • AnimeStudio (Anime Edits)
    • Fine-tuned a Stable Diffusion model for real-time Ghibli style image transformations (1.3k+ conversions at sub-100ms median latency), and deployed a secure proxy with Upstash Redis for IP-based rate limiting, scaling to 11GB+ bandwidth globally with minimal error rates.
    • Mobilized a production grade FastAPI backend with background jobs and automated cleanup, handling 10k+ daily requests in 40+ countries (∼ 140 unique users/day in the first week)

Technical Skills

Languages Python, R,Java, SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Redis, Cassandra, Neo4j)
Technologies PyTorch, scikit-learn, Numpy, Pandas, OpenCV
Specializations Deep Learning, Data Pipelines, LLMs, Data Analysis, Statistical Modeling and Deployment.

Achievements

  • 2025
    • Academic Rank 7th in the Department of Computer Science and Engineering after 5 semesters.
  • 2024
    • First and one of the only two students selected for Google Summer of Code (GSoC) from VIT Amaravati
  • 2023
    • Led a team of 5 international teammates to achieve a Top 50 finish in HackSqaud 2023 (Top 60 winners out of 2500+ teams).