cv

Aspiring AI/ML Researcher with extensive experience in developing and deploying scalable AI-driven solutions.

General Information

Full Name Shreyas S
Languages English, Hindi, Kannada, Telugu, Malayalam

Education

  • 2027

    Amaravati, AP

    M.Tech in Computer Science and Engineering
    Vellore Institute of Technology

Experience

  • SEP'2024 DEC'2024
    Student Researcher
    BMI department of Emory University
    • Developed dynamic input size calculation based on sampling rate and EEG segment duration,enhancing data flexibility.
    • Automated HDF5 dataset loading for seamless integration across multiple EEG configurations.
    • Integrated vector quantization training with optimized codebook and distributed data gathering for synchronized multi-node performance.
    • Engineered a DeepLIFT pipeline with a custom model wrapper to compute and visualize feature attributions for MDD EEG data.
    • Designed visualization tools (heatmaps, box plots, and time-resolved plots) to analyze channel-wise EEG feature importance.
  • MAY'2024 SEP'2024
    Google Summer of Code (GSoC) Contributor
    BMI department of Emory University
    • Preprocessed and cleaned 5 large EEG datasets from various sources
    • Customized and adapted multiple code-bases to integrate with our preprocessed EEG data
    • Accelerated distributed processing and model training across multiple nodes using Pytorch and HuggingFace Trainer API, streamlining the model implementation process and reducing training time by 20%
    • Developed and trained a flexible foundational EEG model, finetuned and achieved 70% accuracy on downstream tasks
    • Revised multiple generative pretrained transformer architectures (GPTs), providing options to choose any based on need. Reducing Computation load by 10%
    • Utilized PyTorch Captum's DeepLift and Gradient descent algorithms to understand model interpretability, and implemented model profiling for the codebase to monitor and ensure proper resource utilization
  • SEP'2023 APR'2024
    AI/ML researcher
    DigitalFortress Private Limited
    • Directed a natural language query agent to efficiently retrieve data from a MongoDB collection utilizing Natural Language Processing and Retrieval Augmented Generation(RAG) resulting in improved user interaction and data retrieval
    • Developed and designed pipelines for data preparation and testing for a novel anti-spoof detection model by combining 3 datasets totaling over 55GB, utilizing various filters to enhance data quality
    • Deployed the trained model on cloud machine, granting open API access through Flask API, resulting in a 38% reduction in processing time for data analysis tasks
    • Pioneered a data downloading channel in Python that efficiently retrieved and processed over 3000+ videos, reducing data acquisition time by 50%
  • MAR'2023 - Present
    Student Researcher
    Center of Excellence in AI and Robotics (AIR), VIT Amaravati
    • Developed advanced text-based inpainting pipelines, enabling seamless context-aware image modifications using Stable Diffusion.
    • Engineered multi-face clustering solutions with DBScan, optimizing face grouping and identity preservation in complex image datasets.
    • Fine-tuned Stable Diffusion models, enhancing image generation quality, control, and consistency for diverse tasks.
    • Optimized inference pipelines, implementing efficient GPU memory management and accelerated batch processing.
    • Integrated custom embeddings and LoRA fine-tuning, allowing for style adaptation and domain-specific image synthesis.
    • Implemented a novel multi-res OCR method to localize text from complex vector diagrams.

Projects

  • ComicStrips (Text-to-Comic Strip Generation)
    • Developed python based reddit crawler and scraped over 1,000 images, then fine-tuned the flux-dev model using Low Rank Adaption technique (LoRA) for text-to-comic strip generation
    • Prepared an automatic data annotation pipeline for captioning scrapped images using Large Vision Model while data preparation
    • Deployed the model on Hugging Face, achieving 20,000+ downloads, highlighting the project's reach and user interest
  • Adhoc (Automated Documentation Tool)
    • Created a command-line tool to automate codebase documentation with local large language models, generating professional outputs in LaTeX, Markdown, and Word formats
    • Released as a Python pip package, widely adopted by peers simplifying the generation of detailed, professional documentation for codebase changes with minimal effort

Technical Skills

Languages Python, R,Java, SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Redis, Cassandra, Neo4j)
Technologies PyTorch, scikit-learn, Numpy, Pandas, OpenCV
Specializations Deep Learning, Data Pipelines, LLMs, Data Analysis, Statistical Modeling and Deployment.

Achievements

  • 2025
    • Academic Rank 8th in the Department of Computer Science and Engineering after 5 semesters.
  • 2024
    • First and one of the only two students selected for Google Summer of Code (GSoC) from VIT Amaravati
  • 2023
    • Led a team of 5 international teammates to achieve a Top 50 finish in HackSqaud 2023 (Top 60 winners out of 2500+ teams).