cv
Aspiring AI/ML Researcher with extensive experience in developing and deploying scalable AI-driven solutions.
General Information
Full Name | Shreyas S |
Languages | English, Hindi, Kannada, Telugu, Malayalam |
Education
-
2027 Amaravati, AP
M.Tech in Computer Science and Engineering
Vellore Institute of Technology
Experience
-
SEP'2024 DEC'2024 Student Researcher
BMI department of Emory University - Developed dynamic input size calculation based on sampling rate and EEG segment duration,enhancing data flexibility.
- Automated HDF5 dataset loading for seamless integration across multiple EEG configurations.
- Integrated vector quantization training with optimized codebook and distributed data gathering for synchronized multi-node performance.
- Engineered a DeepLIFT pipeline with a custom model wrapper to compute and visualize feature attributions for MDD EEG data.
- Designed visualization tools (heatmaps, box plots, and time-resolved plots) to analyze channel-wise EEG feature importance.
-
MAY'2024 SEP'2024 Google Summer of Code (GSoC) Contributor
BMI department of Emory University - Preprocessed and cleaned 5 large EEG datasets from various sources
- Customized and adapted multiple code-bases to integrate with our preprocessed EEG data
- Accelerated distributed processing and model training across multiple nodes using Pytorch and HuggingFace Trainer API, streamlining the model implementation process and reducing training time by 20%
- Developed and trained a flexible foundational EEG model, finetuned and achieved 70% accuracy on downstream tasks
- Revised multiple generative pretrained transformer architectures (GPTs), providing options to choose any based on need. Reducing Computation load by 10%
- Utilized PyTorch Captum's DeepLift and Gradient descent algorithms to understand model interpretability, and implemented model profiling for the codebase to monitor and ensure proper resource utilization
-
SEP'2023 APR'2024 AI/ML researcher
DigitalFortress Private Limited - Directed a natural language query agent to efficiently retrieve data from a MongoDB collection utilizing Natural Language Processing and Retrieval Augmented Generation(RAG) resulting in improved user interaction and data retrieval
- Developed and designed pipelines for data preparation and testing for a novel anti-spoof detection model by combining 3 datasets totaling over 55GB, utilizing various filters to enhance data quality
- Deployed the trained model on cloud machine, granting open API access through Flask API, resulting in a 38% reduction in processing time for data analysis tasks
- Pioneered a data downloading channel in Python that efficiently retrieved and processed over 3000+ videos, reducing data acquisition time by 50%
-
MAR'2023 - Present Student Researcher
Center of Excellence in AI and Robotics (AIR), VIT Amaravati - Developed advanced text-based inpainting pipelines, enabling seamless context-aware image modifications using Stable Diffusion.
- Engineered multi-face clustering solutions with DBScan, optimizing face grouping and identity preservation in complex image datasets.
- Fine-tuned Stable Diffusion models, enhancing image generation quality, control, and consistency for diverse tasks.
- Optimized inference pipelines, implementing efficient GPU memory management and accelerated batch processing.
- Integrated custom embeddings and LoRA fine-tuning, allowing for style adaptation and domain-specific image synthesis.
- Implemented a novel multi-res OCR method to localize text from complex vector diagrams.
Projects
-
ComicStrips (Text-to-Comic Strip Generation)
- Developed python based reddit crawler and scraped over 1,000 images, then fine-tuned the flux-dev model using Low Rank Adaption technique (LoRA) for text-to-comic strip generation
- Prepared an automatic data annotation pipeline for captioning scrapped images using Large Vision Model while data preparation
- Deployed the model on Hugging Face, achieving 20,000+ downloads, highlighting the project's reach and user interest
-
Adhoc (Automated Documentation Tool)
- Created a command-line tool to automate codebase documentation with local large language models, generating professional outputs in LaTeX, Markdown, and Word formats
- Released as a Python pip package, widely adopted by peers simplifying the generation of detailed, professional documentation for codebase changes with minimal effort
Technical Skills
Languages | Python, R,Java, SQL (MySQL, PostgreSQL), NoSQL (MongoDB, Redis, Cassandra, Neo4j) |
Technologies | PyTorch, scikit-learn, Numpy, Pandas, OpenCV |
Specializations | Deep Learning, Data Pipelines, LLMs, Data Analysis, Statistical Modeling and Deployment. |
Achievements
-
2025 - Academic Rank 8th in the Department of Computer Science and Engineering after 5 semesters.
-
2024 - First and one of the only two students selected for Google Summer of Code (GSoC) from VIT Amaravati
-
2023 - Led a team of 5 international teammates to achieve a Top 50 finish in HackSqaud 2023 (Top 60 winners out of 2500+ teams).