Loading
Sai Hemanth Kilaru.
Data Scientist & AI Engineer
I build intelligent
systems.
Bridging
data and decisions.
Beyond the Code.
I am a Data Scientist and AI Engineer driven by the challenge of transforming complex, unstructured data into actionable intelligence. My background spans predictive analytics, generative AI, and distributed systems.
Whether it's fine-tuning leading LLMs for medical precision, building autonomous agents, or architecting robust data pipelines, I thrive at the intersection of rigorous research and practical application.
Currently pursuing my studies at the University of Arizona, I am constantly exploring the bleeding edge of machine learning to build faster, smarter, and more reliable AI solutions.
AI & ML
Building intelligent systems with LLMs and robust algorithms.
Data Engineering
Constructing scalable pipelines and structured architectures.
Full-Stack Dev
Crafting seamless user experiences from backend to frontend.
Optimization
Making models faster, cheaper, and more accurate.
Work Experience
Data Science Intern
ORBO(In association with Teachnook) · Remote · Mar 2023 - Apr 2023 · 2 mos
Improved data accuracy by 40% and cut analysis time by 50% by creating an image analysis tool with advanced OpenCV and Python techniques. Debugging times dropped by 30% after I restructured workflows and optimized team collaboration. This internship gave me real-world expertise in machine learning models, scikit-learn, and application development, with a focus on delivering fast, accurate results.
Responsibilities
- ▸Created an image analysis tool using OpenCV and Python.
- ▸Restructured workflows and optimized team collaboration.
- ▸Gained real-world expertise in scikit-learn and application development.
Impact
- ✓Improved data accuracy by 40%.
- ✓Cut analysis time by 50%.
- ✓Dropped debugging times by 30%.
Machine Learning Intern
AICTE · India · Sep 2023 – Nov 2023
Worked on end-to-end machine learning pipeline development for supervised classification problems, focusing on model optimization, reproducibility, and workflow efficiency.
Responsibilities
- ▸Engineered complete ML pipelines including preprocessing, feature engineering, model training, validation, and evaluation.
- ▸Tuned hyperparameters of scikit-learn classification models, improving validation performance by 10–15%.
- ▸Standardized experimentation workflows and documentation, reducing model iteration cycles by 25%.
- ▸Implemented structured model evaluation using cross-validation and performance metrics benchmarking.
Impact
- ✓Improved reproducibility of ML experiments.
- ✓Reduced development time through structured workflow design.
- ✓Delivered optimized classification models ready for deployment.
Data Analytics Intern
AICTE · India · May 2023 – Jul 2023
Designed and implemented cloud-based data analytics solutions using AWS infrastructure and analytics services focusing on secure, scalable architectures.
Responsibilities
- ▸Built and managed cloud-based analytics pipelines using Amazon EC2, S3, RDS, IAM, and CloudFront.
- ▸Architected secure virtual environments using Amazon VPC and security groups to ensure data isolation and access control.
- ▸Implemented data ingestion, transformation, and querying workflows using AWS-native services.
- ▸Evaluated AWS pricing models and optimized cost-performance trade-offs for analytics workloads.
- ▸Applied data lake concepts including collection, storage, and processing patterns for large-scale structured datasets.
Impact
- ✓Reduced manual data handling effort by 30% through automation.
- ✓Improved access reliability and infrastructure security.
- ✓Gained hands-on experience with production-like cloud analytics architecture.
Coding domain member
SRM ASV · Chennai, Tamil Nadu, India · On-site · Apr 2022 - Sep 2022 · 6 mos
During my time at ASV SRM, I was recruited as a member of the coding domain, where I gained hands-on experience and learned the fundamentals of Machine Learning and Ubuntu. I successfully completed the training period, developing key technical skills, including problem-solving and coding. I was also assigned to contribute to a club project, which, unfortunately, couldn't be completed due to the club becoming inactive. Despite this, the experience helped me build a solid foundation in tech and teamwork.
Responsibilities
- ▸Gained hands-on experience in Machine Learning and Ubuntu.
- ▸Successfully completed the technical training period.
- ▸Developed key technical skills including problem-solving and coding.
- ▸Contributed to a club-level team project.
Impact
- ✓Built a solid foundation in technology.
- ✓Enhanced teamwork and collaborative problem-solving skills.
Selected Work
Autonomous LLM-Powered Data Insights Engine
Jun 2025 – Sep 2025
Built a fully autonomous AI system that ingests arbitrary datasets and generates deterministic, validated analytical insights and publication-ready visualizations. Reduced analysis time from hours to minutes, decreased hallucinated outputs by 40%, and increased dataset robustness by 60%.
Bio-Inspired Routing in Social Wasps
Oct 2025 – Present
Simulated decentralized feeding behavior in Ropalidia marginata and benchmarked heuristic efficiency against TSP solutions. Demonstrated that greedy local heuristics achieve near-optimal efficiency, with implications for swarm robotics.
Consumer Electronics Recommendation Analytics
Feb 2025 – May 2025
Analyzed 1,000+ Best Buy products to identify key drivers of recommendation percentage. Used OLS, Ridge, Lasso regression and HuggingFace BART zero-shot sentiment analysis. Detected seasonal peaks in April and November.
Crowdfunding Campaign Success Prediction
Feb 2025 – May 2025
Trained on 15K+ real GoFundMe campaigns using TF-IDF vectorization and ensemble ML (Random Forest, Gradient Boosting). Achieved 99% accuracy and 0.999 ROC-AUC. Identified gratitude and personal tone as top success drivers.
Mistral-7B LoRA Medical Fine-Tuning
Aug 2024 – Nov 2024
Fine-tuned Mistral-7B on 256K medical samples using LoRA/PEFT on Colab T4 GPU. Achieved 20%+ improvement in response relevance over base zero-shot model, evaluated via ROUGE-L & BERTScore.
Early Sepsis Prediction (IEEE Published)
Aug 2023 – Dec 2023
Trained on 40K+ patient records using Random Forest and Gradient Boosting with imputation and normalization. Achieved 99% accuracy, reduced false negatives, and deployed as a Streamlit & PowerBI dashboard. Published in IEEE Xplore.
AI-Powered Self-Proctoring Platform
Jan 2024 – May 2024
Built a real-time proctoring system using YOLOv5 for phone detection and OpenCV for gaze/face orientation tracking. Generates a live productivity score enforcing academic integrity without human oversight.
Healthcare Demand Forecasting
Oct 2024 – Dec 2024
Processed 50K simulated patient records using Random Forest and Multinomial regression for revenue segmentation. Key insight: the 65+ age group contributes 29% of total billing. Deployed with interactive Quarto dashboards.
Netflix Global Content Strategy Dashboard
Built a Tableau dashboard revealing Netflix's global content strategy: 68% Movies vs 32% TV Shows, exponential growth from 2014–2019, and TV-MA as the dominant rating. Provides actionable insights for content investment decisions.
Early Detection of Neurological Disorders (CNN)
May 2023 – Jul 2023
Classified Alzheimer's, Parkinson's, and Brain Tumors from MRI scans using CNN and transfer learning. Applied histogram equalization and image preprocessing pipelines to enhance feature extraction from medical imagery.
YouTube Performance Analytics Dashboard
Apr 2022 – Jun 2022
Built an automated ETL pipeline for YouTube channel analytics featuring difference-from-median benchmarking and 30-day cumulative growth tracking. Enables content creators to make data-driven production decisions.
Airbnb Market Analysis Dashboard
Interactive Tableau dashboard analyzing Airbnb listing data to uncover pricing trends, neighborhood demand, and seasonal occupancy patterns. Helps hosts and investors optimize pricing strategies for maximum revenue.
Web Scraping Pipeline (Amazon & Stanford)
Engineered web scrapers for Amazon product listings and Stanford faculty pages using Python and BeautifulSoup. Built clean, parsed data pipelines with structured output for downstream analysis and research.
Taxi Management System
Full-stack ride management platform with driver/passenger authentication, ride booking & scheduling, in-app ratings, messaging, and a loyalty reward program. Includes advanced routing for efficient dispatch.
Technical Toolkit.
A comprehensive overview of the languages, frameworks, and tools I use to build scalable, intelligent architectures.
Programming Languages
Machine Learning & AI
Data Science & Analytics
LLM & Generative AI
Data Engineering
Cloud & DevOps
Data Visualization & BI
Developer Tools
Research & Methods
Let's Build the Future.
Open for Data Science, AI/ML roles, and cutting-edge research collaborations. Reach out to discuss how we can turn complex problems into intelligent solutions.
© 2026 Sai Hemanth Kilaru. Built with Next.js & Framer Motion.