AI, ML & Data Science

~/portfolio $ Hi, I'm R V Kiran Krishna Doddi

AI/ML Engineer & Data Scientist_

Applying machine learning and spectral informatics at the J. Hachmann Research Group (SUNY Buffalo). Simultaneously building latency-optimized AI systems for real-world impact—ready to engineer production-grade solutions.

4 Production Systems
6+ ML Models Deployed
13M+ Records Processed
0.94 Best ARI Score
R V Kiran Krishna Doddi - AI/ML Engineer & Data Scientist profile picture

01. About Me

I specialize in designing and deploying end-to-end Machine Learning systems, scalable data pipelines, and agentic AI architectures.

Having completed my Master of Science in Data Science from the University at Buffalo (SUNY), I focus on bridging the gap between advanced models and production systems. My practical engineering background allows me to construct workflows that are not just mathematically sound, but also performant, reliable, and secure.

Currently, I work as a Research Assistant at the J. Hachmann Research Group, applying machine learning to complex chemical datasets. Alongside my research, I focus on open-source contributions and building high-performance AI systems optimized to slash latency and solve real-world problems.

Education

Jan 2025 — May 2026

M.S. in Data Science

University at Buffalo, SUNY, New York

June 2020 — April 2024

B.Tech. in Computer Science & Electronic Engineering

KIIT University, India

02. Professional Experience

February 2025 — Present

Research Assistant — Applied ML & Spectral Informatics

University at Buffalo | J. Hachmann Research Group

  • Multimodal Classification: Benchmarked six clustering and ensemble algorithms (K-Means, HDBSCAN, Random Forest, XGBoost) across 152+ samples and 7,000+ features, achieving Adjusted Rand Index (ARI) up to 0.94 validated through bootstrap resampling.
  • Latent Structure Accuracy: Fact-checked quantitative findings using PCA, LDA, UMAP, t-SNE, and Isomap to improve structure verification against ground-truth material signatures across FTIR, ATR-FTIR, and NIR modalities.
  • Pre-processing Pipelines: Optimized model generalization and reduced misclassification rates by constructing robust spectral preprocessing chains integrating SNV normalization, Savitzky-Golay filtering, and derivative transformations.
  • Cross-Functional Analytics: Translated complex high-dimensional ML outputs into publication-ready visualizations, accelerating strategic research decisions for three cross-functional scientific and stakeholder teams.
November 2022 — April 2023

Machine Learning Intern — Autonomous Steering Angle Prediction

IBM Skills Build & Coincent Academic Internship (Remote)

  • Neural Steering Predictions: Designed, trained, and deployed a 9-layer TensorFlow Convolutional Neural Network (CNN) with 250K+ parameters on 45,500+ driving frames, accomplishing real-time steering angle predictions validated with MAE and MSE metrics.
  • Robustness to Distribution Shifts: Engineered a complex data augmentation pipeline simulating view translations, lane-shift recovery patterns, and rotational corrections, improving CNN stability under varied weather and road conditions.
  • Performance Assessment: Authored extensive validation assessments explaining network behavior and activation mapping across 80/20 train-validation splits, aligning technical and non-technical stakeholders on deployment readiness.

03. Selected Projects

WikiMCP architecture showing TF-IDF formulas and digital brain
Open Source Contribution

WikiMCP

Git-backed multi-user wiki MCP server — long-term memory for any AI.

  • Ranked Context Retrieval: Reduced AI wiki lookup latency from ~90s to ~29s (~68% improvement) by replacing repeated full-page reads with relevance-ranked snippets.
  • BM25-Style Scoring Layer: Built a dependency-free BM25-style scoring layer in Python with title and heading boosts to optimize search relevance.
  • MCP Tool Integration: Added search_pages() and retrieve_context() MCP tools to help AI clients find high-signal content with fewer tool calls and less context overhead.
  • Open Source Contribution: Successfully opened and merged PR #23 into mohith-das/wikimcp, contributing directly to the MCP ecosystem.
Python BM25 Algorithm Model Context Protocol (MCP) RAG Wikipedia API
TransGuard project diagram displaying financial pipelines
Risk Modeling & Analytics

TransGuard AI

Financial Risk Modeling and Analytics Platform

  • Scalable ELT Pipeline: Processed 13,314,061 financial transaction records across 2,000 users, building scalable pipelines and REST APIs on Databricks/PySpark for real-time scoring.
  • Behavioral Feature Engineering: Crafted 20+ behavioral risk features across 7 fraud categories using dbt, Random Forest, and ensemble methods, evaluating outputs with precision/recall analysis.
  • Anomaly Detection Dashboard: Flagged 211,393 anomalous records (1.59% anomaly rate), delivering interactive KPI dashboards built with FastAPI, Streamlit, and Plotly.
Python PySpark Databricks PostgreSQL dbt FastAPI Streamlit Plotly
DriftGuard project dashboard displaying statistical distributions and turbofan engine diagnostics
MLOps & Drift Diagnostics

DriftGuard

NASA CMAPSS Turbofan Engine Drift Monitoring and Root-Cause Diagnosis Platform

  • Regime-Aware Feature Monitoring: Monitored feature drift (PSI, Kolmogorov-Smirnov, Wasserstein distance) and operating regime drift (Jensen-Shannon divergence) across 21 sensors using NASA CMAPSS FD002 dataset.
  • Automated Root-Cause Analysis: Designed a pattern-based diagnosis model to isolate feature/prediction drift, identifying operating condition shifts vs. physical engine degradation.
  • Action Engine & Persistence: Built an automated decision engine (retrain vs. investigate vs. ignore) connected to SQLite database for historical logging and alerting.
  • Simulated Fault Injection: Created a validation harness to inject synthetic fault patterns, visualized on an interactive Streamlit performance monitoring dashboard.
Python Streamlit SQLite Scikit-learn NumPy Pandas MLOps Statistical Tests

04. Technical Skills

A comprehensive breakdown of my engineering capabilities and tools.

Languages

Python SQL C++ Java R

LLM & Agentic AI

LangGraph LangChain Groq API OpenAI API ChromaDB MCP Protocol RAG RAGAS NLP

ML & Data Science

PyTorch TensorFlow Scikit-learn NumPy Feature Engineering Statistical Analysis Clustering Anomaly Detection

DBs, MLOps & Platforms

dbt PySpark Databricks PostgreSQL MongoDB Firebase FastAPI Nginx Docker AWS DigitalOcean Git Streamlit

05. Get In Touch

Interested in collaborating or hiring? Shoot me a message!

Connect with me

I am open to roles involving AI Engineering, Machine Learning Engineering, and Data Science. Feel free to contact me directly!

Please enter your name.
Please enter a valid email address.
Please enter a subject.
Please enter your message.
Thank you! Your message has been sent successfully.

Accent Color