OM

LAKHIA

DATA SCIENTIST

M.S. Statistical Data Science

SPECIALIZATION

Python • SQL • Machine Learning

ABOUT

DATA SCIENTIST & ANALYST

I TURN DATA INTO STORIES THAT DRIVE DECISIONS. MY PASSION LIES IN UNCOVERING HIDDEN PATTERNS WITHIN COMPLEX DATASETS AND TRANSFORMING THEM INTO CLEAR, ACTIONABLE INSIGHTS THAT BUSINESS LEADERS CAN ACT ON WITH CONFIDENCE.

FROM ENGINEERING PYTHON/SQL PIPELINES FOR 500+ BLOOMBERG DATASETS TO BUILDING PREDICTIVE MODELS THAT ACHIEVE 98% ACCURACY, I BRIDGE THE GAP BETWEEN RAW DATA AND STRATEGIC ACTION. MY WORK HAS IMPROVED DATA RELIABILITY BY 35%, INCREASED CUSTOMER RETENTION BY 12%, AND REDUCED REPORTING CYCLES BY 35% ACROSS MULTIPLE ORGANIZATIONS.

MY APPROACH

I BELIEVE EVERY DATASET TELLS A STORY. WHETHER ANALYZING 15,000+ USER SESSIONS TO UNCOVER CHURN PATTERNS OR CONSOLIDATING 200+ FRAGMENTED DATASETS INTO UNIFIED PIPELINES, I FOCUS ON EXTRACTING MEANINGFUL NARRATIVES THAT RESONATE WITH STAKEHOLDERS.

MY TOOLKIT INCLUDES PYTHON, SQL, TABLEAU, AND POWER BI TO CREATE DASHBOARDS THAT DON'T JUST DISPLAY DATA—THEY TELL COMPELLING STORIES THAT INFLUENCE BUDGET ALLOCATION, MARKETING STRATEGY, AND OPERATIONAL DECISIONS.

98%
FORECAST ACCURACY
500+
DATASETS ENGINEERED
35%
EFFICIENCY IMPROVEMENT

EXPERIENCE

BLOOMBERG RESEARCH LAB ASSISTANT

San Francisco State University
San Francisco, CA
Aug 2025 – Present
Engineered Python/SQL pipelines in Jupyter to clean 500+ Bloomberg datasets, improving financial data reliability by 35%
Generated demand forecasts using regression and time-series models, achieving 98% accuracy for retention and sales planning
Designed Tableau and Seaborn dashboards to surface key financial and operational metrics for decision-makers
Applied causal inference to evaluate policy experiments; findings guided compliance and investment decisions
Authored GitHub-based documentation, ensuring reproducibility and facilitating research collaboration
ACADEMIC

DATA ANALYTICS INTERN

UL Solutions
Fremont, CA
Jun 2025 – Aug 2025
Consolidated 200+ fragmented datasets into Snowflake using Databricks pipelines, creating reliable inputs for regulatory reporting
Implemented predictive models (random forests, regression) to assess anomalies, improving KPI reliability by 82%
Developed Tableau and Plotly dashboards highlighting irregularities, enabling proactive action by business leaders
Automated recurring financial/operational reports in Python and Excel VBA, reducing delivery cycles by 35%
Partnered with BI and Finance teams to validate assumptions, document models, and prepare executive-ready deliverables
INDUSTRY

DATA SCIENCE INTERN

Kintu Designs IT
Surat, India
Dec 2023 – May 2024
Analyzed 15,000+ user sessions with clustering/regression in Jupyter to uncover churn patterns and demand trends
Defined KPIs and built SQL-driven Power BI dashboards, improving visibility into marketing and profitability metrics
Supported campaign analysis by embedding predictive insights, increasing customer retention by 12%
Documented reproducible pipelines in notebooks, ensuring transparency for analytics adoption across teams
Presented results to cross-functional stakeholders, influencing budget allocation and marketing strategy
INDUSTRY

DATA ANALYST INTERN

Brainy Beams
Ahmedabad, India
May 2023 – Jun 2023
Validated 5,000+ records with SQL/Python reconciliation, improving dataset accuracy by 60%
Built Excel/Power BI dashboards highlighting churn and LTV metrics for senior management
Converted technical results into concise KPIs, enabling non-technical teams to adopt data-driven practices
Prepared financial/operational reports that strengthened confidence in forecasts and strategy decisions
Collaborated with managers to refine metrics, ensuring dashboards aligned with business priorities
INDUSTRY

SKILLS

PROGRAMMING & DATABASES

Python (Pandas, NumPy, Scikit-learn)
SQL
R
PostgreSQL
MySQL

ANALYTICS & ML

Data Modeling
Regression & Classification
Forecasting
Hypothesis Testing

VISUALIZATION

Tableau
Power BI
Excel (Pivot Tables, Charts)

DATA MANAGEMENT

ETL Pipelines
Data Preprocessing
Data Cleaning & Validation
Documentation

COLLABORATION

Stakeholder Engagement
Teamwork
Requirement Gathering
Presentation of Insights

Featured Projects

Explore my portfolio of data science projects with advanced analytics, interactive visualizations, and comprehensive performance metrics.

Forecasting & Analytics
Forecasting Sales and Demand

Built econometric and ML forecasting models to predict demand patterns, supporting marketing and finance planning. Communicated results with Tableau dashboards, enabling data-driven supply chain and sales decisions.

PythonRSQLTime SeriesEconometricsTableauForecasting
Customer Analytics
Churn and Retention Analysis

Modeled churn-prone segments with decision trees and random forests, improving customer retention insights. Defined retention KPIs and delivered Power BI dashboards for leadership decision-making.

PythonSQLJupyterDecision TreesRandom ForestPower BIClustering
Business Intelligence
Revenue and KPI Dashboard

Automated SQL + Excel pipelines into Tableau dashboards tracking revenue, profitability, and operational KPIs. Enabled stakeholders to detect anomalies and validate KPI integrity, increasing confidence in forecasts.

TableauExcelSQLPythonETLData VisualizationKPI Tracking
Machine Learning
Stock Price Prediction using SARIMAX

Advanced time series forecasting model using Seasonal AutoRegressive Integrated Moving Average with eXogenous variables to predict stock market trends with high accuracy.

PythonSARIMAXPandasNumPyMatplotlibTime Series Analysis
Healthcare Analytics
Heart Disease Prediction

Machine learning classification model using clinical features to predict cardiovascular disease risk, enabling early intervention and preventive healthcare strategies.

PythonRandom ForestScikit-learnPandasROC/AUC Analysis
Health Research
Seasonal Health Patterns Analysis

Comprehensive analysis of seasonal variations in aging-associated health measures, examining Alzheimer's risk factors and mental health patterns across different time periods.

PythonStatistical AnalysisPandasSeabornTime SeriesHealthcare Data

Explore More Projects

Visit my GitHub profile to see additional projects, contributions, and ongoing research in data science and machine learning.

View All Projects on GitHub

Education & Certifications

Education

In Progress
3.7 GPA

Master of Science in Statistical Data Science

San Francisco State University

San Francisco, CA
Aug 2024 – Present
Relevant Coursework
Advanced Statistical MethodsMachine Learning & Predictive AnalyticsBig Data AnalyticsStatistical Computing with R/PythonTime Series AnalysisBayesian StatisticsData Mining & Pattern RecognitionExperimental Design
Completed
3.96 GPA

Bachelor of Technology in Computer Science and Engineering

CHARUSAT University

India
Aug 2020 – May 2024
Relevant Coursework
Artificial Intelligence & Machine LearningData Structures & AlgorithmsDatabase Management SystemsComputer Vision & Image ProcessingNatural Language ProcessingSoftware EngineeringOperating SystemsComputer NetworksWeb DevelopmentObject-Oriented Programming

Certifications

RecentApr 2025

Data Engineering on AWS – Foundations

AWS

RecentDec 2024

Storytelling with Data

DataCamp

Continuous Learning: I'm committed to staying current with the latest developments in data science and analytics. Currently pursuing additional certifications in cloud computing and advanced machine learning techniques.

Get In Touch

I'm always interested in discussing new opportunities, collaborations, or just connecting with fellow data enthusiasts. Feel free to reach out!

Contact Information

Location

San Francisco, CA

Ready to Collaborate?

Whether you're looking for a data scientist to join your team, need consultation on a project, or want to discuss the latest trends in machine learning, I'd love to hear from you.