Available for new projects

James
Brooks

Analytics Engineer

I model data, build semantic layers, and close the gap between raw sources and reliable decisions — across automotive, media, and healthcare, with a decade of domain expertise and AI/ML in the mix.

bash — ~/projects/churn-propensity
10+yr experience
4 industries
46% best A/B lift

What I work with

Modeling
SQL dbt DAX Python
Warehouses
Amazon Redshift Google BigQuery Teradata Oracle MongoDB Hive
BI Delivery
Power BI Tableau Tableau Prep Sigma QlikSense
Analytics & AI
A/B Testing Amplitude Adobe Analytics Machine Learning NLP GenAI

Selected projects

EasyVisa — Visa Certification Predictor

ML

Ensemble classifier trained on 775K+ OFLC employer applications to predict US work visa certification outcomes. Tuned Random Forest achieved F1=0.82 with 87% recall — wage-to-prevailing-rate ratio surfaced as the dominant approval signal.

Pythonscikit-learnRandom ForestXGBoostGBM

SuperKart — Retail Sales Forecasting

ML

Predicted quarterly revenue for a multi-city supermarket chain using tree-based ensemble regressors. Tuned XGBoost delivered R²=0.90 and MAPE=6.45%, enabling reliable inventory planning and data-driven procurement decisions at scale.

PythonXGBoostscikit-learnPandasSeaborn

AllLife Bank — Loan Propensity Model

ML

Classified which bank customers are likely to purchase personal loans, optimizing for recall to catch every potential buyer. Pre-pruned decision tree hit AUC=0.97 with perfect recall — income and monthly credit spend were the strongest predictors.

Pythonscikit-learnDecision TreePandas

Clinical Decision Support — Medical RAG

NLP

RAG pipeline over the Merck Medical Manuals to assist clinicians with diagnostics, drug lookups, treatment protocols, and critical care queries. LangChain + ChromaDB retrieval grounds a local Llama LLM in vetted medical sources, reducing hallucination risk in high-stakes scenarios.

PythonLangChainChromaDBLlamaHuggingFace

HelmNet — Safety Helmet Detector

Deep Learning

Transfer learning pipeline (VGG-16 + FFNN) to detect helmet compliance from construction site camera images. Trained on 631 images across varied lighting and angles with data augmentation — achieved 100% test accuracy and a perfect confusion matrix.

PythonTensorFlowVGG-16KerasOpenCV

FoodHub — Order Data Analysis

Analytics

Exploratory analysis of a food aggregator's order dataset to surface demand patterns by cuisine type, delivery timing, and customer ratings. Identified a 5.87-minute weekday delivery lag and proposed targeted promotions to close the gap and drive retention.

PythonPandasMatplotlibSeaborn

Career

Apr 2024 — now

Senior Business Intelligence Engineer

· Ford Motor Company — San Diego, CA

Owning the full analytics lifecycle — data modeling, semantic layer design, and stakeholder enablement across multiple business functions. Led Qlik Sense → Power BI migration for Ford Pro Charging, redesigning the underlying data model to give ops teams real-time diagnostic metrics. Designed an AI-powered CX model with NLP-based survey classification to surface recurring issues from unstructured data. Currently building a predictive sales model layering customer propensity scores into a self-serve reporting layer for Solution Managers.

Power BIDAXNLPPythonSQL
May 2022 — Mar 2024

Senior Data Analyst / Tableau Developer

· Warner Bros. Discovery (MotorTrend Group) — San Diego, CA

Established the analytics foundation for MotorTrend's new FAST business vertical from zero — defined KPIs, modeled and ingested data into Amazon Redshift, and built the reporting layer on top. Led the org's first A/B testing program using Amplitude, producing a 46% increase in ad impressions served. Cut reporting turnaround by 30% by automating daily delivery workflows and eliminating manual extraction.

TableauAmplitudeAmazon RedshiftA/B TestingSQL
Dec 2019 — May 2022

Senior Data Analyst

· Kaiser Permanente (KP OnCall) — San Diego, CA

Modeled and consolidated complex data streams from Genesys, Clarity, and KPATHS into unified reporting views — eliminating data silos and building a single source of truth for healthcare contact center operations. Partnered directly with business stakeholders to translate operational requirements into analytically rigorous, production-grade reporting.

TableauSQLPythonGenesys
Apr 2015 — Mar 2019

Business Intelligence Analyst

· General Motors — Austin, TX

Subject matter expert for CX data — engineered the data models and reporting platform underpinning customer contact operations across 10+ countries. Built composite datasets supporting a data monetization initiative and wrote custom SQL to surface mobile app usage patterns, directly informing product roadmap decisions.

TableauCognosSQLTeradata
Education

Post Graduate Program — AI/ML: Business Applications

· University of Texas at Austin

Python, Machine Learning, NLP, Neural Networks. Also holds a BS in Management Information Systems from Florida State University, a GCP Cloud Digital Leader certification, and a Tableau Desktop 10 Qualified Associate certification.

PythonMachine LearningNLPGCPdbt

Let's work together

Looking for analytics engineering roles — data modeling, dbt/SQL transformation layers, semantic layers, and BI delivery. Open to a conversation if you're building something that needs clean, reliable data.