● Available Open to AI Engineer, Data Engineer & Data Analyst roles

Building the AI layer
of modern software.

I'm Ankesh Babal — I ship LLM pipelines|

Location
Auckland, NZ
Degree
MSc Analytics, AUT
Focus
LLMs · RAG · Pipelines
TECH STACK I WORK WITH
Python OpenAI Gemini Anthropic HuggingFace FastAPI Streamlit Pandas NumPy scikit-learn Azure GCP AWS Databricks Snowflake MySQL PostgreSQL Power BI Python OpenAI Gemini Anthropic HuggingFace FastAPI Streamlit Pandas NumPy
About

Practical AI, built for real operations.

I'm an AI & data-focused engineer with a Master's in Analytics from Auckland University of Technology. My work sits where machine learning meets real operations — building models that are useful, pipelines that don't break, and systems that actually make it to production.

During my recent internship at Dealtable.ai, I built an end-to-end LLM system supporting venture capital investment decisions — processing 10K+ documents, designing a scoring and evaluation framework, and delivering outputs through a production-ready API.

I enjoy the full AI stack: from raw data ingestion on Azure and GCP, through retrieval-augmented generation, to the last mile where a model output becomes a decision. I'm now looking for a role where I can grow as an AI engineer and ship production AI systems.

4
Production AI systems shipped
11
Public GitHub repositories
96%
ML classification accuracy (SVM)
20%
BI reporting time reduced
Architecture

How I built the Dealtable LLM pipeline.

Real-time VC investment decisions, served through a production API.

End-to-end LLM decision system

Isometric view · hover any layer to inspect

DECISION API FastAPI · Real-time LIVE LLM + SCORING Gemini · Eval framework RAG + EMBEDDINGS Vector search · Top-k retrieval DATA INGESTION 10K+ VC documents → OUTPUT → INFERENCE → CONTEXT → INPUT
<1s
API Latency
RAG
Retrieval
Gemini
LLM Core
5
Pipeline Stages

End-to-End Data Engineering Stack

How I ship data from raw sources to dashboards · cloud-native

Source
DB
Data Sources
APIs · Files · DB
Ingestion
ADF
Data Factory
ETL pipelines
Transform
DB
Databricks
Spark · PySpark
Analytics
SF
Warehouse
Snowflake · BQ
BI
PBI
Power BI
DAX · Dashboards
ML
ML
ML Models
sklearn · XGBoost
LLM
AI
Gemini + RAG
API · Real-time
Serve
API
FastAPI
Production API
Technical Stack

The tools, every day.

Languages, frameworks and platforms I use to move ideas from prototype to production.

AI & Machine Learning

  • Python
  • Pandas
  • NumPy
  • LLMs
  • Gemini API
  • RAG
  • Generative AI
  • Model evaluation
  • scikit-learn

Data Engineering

  • SQL
  • Data pipelines
  • Feature engineering
  • ETL
  • Workflow automation
  • Streamlit
  • FastAPI

Analytics & BI

  • Power BI
  • DAX
  • DAX Studio
  • Excel
  • Tableau
  • Dashboards

Cloud Platforms

  • AWS (Cloud Practitioner)
  • GCP — BigQuery
  • GCP — Storage
  • Azure — Data Lake
  • Azure — Data Factory
  • Databricks

Databases

  • MySQL
  • PostgreSQL
  • Snowflake modelling
  • BigQuery
  • Data warehousing

Business & Ops

  • Data validation
  • Reconciliation
  • Exception handling
  • MILP optimisation
  • Stakeholder reporting
Experience

Where I've worked.

Jul 2025 — Oct 2025 · Internship

AI / LLM Engineer Intern

Dealtable.ai
  • Built an end-to-end LLM pipeline using Gemini API and retrieval-augmented generation to support VC investment decisions.
  • Ingested, cleaned and structured 10K+ documents into a queryable knowledge base with vector embeddings.
  • Designed a multi-criteria scoring framework to evaluate model outputs, flag high-risk cases and reduce false positives.
  • Shipped a sub-second FastAPI decision endpoint consumed by the investment team in real-time.
  • Engineered prompt templates and retrieval strategies that improved output consistency across document types.
Jun 2024 — Present

Operations Data & Admin Analyst

BP Connect Lincoln
  • Analysed daily transaction data across multiple revenue streams, ensuring accuracy before financial processing.
  • Built inventory tracking reports and computed procurement KPIs to optimise stock management decisions.
  • Performed data reconciliation and exception analysis — identified discrepancies and escalated for resolution.
  • Applied structured data validation processes across operations, strengthening reporting controls.
Dec 2022 — May 2023 · Internship

Data Analyst Intern

AtliQ Technologies
  • Designed & built Power BI dashboards to monitor trends, anomalies and performance risks.
  • Automated and debugged data-cleaning workflows, reducing manual effort.
  • Standardised BI reporting — cut processing time by 20%.
Selected Work

Things I've built.

A mix of AI systems, data pipelines, dashboards and ML projects. Source on GitHub.

AI / LLM01

LLM Pipeline for VC Investment Decisions

End-to-end LLM system at Dealtable.ai. Ingested and structured thousands of VC documents, built a retrieval-augmented generation pipeline with Gemini API, designed a multi-criteria scoring framework, and shipped a sub-second decision API via FastAPI serving real-time investment insights.

PythonGemini APIRAGFastAPIVector DBPrompt engineering
Data Engineering02

Transit Analytics & AI-Ready Pipeline

Full transit analytics platform processing GTFS static & real-time trip updates. Built a 6-page Streamlit dashboard (headways, delays, reliability, heatmaps, route comparison, performance score). Designed cloud expansion on Azure Data Lake + Data Factory + Databricks.

PythonStreamlitAzureDatabricksGTFS
Data Engineering03

Uber Data Engineering Project

Cloud-native data pipeline on Uber trip data — automated ingestion via Mage, transformation layer in Python, and analytics served through BigQuery on GCP. Full medallion architecture (bronze → silver → gold).

PythonGCPBigQueryMage
Machine Learning04

Obesity Disease Detection (ML)

Classification models predicting obesity risk — compared SVM, Random Forest and XGBoost. SVM achieved 96.44% accuracy. Built a GUI for real-time input and prediction, targeted at healthcare screening workflows.

Pythonscikit-learnXGBoostTkinter
Analytics05

Business Report 360 — Power BI

Interactive dashboard covering 5 business domains (Finance, Sales, Marketing, Supply Chain, Executive). MySQL source, snowflake modelling, DAX measures, DAX Studio performance tuning, and Power BI Service with auto-refresh & row-level security.

Power BIDAXMySQLSnowflake
More work+6

6 more projects on GitHub

Expense Tracker (FastAPI + Streamlit), SQL Music Store analysis, AtliQ Hospitality dashboards, MILP route optimisation, and more. All source code public.

GitHub@Ankesh-babal
Credentials

Education & certifications.

Education

Master's in Analytics

Auckland University of Technology (AUT)
Jul 2024 — Nov 2025 · Auckland, New Zealand

Bachelor of Engineering — Information Technology

Pune University
Graduated 2023 · Maharashtra, India

Certifications

  • AWS Certified Cloud Practitioner
  • Microsoft Power BI Data Analyst Associate
  • SQL for Data Science
  • Python — Beginner to Advanced
What I bring

Why teams want me.

I don't just build models — I build systems that ship. From raw data ingestion to a production API, I own the full pipeline and make sure it actually works in the real world.

End-to-end delivery
Prototype → Production

I've built production LLM pipelines over large-scale document sets, designed scoring frameworks for investment decisions, and shipped dashboards across finance, sales and supply chain. I work with real business data, not toy datasets.

📊
Battle-tested with real data
Not just Kaggle projects

AWS certified. Hands-on with Azure Data Factory, GCP BigQuery, Databricks. I don't just know cloud theory — I've built and deployed pipelines on all three major platforms.

☁️
Multi-cloud fluency
AWS · Azure · GCP
● Currently learning

Production AI systems at scale

Deep-diving into LangGraph, vLLM inference optimization, evaluation frameworks, and making LLM outputs reliable enough for real-money decisions.

Get in touch

Let's build something useful.

Open to AI Engineer, Data Engineer and Data Analyst roles — full-time, contract or freelance.

A
Ankesh's AI Twin
● Online · ask me anything
👋 Hi! I'm Ankesh's AI twin. Ask me about his projects, skills, experience, or how to hire him. Try the suggestions below or type your own question.