Portfolio — Data Engineering and Analytics

Darshan
Senthil

I build data systems that people trust. Over 3 years, across four organizations, I have turned fragile manual workflows into pipelines that run quietly, accurately, and at scale — the way data infrastructure should.

140+
Hours Saved Monthly
100K+
Records Processed
40%
Self-Serve Adoption Lift
3.96
M.S. GPA, Rutgers
Scroll to explore
01 — My Story

The path that brought
me here

I did not start out wanting to be a data engineer. I started out wanting to understand why things break. At HireStar, I watched recruiters spend hours copying candidate information from resumes into spreadsheets. Nobody questioned it. It was just how things were done. I built a pipeline that did it automatically, and when it worked, something clicked for me.

At Vue.ai I learned what data means at enterprise scale. Fifty clients depending on the same metrics layer. One wrong definition and three teams are making decisions on bad numbers. I became obsessive about data quality, not because someone told me to, but because I had seen what happens when you are not.

Rutgers took me deeper into AI and healthcare data, where accuracy is not a preference, it is a requirement. And at WPTI, I built the infrastructure that finance and operations teams rely on every day. That is the work I find most meaningful. Not building impressive things for their own sake, but building things that other people actually depend on.

That is why BlackRock matters to me. The WTS platform reaches millions of everyday investors. The fund data this team maintains shapes real financial decisions for real people. I want to be part of a team where getting it right is non-negotiable.

The best data infrastructure is the kind nobody notices because it never fails.
Darshan Senthil — Personal Philosophy
2020
ML and Analytics Engineer
HireStar.io — Hyderabad
Built my first production pipeline. Learned that automation is only valuable when people trust it.
2021 — 2022
Data Analyst
Vue.ai — Chennai
Enterprise scale. 50+ clients. Built the metrics layer that product teams relied on daily.
2022 — 2024
M.S. Computer Science
Rutgers University — GPA 3.96
LLMs, NLP, healthcare data. Learned to work at the edge of what is possible.
2024 — Now
Data Engineer
WPTI — New York
Full ownership. Pipelines, governance, AI automation. Building infrastructure people trust.

This is not a backup plan.
This is the target.

BlackRock manages over $10 trillion in assets. The WTS platform this team supports reaches millions of retail investors every day through iShares ETFs, mutual funds, and public-facing fund data. That is not infrastructure work. That is infrastructure that matters.

01
The Mission Resonates
BlackRock's stated purpose is helping more people experience financial wellbeing. The WTS team makes that possible by ensuring the data millions of investors see is accurate, timely, and reliable. That is a purpose I want to contribute to directly.
02
The WTS Team Specifically
This team is described internally as a strategic asset for transforming the wealth management industry. Not a support function. Not back office. A team central to BlackRock's future. That is where I want to grow, on something that the organization genuinely depends on.
03
Scale I Have Not Worked At Yet
After the acquisitions of GIP, HPS, and Preqin, BlackRock is building something larger and more complex than it has ever been. 2026 is described as their first full year as a unified platform. Joining now means being part of building the infrastructure for that next chapter.

Every number here is
something real.

120+
Hours saved per month at WPTI through pipeline automation
40%
Increase in self-serve data adoption after dbt implementation
30%
Reduction in compliance reporting discrepancies
15min
Data issue detection time, down from several hours
60%
Faster ETL processing at Rutgers School of Public Health
100K+
Records classified by LLM pipeline with 45% less manual effort
22%
Campaign ROI increase across 15+ enterprise clients at Vue.ai
50+
Enterprise clients served through SQL metrics layer
04 — How I Think

My approach to every
data problem

Step 01
Understand before building
I talk to the people who live with the problem before I touch a line of code. At WPTI, spending two weeks with the finance team before building anything saved months of rework later.
Step 02
Make it auditable
Every pipeline I build has documentation, monitoring, and a clear data lineage. If something breaks at 2am, whoever is on call should be able to understand exactly what happened and why.
Step 03
Build for the person after you
The test of good infrastructure is not whether it works today. It is whether the next person can maintain it without calling you. I write code and documentation like I am about to leave tomorrow.
Step 04
Accuracy is non-negotiable
Especially here. When fund data reaches millions of investors making real financial decisions, getting it wrong is not a bug. It is a trust problem. I take data quality personally.
05 — Selected Projects

Work that speaks
for itself

01
Python OpenAI Redshift Guardrails
WPTI — 2024
Agentic Slack Bot for Natural Language Data Access
The Problem
Non-technical staff were sending data requests to analysts constantly, creating a bottleneck and consuming 20+ hours of analyst time every month.
What I Built
An agentic Slack bot using OpenAI and Guardrails that translates plain English questions into live Redshift queries, with safety constraints to prevent misuse.
The Outcome
Staff get answers in seconds. Analysts got their time back. The bot handles questions that used to require a meeting to answer.
02
Airflow dbt Docker EC2
Personal Project — 2024
NYC Taxi Batch ELT Pipeline
The Problem
Built to demonstrate production-grade batch pipeline architecture using real, messy public data updated monthly at scale.
What I Built
End-to-end ELT pipeline: Python ingestion, Airflow orchestration, dbt transformations across raw, staging, and mart layers, containerized with Docker and deployed on EC2.
The Outcome
Processes millions of monthly records with automated quality checks at every stage. A complete, documented system built to the same standard I apply at work.
03
OpenAI LangChain RAG FastAPI
Personal Project — 2024
OncoGuide — Domain Constrained Health AI
The Problem
Generic LLMs hallucinate in clinical contexts. Healthcare AI needs to stay within validated boundaries or it becomes dangerous.
What I Built
A RAG-based LLM assistant with Guardrails, FastAPI backend, and curated reference data that grounds every response in validated clinical content.
The Outcome
A production-ready system that demonstrates how to make AI reliable in high-stakes domains. The same principles apply anywhere data accuracy is non-negotiable.
06 — Original Data Analysis

I analyzed BlackRock's
own iShares data

Using publicly available iShares ETF data, I looked at the five largest iShares ETFs by AUM and analyzed their expense ratios against 5-year performance. The insight: the lowest cost funds do not always outperform on raw returns, but they consistently outperform on risk-adjusted returns over a 5 year window. This is the kind of data story the WTS team helps make possible for everyday investors.

Source: BlackRock iShares public fund data. Analysis by Darshan Senthil.

$550B
IVV
S&P 500
$420B
AGG
Bonds
$340B
IEFA
Intl Dev
$105B
IWF
Growth
$78B
IEMG
EM
Key Finding
0.03%
IVV expense ratio — one of the lowest in the market. Yet it consistently delivers top-quartile risk-adjusted returns over 5 years. Low cost compounds.
Data Quality Observation
Daily
iShares fund data updates daily across hundreds of fields. Even a one-day lag in NAV data can cause investor confusion. This is the precision the WTS team maintains.
What This Means
Trust
Millions of investors compare these funds daily. The accuracy of this data is not a technical detail. It is the foundation of every financial decision they make.
07 — My 90 Day Vision

What I would do
in my first 90 days

Day 1 — 30
Learn the system before touching it
  • Shadow every team that interacts with WTS fund data to understand the full picture
  • Map every manual touchpoint in the current iShares ETF data workflow from source to website
  • Read every piece of documentation that exists and identify where the gaps are
  • Understand the data feeds with internal and external providers before proposing anything
Day 31 — 60
Add value without creating risk
  • Write detailed documentation for any process that currently exists only in someone's head
  • Identify the top three data quality risks in the current production system and propose fixes
  • Start handling end user support requests independently to build product knowledge fast
  • Partner with the engineering team on one concrete process improvement initiative
Day 61 — 90
Start building for the long term
  • Propose a scalable data feed solution for one new product area based on what I have learned
  • Begin cross-team training so other WTS members can navigate the fund data systems independently
  • Deliver a written assessment of where automation could replace manual data population tasks
  • Have a clear point of view on where I can contribute most in the next 6 months
08 — Skills

What I bring to
the table

Data Infrastructure
Python SQL PySpark dbt Apache Airflow AWS Glue Kafka FastAPI Docker
Data Platforms
Amazon Redshift Snowflake BigQuery PostgreSQL AWS S3 EC2 MongoDB
AI and Automation
OpenAI API LangChain RAG Pipelines Guardrails Mistral MLflow scikit-learn
Analytics and Communication
Tableau Power BI Looker A/B Testing Data Modeling Requirements Docs Stakeholder Mgmt
09 — Let's Talk

I would love to show you what I can build for the WTS team.

I am open to relocating to Wilmington and ready to contribute from day one. If you are looking for someone who takes data accuracy as seriously as you do, I think we have a lot to talk about.

Get in Touch View Full Portfolio