Portfolio — Data Engineering and Analytics

Darshan
Senthil

I build data systems that people trust. Over 3 years, across four organizations, I have turned fragile manual workflows into pipelines that run quietly, accurately, and at scale — the way data infrastructure should.

140+

Hours Saved Monthly

100K+

Records Processed

40%

Self-Serve Adoption Lift

3.96

M.S. GPA, Rutgers

Scroll to explore

01 — My Story

The path that brought
me here

I did not start out wanting to be a data engineer. I started out wanting to understand why things break. At HireStar, I watched recruiters spend hours copying candidate information from resumes into spreadsheets. Nobody questioned it. It was just how things were done. I built a pipeline that did it automatically, and when it worked, something clicked for me.

At Vue.ai I learned what data means at enterprise scale. Fifty clients depending on the same metrics layer. One wrong definition and three teams are making decisions on bad numbers. I became obsessive about data quality, not because someone told me to, but because I had seen what happens when you are not.

Rutgers took me deeper into AI and healthcare data, where accuracy is not a preference, it is a requirement. And at WPTI, I built the infrastructure that finance and operations teams rely on every day. That is the work I find most meaningful. Not building impressive things for their own sake, but building things that other people actually depend on.

That is why BlackRock matters to me. The WTS platform reaches millions of everyday investors. The fund data this team maintains shapes real financial decisions for real people. I want to be part of a team where getting it right is non-negotiable.

The best data infrastructure is the kind nobody notices because it never fails.

Darshan Senthil — Personal Philosophy

2020

ML and Analytics Engineer

HireStar.io — Hyderabad

Built my first production pipeline. Learned that automation is only valuable when people trust it.

2021 — 2022

Data Analyst

Vue.ai — Chennai

Enterprise scale. 50+ clients. Built the metrics layer that product teams relied on daily.

2022 — 2024

M.S. Computer Science

Rutgers University — GPA 3.96

LLMs, NLP, healthcare data. Learned to work at the edge of what is possible.

2024 — Now

Data Engineer

WPTI — New York

Full ownership. Pipelines, governance, AI automation. Building infrastructure people trust.

02 — Why BlackRock

This is not a backup plan.
This is the target.

BlackRock manages over $10 trillion in assets. The WTS platform this team supports reaches millions of retail investors every day through iShares ETFs, mutual funds, and public-facing fund data. That is not infrastructure work. That is infrastructure that matters.

The Mission Resonates

BlackRock's stated purpose is helping more people experience financial wellbeing. The WTS team makes that possible by ensuring the data millions of investors see is accurate, timely, and reliable. That is a purpose I want to contribute to directly.

The WTS Team Specifically

This team is described internally as a strategic asset for transforming the wealth management industry. Not a support function. Not back office. A team central to BlackRock's future. That is where I want to grow, on something that the organization genuinely depends on.

Scale I Have Not Worked At Yet

After the acquisitions of GIP, HPS, and Preqin, BlackRock is building something larger and more complex than it has ever been. 2026 is described as their first full year as a unified platform. Joining now means being part of building the infrastructure for that next chapter.

03 — Impact by the Numbers

Every number here is
something real.

120+

Hours saved per month at WPTI through pipeline automation

40%

Increase in self-serve data adoption after dbt implementation

30%

Reduction in compliance reporting discrepancies

15min

Data issue detection time, down from several hours

60%

Faster ETL processing at Rutgers School of Public Health

100K+

Records classified by LLM pipeline with 45% less manual effort

22%

Campaign ROI increase across 15+ enterprise clients at Vue.ai

50+

Enterprise clients served through SQL metrics layer

04 — How I Think

My approach to every
data problem

Step 01

Understand before building

I talk to the people who live with the problem before I touch a line of code. At WPTI, spending two weeks with the finance team before building anything saved months of rework later.

Step 02

Make it auditable

Every pipeline I build has documentation, monitoring, and a clear data lineage. If something breaks at 2am, whoever is on call should be able to understand exactly what happened and why.

Step 03

Build for the person after you

The test of good infrastructure is not whether it works today. It is whether the next person can maintain it without calling you. I write code and documentation like I am about to leave tomorrow.

Step 04

Accuracy is non-negotiable

Especially here. When fund data reaches millions of investors making real financial decisions, getting it wrong is not a bug. It is a trust problem. I take data quality personally.

05 — Selected Projects

Work that speaks
for itself

Python OpenAI Redshift Guardrails

WPTI — 2024

Agentic Slack Bot for Natural Language Data Access

The Problem

Non-technical staff were sending data requests to analysts constantly, creating a bottleneck and consuming 20+ hours of analyst time every month.

What I Built

An agentic Slack bot using OpenAI and Guardrails that translates plain English questions into live Redshift queries, with safety constraints to prevent misuse.

The Outcome

Staff get answers in seconds. Analysts got their time back. The bot handles questions that used to require a meeting to answer.

Airflow dbt Docker EC2

Personal Project — 2024

NYC Taxi Batch ELT Pipeline

The Problem

Built to demonstrate production-grade batch pipeline architecture using real, messy public data updated monthly at scale.

What I Built

End-to-end ELT pipeline: Python ingestion, Airflow orchestration, dbt transformations across raw, staging, and mart layers, containerized with Docker and deployed on EC2.

The Outcome

Processes millions of monthly records with automated quality checks at every stage. A complete, documented system built to the same standard I apply at work.

OpenAI LangChain RAG FastAPI

Personal Project — 2024

OncoGuide — Domain Constrained Health AI

The Problem

Generic LLMs hallucinate in clinical contexts. Healthcare AI needs to stay within validated boundaries or it becomes dangerous.

What I Built

A RAG-based LLM assistant with Guardrails, FastAPI backend, and curated reference data that grounds every response in validated clinical content.

The Outcome

A production-ready system that demonstrates how to make AI reliable in high-stakes domains. The same principles apply anywhere data accuracy is non-negotiable.

06 — Original Data Analysis

I analyzed BlackRock's
own iShares data

Using publicly available iShares ETF data, I looked at the five largest iShares ETFs by AUM and analyzed their expense ratios against 5-year performance. The insight: the lowest cost funds do not always outperform on raw returns, but they consistently outperform on risk-adjusted returns over a 5 year window. This is the kind of data story the WTS team helps make possible for everyday investors.

Source: BlackRock iShares public fund data. Analysis by Darshan Senthil.

$550B

IVV
S&P 500

$420B

AGG
Bonds

$340B

IEFA
Intl Dev

$105B

IWF
Growth

$78B

IEMG
EM

Key Finding

0.03%

IVV expense ratio — one of the lowest in the market. Yet it consistently delivers top-quartile risk-adjusted returns over 5 years. Low cost compounds.

Data Quality Observation

Daily

iShares fund data updates daily across hundreds of fields. Even a one-day lag in NAV data can cause investor confusion. This is the precision the WTS team maintains.

What This Means

Trust

Millions of investors compare these funds daily. The accuracy of this data is not a technical detail. It is the foundation of every financial decision they make.

07 — My 90 Day Vision

What I would do
in my first 90 days

Day 1 — 30

Learn the system before touching it

Shadow every team that interacts with WTS fund data to understand the full picture
Map every manual touchpoint in the current iShares ETF data workflow from source to website
Read every piece of documentation that exists and identify where the gaps are
Understand the data feeds with internal and external providers before proposing anything

Day 31 — 60

Add value without creating risk

Write detailed documentation for any process that currently exists only in someone's head
Identify the top three data quality risks in the current production system and propose fixes
Start handling end user support requests independently to build product knowledge fast
Partner with the engineering team on one concrete process improvement initiative

Day 61 — 90

Start building for the long term

Propose a scalable data feed solution for one new product area based on what I have learned
Begin cross-team training so other WTS members can navigate the fund data systems independently
Deliver a written assessment of where automation could replace manual data population tasks
Have a clear point of view on where I can contribute most in the next 6 months

Darshan
Senthil

The path that brought
me here

This is not a backup plan.
This is the target.

Every number here is
something real.

My approach to every
data problem

Work that speaks
for itself

I analyzed BlackRock's
own iShares data

What I would do
in my first 90 days

What I bring to
the table

I would love to show you what I can build for the WTS team.

DarshanSenthil

The path that broughtme here

This is not a backup plan.This is the target.

Every number here issomething real.

My approach to everydata problem

Work that speaksfor itself

I analyzed BlackRock'sown iShares data

What I would doin my first 90 days

What I bring tothe table

I would love to show you what I can build for the WTS team.

Darshan
Senthil

The path that brought
me here

This is not a backup plan.
This is the target.

Every number here is
something real.

My approach to every
data problem

Work that speaks
for itself

I analyzed BlackRock's
own iShares data

What I would do
in my first 90 days

What I bring to
the table