Kushagra Bharti
Student | Software Engineer | ML Enthusiast.
I am a student and software builder who enjoys learning and expanding my skillset.
Primary Sources
- [Canonical portfolio homepage](https://www.kushagrabharti.com): Public visual portfolio homepage.
- [AI-readable HTML profile](https://www.kushagrabharti.com/ai): Full semantic profile with experience, projects, education, writings, creative work, and crawler notes.
- [Plain-text llms.txt](https://www.kushagrabharti.com/llms.txt): This generated Markdown guide for automated readers.
Key facts:
- I am a student and software engineer, but I do not fit cleanly into one lane. I move between machine learning, AI agents, full-stack products, research tooling, data systems, computer vision, optimization, trading experiments, and the occasional hardware or film project.
- A lot of my work starts with a question I cannot leave alone. Can LLM agents actually plan over a full game? Can pose tracking be cleaned up enough for real lab workflows? Can a product keep artifacts and context instead of turning everything into another chat thread?
- I like building the whole loop: the core engine, the UI, the data model, the tests, the telemetry, the failure cases, and the writeup. I do not enjoy stopping at a demo if the interesting part is still hidden.
- MonopolyBench is my main AI research bet right now: a deterministic multi-agent environment for studying long-horizon planning, negotiation, deception, and bias through full Monopoly games.
- At UT Southwestern, I have been working on computer vision for behavioral neuroscience: DeepLabCut/SuperAnimal pipelines, pose cleanup, behavior scoring, QC outputs, and CSV/XLSX scorecards researchers can actually inspect.
- I have worked in real company environments too. At Abilitie, I contributed to an LLM role-play training product with React, TypeScript, provider plumbing, telemetry, prompt work, open-source model fine-tuning, latency improvements, and cost reduction. At Glydr.gg, I have been leading technical and product direction for a customer-facing configuration hub with React/Vite, Fastify, Postgres, Steam auth, admin tooling, and Railway deployment.
- I have also built products and systems outside research: Pact, Beyond Chat, NovelBench, PseudoLawyer, Arachne, a personal portfolio/tracker, quant trading tooling, and smaller ML, hardware, and algorithm projects.
- I care about legibility. If a model makes a decision, I want traces. If a benchmark gives a score, I want the run artifacts. If a pipeline produces a number, I want to know where it came from and where it can fail.
- I care about taste too. The interface matters. The data model matters. The story matters. A thing can pass tests and still feel wrong.
- Film is part of the same instinct for me. Framing, pacing, selection, and restraint show up in software more than people admit.
- I am looking for work where I can learn quickly, own hard problems, build real systems, and stay honest about what is broken.
Contact and External Profiles
- Email: mailto:kushagrabharti@gmail.com
- LinkedIn: https://www.linkedin.com/in/kushagra-bharti/
- GitHub: https://github.com/kushagrabharti
- Medium: https://medium.com/@kushagrabharti
- X: https://x.com/IamKushagraB
- Film Portfolio: https://drive.google.com/file/d/1m3aFLAK4TE29ybbdOzObLS8zrrX3oJwM/view?usp=sharing
Values and Writings and Predictions
01 perpetual learning
Category: value
Summary: Learning as the discipline of staying corrigible before comfort, fluency, and success become inertia.
Perpetual Learning
Learning is the refusal to fossilize.
I believe comfort is a slow form of surrender. Knowledge does not simply vanish; it sediments. Skill, left untested, ossifies into reflex. The mind begins to mistake familiarity for truth, and a little fluency can masquerade as depth.
I want to remain an apprentice to reality: corrected by it, sharpened by it, unsettled by it. I want to keep moving before competence becomes ceremony, before success becomes anesthesia, before the self I have built becomes a room I can no longer leave.
So I return to the discipline:
* Learn before knowledge becomes sediment.
* Adapt before skill becomes architecture.
* Compete before comfort dulls the edge.
* Question before confidence becomes disguise.
* Begin again before achievement becomes sleep.
What does not learn, hardens.
What does not adapt, is buried.
What does not question itself, decays first.
What does not contend, rots.
What does not begin again, dies.
02 kinetic agency
Category: belief
Summary: Turning abundant intelligence into motion through fast feedback, real shipping, and active adaptation.
Kinetic Agency
Agency is the ability to turn uncertainty into motion.
Intelligence is becoming infrastructure. AI can answer, generate, debug, scaffold, summarize, and compress. That changes the game: the scarce thing is no longer access to intelligence, but the will to use it.
Inaction is not neutral. In a fast world, delay compounds. Tools change. Stack choices change. Models change. The half-life of advantage keeps shrinking. What matters is not being perfectly prepared, but staying active enough to adapt.
I care about doing: building, shipping, testing, accepting criticism, revising, moving. Learn the foundations, then put them to work. Progress comes through contact with reality.
AI should not replace agency. It should accelerate it. It should make me faster, broader, more experimental, more dangerous to my own inertia.
How I try to operate:
* Act while the feedback loop is still short.
* Treat plans as drafts and reality as the editor.
* Learn new tools faster than old tools can become habits.
* Let AI compress the distance between intention and execution.
* Reconfigure quickly; attachment is slower than adaptation.
* Keep moving. The terrain changes whether you do or not.
Intelligence is becoming common.
Agency is what turns it into value.
03 discernment
Category: thought
Summary: Taste as selection: the software and cinematic grammar of what to remove, automate, and make invisible.
Discernment
Film taught me that the smallest choices carry the most weight.
Hold a scene two seconds too long and it starts to plead. Cut too early and it loses its nerve. Move the frame a few inches and the background becomes evidence. Leave the line unsaid and silence does the work.
Kuleshov’s lesson stayed with me: nothing means alone. Meaning comes from placement, timing, omission.
Software became the same grammar under stricter conditions.
Taste in software is not just how something appears. It is the instinct to hate repeated effort enough to remove it. To make the machine anticipate instead of merely respond. To build with enough pride that the work can survive being used, read, extended, and blamed.
The small choices decide the system: what gets automated, what gets named, what errors become impossible, what complexity disappears, what power becomes effortless.
AI makes generation abundant, but it does not remove taste. It raises the standard for it. More can be made now, which means more has to be chosen, shaped, compressed, and made worth using.
The work is still the cut: what to automate, what to simplify, what to make invisible, what to make impossible to misunderstand.
Taste is the art of deciding what the world can do without.
04 predictions
Category: prediction
Summary: Notes on agents as cognitive infrastructure, democratized capability, and freedom as the next luxury.
Predictions
Some predictions and ideas I have for the future to come.
Agents Replace Everything
Cognitive work has always depended on human operators: people moving context, making judgments, using tools, and carrying tasks across fragmented systems.
Agents become the interface layer for cognitive work, turning intent into action across law, finance, software, marketing, research, and every field built on information.
The old unit was the app, the file, the meeting, the dashboard. The new unit is the work getting done.
Democratization of Access & Ability
For most of history, serious invention required rare access: experts, labs, capital, credentials, teams, and time. Most people had ideas they could not test.
AI turns intelligence into infrastructure, making research, analysis, creation, experimentation, and discovery available to far more people.
The lab becomes less of a place and more of a capability. More access means more attempts, more strange ideas, more local problems solved, and more breakthroughs from places nobody was watching.
Freedom Becomes Luxury
The world is getting faster, louder, and more optimized. Utility is everywhere, stimulation is endless, and attention is constantly being pulled apart.
The next luxury is freedom: control over your own attention, time, energy, and state of mind.
The best products, places, communities, and experiences will feel less like services and more like worlds: spaces that give people room to focus, wander, play, rest, belong, and breathe again.
Experience Links
Machine Learning Engineer Intern at UT Southwestern Medical Center, Tsai Lab
Date Range: Feb 2026 - Present
Category: Research
Timeline Tone: active
Summary: Fine-tuning DeepLabCut/SuperAnimal for lab-specific 3-chamber mouse-behavior video and building an end-to-end computer vision pipeline for pose extraction, behavioral scoring, QC, and researcher-facing scorecards.
Highlights:
- Context: Working in the Tsai Lab at UT Southwestern on computer vision tooling for behavioral neuroscience, specifically 3-chamber mouse-behavior videos used to study social interaction and experimental phenotypes.
- Problem: Off-the-shelf pose estimation and manual script-based analysis were not enough for reliable lab review; the workflow needed lab-specific model adaptation, consistent post-processing, interpretable QC, and researcher-friendly outputs.
- Model architecture: Adapted the DeepLabCut/SuperAnimal pose-estimation stack to domain-specific 3-chamber mouse footage by hand-annotating lab frames and fine-tuning the base model against the visual conditions and behavioral setup used in the lab.
- Core implementation: Built the end-to-end ML analysis pipeline from raw video to pose extraction, likelihood-aware filtering, dropped-keypoint interpolation, behavioral metric generation, QC reports, and CSV/XLSX scorecards.
- Evaluation and benchmarks: Improved pose-track stability by 56.9% using fine-tuning, likelihood filtering, interpolation, and hardened post-processing across benchmark lab video clips.
- Impact and outcome: Turned a fragile, script-heavy analysis workflow into a reproducible computer vision pipeline that produces interpretable pose tracks, behavioral summaries, and lab-review-ready scorecards instead of raw model outputs.
- Technical depth: Corrected behavioral scoring logic around discrimination index, ambiguous cup contact, interpolation boundaries, chamber occupancy, occlusion/body-length flags, and low-confidence pose summaries so downstream metrics better match experimental definitions.
- What made it hard: The project sits at the intersection of model adaptation, noisy animal video, experimental behavioral definitions, and researcher usability; the system needed to expose uncertainty instead of hiding unreliable trials behind clean-looking outputs.
Tags: Machine Learning, Computer Vision, DeepLabCut, SuperAnimal, Markerless Pose Estimation, Model Fine-Tuning, Domain Adaptation, Behavioral Neuroscience, Behavioral Phenotyping, Mouse Behavior Analysis, Video Analysis, Pose Tracking, Likelihood Filtering, Keypoint Interpolation, QC Tooling, Research Software, Python, OpenCV, pandas, NumPy, HDF5, CSV/XLSX
Link: https://labs.utsouthwestern.edu/tsai-lab
Undergraduate Researcher at UT Dallas, CAIR Lab
Date Range: Apr 2025 - Present
Category: Research
Timeline Tone: active
Summary: Built MonopolyBench, a deterministic multi-agent LLM benchmark for evaluating tool-calling agents on long-horizon planning, negotiation, deception, bias, memory, and schema-constrained decision-making.
Highlights:
- Context: Working with the UT Dallas CAIR Lab on agentic AI evaluation, focusing on long-running multi-agent environments where models must plan, negotiate, remember state, and take schema-valid actions over many turns.
- Problem: Many LLM-agent evaluations are short, single-agent, or hard to reproduce; MonopolyBench creates a deterministic, replayable environment where agent behavior can be inspected at the level of prompts, tool calls, state transitions, and strategic decisions.
- System architecture: Built MonopolyBench as an authoritative rules engine plus multi-agent arena where tool-calling LLM agents play complete Monopoly games through schema-bound actions rather than unconstrained free text.
- Core implementation: Implemented deterministic game mechanics including seeded dice/cards, turn order, legal action menus, property ownership, rent, auctions, jail, trades, liquidation, bankruptcy, strict validation, corrective retries, and deterministic fallbacks.
- Evaluation design: Designed each decision around game state, recent history, memory, legal action schemas, model responses, parsed actions, tool-call traces, applied events, snapshots, summaries, and replayable artifacts.
- Impact and outcome: Built the benchmark infrastructure for a forthcoming agentic AI research paper studying planning, negotiation, deception, and bias in long-horizon, multi-agent game environments.
- Technical depth: Added full run telemetry so experiments are debuggable instead of black-box: prompts, raw responses, parsed actions, retries, fallbacks, validation failures, legal menus, state snapshots, and final summaries are all logged for analysis.
- What made it hard: The system has to keep LLMs constrained without making the game trivial; every tool-call path needs strict validation, reproducibility, and fallback behavior while preserving meaningful strategic freedom.
Tags: LLM Evaluation, Agentic AI, AI Agents, Multi-Agent Systems, Tool Calling, Function Calling, Agent Benchmarking, Deterministic Simulation, Long-Horizon Planning, Negotiation, Deception, Bias Evaluation, Schema-Bound Actions, JSON Schema, Event Sourcing, Replayable Artifacts, Telemetry, Python, FastAPI, WebSockets, React, TypeScript, Vite, Zod, Pytest
Link: https://cairatutd.github.io/
Software Engineer at Glydr.gg
Date Range: Jan 2026 - May 2026
Category: Industry
Timeline Tone: past
Summary: Led engineering for Glydr.gg's Railway-deployed, microservice-based configuration platform serving 1,000+ users, spanning public discovery, Steam authentication, admin tooling, Control Panel imports, CI/CD, and distributed config delivery.
Highlights:
- Context: Glydr.gg needed a customer-facing configuration hub for discovering, publishing, importing, and managing game-server / controller configuration payloads across public users, admins, and Control Panel workflows.
- Problem: The product needed more than static config files; it required authenticated user flows, official/admin publishing, stable versioned imports, repeatable deployments, and safe handoffs into the Control Panel without exposing large or private payloads in URLs.
- System architecture: Led engineering for a Railway-deployed, microservice-based platform serving 1,000+ users, with separate frontend, backend API, worker, and database services built around React/Vite, Fastify, PostgreSQL, Drizzle ORM, and GitHub Actions CI/CD.
- Core implementation: Built public discovery, Steam authentication, admin tooling, Control Panel imports, config detail pages, official config publishing, private uploads, import success states, and backend-owned session flows.
- Data model: Modeled the platform with relational tables for users, Steam identities, sessions, games, categories, configs, immutable config versions, imports, handoff tokens, background jobs, and audit logs.
- Security and correctness: Engineered config delivery with checksum-versioned payloads, tokenized imports, throttling, validation guards, HTTP-only cookies, CSRF checks, admin allowlists, stable checksum parsing, and invalid-payload rejection.
- Impact and outcome: Turned config sharing/importing into a repeatable product workflow for real users instead of an ad hoc file handoff, while giving the team deployment automation and backend validation for safer iteration.
- What made it hard: The platform had to bridge product UX and backend correctness: users needed simple one-click imports, while the system needed stable versioning, safe auth boundaries, replayable imports, and repeatable deploys.
Tags: Software Engineering, Technical Leadership, Product Engineering, Full-Stack Development, Platform Engineering, Microservices, TypeScript, React, Vite, Fastify, PostgreSQL, Drizzle ORM, Railway, GitHub Actions, CI/CD, Steam OpenID, Authentication, CSRF, HTTP-Only Cookies, Rate Limiting, Checksum Validation, Config Versioning, Admin Tooling, API Design
Link: https://glydr.gg/
Independent Researcher (Optimization + ML) at UT Dallas
Date Range: Apr 2025 - Nov 2025
Category: Research
Timeline Tone: past
Summary: Built paper-faithful optimization solvers and a solver-labeled dataset pipeline for 1D drone coverage planning, with benchmarks and QC gates to support supervised, GNN, and RL experiments on optimal tour planning.
Highlights:
- Implemented 4 paper-faithful solvers for 1D drone coverage planning (greedy + DP), including exact plan reconstruction so solver outputs can become usable training labels.
- Built an end-to-end data pipeline from instance generation → gold labels → featurization hooks → QC, enabling ML training on optimal solutions instead of heuristic approximations.
- Measured labeling throughput on a verified run: 370 labeled instances in 2.71s (~136 samples/s), writing ~280KB of JSONL data with automated validation of feasibility and coverage constraints.
- Configured defaults for a 67,000-sample labeled dataset across train/test/shifted/extrap/stress splits, supporting generalization and distribution-shift evaluation.
- Benchmarked exact solver scaling: dp_full stays under 1s up to n=1024 segments and reaches n=4096 in 8.41s, giving practical ceilings for exact-label generation.
- Maintained correctness gates with 162 collected tests, including plan round-trip tests and oracle cross-checks for solver behavior and reconstruction validity.
- Exposed ML-ready surfaces including gold labelers, legality masks, featurization hooks, and candidate metadata for future supervised learning, GNN, and RL experiments.
Tags: Optimization, Algorithms, Dynamic Programming, Greedy Algorithms, Computational Geometry, Coverage Planning, Drone Routing, Exact Algorithms, Plan Reconstruction, Dataset Generation, Programmatic Labeling, Featurization, Data QC, Benchmarking, Reproducible Research, Python, NumPy, PyTorch, Pytest, JSONL, GNN, Reinforcement Learning
Link: https://personal.utdallas.edu/~daescu/
Software Engineering Intern at Abilitie
Date Range: May 2024 - Aug 2024
Category: Industry
Timeline Tone: past
Summary: Owned Llama 3.1 fine-tuning, structured-output engineering, cost optimization, telemetry, and latency work for Abilitie AI Cases, reducing LLM cost per conversation 70% across 27 role-play configurations and reaching 1.0s p95 TTFT.
Highlights:
- Context: Worked on Abilitie AI Cases, an enterprise LLM role-play training product for scenario-based leadership, communication, and decision-making practice.
- Problem: The product needed more reliable role-play behavior, lower LLM cost, structured model outputs, better latency visibility, and smoother perceived responsiveness across many scenario configurations.
- Model architecture: Owned end-to-end Llama 3.1 fine-tuning on proprietary role-play conversations, scenario data, and structured-output targets to improve domain coaching and JSON/tool-calling behavior.
- Core implementation: Built React/TypeScript chat flows, scenario configuration pages, end-state UI fixes, streaming/loading states, and Azure/AWS-backed provider request plumbing around production role-play flows.
- Structured output system: Replaced brittle free-text responses with schema-constrained JSON outputs so model responses could be validated, rendered as deterministic product state, and retried when they failed format expectations.
- Evaluation and optimization: Reduced LLM cost per conversation 70% across 27 role-play configurations through model migration, prompt compression of roughly 20%, schema-constrained outputs, and retry reduction of roughly 8%.
- Telemetry architecture: Built DynamoDB telemetry for TTFT, TTLT, token throughput, token counts, retries, errors, provider/model metadata, and per-request traces, making latency and reliability problems diagnosable instead of anecdotal.
- Latency impact: Optimized 3-second idle prefetching and stale-response invalidation to reach 1.0s p95 TTFT in the monitored role-play flow.
- Safety and robustness: Ran prompt-injection testing and hardening iterations to reduce out-of-format, unsafe, or scenario-breaking model behavior in customer-facing role-play interactions.
- What made it hard: The work required balancing cost, latency, model quality, schema validity, role-play realism, and user experience; optimizing one metric in isolation would have been easy, but the product needed all of them to hold together.
Tags: Software Engineering, LLM Product Engineering, LLMs, Llama 3.1, Fine-Tuning, Structured Outputs, JSON Schema, Tool Calling, Prompt Engineering, Prompt Injection Testing, Model Evaluation, Cost Optimization, Latency Optimization, Telemetry, Observability, DynamoDB, AWS, Azure, React, TypeScript, Material UI, Streaming UX, Product Engineering
Link: https://www.abilitie.com/case-challenges
Dorm Proctor at St. Stephen's Episcopal School
Date Range: Aug 2021 - May 2022
Category: Leadership
Timeline Tone: past
Summary: Supported younger boarding students through peer mentorship, safety training, and community-building in a residential school environment.
Highlights:
- Supported new students as they transitioned into boarding school life, helping them adjust to routines, expectations, and the social environment.
- Served as a peer mentor for academic, personal, and social challenges, using training from counselors and dorm staff to respond with discretion and care.
- Worked with dorm parents, administrators, and counselors to help maintain a safe, welcoming, and inclusive dorm environment.
- Completed safety and emergency training, including fire protocols and campus response procedures.
- Balanced proctor responsibilities with academics and extracurriculars, learning how to be dependable in a community-facing leadership role.
- Tried to be the person younger students could come to when something felt confusing, stressful, or just awkward about living away from home.
Tags: Leadership, Mentorship, Peer Support, Student Life, Counseling, Community Building, Safety Training, Communication
Link: https://www.sstx.org/boarding/boarding-student-support
Project Source Links
MonopolyBench
Summary: A deterministic long-horizon LLM-agent benchmark for evaluating planning, negotiation, deception, memory, bias, and asset-management behavior.
Highlights:
- Context: Built MonopolyBench as a long-horizon agent-evaluation environment in the same research direction as Vending-Bench and Beer Game-style AI-agent studies: structured simulations where LLM agents must make repeated economic decisions under constraints instead of answering isolated prompts.
- Problem: Most LLM benchmarks are short, single-agent, or difficult to replay. MonopolyBench tests whether tool-calling agents can sustain coherent planning across a full game involving cash, assets, rent, debt, trades, auctions, liquidity pressure, bankruptcy, and adversarial incentives.
- Research direction: The current benchmark is Monopoly-specific, but the roadmap is to generalize it into a real-estate and asset-management agent benchmark where models manage property portfolios, negotiate deals, handle liquidity constraints, price assets, and respond to strategic counterparties.
- System architecture: Built an authoritative Monopoly rules engine plus multi-agent arena where models act through schema-bound tool calls rather than free-form text, keeping every decision constrained, validatable, inspectable, and replayable.
- Core implementation: Implemented deterministic dice/cards, turn order, legal action menus, property ownership, rent, auctions, jail, trades, liquidation, bankruptcy, strict action validation, corrective retries, and deterministic fallbacks.
- Agent interface: Designed each decision around full game state, recent history, memory, legal action schemas, private/public messages, persona configuration, and tool-call responses so model behavior can be evaluated at the decision level.
- Telemetry and replay: Logged prompts, tool schemas, raw responses, parsed actions, retries, fallbacks, legal menus, state snapshots, applied events, and summaries so runs are debuggable instead of black-box.
- Evaluation design: Built the benchmark to support model-vs-model comparisons, win-rate rankings, TrueSkill-style ratings, micro-decision suites, negotiation/bluffing analysis, replay auditing, and controlled bias experiments through player descriptors.
- Impact and outcome: MonopolyBench turns a familiar economic board game into a deterministic agent-evaluation lab for studying long-horizon planning, strategic bargaining, memory, rule-following, and failure recovery.
- What made it hard: The benchmark has to preserve strategic freedom while preventing invalid actions, hallucinated game state, illegal trades, broken turn order, and untraceable model behavior; every run needs to be both game-correct and research-analyzable.
Tags: LLM Evaluation, Agent Harness, AI Agents, Multi-Agent Systems, Tool Calling, Function Calling, Agent Benchmarking, Long-Horizon Planning, Economic Simulation, Game Theory, Real Estate Benchmarking, Asset Management, Negotiation, Deception, Bias Evaluation, Schema-Bound Actions, Deterministic Simulation, Event Sourcing, Replayable Artifacts, Telemetry, Python, FastAPI, WebSockets, React, TypeScript, Vite, Zod, JSON Schema, Pytest
Link: https://github.com/KushagraBharti/MonopolyBench
Thumbnail: /portfolio/projects/monopoly-llm-benchmark.svg
F1 Reinforcement Learning
Summary: A custom Formula 1 simulation lab with telemetry-calibrated physics, GPU evolutionary search, BC/SAC policy training, and a 78.683s learned lap.
Highlights:
- Context: Built F1RL as a custom Formula 1 racing AI lab around Monza, combining reinforcement learning, physics simulation, GPU evolutionary search, behavior cloning, SAC fine-tuning, telemetry, replay, and real Formula 1 reference data instead of training on an off-the-shelf toy environment.
- Problem: A racing policy has to solve long-horizon control: accelerate aggressively, brake late but stably, rotate through chicanes, use curbs without abusing track limits, recover from imperfect lines, avoid wall/off-track failures, and complete a full lap quickly. The challenge was not just producing motion; it was producing fast, legal, repeatable racing behavior under a simulator that actually punishes bad vehicle dynamics.
- Final thesis: The project evolved from a custom Gymnasium/SB3 PPO environment into a full hybrid RL/search platform: real F1 telemetry-calibrated physics, CUDA/GPU evolutionary search for discovery, CPU-reranked source trajectories for trust, behavior cloning for policy imitation, and project-native PyTorch SAC for learned-policy promotion.
- Simulator architecture: Built a Gymnasium-compatible Monza environment around a shared simulator, with normalized observations, discrete/continuous/multidiscrete control modes, checkpoint-valid lap logic, Stable-Baselines3 PPO hooks, benchmark tooling, telemetry output, replay support, and learned-policy evaluation paths.
- Physics V2 architecture: Rebuilt the simulator physics into an explicit opt-in `physics_model=v2` contract with metadata for physics model, physics version, and calibration ID so V1 and V2 artifacts remain separate and comparable instead of silently mixing benchmark categories.
- Vehicle dynamics: Implemented a richer F1-style vehicle model with tire slip-angle behavior, tire-force saturation, weight transfer, automatic gear/torque curves, brake-bias instability, curb/surface grip modifiers, and oriented car-body collision checks.
- Tire model: Added slip-angle and tire-force behavior so the car must manage corner entry, understeer/oversteer, trail braking, rotation, and high-speed turning instead of simply staying below a flat grip cap.
- Weight transfer model: Added braking, acceleration, and cornering load-shift behavior so heavy braking can destabilize the rear, smooth brake release matters, throttle timing changes exit behavior, and Monza braking zones like Rettifilo, Roggia, Ascari, and Parabolica become more meaningful.
- Powertrain model: Added a gear/torque curve so acceleration depends on speed/RPM behavior instead of giving the car flat acceleration everywhere, making straights, corner exits, and top-speed behavior more realistic and easier to calibrate against real F1-like pacing.
- Brake and surface model: Added brake-bias / brake-lock tendencies plus asphalt, curb, grass/gravel, and wall/outside surface behavior so late braking, brake-and-steer inputs, curb usage, and legal track-limit behavior have real tradeoffs.
- Collision model: Improved collision detection using the car body and orientation instead of only center-point legality, making wall contact, apex clipping, side impacts, and narrow track-edge cases more trustworthy in replay.
- Calibration architecture: Calibrated V2 against FastF1 Monza references and added OpenF1 sanity-check evidence, using real timing, speed, braking, and lap reference data as the benchmark anchor while keeping the result clearly scoped to a calibrated top-down 2D simulator.
- Final V2 target: Established a FastF1-derived V2 benchmark threshold of 79.327s from the 2024 Italian GP Qualifying NOR lap 11 reference, then treated V2 learned-policy success as a separate benchmark category from the earlier V1 results.
- Observation design: Engineered racing observations so controllers and learned policies receive meaningful driving state: speed, heading error, lateral error, lap progress, ray distances, lookahead heading, target speed, brake demand, target-speed drops, braking-gate proximity, curvature, and section-aware features.
- PPO infrastructure: Built and maintained Gymnasium/SB3 PPO infrastructure with PyTorch/CUDA support, vectorized environments, checkpointing, TensorBoard logging, curriculum/state-library starts, metadata-aware eval, reward overrides, and benchmark tooling. PPO remains useful infrastructure, but it is no longer the headline result.
- Search strategy: Used CUDA-scale evolutionary controller search as the discovery engine, evaluating large populations of driving controllers, scoring full-lap behavior, preserving elite candidates, applying mutation/crossover, and pushing the population toward faster valid laps.
- GPU ES scale: Ran a staged V2 GPU evolutionary-search campaign under the new physics contract, scaling through 1000x5, 1000x10, 1000x25, and 1000x50 stages before producing CPU-reranked V2 source trajectories.
- Largest speed-search result: Ran a 2000-population, 150-generation, 25k-step GPU-fused search that evaluated 300,000 controller candidates and found an 91.233s candidate under the previous V1/V2 transition milestone before the final learned-policy pipeline closed under Physics V2.
- V2 ES result: Under explicit Physics V2, the trusted GPU ES source path found a CPU-reranked 78.6833s lap from generation 46, candidate 724, with selected parity passing at 0 reason mismatches and 0 valid-lap mismatches.
- Correctness contract: Treated GPU search as a proposal generator, not the oracle. The trusted workflow is GPU proposals -> CPU postcheck/rerank -> selected telemetry -> replay/GIF/highlight. This prevents raw GPU-only rows from being promoted without replay validation.
- Dataset export: Exported a V2-only transition dataset from CPU-replayed V2 source candidates instead of mixing V1/V2 data, preserving source run path, physics model, physics version, calibration ID, observation profile, source candidate bucket, lap outcome, and action schema.
- V2 dataset stats: Built a CPU-replayed V2 dataset with 16 source candidates, 50,128 transitions, 8 valid laps, a fastest source lap of 78.65s, and balanced examples across valid laps, early failures, and mid-frontier states.
- Action representation fix: Discovered that dominance-control assumptions lost information because V2 source telemetry used meaningful simultaneous throttle/brake behavior, so the learned-policy path switched to independent throttle/brake/steer control.
- Behavior cloning: Trained an independent-control behavior-cloned policy on V2 source data, reaching low train/validation action error and preserving the source behavior well enough for CPU V2 evaluation.
- SAC learned-policy path: Built a project-native PyTorch SAC workflow that loads verified ES datasets and BC checkpoints, uses twin critics and target networks, preloads replay buffers with ES transitions, supports online CPU MonzaSim rollouts, writes checkpoint/eval telemetry, and selects `best_policy.pt` by valid CPU eval lap.
- SAC promotion detail: The final saved policy is a SAC workflow checkpoint that preserves and promotes the valid BC initialization. This is important: the final artifact is a neural policy checkpoint evaluated through the learned-policy infrastructure, not a replayed JSONL trajectory.
- Final learned-policy result: Promoted a learned SAC checkpoint under explicit `physics_model=v2`, deterministic normal-start CPU MonzaSim evaluation, and `observation_profile=racing_v2`, completing a valid lap in 78.683333s against the 89.327s V2 benchmark threshold.
- Promotion proof: The final learned-policy promotion reached 1/1 valid lap, `lap_complete`, 4661 steps, final progress 5793.000137m, final speed 270.830kph, and a 78.683333s lap under CPU V2 evaluation.
- Replay and visual proof: Generated replayable final highlights and exact pygame-renderer GIFs for the V2 GPU ES source lap and the V2 learned-policy promotion lap, keeping the visuals tied to real replay telemetry instead of approximate custom renders.
- Highlight system: Curated final local highlights under `artifacts/highlights/v2-fastf1-final-20260606`, including selected GPU ES traces, learned-policy promotion telemetry, a 1000-trace learned-policy swarm, and GIF exports.
- Validation and preservation: Preserved bulk artifacts externally, kept curated highlights local, and completed final validation with ruff, pyright, pytest, hardware checks, CUDA availability, and Warp torch interop smoke tests before committing and pushing the Physics V2 pipeline.
- Impact and outcome: Converted the project into a serious racing AI system: explicit versioned physics, real telemetry calibration, GPU-scale search, verified trajectory datasets, behavior cloning, SAC learned-policy promotion, replayable evidence, and a learned policy that beats the internal V2 FastF1-derived simulator benchmark.
- What made it hard: The hard part was not calling PPO, running evolution, or training a network. It was building a simulator where physics mattered, preserving benchmark contracts across V1/V2, designing state and reward signals, scaling GPU search, verifying GPU candidates on CPU, exporting trustworthy datasets, preserving fast behavior through BC/SAC, and producing a final learned policy that survives deterministic normal-start evaluation.
Tags: Reinforcement Learning, Physics Simulation, Gymnasium, Stable-Baselines3, PPO, SAC, Behavior Cloning, PyTorch, CUDA, GPU Evolutionary Search, Evolutionary Algorithms, Genetic Algorithms, Controller Search, Policy Optimization, Dataset Distillation, Simulation, Physics V2, Vehicle Dynamics, Tire Slip Angle, Tire Force Curve, Weight Transfer, Torque Curve, Brake Bias, Surface Modeling, Curb Physics, Collision Detection, Bicycle Model, Control, Racing AI, Formula 1, Monza, FastF1, OpenF1, Telemetry, Replay Systems, OpenCV, Python, NumPy, Pygame, Ray-Cast Sensors, Reward Shaping, Reward Function Tuning, Multi-Profile Scoring, Mutation, Crossover, Elite Selection, Survival Floors, Curriculum Learning, State Libraries, Benchmarking, Experiment Tracking, Long-Horizon Control, ML Systems, Optimization
Link: https://github.com/KushagraBharti/F1-ReinforcementLearning
Thumbnail: /portfolio/projects/f1-optimization.png
IMC Prosperity 4 Quant Trading Competition
Summary: A top-6%-worldwide IMC Prosperity 4 trading system spanning fair-value market making, options pricing, residual signals, DP oracles, and replay diagnostics.
Highlights:
- Context: Competed in IMC Prosperity 4, a multi-round global algorithmic trading competition with both algorithmic and manual trading components, finishing top 6% worldwide with 203,249 XIREC.
- Leaderboard result: Finished #1088 overall, #1404 algorithmic, #893 manual, and #295 country, using a combination of algorithmic strategies, manual puzzle optimization, and round-by-round research iteration.
- Problem: Each round introduced new products, market mechanics, position limits, state constraints, and feedback windows; the challenge was to build strategies that could capture edge without overfitting local or portal-window artifacts.
- Research architecture: Engineered the repo as a reproducible quant research system with round-scoped strategies, final submissions, official feedback archives, replay logs, candidate scorecards, diagnostic scripts, process notes, and strategy-lineage records.
- Round 1 / 2 strategy: Built fair-value market-making engines for ASH_COATED_OSMIUM and INTARIAN_PEPPER_ROOT, combining stationary fair value, top-book imbalance, inventory-skewed quoting, drift/carry accumulation, visible fills, and position-limit enforcement.
- Round 2 mechanism design: Modeled the Market Access Fee and manual allocation challenge as expected-value optimization problems, separating algorithmic trading edge from manual puzzle PnL and fee mechanics.
- Round 3 strategy: Built multi-asset options/fair-value logic for HYDROGEL_PACK, VELVETFRUIT_EXTRACT, and VEV vouchers using dynamic fair-value estimation, underlying-implied voucher anchors, Black-Scholes-style pricing, strike-specific volatility assumptions, expiry decay, inventory skew, and selective take/quote thresholds.
- Round 4 strategy: Added volatility-smile diagnostics, implied-volatility reasoning, VFE/voucher role audits, participant-flow signals, Mark-driven mechanics, stale-inventory analysis, and anti-overfit labels for distinguishing robust improvements from portal-window-only gains.
- Round 5 strategy: Scaled to a 50-product universe with residual/stat-arb strategies across Galaxy Sounds, Sleep Pods, Microchips, Pebbles, Robots, UV Visors, Translators, Panels, Oxygen Shakes, and Snack Packs.
- Round 5 alpha families: Built PEBBLES synthetic fair value, TRANSLATOR/PEBBLES anchors, MICROCHIP/PANEL/UV/SLEEP/OXYGEN relative value, ROBOT/GALAXY momentum, product-specific reversal, category-relative residuals, and selective passive/taker execution.
- Dynamic programming research: Built DP hindsight oracles over discrete inventory states to compute optimal trade schedules, upper-bound execution paths, and missed-edge attribution under inventory and fill constraints.
- Signal distillation: Used DP oracle outputs to study causal z-score, anti-trend, imbalance, residual-spread, and category-relative signals without pretending the hindsight oracle itself was a valid live strategy.
- Backtesting and validation: Ran hundreds of candidate/probe backtests using Kevin/Xeeshan/Rust-style replay tooling, official-window extraction, product/category/block PnL attribution, fill-sequence comparison, inventory-path inspection, drawdown diagnostics, and state-size checks.
- State constraint engineering: Diagnosed official/local replay mismatch from oversized traderData; later candidates used aliases, delta-encoded integer histories, half-tick scaling, residual scaling, cache trimming, and state-cap checks to survive the 50,000-character official traderData ceiling.
- Overfit control: Explicitly rejected strategies that looked strong in portal windows but failed full-history replay, including candidates where aggressive local PnL collapsed under broader replay or transfer-risk checks.
- Metrics and outcomes: Tracked archived algorithmic results across final/best official artifacts including 89,306.8125 Round 1 algo PnL, 80,708 displayed Round 2 algo PnL after fees, 76,114.025390625 Round 3 algo PnL, 50,966.40673828125 Round 4 algo PnL, and 118,855.008789062 best stored Round 5 official algo PnL.
- Manual optimization: Optimized manual decisions separately from algorithmic strategy, including a Round 1 manual portfolio worth 87,995.10 PnL and a 1st-place manual rank, plus a Round 2 allocation worth 164,664 manual PnL.
- Impact and outcome: Built a serious quant research workflow rather than a one-file trading bot: strategy design, oracle research, signal generation, candidate promotion, replay diagnostics, drawdown analysis, and official constraint handling were all part of the system.
- What made it hard: The challenge was not just finding profitable signals; it was separating robust alpha from overfit, simulator quirks, state-size failures, fill-model mismatch, regime transfer, and feedback-window traps.
Tags: Quantitative Trading, Backtesting, Algorithmic Trading, IMC Prosperity, Top 6% Worldwide, Global Competition, Dynamic Programming, DP Hindsight Oracle, Market Making, Inventory-Skewed Quoting, Fair Value Modeling, Black-Scholes, Options Pricing, Volatility Smile, Statistical Arbitrage, Residual Signals, Drift / Carry, PnL Attribution, Drawdown Analysis, Inventory Management, Execution Strategy, Market Microstructure, Signal Research, State Serialization, Python, Data Analysis, Research Infrastructure
Link: https://github.com/KushagraBharti/IMC-Prosperity-4
Thumbnail: /portfolio/projects/imc-prosperity.png
Pact
Summary: A hackathon winner mobile accountability app where users turn goals into escrow-backed commitments, submit proof, and rely on trusted validators.
Highlights:
- Context: Built Pact as a consumer accountability product where users turn personal goals into financially backed commitments instead of vague intentions.
- Problem: Most accountability apps rely on reminders or social pressure; Pact makes accountability concrete by combining financial stakes, proof submission, trusted validators, and explicit outcome resolution.
- Product architecture: Designed the full pact lifecycle from creation to stake lock, proof upload, validator voting, stake release/forfeit, cancellation, and resolved-state lockout.
- On-chain architecture: Implemented Solana escrow flows on devnet/mainnet so commitment stakes could be connected to actual on-chain escrow behavior rather than only mocked UI state.
- Backend architecture: Built Fastify APIs with Supabase Auth, PostgreSQL, and Supabase Storage for authenticated users, pact state, validator membership, proof image uploads, and lifecycle enforcement.
- Core implementation: Implemented pact creation, stake flow integration, proof-image submission, validator voting, majority-based resolution, creator self-vote prevention, membership checks, and resolved/cancelled pact lockout rules.
- Security and correctness: Enforced critical state transitions server-side so protected actions never trust client-supplied user identity, creator IDs, validator IDs, pact state, proof ownership, or resolution authority.
- UX design: Built a React Native/Expo mobile interface around commitment status, financial stakes, proof review, validator transparency, pending actions, and clear next-step prompts.
- Impact and outcome: Shipped Pact as a complete mobile hackathon product and won 1st place in the HackSMU Solana Track.
- What made it hard: The project had to coordinate mobile UX, backend state, validator logic, proof artifacts, and on-chain escrow semantics while preventing users from bypassing lifecycle rules from the client.
Tags: React Native, Full-Stack Mobile, TypeScript, Expo, Expo Router, Fastify, Supabase, PostgreSQL, Supabase Auth, Supabase Storage, Solana, On-Chain Escrow, Web3, SPL Token, Mobile Development, Full-Stack Development, FinTech, Consumer Social, Product Engineering, State Machines, Server-Side Validation, System Design, Hackathon Winner, 1st Place
Link: https://github.com/KushagraBharti/Pact
Thumbnail: /portfolio/projects/pact.png
Go Web Crawler
Summary: A high-concurrency Go web crawler with worker pools, host-aware frontier scheduling, PostgreSQL persistence, and 100,000-page public-web crawl runs.
Highlights:
- Context: Built a systems-heavy web crawler to crawl large web graphs, extract page content, preserve discovery structure, and make crawl runs inspectable instead of treating crawling as a black-box HTTP loop.
- Problem: A naive crawler can overload one domain, duplicate URLs, lose frontier state, hide failures, or produce unreadable output; Arachne needed bounded concurrency, host-aware scheduling, canonical deduplication, structured persistence, and reproducible benchmarks.
- Crawler architecture: Built the crawler core in Go around goroutine worker pools, channel-based scheduling, global/per-host concurrency controls, and a host-partitioned breadth-first frontier to avoid single-domain starvation while expanding large web graphs.
- Traversal model: Modeled crawl expansion as BFS-style frontier traversal: fetched pages emit outgoing links, links are canonicalized and deduplicated, new tasks are assigned to host queues, and workers pull tasks through bounded scheduling controls.
- Core implementation: Implemented HTTP fetching, HTML link extraction, canonical URL normalization, duplicate suppression, frontier scheduling, page/content capture, discovery-edge construction, crawl error recording, and run-level metadata.
- Persistence architecture: Persisted frontier state, fetched pages, extracted content, graph edges, crawl errors, and run metadata in PostgreSQL so crawl datasets can be queried, resumed, and analyzed instead of existing only as transient memory.
- Evaluation and benchmarks: Benchmarked public-web crawling at 10,003 pages in 15.39s, roughly 649.8 pages/sec, and separately completed 100,000 successful HTML pages.
- Impact and outcome: Turned the project from a standard crawler demo into a crawl dataset system with explicit graph structure, persistence, throughput metrics, and observable failure surfaces.
- Technical depth: The high-signal engineering is in the scheduler: goroutines, channels, breadth-first frontier semantics, host partitioning, canonical dedupe, bounded worker pools, and persistent graph/data state.
- What made it hard: Large public-web crawling breaks simple assumptions quickly; the crawler has to handle duplicate pressure, frontier growth, host imbalance, fetch failures, HTML variance, persistence overhead, and benchmark reproducibility.
Tags: Go Concurrency, Goroutines, Go, Channels, Concurrency, HTTP, BFS Frontier, Frontier Scheduling, Worker Pools, Host-Partitioned Scheduling, URL Canonicalization, Deduplication, HTML Parsing, PostgreSQL, Graph Data, Large-Scale Web Data, Benchmarking, Systems Engineering, Backend Engineering, Performance Engineering, TypeScript, Next.js, React, Server-Sent Events
Link: https://github.com/KushagraBharti/Web-Crawler-Go
Thumbnail: /portfolio/projects/arachne.png
NovelBench
Summary: A live multi-stage LLM benchmark where frontier models generate, critique, revise, and vote on creative prompts under pressure.
Highlights:
- Context: Built NovelBench to evaluate whether LLMs can produce, critique, revise, and judge creative ideas through a structured workflow rather than a single one-shot answer.
- Problem: Many creative AI comparisons are anecdotal or one-turn; NovelBench creates a repeatable arena where models face the same prompts, critique each other anonymously, revise under feedback, and vote on outputs.
- Benchmark architecture: Designed a multi-stage pipeline where 2–8 models generate ideas, anonymously critique peer outputs, revise with aggregated feedback, and vote on final submissions.
- Evaluation design: The workflow tests several capabilities at once: original idea generation, critique quality, feedback integration, self-revision, comparative judgment, and resistance to model-identity bias.
- Production architecture: Built durable run orchestration around append-only events, persisted artifacts, leaderboard snapshots, archive pages, detailed run pages, replayable traces, and stage-level failure tolerance.
- Model execution: Integrated OpenRouter-backed model calls with structured output normalization, model selection, workflow retries, scoring logic, failure handling, and leaderboard updates.
- Public deployment: Launched NovelBench as a live public benchmark site, currently tracking 42 benchmark runs, 219 generated ideas, 1,033 critiques written, and 17 total models.
- Product surface: Built the public Next.js interface with landing pages, live arena flows, searchable archive, leaderboard views, and detailed run pages so benchmark results can be inspected rather than hidden behind logs.
- Impact and outcome: NovelBench turns model creativity into a traceable multi-stage evaluation process, making it possible to compare models on generation, critique, revision, and judging behavior instead of vibe-based screenshots.
- What made it hard: The benchmark has to coordinate multiple models, preserve anonymity, normalize structured outputs, survive partial model failures, keep event history durable, and produce rankings that remain understandable to users.
Tags: LLM Evaluation, Evaluation Infrastructure, TypeScript, Next.js, React, Convex, OpenRouter, LLMs, AI Benchmarking, Creative Evaluation, Multi-Model Evaluation, Workflow Orchestration, Realtime Systems, Structured Outputs, Anonymous Critique, Model Voting, Leaderboard Systems, Prompt Engineering, Product Engineering, Research Infrastructure, Solo Project
Link: https://github.com/KushagraBharti/NovelBench
Thumbnail: /portfolio/projects/novel-bench.png
AutoHDR ML Lens Correction
Summary: A geometry-first computer vision system for automatic lens correction, combining a staged ResNet34 hybrid CNN with Brown-Conrady camera geometry.
Highlights:
- Context: Built AutoHDR as a competition-grade computer vision system for automatic lens correction on paired distorted/corrected image data.
- Problem: Pure image-to-image models can learn visually plausible corrections without respecting camera geometry; AutoHDR combined analytic lens distortion modeling with learned residual correction to keep the system grounded in optical structure.
- Model architecture: Trained a staged ResNet34 hybrid CNN with two prediction paths: a Brown-Conrady parameter head for global camera distortion coefficients and a 2-channel residual-flow decoder for local learned displacement.
- Geometry architecture: Fused analytic Brown-Conrady camera geometry and learned residual flow into a differentiable grid_sample warp, making lens correction a structured geometric transformation rather than unconstrained pixel generation.
- Training pipeline: Trained on 23,118 paired images using cloud H200 compute, moving through param-only calibration, hybrid training, and final fine-tuning stages.
- Inference pipeline: Ran deterministic full-batch inference over 1,000 test images with safety/fallback checks, submission packaging, and reproducible output generation.
- Evaluation and benchmarks: Scored 89.42 on the leaderboard, placing roughly top 25 to top 30 out of about 200 participants, and earned CTO review for the model architecture and training approach.
- Quality and validation: Measured stage progression during cloud training, including validation-loss reduction and training-warning cleanup, and validated the codebase with 108/108 passing tests across geometry contracts, inference fallback behavior, training hooks, QA tooling, and submission flows.
- Impact and outcome: Built a serious geometry-guided ML system that treated training, inference, scoring, QA, and packaging as one product pipeline rather than a visual demo.
- What made it hard: The technical challenge was balancing analytic camera geometry with learned correction capacity while keeping the warp differentiable, stable, reproducible, and strong enough to perform on hidden test images.
Tags: Computer Vision, ResNet34 CNN, Deep Learning, PyTorch, CNNs, ResNet34, Image Geometry, Lens Distortion Correction, Brown-Conrady Model, Optical Flow, Residual Flow, Warping, grid_sample, Model Training, Staged Training, Benchmarking, H200, Reproducible Systems, Testing, QA Tooling, Competition Engineering
Link: https://github.com/KushagraBharti/AutoHDR-LensCorrection
Thumbnail: /portfolio/projects/autohdr-ml-lens-correction.png
Beyond Chat
Summary: An artifact-first AI workspace with specialized studios, tool-calling runs, durable context, model comparison, and storage-backed outputs.
Highlights:
- Context: Built Beyond Chat as an artifact-first AI workspace, not a single chat box, with dedicated workflows for chat, writing, research, image, data, finance, artifacts, settings, and model comparison.
- Problem: Traditional chat apps discard most value after the conversation ends; Beyond Chat turns useful model outputs into durable, searchable, reusable artifacts that can be attached to later workflows.
- Product architecture: Designed the product around specialized studios, saved artifacts, reusable context, provider-aware runs, tool traces, storage-backed files, and cross-studio handoffs.
- Agentic architecture: Implemented explicit run records, step timelines, provider metadata, tool-call metadata, failure states, artifact provenance, and saved outputs so workflows are traceable instead of opaque.
- Context system: Built Context Builder as the connective layer across studios, letting users attach saved artifacts to Chat, Compare, Research, Writing, Image, Data, and Finance workflows.
- Model orchestration: Engineered OpenRouter-backed chat and compare flows with model selection, parallel model comparison, streaming responses, retry handling, selected-result handoffs, and one-click artifact saves.
- Research workflow: Built Research Studio around Exa search, producing source-backed research reports with visible workflow steps and explicit provider-missing behavior when search is unavailable.
- Data workflow: Built Data Studio with Supabase Storage uploads, workspace-scoped file paths, CSV/XLS/XLSX parsing, preview/profile generation, analysis, chart/table rendering, and separate artifact saves.
- Writing workflow: Built Writing Studio with launch templates, targeted edit mode, bounded context, document generation, assistant suggestions, Compare integration, and provenance-linked saves.
- Infrastructure: Implemented Supabase-authenticated runtime infrastructure across FastAPI and React, including JWT-backed request context, workspace bootstrap, ownership checks, private storage, signed URLs, and RLS hardening.
- Deployment: Fully deployed the product through Vercel, Railway, and Supabase, with working demos for the core studios, provider integrations, artifact flows, and authenticated storage-backed workflows.
- Operational features: Added provider status reporting, disconnected-safe UI states, Stripe billing endpoints, export-to-Markdown/PDF, multi-artifact bundle export, and failure-aware UI behavior.
- Impact and outcome: Beyond Chat demonstrates a complete AI product architecture around durable artifacts, tool-calling runs, provider orchestration, reusable context, and multi-studio workflows rather than throwaway chat messages.
- What made it hard: The hard part was coordinating auth, storage, model providers, file parsing, context reuse, streaming UX, artifact provenance, workflow state, and provider failures into one coherent AI workspace.
Tags: RAG, Artifact Systems, TypeScript, React, Vite, Tailwind CSS, Python, FastAPI, Supabase, PostgreSQL, Supabase Auth, Supabase Storage, Vercel, Railway, OpenRouter, Exa, Stripe, LLMs, Agentic AI, AI Agents, Tool Calling, Workflow Orchestration, Context Engineering, Model Comparison, Data Analysis, Full-Stack Development, System Design, Product Engineering
Link: https://github.com/KushagraBharti/Beyond-Chat
Thumbnail: /portfolio/projects/beyond-chat.png
Kaggle Titanic ML
Summary: A complete Titanic ML pipeline covering data cleaning, feature engineering, model comparison, EDA, and a documented learning report.
Highlights:
- Context: Built this as a foundational machine-learning project while learning the complete supervised ML workflow from scratch in Jupyter Lab.
- Problem: The goal was not to hide behind a final score; it was to understand and document the full path from messy tabular data to cleaned features, trained models, comparisons, mistakes, and lessons.
- Data workflow: Loaded the Titanic dataset, inspected missingness, cleaned values, encoded categorical features, extracted passenger titles, and engineered features such as FamilySize and IsAlone.
- Exploratory analysis: Studied survival patterns across passenger class, sex, age, family structure, fare, embarkation, and title-derived social signals using notebook-driven EDA.
- Modeling workflow: Trained and compared Logistic Regression, SVM, KNN, Decision Tree, Random Forest, Naive Bayes, Perceptron, and SGD models using scikit-learn.
- Evaluation and benchmarks: Reached roughly 86.76% training accuracy with Decision Tree / Random Forest baselines while using comparisons to understand bias, variance, overfitting, and model behavior.
- Documentation: Wrote the project as a full learning notebook with notes, reasoning, experiments, mistakes, and observations rather than only publishing final cleaned code.
- Impact and outcome: This project became the foundation for later ML work by forcing the full mental model: data cleaning, feature engineering, EDA, model selection, evaluation, and communication.
- What made it valuable: The project is intentionally fundamental; it shows the building blocks behind later ML systems and preserves the learning process instead of pretending the first ML project was advanced research.
Tags: Machine Learning, Feature Engineering, Foundational ML, Kaggle, Titanic, Jupyter, Python, pandas, scikit-learn, Data Cleaning, EDA, Model Comparison, Random Forest, Decision Tree, Logistic Regression, SVM, KNN, Naive Bayes, Matplotlib, Seaborn, Learning Notes
Link: https://github.com/KushagraBharti/Kaggle-Titanic-Solution
Thumbnail: /portfolio/projects/kaggle-titanic-ml.png
PseudoLawyer
Summary: An AI-powered contract negotiation platform with real-time multi-party chat, an AI mediator, and contract drafting from negotiation history.
Highlights:
- Built a Next.js 15 + Supabase platform where two parties can negotiate in a shared real-time chat and produce a final contract draft from the conversation.
- Implemented Supabase Auth, Postgres persistence, and Realtime subscriptions for multi-party negotiation sessions.
- Built Sudo, an AI mediator powered through OpenRouter / Claude 3.5 Sonnet, to respond to negotiation context and help parties move toward agreement.
- Implemented a trigger-based Ask Sudo flow so the mediator can be invoked when the conversation needs help instead of constantly interrupting the chat.
- Built a contract generation route that loads the template, participants, and recent negotiation messages before drafting the final agreement.
- Modeled the domain with profiles, templates, negotiations, participants, messages, and contracts in a 6-table Postgres schema.
- Measured OpenRouter chat-completion latency at p50/p95 = 897ms/981ms over 5 runs for the AI layer.
- Built a polished dark UI with role-based flows, dashboard pages, negotiation detail views, contract detail views, typing/loading states, and simple download support.
Tags: Next.js 15, TypeScript, Supabase, Supabase Auth, Supabase Realtime, PostgreSQL, OpenRouter, Claude 3.5 Sonnet, LLMs, AI Mediator, Contract Generation, Legal Tech, Tailwind CSS, Framer Motion, Full-Stack Development
Link: https://github.com/KushagraBharti/PseudoLawyer
Thumbnail: /portfolio/projects/pseudo-lawyer.png
Personal Portfolio Website
Summary: A full-stack portfolio and personal tracker with a public showcase, private authenticated surfaces, live widgets, and machine-readable content for humans and AI systems.
Highlights:
- Built a public portfolio with a hero section, about content, education, experiences, featured projects, other projects, and dedicated project/experience data models.
- Added a private tracker surface alongside the public site so personal workflows can live behind authentication without leaking into the portfolio experience.
- Centralized content in TypeScript modules so public cards, detailed pages, and AI-facing snapshots stay aligned from the same source of truth.
- Generated an `llms.txt` / structured AI-facing view so the site can be read cleanly by humans and models.
- Integrated live GitHub and weather widgets with backend caching and graceful fallbacks so the homepage remains dynamic without depending on fragile client-side calls.
- Built the app as a responsive React/Vite/Tailwind interface with Express APIs, motion, reusable components, and verification for build, lint, unit, integration, and live flows.
Tags: TypeScript, Node.js, Express, React, Vite, Tailwind CSS, Framer Motion, Bun, Vitest, REST API, Full-Stack Development, API Integration, Live Widgets, System Design, Testing
Link: https://github.com/KushagraBharti/Personal-Site
Thumbnail: /portfolio/projects/personal-site.png
Algorithmic Trading Quantitative Test Environment
Summary: A modular quant testbench for strategy development, backtesting, risk metrics, visualization, and Alpaca paper-trading execution.
Highlights:
- Built a modular quant pipeline covering market data ingestion, feature engineering, strategy signals, transaction-cost-aware backtesting, risk metrics, trade logs, and visualization.
- Integrated Alpaca market data and paper-trading APIs to connect research code with simulated execution workflows.
- Implemented a moving-average crossover strategy and feature layer with returns, moving averages, signals, trades, strategy returns, and cumulative returns.
- Modeled transaction costs in the backtester and computed Sharpe ratio, max drawdown, and final return for consistent strategy evaluation.
- Generated CSV trade logs and Matplotlib equity/signal plots for auditability and faster strategy debugging.
- Benchmarked the full evaluation loop at ~1.54M bars/sec on a 199,951-bar seeded synthetic dataset, showing the research loop is fast enough for iteration.
- Current limitations: slippage modeling, parameter sweeps, position sizing, and walk-forward validation are not fully implemented yet.
Tags: Python, Alpaca API, Pandas, NumPy, Matplotlib, Algorithmic Trading, Backtesting, Paper Trading, Risk Metrics, Sharpe Ratio, Max Drawdown, Quant Research
Link: https://github.com/KushagraBharti/Quant-Test-Environment
Thumbnail: /portfolio/projects/quant-test-environment.png
Northstar Agentic Financial Memory Platform
Summary: A memory-first AI wealth-management prototype where a local agent loads durable financial context, portfolio snapshots, and tool traces to produce explainable scenario analysis.
Highlights:
- Built Northstar around durable user context rather than a static portfolio dashboard: onboarding answers compile into readable memory, structured context packets, and a graph of financial preferences.
- Designed the runtime around one visible local agent, North, that preloads memory, context packets, and portfolio snapshots before responding.
- Implemented specialist tools for market/news research, financial data, filings, portfolio context, deterministic scenario checks, and trust receipts.
- Built a chat-first React/Vite interface with quick actions, markdown-rendered answers, a memory transparency modal, and a JSONL-style trace panel.
- Engineered onboarding as a high-signal financial memory compiler using a 44-question questionnaire to structure goals, values, risk comfort, communication style, tax context, and approval boundaries.
- Created a graph-first dashboard that visualizes durable user context as memory nodes instead of generic financial charts.
- Added deterministic demo reliability with seeded holdings/transactions/tax lots, simulated Plaid-style import, market-check fallbacks, and capped external calls.
- Built the Express + TypeScript backend with Supabase/Postgres-ready persistence, OpenRouter/OpenAI-compatible model execution, memory/status APIs, agent streaming routes, and local JSON/JSONL mirrors.
- Focused the architecture on agentic workflows, contextual memory, tool use, observability, audit trails, human-in-the-loop guardrails, and grounded retrieval/context packets.
Tags: TypeScript, React, Vite, Express, Supabase, PostgreSQL, OpenRouter, LLMs, Agentic AI, AI Agents, Tool Calling, Contextual Memory, RAG, JSONL Tracing, Observability, Guardrails, FinTech, Portfolio Analytics, Scenario Analysis, Full-Stack Development, System Design
Link: https://github.com/YuvrajKashyap/northstar
Thumbnail: /portfolio/projects/northstar.png
Age & Gender Recognition
Summary: A real-time OpenCV demo that detects faces from video and predicts age/gender using pre-trained Caffe DNN models.
Highlights:
- Built a real-time face detection pipeline using OpenCV’s DNN module.
- Used pre-trained Caffe models for age and gender prediction, with approximate reported accuracy of 71% for gender and 62% for age in the project context.
- Tuned confidence thresholds and padding around detected face regions to improve prediction stability.
- Rendered bounding boxes and prediction labels directly on the video stream for immediate visual feedback.
Tags: Python, OpenCV, DNN, Caffe, Face Detection, Real-Time Processing, Computer Vision
Link: https://github.com/KushagraBharti/Gender-Age-Detection
Thumbnail: /portfolio/projects/age-gender-recognition.png
PCB Design Project
Summary: A hardware project where I designed, ordered, assembled, and tested custom PCBs as part of a senior independent project.
Highlights:
- Designed multiple PCBs in EasyEDA, moving from schematic capture to board layout and manufacturing files.
- Ordered boards and components through JLCPCB/LCSC, learning the practical constraints around cost, availability, package type, and manufacturability.
- Worked through design challenges involving ATmega328 variants, SMD/THT parts, capacitive-touch buttons, power routing, and component placement.
- Assembled and tested the boards after delivery, gaining hands-on soldering, debugging, and hardware bring-up experience.
Tags: PCB Design, Circuit Design, EasyEDA, JLCPCB, LCSC, Electronics, Hardware, Soldering, Embedded Systems
Link: https://drive.google.com/drive/folders/1Zpps2I5CSq7O7xIUTwsn9uJrs2zYMwTK?usp=sharing
Thumbnail: /portfolio/projects/pcb-design-project.png
Self-Driving Car Project
Summary: An Arduino-based RC car rebuild with ultrasonic sensors and a custom obstacle-avoidance control loop.
Highlights:
- Repurposed an RC car by rebuilding its internals with an Arduino Uno, motor shield, ultrasonic sensors, and custom wiring.
- Wrote C++ control logic to read ultrasonic distance data and perform obstacle detection/avoidance.
- Learned the hardware/software debugging loop: wiring, sensor noise, motor control, soldering, and physical-world failure cases.
Tags: Arduino, C++, Self-Driving, Autonomous Vehicle, RC Car, Electronics, Ultrasonic Sensors, Hardware
Link: https://drive.google.com/drive/folders/1Ma02iYvhobL4ckcy6yOPA300WvIpOeDD?usp=sharing
Thumbnail: /portfolio/projects/self-driving-car-project.png
CircuitSeer (Circuit Solver)
Summary: A computer vision circuit-analysis tool that detects components, traces wiring, and helps solve simple circuit diagrams.
Highlights:
- Built CircuitSeer through the AI Mentorship Program at UT Dallas as a team project combining object detection and classical computer vision.
- Focused on component recognition using a fine-tuned YOLOv5 model to detect resistors, capacitors, diodes, inductors, and power sources.
- Integrated line-detection work using Canny Edge Detection and Hough Transform to help trace wiring between detected components.
- Connected the detection outputs to downstream logic for simple series/parallel resistance and capacitance analysis.
- Built the project with Python and Flask so users could upload circuit diagrams and receive structured analysis through a web interface.
Tags: Python, YOLOv5, Flask, OpenCV, Computer Vision, Object Detection, Canny Edge Detection, Hough Transform, Circuit Analysis
Link: https://github.com/Hteam121/circuit-seer
Thumbnail: /portfolio/projects/circuit-seer.png
DataDrive: Unified Insights for Data & Fuel Optimization
Summary: A full-stack ML analytics dashboard for exploring Toyota vehicle data, fuel-efficiency predictions, clustering, and interactive visualizations.
Highlights:
- Built a Flask + React analytics dashboard over Toyota vehicle data, combining model-backed predictions with interactive exploration.
- Implemented regression and K-Means clustering pipelines for fuel-efficiency and vehicle-segmentation analysis.
- Built backend API endpoints for prediction, car detail retrieval, and clustering results.
- Added data cleaning, feature engineering, and model evaluation workflows to keep the ML layer reproducible.
- Built interactive React/D3 visualizations and a 3D car viewer to make the model outputs easier to explore.
- Integrated GPT-powered explanation generation and Pinata storage as experimental transparency/auditability features.
Tags: Flask, Python, Machine Learning, Linear Regression, KMeans, React, D3.js, Three.js, Data Analytics, APScheduler, SHAP, OpenAI, Pinata
Link: https://github.com/KushagraBharti/HACKUTD-Data-Drive
Thumbnail: /portfolio/projects/data-drive.png
Maze Traversal
Summary: A small recursive DFS maze solver in Python that traces a path from start to exit through a grid-based maze.
Highlights:
- Implemented a recursive depth-first search solver for mazes represented as nested lists.
- Marked the solution path with directional arrows to visually trace movement from start to exit.
- Added file loading, start-position detection, intermediate maze printing, and execution-time measurement.
- Used the project to practice recursion, backtracking, grid traversal, and simple algorithm visualization.
Tags: Python, Depth-First Search, Recursion, Maze Solving, Backtracking, Algorithms
Link: N/A
Thumbnail: /portfolio/projects/maze-traversal.png
Point Cloud Down Sampler
Summary: A point-cloud processing project comparing a from-scratch voxel downsampler with Open3D’s built-in voxel grid method.
Highlights:
- Implemented a custom voxelization algorithm that groups 3D points into discrete grid cells using mathematical flooring.
- Reduced dense point clouds while preserving overall shape structure for downstream visualization and analysis.
- Compared the custom implementation against Open3D’s high-performance `voxel_down_sample` method.
- Built a small pipeline to load CSV point clouds, convert them into PCD format, downsample, and export processed outputs.
- Used the project to understand the tradeoff between writing geometry code from scratch and relying on optimized library primitives.
Tags: Python, Pandas, Open3D, Voxelization, Point Cloud, Downsampling, 3D Data, Geometry
Link: N/A
Thumbnail: /portfolio/projects/point-cloud-down-sampler.png
Film and Creative Work
Section Summary: Stories and taste make us human, and I enjoy telling them through the lens.
Filmmaking Profile:
- A film I made was screened at AMC Theatres in Times Square!
- I love filmmaking, have directed 2 short films, and have contributed to other productions as a videographer and editor.
- Film Portfolio: https://drive.google.com/file/d/1m3aFLAK4TE29ybbdOzObLS8zrrX3oJwM/view?usp=sharing
01 St. Stephen's Dining Hall Documentary
Slug: st-stephens-dining-hall-documentary
Title: St. Stephen's Dining Hall Documentary
Short Title: Dining Hall Documentary
Subtitle: 2022
Year: 2022
Genre: Documentary
Duration: 10 min
Summary: A documentary on the dining hall staff and the people behind the daily experience.
Description: A documentary following the St. Stephen's dining hall staff from the start of their day to the end, combining observational footage, intimate interviews, and a close look at the full dining hall experience.
Roles: Director, Cinematographer, Editor
Recognition / Notes:
- Nominated for The All-American High School Film Festival 2023.
- Screened at AMC Theatres in New York City.
Type: video
Platform: youtube
Watch URL: https://youtu.be/WM6RvRfDCX4
Embed URL: https://www.youtube-nocookie.com/embed/WM6RvRfDCX4
Actions:
- Watch: https://youtu.be/WM6RvRfDCX4
- Festival Selection: https://www.hsfilmfest.com/2023-official-selections
02 The PB&J Documentary
Slug: the-pbj-documentary
Title: The PB&J Documentary
Short Title: The PB&J Documentary
Subtitle: 2023
Year: 2023
Genre: Documentary
Duration: 19 min
Summary: A comedic documentary about obsession, mentorship, and the perfect PB&J sandwich.
Description: A comedic documentary following Liam and Edison as they chase the perfect PB&J through restaurants, roadside discoveries, and a boutique in San Antonio before the whole mentor-protege dynamic starts to unravel.
Roles: Director, Cinematographer, Editor
Recognition / Notes: N/A
Type: video
Platform: youtube
Watch URL: https://youtu.be/FS8l8G2p7PM
Embed URL: https://www.youtube-nocookie.com/embed/FS8l8G2p7PM
Actions:
- Watch: https://youtu.be/FS8l8G2p7PM
03 RTMS Semesterly Recap
Slug: rtms-recap
Title: RTMS Semesterly Recap
Short Title: RTMS Semesterly Recap
Subtitle: 2018
Year: 2018
Genre: Recap
Duration: 3 min
Summary: A semester photo montage focused on rhythm, pacing, and raw editing craft.
Description: A semesterly recap film from Ras Tanura Middle School built as a photo montage. It has no traditional narrative, but it highlights editing instincts, visual sequencing, and the ability to build momentum through rhythm alone.
Roles: Editor, Photographer, Story Builder
Recognition / Notes: N/A
Type: video
Platform: drive
Watch URL: https://drive.google.com/file/d/1az0x6mwBzTXJEPBC7zhBQk9_DGO_8GwN/view?usp=sharing
Embed URL: https://drive.google.com/file/d/1az0x6mwBzTXJEPBC7zhBQk9_DGO_8GwN/preview
Actions:
- Watch: https://drive.google.com/file/d/1az0x6mwBzTXJEPBC7zhBQk9_DGO_8GwN/view?usp=sharing
Optional
- [Structured portfolio JSON](https://www.kushagrabharti.com/portfolio.json): Static structured portfolio snapshot.
- [Live public portfolio API](https://www.kushagrabharti.com/api/portfolio): Backend-owned public portfolio snapshot.
- [Live public API llms.txt](https://www.kushagrabharti.com/api/portfolio/llms.txt): Runtime-generated plain-text profile.
- [Robots policy](https://www.kushagrabharti.com/robots.txt): Crawler permissions for the public portfolio surface.
- [Sitemap](https://www.kushagrabharti.com/sitemap.xml): Indexable public portfolio URLs.
- [Build metadata](https://www.kushagrabharti.com/version.json): Static export generation metadata and canonical export URLs.