Two decades turning emerging technology into adopted, revenue-generating reality — and a track record of founding and scaling technical teams from the ground up. I've grown a global org from 4 to 130+ architects, engineers and scientists, shipped decision systems influencing $2B+ in annual outcomes, and today lead Applied AI & agentic go-to-market across APJ, based in Sydney.
I work where applied AI meets the enterprise — a technical pre-sales / solutions architect at heart who translates model capability into systems people actually adopt, trust, and pay for.
My career is a through-line of decision intelligence: econometric engines at Amazon, agentic supply-chain planning at AWS, and decision-science platforms at Aera serving the Fortune 1000. I lead from both sides of the table — setting product vision and strategy, then running the deep-dive workshops, PoCs and architecture that win the technical decision.
I'm a builder who leads. When a service has gaps, I build the missing pieces myself — data-onboarding pipelines, validators, benchmarking harnesses, agentic prototypes and live grounded-LLM apps — so the customer sees a complete solution. That keeps my technical judgment honest and my strategy grounded in what the technology can truly do.
Rather than describe my work, I shipped it — a conversational work-sample agent I built end-to-end: a managed LLM grounded in a private, hot-swappable skill file, retrieval-augmented with no redeploy to update. It's a working POC of exactly the pre-sales pattern I help enterprises build — a real production LLM use case, not a slide. Ask about leadership scope, scaling teams, technical depth, or hands-on AI builds.
Drag the concurrency slider through 120K real inferences on live AWS hardware — Trainium2 vs A10G GPU. p50/p99 latency, throughput & cost/million. 5.5× faster, 54% cheaper at production load.
A live consumer AI SaaS: a Bedrock-backed engine generates top-percentile adaptive test questions with trap distractors, auto-scoring & progress dashboards.
AI-native building-services engineering: a CAD/DXF geometry parser, a plumbing-code rule engine, and automated drawing generation. (demo page is a sample — may error.)
Shipped production LLM apps grounded in private knowledge stores with retrieval — dataset/SOP generators, pricing tools, and grounded chat agents (this very page is one). Real production LLM use cases, not demos.
Defined decision-engine architecture for systems making 500M+ daily decisions — non-functional requirements, trade-offs, and operational health at 99.9% uptime.
MS Electrical Engineering, USC · BE Electronics & Communications, Gujarat University. Two decades from semiconductor systems to applied-AI leadership.
The hard problem in enterprise AI is no longer capability — it's adoption: turning model capability into systems organisations trust, deploy, and run safely at scale. That's the exact seam I've worked my entire career — and I've done it as a founding team-builder, not just an operator.
Built a technical org from 4 to 130+, every hire one-on-one, across solution architects, engineers and scientists — and stood up 3 greenfield offices. I thrive in unstructured, zero-to-one environments and grow the people in them.
Two decades of technical pre-sales / solutions architecture — use-case scoping, deep-dive workshops, POC execution, technical-champion building and ROI validation. Recognised for "x-ray vision into the core mechanics of a customer's problem" and for building the artifacts that fill product gaps so customers see a complete solution. Executive presence with both engineers and the C-suite.
Sydney-based, leading applied-AI GTM across APJ — from ANZ's regulated industries (financial services, government, resources, healthcare) to high-growth markets in Japan, Korea, India and SEA. I ship real production LLM systems with reliability and safety built in, not bolted on.
Real measured data — 120,000 inferences on live AWS hardware. DistilBERT SST-2 sentiment classification (67M params, seq-len 128), modelled on super-app feedback at scale. AWS Trainium2 (trn2.xlarge, Neuron SDK + torch.compile) vs NVIDIA A10G (g5.xlarge). Total cost to reproduce: ~$3 in 15 min.
The A10G is a general-purpose GPU optimised for training flexibility; every inference still pays for hardware it doesn't use. Trainium2's NeuronCores execute a model compiled ahead-of-time (torch.compile → Neuron) into fixed instruction streams — no kernel-launch overhead, deterministic scheduling. The payoff isn't just median speed: at 128 concurrent the GPU's p50→p99 spread is 197ms (991→1,188) while Trainium2's is 10ms (179→189). That latency consistency is what holds an SLA under load — and at 256 concurrent the GPU dropped 29 requests while Trainium2 dropped zero. Unit economics combine instance price and throughput: cheaper per hour × more inferences per second = 54% lower cost per million.
Methodology & cost basis. Cost per million is computed as hourly_cost ÷ (throughput × 3600) × 10⁶, using on-demand pricing of $0.758/hr (trn2.xlarge) and $1.006/hr (g5.xlarge) — it reflects both instance price and measured throughput at each concurrency level. The headline 54% cheaper corresponds to the 256-concurrent stress test (the paper's reference point); at moderate load the gap is wider, because the GPU's throughput collapses faster than its hourly price advantage. Absolute per-million figures are small by design — DistilBERT (67M params) is a deliberately lightweight, well-proven workload — so the durable signal for production is throughput-per-dollar and latency-under-load, not the raw cent figure. All numbers are measured on real EC2 instances (120,000 inferences); the harness is open-source and reproducible in ~15 min for ~$3.