Geddy Dukes

ML Engineer who builds AI for high-stakes environments. I specialize in systems where hallucinations aren't acceptable and every decision needs an audit trail—from training 67M-parameter models from scratch to shipping production underwriting platforms for $30M portfolios.

✨ TinyLLM blog post featured in TLDR AI Newsletter

Resume (PDF)Hire me

What I Bring to Your Team

Most ML engineers can train models. Few understand how to deploy them where errors cost money or violate regulations.

⚡

Ship production systems fast

Built a complete financial analysis platform and AI underwriting system while simultaneously managing a $30M loan portfolio. Comfortable across the full stack—PyTorch training loops to React frontends to PostgreSQL schemas. I prototype rapidly and ship incrementally.

🏦

Domain expertise in finance

Spent years as a credit analyst underwriting deals up to $3M and managing compliance workflows across ~$100M in programs. I understand lending requirements, risk assessment, and regulatory constraints—not just the ML. This means I can build AI systems that actually work in regulated environments without months of onboarding.

🔒

Safety-first, neuro-symbolic architecture

Every production system I've built separates probabilistic extraction from deterministic decisions using neuro-symbolic design patterns. Complete audit trails, provenance tracking, and confidence thresholds at every layer. I know when to use LLMs and when to use rules—and I build the infrastructure to enforce that boundary.

Featured Projects

Research · Featured on TLDR AI

TinyLLM – Language Model from Scratch

67M parameter transformer trained entirely on consumer hardware — proving deep understanding of architecture, not just API calls.

Built a GPT-style language model from scratch including RoPE positional encodings, RMSNorm, and SwiGLU activations. Designed a custom continual learning system with replay buffers and evaluation gating to prevent catastrophic forgetting. Achieved 93.94% exact-match accuracy on CLI command generation. This project demonstrates first-principles understanding of transformer architecture and training dynamics — the kind of depth that informs how I design and debug production LLM systems.

Key capabilities:

Full transformer implementation (RoPE, RMSNorm, SwiGLU)
Continual learning with replay buffers
Evaluation gating to prevent catastrophic forgetting
Trained on consumer GPU
93.94% exact-match accuracy on CLI commands

Tech:Python · PyTorch · WandB

Links:

GitHub Demo Blog Post

In Development

Underwriting Agent – AI-Native Credit Decision Platform

AI underwriting platform designed to reduce loan analysis from weeks to days while maintaining full regulatory compliance for CDFIs.

End-to-end underwriting workspace that automates financial document processing, surfaces risks with complete provenance tracking, and generates audit-ready credit memos. Built for CDFIs where every lending decision requires documentation and regulatory compliance. The challenge: using LLMs for analysis while maintaining the auditability required in financial services. The solution: strict separation between probabilistic document extraction and deterministic decision logic, with every recommendation tied directly to source documents and calculations.

Key capabilities:

Agent-orchestrated workflow with deterministic + LLM hybrid analysis
Three-tier OCR pipeline for financial documents
Automated risk surfacing with full source provenance
End-to-end auditability for regulatory compliance
Applicant portals + analyst dashboards

Tech:Python · TypeScript · FastAPI · PostgreSQL · LLMs

Links:

GitHub Case Study (Coming Soon)

In Progress

Claims & Policy Decision Engine

Neuro-symbolic decision engine that converts probabilistic LLM outputs into deterministic, auditable decisions for regulated workflows.

Policy and claims decision system designed for environments where "the model said so" isn't good enough. Uses LLMs to interpret unstructured documents, then enforces strict validation, confidence thresholds, and deterministic rule evaluation before any decision is made. Every output includes complete audit trails showing which documents informed which conclusions. A working proof-of-concept for the neuro-symbolic architecture that now underpins the underwriting platform — demonstrating the pattern at a smaller scale before productizing it.

Key capabilities:

Neuro-symbolic architecture: probabilistic extraction → deterministic validation
Confidence-gated decision workflows
Complete document provenance tracking
Deterministic rule enforcement
Audit-ready decision logs

Tech:Python · FastAPI · Pydantic · PostgreSQL

Links:

GitHub Technical Writeup

Open Source · npm + PyPI

AgentFT – Agent Evaluation Framework

Testing infrastructure that makes AI agent reliability measurable, repeatable, and debuggable. Everyone is building agents — almost no one knows how to test them.

Evaluation harness that treats AI agents as testable software artifacts. Supports task abstractions, async execution, composite judging, and trace-level logging to detect regressions across versions and workflows. Critical for production deployments where you need to know if your agent got worse after an update. Published as open-source packages on both npm and PyPI — because good developer experience matters.

Key capabilities:

Async evaluation engine for agent tasks
Regression detection across versions
Trace-level observability
Composite judge patterns
Published on npm and PyPI

Tech:Python · AsyncIO · PyTorch · Jinja2

Links:

GitHub PyPI

Live · Used by underwriting teams

Financial Analysis Platform

Real-time financial computation engine that powered underwriting decisions across millions in loan volume.

Production fintech platform built while managing a $30M CDFI portfolio. Supports real-time financial analysis, metric computation, and compliance workflows with sub-second latency. Architected deterministic calculation pipelines with role-based access control and full audit logging. This became the deterministic financial core later integrated into the AI underwriting platform — proving the value of building reliable calculation engines before layering AI on top.

Key capabilities:

Real-time financial metric computation
Deterministic calculation pipelines
RBAC + complete audit logs
Compliance workflow automation
Sub-second latency across millions in loan decisions

Tech:TypeScript · Next.js · PostgreSQL · Supabase

Links:

GitHub

Demo

Research Discovery Agent

Provenance-tracked research extraction system that prioritizes correctness and traceability over fluency.

Research discovery platform that converts unstructured academic papers into a structured, queryable knowledge graph while enforcing provenance and confidence at every step. LLMs propose extractions, which are then validated deterministically before graph ingestion. Built to explore how to combine LLM flexibility with symbolic validation — every claim in the knowledge graph can be traced back to its source document with a confidence score.

Key capabilities:

Provenance-first document extraction
Confidence-gated claims before graph insertion
Neuro-symbolic validation layer
Queryable knowledge graph with relationship tracking
Full source traceability

Tech:TypeScript · Node.js · PostgreSQL · LLMs

Links:

GitHub Live Demo

Professional Experience

Credit Analyst & Catalytic Capital Program Manager

Community Vision Capital & Consulting

Dec 2023 – Nov 2025 · Remote

Built production financial spreading platform (TypeScript/Next.js/PostgreSQL/Supabase) with RBAC, audit logs, and real-time calculation pipelines — used by underwriting teams to analyze loan applications across millions in volume
Managed $30M catalytic capital program and personally underwrote loan deals up to $3M, gaining deep hands-on experience in lending compliance, risk assessment, and regulatory requirements
Encoded complex financial rules as deterministic backend logic, creating reliable automated workflows that became the foundation for later AI underwriting systems
Integrated lending logic directly into internal tooling, reducing manual analysis overhead and standardizing underwriting processes

Highlight

Finance → software builder: moved complex domain rules into auditable, production workflows.

Catalytic Capital Program Analyst

Community Vision

Jun 2022 – Dec 2023

Improved program operational efficiency ~20% through data modeling and financial analysis optimization
Coordinated budgeting and performance tracking across ~$100M in programs
Managed AHP and AHEAD compliance cycles, building deep expertise in regulatory requirements and audit processes

Finance Project Manager

Dorjil Company

May 2020 – Oct 2021

Built Python/SQL automation tools that secured $400K+ in operating subsidies for affordable housing portfolios
Led multi-year audit that recovered $2M in overdue operating subsidies
Automated financial data pipelines, achieving ~25% efficiency improvements in reporting and compliance workflows

Full-Stack Software Engineer (Contract)

Blindly

Jun 2020 – Sep 2020

Implemented ID parsing and verification system (Node.js/React/Redux) for identity authentication workflows
Shipped production authentication features in fast-paced, cross-functional startup environment

Core Skills

Engineering

TypeScriptPythonNode.jsReact/Next.jsReact NativeSQL (PostgreSQL + Supabase)DockerAWSVercelGit/CI/CD

AI Systems

LLM orchestrationAgent frameworksRAGNeuro-symbolic AISymbolic–LLM hybridsEmbeddings & vector searchMulti-agent coordinationMultimodal SSL (PyTorch)Evaluation infrastructure

Delivery & Leadership

Agile/ScrumProgram leadershipRisk analysisStakeholder alignmentAuditability & compliance design

Let's work together

If your current AI strategy is "just use a bigger prompt," let's talk about how to build a system that actually survives an audit. I'm looking for ML engineering roles where I can build AI systems that ship to production and handle real consequences. Particularly interested in companies working in regulated industries; fintech, healthcare, insurance, legal.

Based in SF Bay Area · Available immediately · Open to in-person and remote

Resume (PDF)Hire me

geddydukes@gmail.com · 707-799-1271