In this post, we explore a multi-agent system built with LangGraph and Google's Gemini models that transforms a non‑technical specification into working code — end to end. By breaking the process into specialized agents, we achieve a robust, transparent, and extensible workflow for Gen AI–driven software development. This work was done towards the capstone project for the 5 day intensive GenAI workshop hosted by kaggle and google

1. Project Overview

The "Gen AI Agent for Code Generation" notebook demonstrates:

  • Parsing a raw user spec into structured requirements
  • Planning a high‑level architecture (modules, folder/file structure)
  • Generating code stubs and implementation based on that plan
  • Reviewing the generated code for correctness and completeness
  • Iteratively fixing bugs until approval

All agents communicate via a shared GraphState TypedDict, orchestrated by LangGraph's StateGraph.

2. Agent Pipeline

flowchart TD

A[Requirement Parser] → B[Planner / Architect]

B → C[Dev Agent]

C → D[Quality Gate]

D — needs fix? → E[Bug Fixer] → D

D — approved → END

  1. Requirement Parser
  • Uses Gemini 2.0 to extract: • System description • Functional & non‑functional requirements • Tech stack (defaults to Python + FastAPI)
  • Outputs a JSON schema (RequirementSchema).
  1. Planner / Architect
  • Runs on a more capable Gemini 2.5‑pro model.
  • Decomposes requirements into modules and folder/file structure (ArchitechurePlan).
  1. Dev Agent
  • Generates code stubs and skeletons per module.
  • Constrains output to a FinalCode schema (filename + content).
  1. Quality Gate
  • Reviews code against requirements and plan.
  • Produces a Review schema: approval status, comments, missing features.
  1. Bug Fixer
  • Triggered if review is negative and fix attempts remain.
  • Revises code until it passes review or exhausts retry limit.

3.Key Technical Highlights

  • Structured Output with Pydantic Every agent binds to a Pydantic schema to enforce predictable JSON responses.
  • Model Selection & Retry Logic
  • THINKING_MODEL (Gemini 2.5 Pro) for planning and code-heavy tasks
  • GENERAL_MODEL (Gemini 2.0 Flash) for parsing and review
  • Simple retry loop handles transient API throttling (429/503).
  • LangGraph for Orchestration Using StateGraph, we declare nodes, edges, and conditional transitions in a declarative graph.
  • Few‑shot & Implicit Function Calling Prompts are structured to mimic function signatures, guiding models to emit JSON without explicit function-calling APIs.

4. Demonstration

Given the user spec:

Build a collaborative note‑taking app with user auth, shared notebooks, real‑time collaboration, version history, tags, and search.

The pipeline:

  1. Parses the above into a list of requirements.
  2. Architects a microservice‑style FastAPI backend + React frontend.
  3. Generates stub code files (app/main.py, auth.py, notes.py, etc.).
  4. Reviews the code for missing features (e.g., conflict resolution in real‑time edits).
  5. Fixes any gaps until approval.

5. Conclusion

By modularizing each step — requirements, planning, code generation, review, and fixes — this Gen AI agent pipeline offers:

  • Traceability: Clear handoff and data schemas between agents
  • Extensibility: Swap in new models or add specialized nodes (e.g., security audit)
  • Automation: From spec to code with minimal human intervention