Windsurf Cascade Mode: Complete Guide to the Agentic AI Assistant
TL;DR Cascade is Windsurf’s agentic AI that plans, edits multiple files, and runs terminal commands in a single flow. Three modes — Write, Chat, and Plan — each serve distinct…
- Cascade is Windsurf’s agentic AI that plans, edits multiple files, and runs terminal commands in a single flow.
- Three modes — Write, Chat, and Plan — each serve distinct purposes; toggle with Ctrl + ..
- Rules (.windsurfrules) and Memories give Cascade persistent, project-aware context across sessions.
Overview
Cascade is the agentic AI assistant built into Windsurf, Codeium’s AI code editor. Unlike line-level autocomplete or single-file chat, Cascade operates at the project level: it indexes your entire codebase, builds a multi-step plan, then executes coordinated edits across multiple files while running terminal commands as needed.
This guide covers how to activate Cascade, use its three modes effectively, configure rules and memories for persistent context, and avoid common pitfalls. It assumes you have Windsurf installed and a project open.
- Vendor
- Codeium
- Pricing
- Free · Pro $15/mo · Team $35/mo
- Platforms
- macOS, Linux, Windows
- Flagship models
- SWE-1.6, SWE-1.5
- Config file
- .windsurfrules
Prerequisites
Before using Cascade, confirm the following:
- Windsurf installed on macOS, Linux, or Windows. Download from the official site.
- A Windsurf account — free tier includes 25 credits. Pro ($15/mo) and Team ($35/mo) plans increase limits.
- A project open in the editor. Cascade indexes your codebase on open; a project with files gives you something to work with immediately.
- Optional: API keys for BYOK (Bring Your Own Key) if you want access to models beyond the defaults, such as Claude 4 Opus or GPT-5.1-Codex Max.
No additional CLI tools or plugins are required. Cascade ships as a core feature of the editor.
Opening Cascade
Open the Cascade panel with a keyboard shortcut or the UI icon.
Any text selected in the editor or terminal is passed to Cascade as context when the panel opens. This is faster than manually pasting snippets into the prompt.
Understanding the Three Modes
Cascade has three distinct modes, each designed for a different type of interaction. Choosing the right mode matters — it determines whether Cascade can modify your code or only discuss it.
Write Mode
Write mode is Cascade’s primary agentic mode. It can:
- Create, modify, and delete code across multiple files
- Execute terminal commands and read their output
- Show diffs before applying changes
- Roll back to any checkpoint if something goes wrong
When you prompt Cascade in Write mode, it generates a step-by-step plan, then executes it. You see diffs for each file change and can accept or reject them individually.
Chat Mode
Chat mode is read-only. Cascade answers questions about your codebase, explains code, discusses architecture, and helps debug — but makes no modifications. Use this when you need to understand before you act.
Plan Mode
Plan mode creates detailed implementation plans without writing code. It’s useful for scoping complex features before committing to implementation. Think of it as a design doc generator that understands your actual codebase.
How Cascade Understands Your Codebase
Cascade’s context engine is what separates it from a generic LLM chat wrapper. When you open a project, Windsurf immediately indexes every file — not just the ones currently open.
Each file and function is converted into 768-dimensional vector embeddings that capture semantic meaning. When you prompt Cascade, it runs a similarity search (using Codeium’s proprietary M-Query retrieval) against this index and pulls the most relevant code into the prompt context.
The context system operates on multiple layers simultaneously:
- RAG-based codebase index — semantic search across all project files
- Real-time action tracking — your edits, terminal commands, clipboard contents
- Memories — auto-generated persistent context from prior sessions
- Rules — user-defined instructions from
.windsurfrulesandAGENTS.md
A visual context window indicator in the UI shows how much of the available context is consumed. When it’s near capacity, start a new session to avoid silent truncation.
Configuring Rules for Better Output
Rules are the single highest-impact configuration for Cascade’s output quality. They tell Cascade how your project works, what conventions to follow, and what patterns to avoid.
Project-level rules
Create a .windsurfrules file in your project root:
# Project Rules
## Stack
- Framework: Next.js 15 with App Router
- Language: TypeScript (strict mode)
- Styling: Tailwind CSS v4
- Database: PostgreSQL via Drizzle ORM
## Conventions
- Use named exports, not default exports
- Error boundaries on every route segment
- No barrel files (index.ts re-exports)
## Anti-patterns
- Never use `any` type
- No inline styles
- Do not install new dependencies without asking
Workspace rules
For more granular control, create files in .windsurf/rules/. Each file scopes to a specific domain — auth.md, testing.md, api.md — so Cascade pulls only what’s relevant.
Global rules
Create a global_rules.md for preferences that apply across all your projects. These cover personal style: indentation, commit message format, preferred libraries.
Well-structured rules produce dramatically better output. Invest 10 minutes writing them and every subsequent Cascade interaction improves.
Using Memories for Persistent Context
Cascade automatically generates Memories during your interactions. These persist across sessions and capture patterns like frequently referenced functions, preferred libraries, and project-specific idioms.
Key details:
- Auto-generated and local-only — Memories live on your machine, not synced to your team.
- Persistent across sessions — Cascade remembers what it learned last time you worked on this project.
- Can become stale — After major refactors, Memories may reference outdated patterns. Clear them periodically.
For knowledge you want the entire team to share, write it as a Rule (version-controlled) or add it to AGENTS.md rather than relying on auto-generated Memories.
Terminal Integration and Turbo Mode
Cascade can execute terminal commands directly. By default, it asks for approval before running each command. Turbo Mode changes this — commands auto-execute unless they’re on your deny list.
Windsurf uses a dedicated zsh shell profile for agent execution, separate from your regular terminal. This improves reliability and prevents Cascade from interfering with your active terminal sessions.
Configure allow/deny lists to control what auto-executes:
- Allow list: Commands that always run without prompting (e.g.,
npm test,git status) - Deny list: Commands that always require manual approval (e.g.,
rm -rf,git push --force)
Choosing a Model
Cascade supports multiple underlying models. Your choice affects speed, quality, and credit consumption.
| Model | Speed | Best for | Notes |
|---|---|---|---|
| SWE-1.6 | Fast | General agentic coding | Codeium’s latest frontier model |
| SWE-1.5 | Fast | Free-tier users | Free version available |
| Claude Sonnet 4.6 | Medium | Complex reasoning | Promotional pricing |
| Gemini 3 Flash | Fast | Quick iterations | Available to all users |
| GPT-5.1-Codex Max | Varies | Heavy reasoning tasks | Low/Medium/High reasoning tiers |
Arena Mode lets you run two models side-by-side with hidden identities, so you can blind-test which performs best for your specific workflow. Useful for deciding between SWE-1.6 and a third-party model on your actual codebase.
Tips and Best Practices
- Start sessions clean. When the context window indicator is above 80%, open a new Cascade session. Stuffed context degrades output quality.
- Use @-mentions liberally. Instead of hoping Cascade finds the right file via indexing, explicitly reference files with
@filename. Precision beats recall. - Write rules as bullet points. Long prose in
.windsurfrulesconfuses the model. Numbered lists and short declarative statements work best. - Use checkpoints as undo. Cascade creates visual checkpoints at each step. If a multi-file edit goes wrong, roll back to a specific checkpoint instead of manually reverting.
- Parallel sessions for independent tasks. Since Wave 13, you can run multiple Cascade sessions simultaneously using Git worktrees. Use this for tasks that don’t touch the same files.
- Clear stale Memories after refactors. If you renamed a module or restructured directories, old Memories may cause Cascade to reference paths that no longer exist.