Ready to Work Together?
Let's discuss how our expertise can help transform your business.
Colin Soleim
·
Apr 15, 2026
If you've tried Claude Code on a large Rails codebase, you've probably noticed the gap between the demos and reality. The demos show Claude refactoring a 50-line file or writing a CRUD controller from scratch. Your codebase has 400 models, a domain model that took two years to understand, soft deletes on every table, and three different patterns for filter state persistence that nobody documented. Out of the box, Claude doesn't know any of that.
The difference between Claude Code being "kind of useful" and Claude Code being genuinely productive on a large codebase is context and configuration. Telling the tool what it needs to know about your specific system, giving it access to your data, and building repeatable workflows for the tasks you do every week.
The good news is that Rails monoliths are unusually well-suited for AI-assisted development. A RubyKaigi 2025 analysis by Rodrigo Serradura found that Ruby ranks #1 in token efficiency with Claude Code at $0.36 per task across 18 languages tested. Y Combinator CEO Garry Tan has called Rails' convention-over-configuration approach "LLM catnip" because the predictable file structure means the AI's first guess about where code lives is usually correct. When your models are in app/models, your controllers are in app/controllers, and your tests mirror the same structure, Claude doesn't waste tokens searching.
Shopify, Thoughtbot, and our team at NextLink Labs have all invested heavily in Claude Code tooling for Rails. The approaches are converging. Here's what actually works.
The single most important file in your Claude Code setup is CLAUDE.md in your project root. Claude reads it at the start of every conversation. On a small project, you can skip it and Claude will figure things out by reading the code. On a codebase with 100K+ lines, that doesn't work. There are too many files, too many implicit conventions, and too many places where the "obvious" approach is wrong because of some historical decision.
Thoughtbot found this when they used Claude Code to build TellaDraft, a production Rails app, in a two-week sprint. Their CLAUDE.md started small and grew over the sprint as they discovered gaps. They added stack specifics (Rails, Postgres, Devise), API integration details for ElevenLabs and WhisperAI, coding style preferences, and testing conventions. When Claude approached context limits mid-session, they prompted it to reread the CLAUDE.md and scan the codebase, which got it back on track.
We've landed on a similar approach. Here's what we include in ours for a Rails 7.2 monorepo with about 80 models:
What we don't put in CLAUDE.md is equally important. Don't dump your entire schema, your full API documentation, or your README. Every token in CLAUDE.md is context that Claude carries through the entire conversation. One practitioner recommends only documenting tools and APIs used by 30% or more of your team, and keeping the file under 15KB. We follow the same rule: document what Claude can't infer from reading the code, skip anything it can figure out by reading the relevant source file.
We treat CLAUDE.md the same way we treat our README. Whenever a new developer joins the project and gets stuck on something, we update it. When Claude gets something wrong because of missing context, add that context to CLAUDE.md so it doesn't happen again.
We wrote a separate article about building a custom MCP server to connect Claude Code to your Rails database, so we won't repeat the implementation details here. But we want to emphasize why this matters more on large codebases than small ones.
On a small app, Claude can read your models and migrations and build a reasonable mental model of your data. On a large app with 80+ tables, polymorphic associations, STI, and years of migrations, that doesn't scale. Claude needs to be able to ask the database directly: "What columns does this table actually have? What are the real values in this enum column? How many records match this condition?"
The MCP server we built gives Claude three tools: list_tables, describe_table, and execute_query. All queries run through a Rails controller that enforces read-only transactions with a 30-second timeout. Claude can explore the schema, run ad-hoc queries, and verify its assumptions without you switching to psql.
Shopify takes the MCP concept further. Their engineering team connects AI tools to internal wikis, product management tools, and data warehouses through MCP servers, all respecting existing access controls. For their 2.8-million-line monolith, a coding agent that can only see source files is working with a fraction of the context it needs. MCP servers fill the gap by giving the agent access to the surrounding systems: the database, the issue tracker, the deployment logs.
For our setup, the most immediately useful application is debugging data issues. If a student isn't showing up on a dashboard, Claude can trace the problem from the cached computation in Solid Cache, through the precompute job's query, down to the actual database records, and tell you exactly where the discrepancy is. Without database access, it can only guess.
Claude Code skills are markdown files in ~/.claude/skills/ that define step-by-step workflows. They're not useful for one-off tasks (just ask Claude directly for those). They're useful for processes you run regularly that involve multiple steps, external data sources, and consistent output formats.
We have seven custom skills. Here are the three that get the most use:
Shopify takes a similar approach with Roast, an open-source Ruby framework they extracted from their internal AI tooling. Roast structures AI workflows using YAML config files and markdown prompt templates. Each workflow is a series of steps that can be directory-based, shell commands, inline AI prompts, or custom Ruby classes. Steps can even run in parallel using nested arrays in the YAML.
Here's what Shopify actually uses Roast for internally on their 2.8-million-line Rails monolith:
The philosophy behind both approaches (our skills, Shopify's Roast) is the same: non-determinism is the enemy of reliability. A vague prompt like "analyze the errors and create tickets" will give you different results every time. A structured workflow with explicit steps, specific thresholds, and defined output formats gives you consistent, reviewable output. The skill is a runbook, not a prompt.
If you're working on a codebase that connects to shared staging or production infrastructure, you need to think about what Claude is allowed to run. The default permission model asks you to approve each command, which gets tedious fast. The alternative, approving everything, is dangerous when a rails db:drop is one autocomplete away from rails db:migrate.
Claude Code supports a permission whitelist in .claude/settings.local.json. You explicitly list the commands, tools, and patterns that are allowed to run without prompting. Everything else still requires approval.
This file gets checked into the repository, so every developer on the team gets the same guardrails. If someone needs a new command whitelisted, they add it to the file and it goes through code review like any other change.
Shopify takes guardrails a step further. Even with all their AI tooling, they still require human PR review on every change. AI doesn't check in code automatically. Their CTO Farhan Thawar reports roughly 20% productivity improvements across the engineering org, but he's careful to note they measure this through demos and shipped features, not lines of code. As he puts it: "Code is cheap now. I don't want code, I want solutions."
Thoughtbot came to a similar conclusion during their TellaDraft sprint. Every single change was reviewed before committing, and every commit served as a quality control checkpoint. They also found that Claude sometimes generates tests that "just mock everything and test the mock," which passes CI but tests nothing useful. Their fix was coaching Claude through test-writing with specific examples of what a meaningful test looks like in their CLAUDE.md, and catching the rest in review.
CLAUDE.md instructions are advisory. Claude reads them, usually follows them, but can drift, especially in long sessions with lots of context. If you have a rule that absolutely cannot be violated, use a hook instead.
Hooks are scripts that run automatically at specific points in Claude's workflow. Unlike CLAUDE.md, they're deterministic. A pre-commit hook that blocks credential files will block them every time, regardless of how long the conversation has been running or how much context has been compacted.
The most useful hooks for enterprise Rails work:
If your Rails app has a separate frontend (React, Next.js, etc.) or you're running multiple services, .claude/launch.json tells Claude Code how to start and preview them.
Here's our configuration:
Two things to note. First, we use absolute paths to the asdf shims for bundle and npm. Claude Code doesn't source your shell profile when it starts a server, so if you use asdf, rbenv, or nvm, the default bundle or npm commands may resolve to the wrong version or fail entirely. Absolute paths fix this.
Second, if your frontend makes API calls to the Rails backend, make sure your CORS configuration allows the preview URL. Our React app runs on port 3001 and makes requests to localhost:3100. CORS is configured in config/application.rb to allow http://localhost:3001. We document this in CLAUDE.md because it's the kind of thing that causes confusing failures when Claude starts both servers and tries to preview the app.
After using this setup daily for several months and watching other engineers adopt it, we've noticed a few common mistakes.
If you're working on a large Rails codebase and want to get the most out of Claude Code, start with these three things:
Add permission whitelisting and hooks as you get comfortable with the tool. The setup takes a day. The productivity gain compounds over months, because every convention you document and every workflow you automate saves time on every future session.
If you're interested in the broader case for Rails in 2026, we wrote about why we still build with Ruby on Rails and the ecosystem improvements that keep it competitive. Claude Code is one more reason the framework keeps working for us.
Author at NextLink Labs
Production looks nothing like staging. Staging looks nothing like dev. Here's how NextLink Labs uses Terraform workspaces and GitLab CI/CD to eliminate infrastructure drift across every client environment.
Alex Podobnik
·
Apr 15, 2026
Every AWS engagement used to start from scratch. NextLink Labs' Terraform reference architecture changes that — a production-ready, fully modular AWS foundation that cuts environment setup from 5 days to 1.
Alex Podobnik
·
Apr 7, 2026
Most teams have dashboards. Very few have observability. This playbook walks through the five-level maturity model NextLink Labs uses to build real observability with Grafana Cloud — from metrics to proactive monitoring.
Alex Podobnik
·
Mar 31, 2026
Let's discuss how our expertise can help transform your business.