1
1
The advent of AI-powered coding agents has undeniably begun to reshape the landscape of software development, promising unprecedented efficiency and autonomy. Yet, this transformative technology, exemplified by tools like Anthropic’s Claude Code, comes with a significant financial barrier that has begun to alienate the very community it seeks to empower. Claude Code, a sophisticated terminal-based AI agent, boasts the capability to write, debug, and deploy code autonomously, capturing the fervent imagination of software developers globally. However, its pricing structure—ranging from $20 to $200 per month depending on usage—has ignited a growing backlash among programmers.
In response to these escalating costs and restrictive usage policies, a compelling free alternative has rapidly gained prominence. Goose, an open-source AI agent developed by Block (the financial technology company formerly known as Square), offers a suite of functionalities nearly identical to Claude Code. Crucially, Goose distinguishes itself by running entirely on a user’s local machine, eliminating subscription fees, cloud dependencies, and the frustrating rate limits that plague commercial offerings.
"Your data stays with you, period," affirmed Parth Sareen, a software engineer, during a recent livestream demonstration of Goose. This statement encapsulates the core appeal of the platform: it grants developers unparalleled control over their AI-powered workflow, including the invaluable ability to operate offline—even during a flight. The project’s popularity has soared, evidenced by over 26,100 stars on GitHub, the leading code-sharing platform. With 362 contributors and 102 releases since its inception, including the latest version 1.20.1 shipped on January 19, 2026, Goose demonstrates a development velocity that rivals many commercial products. For developers increasingly frustrated by Claude Code’s pricing structure and stringent usage caps, Goose represents a rare and increasingly vital offering in the AI industry: a genuinely free, no-strings-attached solution for serious development work.
Anthropic’s New Rate Limits Spark a Developer Revolt
To fully grasp the significance of Goose, it is essential to understand the ongoing controversy surrounding Claude Code’s pricing and usage restrictions. Anthropic, a San Francisco-based artificial intelligence company founded by former OpenAI executives, integrates Claude Code into its various subscription tiers. The free plan, for instance, offers no access to the coding agent whatsoever. The Pro plan, priced at $17 per month with annual billing (or $20 monthly), imposes a severe limitation of just 10 to 40 prompts every five hours—a constraint that many professional developers can exhaust within mere minutes of intensive coding.
Even Anthropic’s premium Max plans, costing $100 and $200 per month, which offer more generous allowances (50 to 200 prompts and 200 to 800 prompts respectively, along with access to Anthropic’s most powerful model, Claude 4.5 Opus), still come with restrictions that have inflamed the developer community. The situation reached a boiling point in late July when Anthropic announced new weekly rate limits. Under this revised system, Pro users are allocated 40 to 80 hours of Sonnet 4 usage per week, while Max users on the $200 tier receive 240 to 480 hours of Sonnet 4, plus an additional 24 to 40 hours of Opus 4. Nearly five months later, the frustration among users remains palpable.
The primary issue stems from the deceptive nature of these "hours," which are not literal hours of usage but rather token-based limits. These limits fluctuate wildly depending on factors such as codebase size, conversation length, and the complexity of the code being processed. Independent analyses suggest that the actual per-session limits translate to roughly 44,000 tokens for Pro users and approximately 220,000 tokens for the $200 Max plan. Tokens, in this context, refer to chunks of text or code that the AI processes.
"It’s confusing and vague," one developer wrote in a widely circulated analysis. "When they say ’24-40 hours of Opus 4,’ that doesn’t really tell you anything useful about what you’re actually getting." The backlash across platforms like Reddit and various developer forums has been intense. Numerous users have reported hitting their daily limits within 30 minutes of focused coding, leading some to cancel their subscriptions entirely, denouncing the new restrictions as "a joke" and "unusable for real work."
Anthropic has attempted to defend these changes, asserting that the limits impact fewer than five percent of users and primarily target individuals running Claude Code "continuously in the background, 24/7." However, the company has not clarified whether this five percent figure refers to Max subscribers specifically or to the broader base of all Claude Code users—a crucial distinction that significantly affects the perceived impact of the policy.
How Block Built a Free AI Coding Agent That Works Offline
Goose adopts a fundamentally different approach to addressing the challenges of AI-assisted coding. Developed by Block, the payments company helmed by Jack Dorsey, Goose is what engineers refer to as an "on-machine AI agent." This architecture stands in stark contrast to Claude Code, which relies on sending user queries to Anthropic’s remote servers for processing. Instead, Goose is designed to run entirely on a user’s local computer, leveraging open-source language models that developers download and manage themselves.
The project’s documentation highlights its ambition to go "beyond code suggestions," enabling the agent to "install, execute, edit, and test with any LLM." That concluding phrase, "any LLM," is the pivotal differentiator. Goose is inherently model-agnostic, providing unparalleled flexibility. Developers can choose to connect Goose to Anthropic’s Claude models if they possess API access, or integrate it with OpenAI’s GPT-5, Google’s Gemini, or route it through services like Groq or OpenRouter. Most importantly, Goose empowers users to run it entirely locally using tools such as Ollama, which facilitates the downloading and execution of open-source models directly on personal hardware.
The practical implications of this local setup are profound. By operating independently of cloud services, developers are freed from subscription fees, usage caps, rate limits, and any concerns about their proprietary code being transmitted to external servers. All interactions with the AI remain securely on the user’s machine. Sareen underscored this benefit during his demonstration, noting, "I use Ollama all the time on planes—it’s a lot of fun!" This illustrates how local models liberate developers from the constraints of constant internet connectivity.
What Goose Can Do That Traditional Code Assistants Can’t
Goose functions as either a command-line tool or a desktop application, capable of autonomously executing complex development tasks. Its capabilities extend far beyond simple code completion, encompassing the ability to build entire projects from scratch, write and execute code, debug failures, orchestrate workflows across multiple files, and interact with external APIs—all without requiring constant human intervention.
This advanced functionality is rooted in what the AI industry terms "tool calling" or "function calling." This mechanism allows a language model to request specific actions from external systems. When a developer instructs Goose to create a new file, run a test suite, or check the status of a GitHub pull request, the agent doesn’t merely generate text describing the desired action. Instead, it actively executes those operations within the development environment.
The efficacy of this capability largely depends on the underlying language model. According to the Berkeley Function-Calling Leaderboard, which evaluates models on their proficiency in translating natural language requests into executable code and system commands, Claude 4 models from Anthropic currently demonstrate superior performance in tool calling. However, newer open-source models are rapidly closing this gap. Goose’s documentation highlights several promising options with robust tool-calling support, including Meta’s Llama series, Alibaba’s Qwen models, Google’s Gemma variants, and DeepSeek’s reasoning-focused architectures.
Furthermore, Goose integrates with the Model Context Protocol (MCP), an emerging standard designed to connect AI agents to various external services. Through MCP, Goose can access databases, search engines, file systems, and third-party APIs, significantly expanding its operational capabilities beyond what the base language model alone provides.
Setting Up Goose with a Local Model
For developers seeking a completely free, privacy-preserving AI coding environment, setting up Goose involves three primary components: the Goose agent itself, Ollama (a tool for running open-source models locally), and a compatible language model.
Step 1: Install Ollama
Ollama is an open-source project specifically designed to simplify the intricate process of running large language models on personal hardware. It handles the complex tasks of downloading, optimizing, and serving these models through a user-friendly interface. To begin, download and install Ollama from ollama.com. Once installed, models can be pulled with a single command. For coding tasks, Qwen 2.5 is recommended for its strong tool-calling capabilities. An example command would be: ollama run qwen2.5. The selected model will automatically download and commence running on your machine.
Step 2: Install Goose
Goose is available as both a desktop application and a command-line interface (CLI). The desktop version offers a more visual experience, while the CLI caters to developers who prefer working exclusively in the terminal. Installation instructions vary by operating system but typically involve downloading from Goose’s GitHub releases page or utilizing a package manager. Block provides convenient pre-built binaries for macOS (supporting both Intel and Apple Silicon), Windows, and Linux.
Step 3: Configure the Connection
Within Goose Desktop, navigate to Settings, then select "Configure Provider," and choose Ollama. Verify that the API Host is set to http://localhost:11434, which is Ollama’s default port, and then click Submit. For the command-line version, execute goose configure, select "Configure Providers," choose Ollama, and enter the desired model name when prompted. With these steps completed, Goose is now successfully connected to a language model running entirely on your hardware, poised to execute complex coding tasks without any subscription fees or external dependencies.
The RAM, Processing Power, and Trade-offs You Should Know About
A natural question arises regarding the necessary computer specifications for running Goose locally. Operating large language models on personal hardware demands significantly more computational resources than typical software applications. The primary constraint is memory—specifically, RAM on most systems, or VRAM if a dedicated graphics card is employed for acceleration.
Block’s documentation suggests that 32 gigabytes of RAM establishes "a solid baseline for larger models and outputs." For Mac users, the computer’s unified memory serves as the main bottleneck. For Windows and Linux users equipped with discrete NVIDIA graphics cards, GPU memory (VRAM) becomes more critical for accelerating model inference.
However, it’s not strictly necessary to possess expensive, high-end hardware to get started. Smaller models, with fewer parameters, can operate effectively on more modest systems. Qwen 2.5, for example, is available in multiple sizes, with its smaller variants capable of running efficiently on machines equipped with 16 gigabytes of RAM. "You don’t need to run the largest models to get excellent results," Sareen emphasized, recommending that developers start with a smaller model to test their workflow and then scale up as needed. For context, an entry-level Apple MacBook Air with 8 gigabytes of RAM would likely struggle with most capable coding models, whereas a MacBook Pro with 32 gigabytes—an increasingly common configuration among professional developers—can handle them comfortably.
Goose vs. Claude Code: Key Trade-offs
It is important to acknowledge that Goose, even with a local LLM setup, is not a perfect, drop-in substitute for Claude Code. The comparison involves genuine trade-offs that developers must consider.
Model Quality: Claude 4.5 Opus, Anthropic’s flagship model, currently maintains its position as arguably the most capable AI for intricate software engineering tasks. It demonstrates exceptional proficiency in comprehending complex codebases, adhering to nuanced instructions, and generating high-quality code with minimal revisions. While open-source models have advanced dramatically, a discernible gap persists, particularly when tackling the most challenging and abstract coding problems. One developer who transitioned to the $200 Claude Code plan described the difference succinctly: "When I say ‘make this look modern,’ Opus knows what I mean. Other models give me Bootstrap circa 2015."
Context Window: Claude Sonnet 4.5, accessible via Anthropic’s API, boasts a massive one-million-token context window. This capacity is sufficient to load entire large codebases without the need for complex chunking or manual context management. In contrast, most local models are typically limited to 4,096 or 8,192 tokens by default, though many can be configured for longer contexts at the expense of increased memory consumption and slower processing speeds.
Speed: Cloud-based services like Claude Code operate on dedicated server hardware specifically optimized for AI inference, resulting in rapid response times. Local models, running on consumer-grade laptops, generally process requests more slowly. This difference in speed can be significant for iterative development workflows that demand rapid changes and immediate AI feedback.
Tooling Maturity: Claude Code benefits from Anthropic’s dedicated engineering resources, which ensure features like prompt caching (capable of reducing costs by up to 90 percent for repeated contexts) and structured outputs are polished and comprehensively documented. Goose, while actively developed with 102 releases to date, relies on community contributions and may exhibit less refinement in specific areas compared to its commercial counterparts.
How Goose Stacks Up Against Cursor, GitHub Copilot, and the Paid AI Coding Market
Goose enters a highly competitive market for AI coding tools but carves out a distinctive niche. Tools like Cursor, a popular AI-enhanced code editor, charge $20 per month for its Pro tier and $200 for Ultra, mirroring Claude Code’s Max plans. Cursor, however, provides approximately 4,500 Sonnet 4 requests per month at the Ultra level, representing a substantially different allocation model compared to Claude Code’s hourly resets.
Other open-source projects, such as Cline and Roo Code, offer AI coding assistance but with varying degrees of autonomy and tool integration. Many of these projects primarily focus on code completion and suggestions rather than the agentic task execution that defines both Goose and Claude Code. Enterprise offerings from major cloud providers like Amazon’s CodeWhisperer and GitHub Copilot are typically aimed at large organizations with complex procurement processes and dedicated budgets, making them less relevant to individual developers and small teams seeking lightweight, flexible tools.
Goose’s unique value proposition lies in its combination of genuine autonomy, model agnosticism, local operation, and zero cost. The tool is not designed to compete with commercial offerings solely on the grounds of polish or raw model quality. Instead, its primary competitive edge is freedom—both financial and architectural.
The $200-a-Month Era for AI Coding Tools May Be Ending
The market for AI coding tools is undergoing rapid evolution. Open-source models are improving at an astonishing pace, consistently narrowing the performance gap with proprietary alternatives. Emerging models like Moonshot AI’s Kimi K2 and z.ai’s GLM 4.5 now benchmark near the levels of Claude Sonnet 4—and they are freely available.
If this trajectory continues, the quality advantage that currently justifies Claude Code’s premium pricing may erode significantly. Anthropic would then face increasing pressure to compete on features, user experience, and seamless integration rather than solely on raw model capability.
For the time being, developers are presented with a clear choice. Those who demand the absolute best model quality, can afford premium pricing, and are willing to accept usage restrictions may continue to prefer Claude Code. However, those who prioritize cost-effectiveness, data privacy, offline access, and maximum flexibility now have a genuine and robust alternative in Goose. The mere existence of a zero-dollar, open-source competitor offering comparable core functionality to a $200-per-month commercial product is itself remarkable. It reflects both the accelerating maturation of open-source AI infrastructure and a growing demand among developers for tools that respect their autonomy and control.
Goose is not without its limitations. It requires a more technical setup than commercial alternatives, depends on hardware resources that not every developer possesses, and its model options, while rapidly improving, still trail the very best proprietary offerings on the most complex tasks. Yet, for a burgeoning community of developers, these limitations are acceptable trade-offs for something increasingly rare and valuable in the AI landscape: a tool that truly belongs to them.
Goose is available for download at github.com/block/goose. Ollama is available at ollama.com. Both projects are free and open source.