1
1
The burgeoning artificial intelligence coding revolution, while promising unprecedented productivity, arrives with a significant caveat: its often-exorbitant cost. Anthropic’s Claude Code, a terminal-based AI agent renowned for its ability to autonomously write, debug, and deploy code, has undoubtedly captured the attention and imagination of software developers globally. However, its tiered pricing structure—ranging from $20 to $200 per month depending on usage—has ignited a growing backlash among the very programmers it aims to empower.
Amidst this discontent, a compelling free alternative is rapidly gaining momentum. Dubbed Goose, this open-source AI agent, developed by Block (the financial technology company formerly known as Square), provides functionality nearly identical to Claude Code. Crucially, Goose distinguishes itself by running entirely on a user’s local machine, eliminating subscription fees, cloud dependencies, and restrictive rate limits that typically reset every few hours.
"Your data stays with you, period," affirmed Parth Sareen, a software engineer, during a recent livestream demonstration of the tool. This statement encapsulates Goose’s core appeal: it grants developers unparalleled control over their AI-powered workflow, including the invaluable capability to work entirely offline, even in environments like an airplane.
The project’s popularity has soared since its launch. Goose now boasts an impressive tally of over 26,100 stars on GitHub, the premier code-sharing platform. With 362 contributors and 102 releases to its name, the project demonstrates a development velocity that rivals many commercial software products. Its latest version, 1.20.1, was shipped on January 19, 2026, underscoring its active and dynamic evolution. For developers increasingly frustrated by Claude Code’s prohibitive pricing and stringent usage caps, Goose emerges as a rare offering in the AI industry: a genuinely free, no-strings-attached option for undertaking serious development work.
Anthropic’s New Rate Limits Spark a Developer Revolt
To fully grasp the significance of Goose, it is essential to understand the ongoing controversy surrounding Claude Code’s pricing and usage policies. Anthropic, the San Francisco-based artificial intelligence company founded by former OpenAI executives, integrates Claude Code into its various subscription tiers. The free plan, for instance, offers no access to Claude Code whatsoever. The Pro plan, priced at $17 per month with annual billing (or $20 monthly), imposes severe limitations, restricting users to a mere 10 to 40 prompts every five hours—a constraint that many professional developers find they exhaust within minutes of intensive coding.
Higher-tier Max plans, costing $100 and $200 per month, offer more generous allowances, providing 50 to 200 prompts and 200 to 800 prompts respectively, alongside access to Anthropic’s most powerful model, Claude 4.5 Opus. However, even these premium subscriptions come with restrictions that have inflamed the developer community.
The situation escalated in late July when Anthropic announced new weekly rate limits. Under this revised system, Pro users are allocated 40 to 80 hours of Sonnet 4 usage per week. Max users on the $200 tier receive 240 to 480 hours of Sonnet 4, plus an additional 24 to 40 hours of Opus 4. Nearly five months later, the widespread frustration among developers remains palpable.
The primary issue lies in the ambiguity of these "hours." They do not represent actual clock hours but rather token-based limits that fluctuate dramatically depending on factors such as codebase size, conversation length, and the inherent complexity of the code being processed. Independent analyses suggest that the actual per-session limits translate to approximately 44,000 tokens for Pro users and around 220,000 tokens for the $200 Max plan.
"It’s confusing and vague," one developer articulated in a widely shared analysis. "When they say ’24-40 hours of Opus 4,’ that doesn’t really tell you anything useful about what you’re actually getting."
The backlash has been fierce across platforms like Reddit and various developer forums. Numerous users have reported hitting their daily limits within as little as 30 minutes of intensive coding. Others have gone as far as canceling their subscriptions entirely, denouncing the new restrictions as "a joke" and "unusable for real work." Anthropic has publicly defended these changes, asserting that the limits affect fewer than five percent of users and are specifically aimed at individuals running Claude Code "continuously in the background, 24/7." However, the company has not clarified whether this figure refers to five percent of Max subscribers or five percent of its entire user base—a distinction with significant implications for how the impact is perceived.
How Block Built a Free AI Coding Agent That Works Offline
Goose adopts a fundamentally different philosophical and architectural approach to solving the same core problem of AI-assisted coding. Built by Block, the payments company led by Jack Dorsey, Goose is what engineers refer to as an "on-machine AI agent." In stark contrast to Claude Code, which transmits user queries to Anthropic’s remote servers for processing, Goose is designed to run entirely on a user’s local computer. This is achieved by leveraging open-source large language models (LLMs) that users download and control directly.
The project’s documentation clearly states its ambition to go "beyond code suggestions," aiming to "install, execute, edit, and test with any LLM." That last phrase—"any LLM"—is the critical differentiator. Goose is meticulously designed to be model-agnostic.
This flexibility means developers can connect Goose to Anthropic’s Claude models if they possess API access, or integrate it with OpenAI’s GPT-5, Google’s Gemini, or route it through services like Groq or OpenRouter. However, where Goose truly shines is its ability to run entirely locally using tools such as Ollama. Ollama simplifies the process of downloading and executing various open-source models directly on a user’s own hardware.
The practical implications of this local setup are profound. With a local configuration, developers are freed from subscription fees, usage caps, and rate limits. Furthermore, there are no concerns about proprietary code being transmitted to external servers, as all conversations and data interactions with the AI remain securely on the user’s machine. "I use Ollama all the time on planes—it’s a lot of fun!" Sareen noted during his demonstration, underscoring how local models liberate developers from the constraints of internet connectivity and cloud dependence.
What Goose Can Do That Traditional Code Assistants Can’t
Goose operates primarily as a command-line tool or a desktop application, engineered to autonomously perform complex development tasks. Its capabilities extend far beyond simple code completion or suggestions. Goose can initiate and build entire projects from scratch, write and execute code, debug failures, orchestrate intricate workflows across multiple files, and seamlessly interact with external application programming interfaces (APIs)—all with minimal human oversight.
This advanced functionality relies heavily on a mechanism known in the AI industry as "tool calling" or "function calling." This refers to the ability of a language model to intelligently request and trigger specific actions from external systems. When a developer instructs Goose to create a new file, run a test suite, or check the status of a GitHub pull request, the AI doesn’t merely generate text describing what should happen; it actively executes those operations within the local development environment.
The effectiveness of this capability is significantly influenced by the underlying language model. According to the Berkeley Function-Calling Leaderboard, which benchmarks models on their proficiency in translating natural language requests into executable code and system commands, Claude 4 models from Anthropic currently demonstrate superior performance in tool calling. However, newer open-source models are rapidly closing this gap. Goose’s documentation highlights several promising options with robust tool-calling support, including Meta’s Llama series, Alibaba’s Qwen models, Google’s Gemma variants, and DeepSeek’s reasoning-focused architectures.
The tool also integrates with the Model Context Protocol (MCP), an emerging standard designed to connect AI agents to various external services. Through MCP, Goose can access and leverage databases, search engines, file systems, and a wide array of third-party APIs, significantly extending its capabilities beyond what the base language model alone could provide.
Setting Up Goose with a Local Model
For developers keen on a completely free, privacy-preserving, and locally controlled AI coding setup, the process involves three primary components: Goose itself, Ollama (the tool for running open-source models locally), and a compatible language model.
Step 1: Install Ollama
Ollama is an open-source project specifically designed to dramatically simplify the process of running large language models on personal hardware. It expertly manages the complex tasks of downloading, optimizing, and serving these models through a user-friendly interface. To begin, download and install Ollama directly from ollama.com. Once installed, users can pull models with a single command. For coding tasks, Qwen 2.5 is frequently recommended for its strong tool-calling support.
ollama run qwen2.5
The model will automatically download and commence running on your machine.
Step 2: Install Goose
Goose is available both as a dedicated desktop application and a command-line interface (CLI). The desktop version offers a more visual and intuitive experience, while the CLI caters to developers who prefer to work entirely within the terminal. Installation instructions vary slightly by operating system but generally involve downloading pre-built binaries from Goose’s GitHub releases page or utilizing a package manager. Block provides official binaries for macOS (supporting both Intel and Apple Silicon), Windows, and Linux.
Step 3: Configure the Connection
In Goose Desktop, the process is straightforward: navigate to Settings, then Configure Provider, and select Ollama. Confirm that the API Host is set to http://localhost:11434 (Ollama’s default port) and click Submit. For the command-line version, execute goose configure, select "Configure Providers," choose Ollama, and enter the desired model name when prompted. With these steps completed, Goose is now fully connected to a language model running entirely on your local hardware, poised to execute complex coding tasks without any subscription fees or external dependencies.
The RAM, Processing Power, and Trade-offs You Should Know About
A natural and critical question arises: what kind of computer hardware is necessary to run such a setup? Running large language models locally inherently demands substantially more computational resources than typical software applications. The primary constraint is memory—specifically, RAM on most general-purpose systems, or VRAM if a dedicated graphics card is being utilized for acceleration.
Block’s official documentation suggests that 32 gigabytes of RAM provides "a solid baseline for larger models and outputs." For Mac users, this refers to the computer’s unified memory, which serves as the primary bottleneck. For Windows and Linux users with discrete NVIDIA graphics cards, the GPU memory (VRAM) becomes more crucial for effective acceleration.
However, it is important to note that developers do not necessarily require top-tier, expensive hardware to get started. Smaller models with fewer parameters are designed to operate effectively on much more modest systems. Qwen 2.5, for instance, is available in multiple sizes, and its smaller variants can function quite capably on machines equipped with 16 gigabytes of RAM. "You don’t need to run the largest models to get excellent results," Sareen emphasized, advocating for a practical approach. The common recommendation is to begin with a smaller model to establish and test the workflow, then scale up to larger models as specific needs dictate. For context, an entry-level Apple MacBook Air with 8 gigabytes of RAM would likely struggle with most capable coding models. Conversely, a MacBook Pro with 32 gigabytes of unified memory—an increasingly common configuration among professional developers—can handle these models comfortably.
Why Keeping Your Code Off the Cloud Matters More Than Ever
While Goose, particularly when paired with a local LLM, offers compelling advantages, it is not presented as a perfect, direct substitute for Claude Code. The comparison involves genuine trade-offs that developers must carefully consider.
Model Quality: Claude 4.5 Opus, Anthropic’s flagship model, currently remains arguably the most capable AI for sophisticated software engineering tasks. It excels at comprehending intricate codebases, meticulously following nuanced instructions, and consistently producing high-quality code on the initial attempt. While open-source models have made dramatic advancements, a discernible gap persists, especially when tackling the most challenging and abstract coding problems. One developer who transitioned to the $200 Claude Code plan described the difference bluntly: "When I say ‘make this look modern,’ Opus knows what I mean. Other models give me Bootstrap circa 2015."
Context Window: Claude Sonnet 4.5, accessible via the API, boasts an exceptionally large one-million-token context window. This capacity is often sufficient to load entire large codebases into memory without requiring complex chunking strategies or manual context management. In contrast, most local models are typically limited to context windows of 4,096 or 8,192 tokens by default, although many can be configured for longer contexts at the expense of increased memory usage and potentially slower processing speeds.
Speed: Cloud-based services like Claude Code operate on dedicated server hardware specifically optimized for AI inference, resulting in generally faster processing times. Local models, running on consumer-grade laptops or workstations, typically process requests more slowly. This difference in speed can be significant for iterative development workflows where developers are making rapid changes and expecting immediate AI feedback.
Tooling Maturity: Claude Code benefits from Anthropic’s dedicated engineering resources, which ensure its features are highly polished and well-documented. Features such as prompt caching (which can significantly reduce costs by up to 90 percent for repeated contexts) and structured outputs are often robust and refined. Goose, while under active development with 102 releases to date, relies on community contributions and may, in certain specific areas, lack an equivalent level of refinement or feature completeness.
How Goose Stacks Up Against Cursor, GitHub Copilot, and the Paid AI Coding Market
Goose enters a competitive and rapidly expanding market of AI coding tools, yet it carves out a distinctive and valuable position.
Consider Cursor, a popular AI-enhanced code editor. Its pricing structure, with a Pro tier at $20 per month and an Ultra tier at $200, closely mirrors Claude Code’s Max plans. However, Cursor provides approximately 4,500 Sonnet 4 requests per month at the Ultra level, representing a substantially different allocation model compared to Claude Code’s hourly resets.
Other open-source projects like Cline and Roo Code also offer AI coding assistance but with varying levels of autonomy and tool integration. Many of these projects tend to focus more on intelligent code completion rather than the comprehensive, agentic task execution that defines both Goose and Claude Code.
Meanwhile, enterprise-focused offerings such as Amazon’s CodeWhisperer and GitHub Copilot target large organizations with complex procurement processes and dedicated IT budgets. These solutions are generally less relevant to individual developers and small teams who prioritize lightweight, flexible, and often free tools.
Goose’s unique value proposition lies in its powerful combination of genuine autonomy, model agnosticism, purely local operation, and zero cost. The tool is not attempting to compete directly with commercial offerings solely on the grounds of polish or raw model quality. Instead, its primary competitive advantage rests on providing freedom—both financial and architectural—to its users.
The $200-a-Month Era for AI Coding Tools May Be Ending
The market for AI coding tools is undergoing rapid and continuous evolution. Open-source models, in particular, are improving at an astonishing pace, steadily narrowing the performance gap with their proprietary counterparts. Recent examples like Moonshot AI’s Kimi K2 and z.ai’s GLM 4.5 are now benchmarking remarkably close to Claude Sonnet 4 levels—and they are freely available for use.
If this trajectory continues, the significant quality advantage that currently justifies Claude Code’s premium pricing may gradually erode. Anthropic, and other proprietary AI providers, would then face increasing pressure to compete on factors beyond raw model capability, such as innovative features, superior user experience, and seamless integration.
For the immediate future, developers are presented with a clear choice. Those who demand the absolute best in model quality, can comfortably afford premium pricing, and are willing to accept usage restrictions may continue to prefer Claude Code. However, for a rapidly growing segment of developers who prioritize cost-efficiency, data privacy, the ability to work offline, and unparalleled flexibility, Goose offers a genuine and compelling alternative.
The very existence of a zero-dollar, open-source competitor that offers comparable core functionality to a $200-per-month commercial product is, in itself, a remarkable testament to the current state of the industry. It reflects both the accelerated maturation of open-source AI infrastructure and a strong, growing appetite among developers for tools that genuinely respect and empower their autonomy.
Goose is not without its limitations. It typically requires more technical setup proficiency than commercial alternatives, and its optimal performance depends on hardware resources that not every developer possesses. Its model options, while improving at breakneck speed, may still lag behind the very best proprietary offerings for the most complex and nuanced tasks.
Yet, for a burgeoning community of developers, these limitations are increasingly seen as acceptable trade-offs for something profoundly rare in the contemporary AI landscape: a tool that truly belongs to them.
Goose is available for download at github.com/block/goose. Ollama is available at ollama.com. Both projects are free and open source.