1
1
1
2
3
The landscape of artificial intelligence-powered coding is undergoing a significant transformation, marked by both groundbreaking innovation and growing user discontent over cost and control. While powerful AI agents like Anthropic’s Claude Code have revolutionized software development by offering autonomous code generation, debugging, and deployment, their premium pricing models and restrictive usage policies have ignited a rebellion among the very programmers they aim to empower. This has paved the way for a compelling, free, and open-source alternative: Goose, an on-machine AI agent developed by Block (formerly Square), which promises identical functionality without subscription fees, cloud dependency, or arbitrary rate limits.
Claude Code, a terminal-based AI agent, has captivated software developers globally with its advanced capabilities. However, its pricing structure, ranging from $20 to $200 per month based on usage, has become a flashpoint for frustration. Developers are increasingly questioning the value proposition, particularly as a robust, no-cost alternative emerges, challenging the commercial dominance in this nascent but rapidly evolving sector. Goose, in contrast, runs entirely on a user’s local machine, offering complete data privacy and the freedom to operate offline, even in environments like an airplane. As Parth Sareen, a software engineer, highlighted during a recent livestream demonstration, "Your data stays with you, period," a statement that encapsulates Goose’s core appeal and commitment to user autonomy.
The project’s popularity has soared, evident in its impressive traction on GitHub, the premier code-sharing platform. Goose currently boasts over 26,100 stars, indicating strong community interest and adoption. With 362 contributors and 102 releases since its inception, including the latest version 1.20.1 shipped on January 19, 2026, Goose demonstrates a development velocity that rivals many commercial products. For developers who have expressed frustration with Claude Code’s pricing tiers and usage caps, Goose represents a refreshing and increasingly rare offering in the AI industry: a genuinely free, unrestricted option for serious and professional development work.
Anthropic’s New Rate Limits Spark a Developer Revolt
To fully grasp the significance of Goose, it is crucial to understand the controversy surrounding Claude Code’s pricing and usage policies. Anthropic, the San Francisco-based artificial intelligence company founded by former OpenAI executives, integrates Claude Code into its various subscription plans. The free tier offers no access to Claude Code, while the Pro plan, priced at $17 per month with annual billing (or $20 monthly), imposes severe limitations of just 10 to 40 prompts every five hours. This constraint has proven highly impractical for serious developers, who often exhaust these limits within minutes during intensive coding sessions.
The Max plans, available at $100 and $200 per month, offer more generous allowances, providing 50 to 200 prompts and 200 to 800 prompts respectively, alongside access to Anthropic’s most powerful model, Claude 4.5 Opus. However, even these premium tiers are not immune to restrictions, which have further inflamed the developer community. In late July, Anthropic introduced new weekly rate limits, replacing the five-hour reset system. Under this new regime, Pro users receive 40 to 80 hours of Sonnet 4 usage per week, while Max users on the $200 tier are allocated 240 to 480 hours of Sonnet 4, plus an additional 24 to 40 hours of Opus 4. Nearly five months later, the widespread frustration among developers remains palpable.
The primary issue stems from the ambiguity of these "hours," which are not literal time units but rather token-based limits. These limits fluctuate considerably depending on factors such as codebase size, conversation length, and the complexity of the code being processed by the AI. Independent analyses suggest that the actual per-session limits translate to approximately 44,000 tokens for Pro users and around 220,000 tokens for the $200 Max plan. This lack of clarity has been a major point of contention. As one developer noted in a widely circulated analysis, "It’s confusing and vague. When they say ’24-40 hours of Opus 4,’ that doesn’t really tell you anything useful about what you’re actually getting."
The backlash has been particularly fierce on platforms like Reddit and various developer forums. Many users have reported hitting their daily limits within as little as 30 minutes of intensive coding, leading some to cancel their subscriptions entirely, labeling the new restrictions as "a joke" and "unusable for real work." Anthropic has attempted to defend these changes, asserting that the limits affect fewer than five percent of users and are primarily aimed at individuals who run Claude Code "continuously in the background, 24/7." However, the company has not clarified whether this figure refers to five percent of Max subscribers or five percent of its entire user base, a distinction that holds significant implications for the perceived impact of the restrictions.
How Block Built a Free AI Coding Agent That Works Offline
Goose adopts a fundamentally different approach to solving the challenges of AI-assisted coding. Developed by Block, the financial technology company spearheaded by Jack Dorsey, Goose is engineered as an "on-machine AI agent." Unlike cloud-based services such as Claude Code, which necessitate sending user queries to remote servers for processing, Goose is designed to operate entirely on a user’s local computer. This is achieved by leveraging open-source language models that users can download and manage directly on their hardware.
The project’s documentation highlights its ambition to go "beyond code suggestions," enabling the agent to "install, execute, edit, and test with any LLM." This crucial phrase—"any LLM"—underscores Goose’s key differentiator: its model-agnostic architecture. Developers are not locked into a single proprietary model. Instead, they have the flexibility to connect Goose to a wide array of language models. This includes Anthropic’s Claude models, provided the user has API access, or proprietary models from OpenAI (such as GPT-5) and Google (Gemini). Furthermore, Goose can route requests through third-party services like Groq or OpenRouter.
Crucially, Goose empowers users to run language models entirely locally using tools like Ollama. Ollama simplifies the process of downloading and executing open-source models directly on personal hardware. The practical ramifications of this local setup are substantial. It eliminates subscription fees, usage caps, and rate limits, while also assuaging concerns about proprietary code being transmitted to external servers. All interactions with the AI remain confined to the user’s machine, ensuring complete data privacy. Sareen emphasized the liberation this offers during a demonstration, noting, "I use Ollama all the time on planes – it’s a lot of fun!" This highlights how local models free developers from the constraints of internet connectivity, enabling uninterrupted work.
What Goose Can Do That Traditional Code Assistants Can’t
Goose functions as either a command-line tool or a desktop application, capable of autonomously executing complex development tasks. Its capabilities extend far beyond simple code suggestions; it can build entire projects from inception, write and execute code, debug failures, orchestrate workflows across multiple files, and interact seamlessly with external APIs, often without continuous human intervention.
This advanced functionality is rooted in a core AI industry concept known as "tool calling" or "function calling." This mechanism allows a language model to request and trigger specific actions from external systems or tools. For instance, when a developer instructs Goose to create a new file, run a test suite, or check the status of a GitHub pull request, the AI doesn’t merely generate text describing the desired outcome. Instead, it actively executes these operations within the user’s environment.
The effectiveness of this capability heavily relies on the sophistication of the underlying language model. According to the Berkeley Function-Calling Leaderboard, which benchmarks models on their ability to translate natural language into executable code and system commands, Anthropic’s Claude 4 models currently exhibit superior performance in tool calling. However, the gap is rapidly closing as newer open-source models gain ground. Goose’s documentation specifically highlights several open-source options with robust tool-calling support, including Meta’s Llama series, Alibaba’s Qwen models, Google’s Gemma variants, and DeepSeek’s reasoning-focused architectures.
Furthermore, Goose integrates with the Model Context Protocol (MCP), an emerging standard designed to facilitate connections between AI agents and various external services. Through MCP, Goose can access and leverage databases, search engines, file systems, and a broad spectrum of third-party APIs. This integration significantly expands its operational capabilities, allowing it to perform tasks that go far beyond what the base language model can achieve in isolation.
Setting Up Goose with a Local Model
For developers seeking a completely free, private, and self-contained AI coding environment, setting up Goose with a local language model involves three primary components: Goose itself, Ollama (a tool for locally running open-source models), and a compatible language model.
Step 1: Install Ollama
Ollama is an open-source project designed to streamline the process of running large language models on personal hardware. It simplifies the typically complex tasks of downloading, optimizing, and serving these models via an intuitive interface. Users can download and install Ollama from ollama.com. Once installed, models can be pulled with a single command. For coding-specific tasks, Qwen 2.5 is often recommended due to its strong tool-calling capabilities. A simple command like ollama run qwen2.5 will automatically download and start the model on the user’s machine.
Step 2: Install Goose
Goose is available as both a desktop application and a command-line interface (CLI). The desktop version offers a more visual user experience, while the CLI caters to developers who prefer working exclusively within the terminal. Installation instructions vary by operating system, typically involving downloading pre-built binaries from Goose’s GitHub releases page or utilizing a package manager. Block provides binaries for macOS (supporting both Intel and Apple Silicon), Windows, and Linux, ensuring broad compatibility.
Step 3: Configure the Connection
To connect Goose Desktop to Ollama, users navigate to Settings, then Configure Provider, and select Ollama. They must confirm that the API Host is set to http://localhost:11434, which is Ollama’s default port, and then click Submit. For the command-line version, the process involves running goose configure, selecting "Configure Providers," choosing Ollama, and entering the desired model name when prompted. With these steps completed, Goose is fully connected to a language model running entirely on the local hardware, ready to execute complex coding tasks without any reliance on external services, subscription fees, or cloud dependencies.
The RAM, Processing Power, and Trade-offs You Should Know About
A natural question arises regarding the hardware requirements for running large language models locally. Executing these models on personal machines demands substantially greater computational resources compared to typical software applications. The primary bottleneck is memory—specifically, RAM on most systems, or VRAM (video RAM) if a dedicated graphics card is utilized for acceleration.
Block’s documentation suggests that 32 gigabytes of RAM provides "a solid baseline for larger models and outputs." For Mac users, the unified memory of the system serves as the main constraint. Conversely, for Windows and Linux users equipped with discrete NVIDIA graphics cards, GPU memory (VRAM) becomes more critical for accelerating model inference. However, expensive, high-end hardware isn’t always a prerequisite to get started. Smaller models, with fewer parameters, can operate effectively on more modest systems. For instance, Qwen 2.5 offers multiple size variants, with its smaller configurations capable of running efficiently on machines with 16 gigabytes of RAM.
Sareen emphasized during his demonstration that "You don’t need to run the largest models to get excellent results." The practical advice for developers is to begin with a smaller model to establish and test their workflow, then incrementally scale up to larger models as specific needs dictate. For context, an entry-level Apple MacBook Air with 8 gigabytes of RAM would likely struggle with most capable coding models. However, a MacBook Pro with 32 gigabytes of unified memory, which is increasingly common among professional developers, can comfortably handle these computational demands.
Why Keeping Your Code Off the Cloud Matters More Than Ever
While Goose with a local LLM offers compelling advantages, it is important to acknowledge that it is not a perfect, one-to-one substitute for commercial offerings like Claude Code. Developers must consider real trade-offs between the two approaches.
Model Quality: Anthropic’s flagship model, Claude 4.5 Opus, is widely regarded as one of the most capable AI models for complex software engineering tasks. It excels at comprehending intricate codebases, adhering to nuanced instructions, and generating high-quality code with remarkable accuracy on the first attempt. While open-source models have made significant strides, a performance gap persists, particularly when tackling the most challenging and sophisticated development problems. As one developer who transitioned to the $200 Claude Code plan succinctly put it, "When I say ‘make this look modern,’ Opus knows what I mean. Other models give me Bootstrap circa 2015."
Context Window: Claude Sonnet 4.5, available via API, boasts an exceptionally large one-million-token context window. This capacity is sufficient to load entire large codebases, eliminating the need for manual chunking or complex context management strategies. In contrast, most local models are typically limited to context windows of 4,096 or 8,192 tokens by default. While many can be configured for longer contexts, this often comes at the expense of increased memory usage and slower processing speeds.
Speed: Cloud-based services like Claude Code operate on dedicated server hardware specifically optimized for AI inference, resulting in very fast response times. Local models, running on consumer-grade laptops, generally process requests more slowly. This difference in speed can be a critical factor in iterative development workflows where rapid changes and immediate AI feedback are essential.
Tooling Maturity: Claude Code benefits from Anthropic’s substantial engineering resources, which contribute to its polished features, such as prompt caching (which can reduce costs by up to 90 percent for repeated contexts) and well-documented structured outputs. Goose, despite its active development with 102 releases to date, relies heavily on community contributions. Consequently, it may exhibit less refinement or feature parity in specific areas compared to its commercially backed counterparts.
How Goose Stacks Up Against Cursor, GitHub Copilot, and the Paid AI Coding Market
Goose enters a highly competitive market for AI coding tools, yet it carves out a distinctive niche. Competing AI-enhanced code editors like Cursor, for example, charge $20 per month for its Pro tier and $200 for Ultra, mirroring Claude Code’s Max plans. Cursor’s Ultra level provides approximately 4,500 Sonnet 4 requests per month, which is a substantially different allocation model compared to Claude Code’s hourly resets.
Other open-source projects such as Cline and Roo Code also offer AI coding assistance but often with varying degrees of autonomy and tool integration. Many of these alternatives tend to focus primarily on code completion functionalities rather than the comprehensive agentic task execution that defines both Goose and Claude Code. Meanwhile, enterprise-focused offerings like Amazon’s CodeWhisperer and GitHub Copilot are typically aimed at large organizations, which have complex procurement processes and dedicated budgets for such solutions, making them less relevant to individual developers and small teams seeking agile, cost-effective tools.
Goose’s unique value proposition lies in its combination of genuine autonomy, model agnosticism, local operation, and zero cost. It is not designed to compete directly with commercial offerings on the basis of sheer polish or the absolute highest model quality. Instead, its primary competitive edge is rooted in offering unparalleled freedom—both financial and architectural—to its users.
The $200-a-Month Era for AI Coding Tools May Be Ending
The market for AI coding tools is undergoing rapid evolution. Open-source models are continually improving, narrowing the performance gap with proprietary alternatives at an accelerating pace. Emerging models like Moonshot AI’s Kimi K2 and z.ai’s GLM 4.5 are now benchmarking near the levels of Anthropic’s Claude Sonnet 4, and critically, they are freely available.
Should this trajectory continue, the quality advantage that currently justifies Claude Code’s premium pricing may erode significantly. In such a scenario, Anthropic would face increasing pressure to compete on factors beyond raw model capability, focusing instead on advanced features, superior user experience, and seamless integrations.
For the time being, developers face a clear choice. Those who prioritize the absolute best model quality, are able to afford premium pricing, and are willing to accept usage restrictions may continue to prefer Claude Code. However, for a growing segment of developers who value cost-effectiveness, privacy, offline accessibility, and architectural flexibility, Goose presents a genuine and powerful alternative. The very existence of a zero-dollar open-source competitor offering comparable core functionality to a $200-per-month commercial product is a remarkable development. It underscores both the increasing maturity of open-source AI infrastructure and a strong appetite among developers for tools that genuinely respect their autonomy and control.
Goose is not without its limitations. It typically requires more technical setup expertise than commercial alternatives and depends on hardware resources that not every developer possesses. Furthermore, its model options, while rapidly advancing, may still lag behind the very best proprietary offerings for the most complex and nuanced coding tasks. However, for a burgeoning community of developers, these trade-offs are acceptable in exchange for something increasingly rare in the contemporary AI landscape: a tool that truly belongs to them, offering freedom from external dependencies and recurring costs.
Goose is available for download at github.com/block/goose. Ollama is available at ollama.com. Both projects are free and open source.