My path to Ferrum started before Ferrum existed.
I had been using VS Code-based AI tools and extensions, and the GUI workflow felt heavy: clumsy panels, extra interface layers, background processes, and too many extensions running inside the same environment where I keep my repositories.
Then I tried Claude Code. It was powerful, yet the interface felt noisy to me: too many features, too much going on, distracting UX bugs like terminal flickering, and too many details around the actual work.
In autumn 2025 I started using OpenCode, and I liked it immediately. It felt cleaner, was easy to install, easy to set up, and quickly became my main coding agent.
For a while I enjoyed the pace of development. Every week brought new options and capabilities. After using it every day, the priority became clear: I wanted a stable environment more than an endless stream of features.
I often keep several agent terminals open, one per project. In that situation, small things matter: memory footprint, startup time, terminal behavior, copy and paste, scrollback, prompt size, and clear edits.
OpenCode gave me a clean starting point. Over time, its memory footprint and small UX bugs started to matter. Pi was leaner and its harness felt closer to what I wanted, yet the footprint and auto-update model still made me uneasy for a tool running inside my repositories.
I wanted the shape of a Unix tool: a small native binary, stable output, normal copy and paste, long scrollback, simple tab completion, predictable sessions, and a system prompt I could understand and override.
Ferrum became that answer: a Rust-native, Linux-only coding agent designed to stay plain, solid, and useful. Ferrum is iron. It just has to do the work.
The first version was deliberately narrow: a focused tool rather than a platform. Its job was to run a prompt, expose a few local tools, persist a session, and get out of the way.
The repository history shows how compressed the first days were. The first commit was on May 28, 2026: Initial Ferrum MVP. That same day it gained enough of the surrounding pieces: MCP, CI, release notes, image input, and error handling, to become usable.
It was rough, but it was already usable.
A week later, something important happened: I realized I could use Ferrum to develop Ferrum.
That changed the project. It stopped being an experiment I worked on from the outside and became a tool I could improve from the inside. If I missed a feature, I could add it. If a behavior annoyed me, I could fix it. If the prompt was too heavy, I could make it smaller. The feedback loop became very short.
At some point I stopped reaching for the other agents first. Ferrum had become consistent enough that I saw no practical performance disadvantage in daily work. Then I started seeing the advantages.
The improvements that made Ferrum feel like my daily driver were not exotic.
Memory footprint was the first obvious difference. I often run one agent per project, in multiple terminals. When each instance is a long-lived companion to a working tree, memory usage is not an abstract benchmark.
This practical benchmark captures the kind of difference I could feel every day. On my machine, just asking each agent for its version showed the shape of the problem:
| command | max RSS | elapsed |
|---|---|---|
ferrum --version | 7.3-7.8 MB | 0.00s |
pi --version | 140-152 MB | 0.41-0.62s |
opencode --version | 197-200 MB | 0.42-0.53s |
Versions tested: Ferrum 0.4.19, Pi 0.75.5, OpenCode 1.15.13.
This measures startup and runtime overhead, not agent quality. For a command-line tool I open all day, often more than once, that matters.
Terminal stability mattered just as much. Ferrum is line-oriented and plain by design: stable output, scrollback, copy and paste, and normal shell habits.
The system prompt became another turning point. During benchmarking, Ferrum was faster than OpenCode but still slower than Pi in some tasks. Looking at the prompts explained part of it: OpenCode's was huge, while Pi's was much slimmer.
That changed Ferrum's design. The default prompt became smaller and completely overridable. Project behavior belongs in AGENTS.md, local docs, and session context. The global prompt should make the harness work, not consume the project’s context window.
The runtime shape mattered too. Ferrum has Rust dependencies like any real program, but opening a new agent terminal starts one native executable. That is the shape I wanted.
Ferrum executes exposed tools directly, without asking for confirmation before every call. Many agents build a workflow around approval prompts. My interactive workflow is different: Ferrum is YOLO by default. It has one state: exposed tools can run.
For read-only work, I expose only read/search tools. In an interactive session inside a Git repository, editing tools are part of the normal workflow. A wrong edit is recoverable; Git gives me the rollback path.
Rollback is only half of the workflow. The other half is understanding what changed before I continue.
This is why I spent time on diff modes. Human revision is the workflow, so diffs are first-class UI. Ferrum can render edits as a unified patch, compact patch, full old/new replacement blocks, word-level changes, or side-by-side view. The selected mode follows the session, and color can be enabled or disabled separately.
The mode matters. Patch view is good for context, full blocks are better for replacements, word diff catches subtle changes, and side-by-side helps with larger edits. The agent can move quickly, but the human still reviews. The UI should make that review precise and clear.
After several months of using coding agents, my preferred workflow became simple: one agent per project, with human revision.
Each project has its own repository, its own AGENTS.md, its own docs, its own session history, and its own rhythm. The agent works best close to the code: reading local context, using local tools, and producing changes I can review.
This is also how I want Ferrum to evolve. Pi influenced the harness shape: tools, sessions, context files, skills-like instructions. OpenCode showed how productive a clean command-line agent could be. Ferrum takes those lessons selectively, keeping the ideas that fit my workflow and leaving the rest outside.
That selectivity is the freedom I wanted: to learn from the main agents without chasing every feature they add.
Benchmarks helped me understand whether the small boring version was good enough.
In one local benchmark snapshot, Ferrum, Pi, and OpenCode all passed the same five coding tasks using the same GPT-5.5 family access. That was the important part: Ferrum handled the tasks I cared about.
The resource profile, however, was different:
| agent | passed | mean time | mean RSS |
|---|---|---|---|
| Ferrum | 5/5 | 20.04s | 29.4 MB |
| Pi | 5/5 | 23.86s | 180.3 MB |
| OpenCode | 5/5 | 68.29s | 420.6 MB |
This was a small local benchmark rather than a universal ranking. Network latency, model behavior, prompt scaffolding, and tool differences all matter. It was enough evidence for me. Ferrum was doing the work, and it was doing it with the kind of footprint I wanted.
Ferrum stays intentionally incomplete: Linux-only v1, no themes, no marketplace, no rich plugin UI, no auto-update machinery, and no compatibility promise with other agents. Those choices keep the project light. Most features that did land came from actual use, not from a checklist.
The source code and releases are on Codeberg:
https://codeberg.org/ominiverdi/ferrum
There is also a GitHub mirror:
https://github.com/ominiverdi/ferrum
Ferrum will stay narrow: a solid Linux CLI for one project session at a time, with clear edits and human review.
The model matters, but the daily value is in the harness around it. Ferrum is meant to be solid like iron: plain, reliable, and ready to work.