The mythical, it's text, so it's accessible
There is a persistent misconception among sighted developers: if an application runs in a terminal, it is inherently accessible. The logic assumes that because there are no graphics, no complex DOM, and no WebGL canvases, the content is just raw ASCII text that a screen reader can easily parse.
The reality is different. Most modern Text User Interfaces (TUIs) are often more hostile to accessibility than poorly coded graphical interfaces. The very tools designed to improve the Developer Experience (DX) in the terminal—frameworks like Ink (JS/React), Bubble Tea (Go), or tcell—are actively destroying the experience for blind users.
The Architectural Flaw: Stream vs. Grid
To understand the failure, we must distinguish between two distinct concepts often conflated under “terminal apps”: the CLI (Command Line Interface) and the TUI.
The CLI (The Stream): This operates on a standard input/output model (stdin/stdout). You type a command, the system appends the result below, and the cursor moves down. This is linear and chronological. For a screen reader, specifically kernel-level readers like Speakup, this is ideal.
The TUI (The Grid): This treats the terminal window not as a stream of text, but as a 2D grid of pixels, where every character cell is a pixel. It abandons the temporal flow for a spatial layout.
Case Study: The gemini-cli Madness
Let's look at a concrete example: gemini-cli, a tool written in Node.js using the Ink framework. On the surface, it looks like a simple chat interface. But underneath, Ink is trying to reconcile a React component tree into a terminal grid.
When you use this tool with Speakup (Linux) or NVDA (Windows), the application doesn't just fail; it actively spams you.
Because the framework treats the screen as a reactive canvas, every update triggers a redraw. When the AI is “thinking,” the tool updates a timer or a spinner. To do this, it moves the hardware cursor to the timer location, writes the new time, and moves it back.
For a sighted user, this happens instantly. For a screen reader user, this is what you hear:“Responding... Time elapsed 1s... Responding... Time elapsed 2s... [Fragment of chat history]... Responding...”
It drives the screen reader mad. The cursor is teleporting all over the screen to update status indicators, spinners, and history. Speakup tries to read whatever is under the cursor at that exact millisecond. You end up hearing random bits of conversation mixed with timer updates, making it impossible to focus on what you are actually typing.
Worse, lets pretend that you've somehow managed well with speakup so far, but that you want to do some work with nvda. Maybe paste an error you're getting on windows. So you open your terminal, ssh into your linux box, attach to your screen session and paste your text.
The result is an immediate crash of the screen reader (NVDA) or massive system instability.Why? Every time you type a character or paste text, the application triggers a state change. The framework decides it needs to re-render the interface. Because the conversation history is part of that state, the application attempts to redraw or re-calculate the layout for thousands of lines of text instantly. The more messages you have in a conversation, the more this will happen. And no, you can't just avoid this by using insert+5, the key combo supposed to avoid announcing dynamic change of content.
The Lag Loop
Furthermore, frameworks like Ink running on single-threaded environments (like Node.js) suffer from massive performance degradation when the history grows. If you paste a large block of text, the system has to calculate the diff for thousands of lines.
This causes input lag. You press a key, and you wait. You can wait up to 10 seconds for a single character to echo back. The system is too busy calculating how to redraw the screen to actually process your input.
Why The “Old Guard” Works (nano, vim, menuconfig)
Sighted developers often ask: “If TUIs are bad, why do you use nano, vim, or menuconfig?”
The answer is not that these tools handle the cursor perfectly by default. The answer is that they allow you to hide the cursor entirely.
1. Hiding the Cursor (nano, vim)
In tools like nano or vim, usability depends on turning off features that track cursor position. If you run nano with options that show the cursor position (like --constantshow), or if you use vim without specific configuration, the experience is broken.
When the cursor is visible and tracking is active, Speakup prioritizes the cursor's location update over the character echo. Instead of hearing the letter “a” when you type it, you hear “Column 2”. You type “b”, and you hear “Column 3”.
These older tools succeed because they allow you to disable this noise. You can configure them to suppress the visual cursor or status bar updates, forcing the screen reader to rely on the character input stream rather than the noisy coordinate updates. Modern frameworks rarely offer a “no-cursor” or “headless” mode; they assume the visual cursor is essential.
2. Single Column Focus (menuconfig)
Tools like the Linux kernel's menuconfig work because they enforce a strict, single-column focus. Even though there are borders and titles, the active area is a vertical list. The cursor stays pinned to that list. It doesn't jump to the bottom right to update a clock, then to the top left to update a title. The spatial complexity is kept low enough that the screen reader never gets “lost.”
3. The Lost Art of Scrolling Regions (Irssi)
Irssi is the gold standard for accessible chat, but not because of luck. Irssi was built over 20 years with a custom rendering engine that utilizes VT100 Scrolling Regions.
When a new message arrives in Irssi:1. It tells the terminal driver: “Define a scrolling region from line 1 to 23.”2. It sends a command: “Scroll up.” The terminal moves the bits up.3. It draws the new text at the bottom of that region.
Crucially, it handles this in a way that minimizes interference with the input line. It relies on the terminal's hardware capabilities rather than rewriting every character on the screen manually. Modern frameworks ignore these hardware features in favor of “diffing” the screen state and rewriting characters, which is computationally heavier and hostile to accessibility.
The “Stale Bot” excuse: A Case Study in Neglect
Google and the maintainers of gemini-cli pretend to care about accessibility. “Pretend” is the operative word here.If you look at the repository, critical accessibility regressions like Issue #3435 and Issue #11305 have been left to rot. There is no discussion, no roadmap, and no fix.Even worse is the fate of Issue #1553, which was supposed to track these accessibility failures. It didn't get solved; it got silenced. It was closed automatically by a bot with this generic dismissal:> Hello! As part of our effort to keep our backlog manageable and focus on the most active issues, we are tidying up older reports. It looks like this > issue hasn't been active for a while, so we are closing it for now.”
This is unacceptable. Closing an accessibility report because the maintainers haven't touched it in months is not “tidying up”; it is hiding evidence. It effectively says that if a bug is ignored long enough, it ceases to exist. It boosts the project's “Closed Issues” metric while leaving the actual software unusable for blind users.
Conclusion
If you are building for the terminal and care about accessibility, stop using declarative UI frameworks that treat the terminal like a canvas.
The “modern” TUI stack has optimized for the developer's ability to write React-like code at the expense of the machine's ability to render text efficiently.
If you cannot guarantee that your application allows the user to hide the cursor, or if you rely on aggressive redrawing to show spinners and timers, you are building an inaccessible tool.
For the blind user, a dumb, linear CLI stream is infinitely superior to a “smart” TUI that lags, spams, and scatters the cursor across the screen.