I've been thinking about Gemtext content vs. presentation directives.
It's a pretty interesting situation. In an "ideal" world, Gemtext would have no way for authors to specify visual attributes for the content, and clients could freely style pages as they see fit.
However, even Solderpunk's 100-line Python example client supports ANSI styling, since it's something that a terminal emulator handles for you automatically. One has to specifically prevent ANSI control sequences from reaching the terminal to avoid this. This leaves us in a situation where many (terminal-based) clients — including the one you write yourself — can just assume as a given that ANSI control sequences can be used for things like colored ASCII/Unicode art and highlighting words for emphasis.
Of course, this is all thanks to the environment where the program is running. The confluence of history has brought us a system where one can insert hidden control sequences in text, in a standardized fashion, and have it modify text appearance (and cursor position) on screen.
The Gemini protocol should be agnostic of such things, as it doesn't assume the use of a terminal emulator, but in practice with Gemini being so heavily text-focused, the terminal is one of the foremost and sometimes even the most preferable environment to use. Thus I think it would help for one of these things to happen:
Getting rid of ANSI styling altogether would resolve the ambiguity neatly, but it could also make writing a Gemtext parser more complicated, and restrict the potential applications for Gemtext. Parsing is perhaps the more serious issue: it's pretty trivial to use regular expressions to recognize the sequences and skip them, but a regex library is another dependency that may not be available for everyone.
Fully embracing ANSI styling has its own fallout effects. The source text may become unreadable in a normal text editor. The size of the content expands if control sequences are used heavily, with each sequence being several bytes long. There is no single agreed-upon way to interpret all of the sequences (the colors are perhaps the least ambiguous ones). There's the whole inline vs. external styling problem, akin to tags vs. CSS. And course, as discussed before, screen readers may produce nonsense garbled output when encountering these sequences, unless the client ensures they are filtered out.
Perhaps the status quo is preferable, despite the ambiguity. The user gets to have the final say on the matter, in their selection of environment, client, and configuration settings.