No. 10 Environment

AI and the Environment: What Continio Actually Does About It

Every AI request uses real compute. Here's the honest position, and the specific technical choices we make to use less of it.

20 March 2026 · 5 min read

Every AI product uses real compute. Compute uses real energy. If you use Continio regularly, you are contributing to that. I think you deserve a straight answer about what that means and what, if anything, is being done about it.

This is not a sustainability report. I do not have verified emissions figures. What I can give you is an honest account of how the product is designed and what technical choices we make that bear on the question.

The honest position

Continio is not zero-impact. No AI product is. The question worth asking is not "is this environmentally perfect?" but "is this better or worse than the alternative workflow, and are deliberate choices being made?"

On both counts, I think the answer is yes. Here is why.

Continuity reduces waste

The largest source of wasted compute in AI is repetition. Every time a user opens a new chat and re-explains who they are, what they are working on, and what has already been decided, that is compute spent re-processing information the system should already know. Multiply that across millions of conversations and you have an enormous amount of redundant inference.

Continio is built around continuity precisely because of this. The product remembers your context so you do not have to repeat it. Fewer redundant tokens means less energy per useful response. This is not a green marketing claim. It is a structural property of how the product works.

Prompt caching

Continio uses prompt caching for the system instructions sent to the model on every message. A cached prompt costs a fraction of the energy of a full inference. This is the single most impactful technical choice we make: the instructions that shape every response are processed once, then read from cache for all subsequent messages in a session. On a typical session this saves around 90% of the compute cost of the system prompt.

The system prompt was also compressed by 50% in March 2026, from roughly 7,000 tokens to 3,500. The behaviour is unchanged. The cost and environmental footprint of every session is not. Both changes compound: a smaller prompt that is also cached costs a fraction of what it did six months ago.

Ranked memory injection

Continio stores a growing picture of who you are and what you're working on. A naive system would inject all of that into every message. Continio doesn't. Instead, it scores each memory item by relevance, recency, and authority, and selects only the most pertinent ones for each request. A practical question might use four items. A reflective conversation might use twelve. Either way, the rest stays quiet. This is a meaningful reduction in tokens per request, which translates directly to less compute per useful response.

Model routing

Not every message needs the largest, most capable model. Simple factual lookups go to a lightweight model. Complex, nuanced, or emotional conversations get the full model. This is automatic on every request. It's primarily a quality and cost decision, but lighter models use meaningfully less compute and it's worth saying so honestly.

What we do not claim

We are not carbon neutral. We have not published a verified emissions figure because we do not have the data to do so accurately, and publishing a number we cannot stand behind would be worse than saying nothing.

Continio runs on infrastructure from Vercel, Railway, and Supabase. The AI inference runs on Anthropic and OpenAI infrastructure, both of which have published net-zero and renewable energy commitments. I am not claiming credit for their commitments, but it is worth knowing that the underlying infrastructure is not running on coal.

What I commit to

As the product grows, efficiency improvements come before feature additions. Routing lighter tasks to lighter models, caching more aggressively, eliminating redundant processing. These are engineering priorities, not afterthoughts.

The relevant question is not whether an AI product has zero environmental cost. Nothing does. The question is whether the product is designed to be as efficient as possible per unit of useful output, and whether the people building it are honest about where things stand.

I think we are. You can hold me to that.