uncloseai.

Reverse Retrieval Augmented Generation

Client-Side Context Injection for Small Language Models

How live DOM extraction makes 8B models punch above their weight class.

russell@unturf, cthegray, TimeHexOn, foxhop

Abstract

Traditional Retrieval Augmented Generation (RAG) requires a server-side pipeline: chunk documents, embed them into vectors, store them in a database, and retrieve by similarity search at query time. This architecture demands infrastructure, indexing latency, and maintenance of embedding models and vector stores.

Reverse Retrieval Augmented Generation (Reverse RAG) inverts this entirely. Instead of the server fetching documents to augment the prompt, the client extracts live content from the page the user currently views and injects it directly into the conversation context. The data comes to the model. No vector database. No embeddings. No indexing pipeline. No server-side retrieval.

uncloseai.js implements this technique as an AGPL-3.0-only algorithm. uncloseai.js serves as the entrypoint of a modular application that adds a machine learning chat interface to any webpage. By feeding the model the full, fresh content of whatever page the user visits, small 8B-parameter models produce answers that rival much larger models on page-specific questions.

Read the full whitepaper (PDF)

Source repository (GitLab)

Citation

russell@unturf, cthegray, TimeHexOn, foxhop. "Reverse Retrieval Augmented
Generation: Client-Side Context Injection for Small Language Models."
uncloseai.com, 2026. https://uncloseai.com/reverse-retrieval-augmented-generations-rag.html

Licence

AGPL-3.0-only

PERMACOMPUTER PREAMBLE, NO WARRANTY

This is free software for the public good of a permacomputer hosted at
permacomputer.com, an always-on computer by the people, for the people. One
which is durable, easy to repair, & distributed like tap water for machine
learning intelligence.

The permacomputer is community-owned infrastructure optimized around four values:

  TRUTH      First principles, math & science, open source code freely distributed
  FREEDOM    Voluntary partnerships, freedom from tyranny & corporate control
  HARMONY    Minimal waste, self-renewing systems with diverse thriving connections
  LOVE       Be yourself without hurting others, cooperation through natural law

This software contributes to that vision by making machine learning accessible
to everyone through a free, open, embeddable chat interface. Code is seeds to
sprout on any abandoned technology.

Learn more: https://www.permacomputer.com

NO WARRANTY. THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND.

That said, our permacomputer's digital membrane stratum continuously runs unit,
integration, & functional tests on all of it's own software, with our
permacomputer monitoring itself, repairing itself, with minimal human in the
loop guidance. Our machine learning agents do their best to leave no stone unturned.

Copyright (C) 2025-2026 TimeHexOn & foxhop & cthegray & russell@unturf
https://russell.ballestrini.net
https://www.timehexon.com
https://www.foxhop.net
https://carltonthegray.com
https://www.unturf.com/software
https://www.permacomputer.com
https://uncloseai.com