Back to HomeGet Started
Contact us
All Posts
Notes

Will the Larger Context Window Kill RAG?

Blog Image
This Midjourney-generated illustration symbolizes the contrast between RAG and the large context window
Published on
October 31, 2024

"640 KB ought to be enough for anybody" — Bill Gates, 1981

"There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days" — Eric Schmidt, 2010

"Information is the oil of the 21st century, and analytics is the combustion engine." — Peter Sondergaard, 2011

"The context window will kill RAG" — Every second AI specialist, 2024

Disclaimer:
There is no solid proof that the quotes mentioned here are accurate. The text below is purely the author’s own imagination. I assumed that a wonderful future is just around the corner, where a super-duper chip will be invented, resolving memory issues, LLMs will become cheaper, faster, and the hallucination problem will be solved. Therefore, this text should not be taken as an ultimate truth.

Lately, there’s been a lot of buzz around the arrival of LLMs with large context windows — millions of tokens. Some people are already saying that this will make RAG obsolete.

But is that really the case?

Are we so sure that larger context windows will always keep up with the exponential growth of data? According to estimates, the total amount of data in the world doubles every two to three years. At some point, even these huge context windows might start looking a bit too cramped.

Let’s say we’re talking about a million tokens right now — that’s roughly 2,000 pages of text. Think of 200 contracts, each a hundred pages long. Not that impressive if we’re talking about large-scale company archives. Even if we’re talking about 10 million tokens, that’s 20,000 pages of English text. What about Slavic or Eastern languages?

So, we’re not talking about fitting an entire corporate database into a single context just yet. Instead, it’s more about reducing the requirement for search accuracy. You can just grab a broad set of a few hundred relevant documents, and let the model do the fact extraction on its own.

But here’s what’s important. We’re still in the early days of RAG. Right now, RAG handles information retrieval well but struggles with more complex analytical tasks, like the ones in the infamous FinanceBench. And if we’re talking about creative tasks that need deep integration with unique, user-specific content, RAG is still hovering at the edge of what’s possible. In other words, at this stage, a million tokens feel like more of a “buffer” than a solution.

But the larger context windows might give RAG a major boost! Here’s why:

• Tackling more complex tasks. As context windows grow, RAG will be able to handle much more sophisticated analytical and creative challenges, weaving internal data together to produce insights and narratives.

• Blending internal and external data. With larger context, RAG will be able to mix internal company data with real-time info from the web, unlocking new possibilities for hybrid use cases.

• Keeping interaction context intact. Longer contexts mean keeping the entire conversation history alive, turning interactions into richer dialogues that are deeply rooted in “your” data.

So, what’s next? Once people and companies have tools to find and analyze all their stored data, they’re going to start digitizing everything. Customer calls, online and offline behavior patterns, competitor info, logs from every single meeting… You name it. Data volumes will start skyrocketing again, and no context window — no matter how big — will ever be able to capture it all.

And that’s when we’ll be heading into the next RAG evolution, which will need even more advanced techniques to keep up.

‍

Yuri Vorontsov
Blog Image
notes

RAG in 2025: Navigating the New Frontier of AI and Data Integration

We are on the brink of a world where AI not only understands the vast expanse of the internet but also comprehends your organization's unique data landscape—providing insights, answering complex questions, and even predicting future trends based on proprietary information.
Blog Image
notes

Why the Heck Do I Need RAG When I’ve Got ChatGPT?

The post highlights that while ChatGPT-4o is a powerful tool, relying solely on it for financial document analysis can be risky due to inaccuracies and the limitations of its Internet search capabilities. This demonstrates the importance of RAG in scenarios like financial analysis, where precision is critical.
Blog Image
Humor

Long, expensive, awesome

Here's a clear and simple way to build the best, most wonderful, amazing, perfect RAG. Let this touch of humor offer a bit of comfort on your challenging journey toward mastering RAG and building a RAG system.
Blog Image
Notes

Anticipated trends and advancements in RAG for 2025

By 2025 RAG is expected to become a foundational technology in corporate settings, driving advancements in data integration, secure deployments, and versatile applications across various industries.
Blog Image
Notes

Will the Larger Context Window Kill RAG?

Lately, there’s been a lot of buzz around the arrival of LLMs with large context windows — millions of tokens. Some people are already saying that this will make RAG obsolete.
View all posts
PricingContact
Copyright © 2024, QuePasa.ai.
All rights reserved.
Terms of ServicePrivacy Policy