Join the Community

22,998

Expert opinions

43,833

Total members

375

New members (last 30 days)

158

New opinions (last 30 days)

28,982

Total comments

Join Sign in

Contextual RAG: Identify Risks and Opportunities without Over-Engineering

03 March 2025 Be the first to comment

Steve Wilcockson

Technical Product Marketing

Quantexa

While AI models - large language models (LLMs) in particular - become more powerful and widely commoditized, the need for accurate, context-rich data grows ever more critical. In regulated industries, this becomes even more essential, as organizations must also ensure explainability and governance. For the modern organization, Contextual Retrieval-Augmented Generation (RAG) delivers trusted data and organizational context alongside models like LLMs, but without the complexity of alternative "RAG" approaches.

In all cases, in this rapidly evolving landscape, remember that your AI is only as good as the data it’s built upon.

The rapid innovation of LLMs and copilots

In November 2022, OpenAI introduced ChatGPT, the power of a large language model behind a simple, easy-to-access, chat user interface.

The copilot was born, a means by which one can engage with – prompt – a Generative Pretrained Transformer (GPT) AI model predicated on a global set of data to train the model and real-time input data to produce an accurate response, prediction, or action.

Since then, a Generative AI lexicon emerged, centered on Large Language Models (LLMs), a subset of Foundation Models, which can generate art, music, code, etc, as well as text. LLMs understand your prompts and author text responses.

Typically, LLMs require compute-hungry hardware, such as GPUs, Graphics Processing Units, so named during an earlier decade when they dominated “embarrassingly parallel” image transformations in gaming systems. Such approaches translated well to the inner parallel workings of neural networks which set the preconditions for the transformer architectures which power LLMs. Compute-efficient DeepSeek LLMs are challenging the GPU paradigm. They can work with commodity CPU hardware through advanced algorithmic implementations, task-specific optimizations, and careful attention to hardware integration. Through improving the mathematics and low-level engineering, they challenged the prevailing "throw compute at the problem" zeitgeist.

Incorporating enterprise data: Retrieval-Augmented Generation

Yet in all cases, LLMs are only as good as the data they’re trained on, which likely does not include your organization’s proprietary data. Only by integrating your organization’s proprietary data alongside LLMs can GenAI deliver enterprise value, while deploying an architecture that provides protective guardrails to mitigate against data leakage and hallucination risks, the latter being a function of Garbage In, Garbage Out (GIGO).

After May 2023, then, the term Retrieval-Augmented Generation (RAG) became popular.

RAG is a process or pipeline whereby an organization's applications and databases simply augment LLM prompts and output. However, underneath it can be complex, typically requiring the structuring of searchable vector embeddings in columnar or vector databases, or technical computing environments like Python or MATLAB. These are mathematical code sets generated by neural networks, like those used in LLMs themselves, which represent understanding of objects or words. They're powerful, but storage and compute heavy, and bring an additional - and quite bloated - data layer into your stack, sitting between your enterprise data environment and the GenAI workflows.

Graphs aficionados, meanwhile, have evangelized GraphRAG, applications of graph technologies, popularly knowledge graphs, repositories of relationship information governed by a user-defined ontology or set of rules. GraphRAG also entails the overhead of converting graph structures into searchable vector embeddings, but I would make the case that graphs are just one (highly useful) tool. Now, GraphRAG aficionados would rightly point out that by vectorizing graph data into searchable form, it can form part of the searchable aggregative vector layer. However, it bloats the vector store and search middleware layer, which as noted is storage- and compute- intensive.

Thus I recommend a simpler approach, Contextual RAG. At the highest level, it adds extra explanatory context to RAG pipelines beyond graphs and vectors, but it is not dependent on them and the bloat they bring. Instead, it takes a copilot and the global knowledge of the LLM directly into your decision intelligence layer, which mines your data and knowledge layers. No data duplication through vectors or middleware layers required, just a copilot, LLM, and a decent decision intelligence and/or enterprise data architecture that takes your organization's knowledge and applies your guardrails.

Let’s explore an example prompt: “Can you tell me about Michael Greene? What risks are associated with him and the transactions he has made?”

This prompt investigates a potentially risky customer, perhaps as part of a Perpetual KYC (Know Your Customer) investigation. Perpetual KYC is a pertinent use case because the discipline requires you to continuously check, update and maintain customer and counterparty records, which an LLM won't do.

LLMs, used standalone, can guide with generic information and offer some (public) pointers, but that's about it, for example:

“Michael Green is known for his expertise in market structures and passive investment strategies, which he sees as posing significant risks to the financial system. His concern..."

Note the spelling of the name – Green versus Greene - irrespective of whether this is actually the information we would expect of said Michael Green. Is this really the Michael Green we’re investigating, or is it a hallucination?

A common RAG pipeline will attempt to help answer through converting your internal data from its various types into a middleware vector layer. Basic or naiive RAG can help bring more likely information, but it adds complexity and other challenges.

It's expensive
It may lose context in the translation
It will likely fail to resolve the Green versus Greene conundrum because sufficient entity resolution is unlikely to be hard-coded into the vector layer
- The hallucination risk may not completely disappear

With GraphRAG, contextual graph knowledge about Michael Greene’s networks and relationships can address some of these challenges better. However, a lot rests on how the knowledge graph layer integrates and makes context of your enterprise data and how well it converts the context to vectors. Such approaches are:

Subject to the same storage and compute costs as vector and other databases
Prone to losing something in the translation, including with the fusion and integration of your enterprise data estate into the graph, and then into vectors
Unlikely to solve the Green versus Greene problem unless very stringent entity resolution is part of the architecture (simple data matching doesn't normally cut it)
More static than agile, since
- Graphs tend to "ontologically" focus on one use case only
- are unlikely to be useful in real-time
- require lots of difficult handovers across the constituents of the store-to-graph-to-vector-to-search pipeline.

Don't take data to the copilot, bring the copilot to the decision intelligence layer

Remember, your AI is only good as your data foundation and the strong contextual knowledge layer and/or full-blooded decision intelligence capability you may have built on top of it. If you have one, why not use it?

Much better, then, when answering whether Michael Greene - rather than the prospectively hallucinatory Michael Green - presents risk, to get meaningful responses drawn directly from across your data estate, like this, which gets informed by my enterprise data layer:

“Michael Greene is a customer of the bank and is flagged as a high risk individual.

Michael Greene is linked to offshore and corporate registry documents that indicate personal risk.

Overall, with a score of 285 Michael Greene is deemed a risky individual in this network.”

Here, the copilot simply queries the firms enterprise decision intellgence layer, which in this case is predicated on full entity resolution that goes beyond simple matching, plus graphs, and ranked scores created from those graphs. However, no embeddings are required, just a straightforward prompt, leveraging the global knowledge contained in the LLM, which is taken directly to the decision intelligence application where it is assessed and guard-railed.

Should the enterprise layer focus well on other areas where LLMs perform notoriously poorly, such as time-series intelligence, geospatial querying, or full-blooded mathematics, you can follow the same approach. Don't waste time and cost turning the simple-yet-challenging time-series or the hard maths into inappropriate costly vectors. Just take your copilot straight to the intelligence platform, be it decision intelligence, time-series intelligence, geo-spatial intelligence or mathematical intelligence.

Democratize decision intelligence & mitigate AI risk

Contextual RAG harnesses your organization’s data and innate knowledge to ensure the public inference of the LLM is managed from the prompt, drawing directly into your enterprise's decision intelligence simply and efficiently. It helps democratize queries to subject matter experts directly via a copilot, lessening requirements for user interfaces or dashboards, while negating the need for complicated, expensive vector databases. In our case, we have exemplified decision intelligence, but it applies to all forms of intelligence.

Your AI copilot is only as good as your data foundation, so use your data comprehensively and wisely. Contextual RAG, informed by your data and knowledge layer and expressed simply and easily. makes for a powerful, simple, effective contextual combination. No need, in many cases, for over-complicated vectors.

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

4242

Report

Channels

/artificial intelligence /financial crime

Financial Risk Management

This network brings together professionals involved in the oversight and management of their company's financial risks and exposures as well as solution vendors, in order to discuss risk issues including interest rate risk, foreign exchange risk and commodity price risk, among others.

Join group

160 opinions 36 members 03 April 2025

Comments: (0)

Steve Wilcockson

Technical Product Marketing

Quantexa

Member since

28 Feb 2014

Location

Diss / London

More expert opinions

Jamel Derdour CMO at Transact365 - www.transact365.io