WISE: Context-Aware Scientific Knowledge Extraction — Research

Start with a single gene. HBB. One authoritative source knows about it. That source links to 24 others. One of those links to hundreds more. Within a few traversal steps you have an exponentially expanding tree of interconnected knowledge — and no intelligent way to navigate it without either drowning in volume or stopping too early and missing what matters most.

Traditional search engines return a list and step back. The researcher does the rest manually.

General-purpose LLMs offer synthesized answers — but those answers are constrained by context window. Even the most capable models available can only hold so much simultaneously. The researchers found that GPT-4o, with its 128,000 token context window, could realistically process around eight sources at once before running out of room. This is not a criticism of any model. It is an honest accounting of a structural ceiling that affects every LLM-based retrieval system operating at scale.

The result, in domains where completeness matters — medicine, biology, materials science, social research — is answers that are confident but incomplete. The rare condition gets missed. The edge case goes unrecorded. The nuanced connection between distant sources never gets made.

WISE was built to do what a skilled expert researcher actually does: identify the most valuable leads, discard what is already understood, follow depth where it is warranted, stop when further exploration stops returning meaningful new knowledge, and surface something genuinely comprehensive at the end.

System	Diseases Found	Recall	Rare Conditions
WISE	16	0.84	✓
ChatGPT (GPT-4o)	9	0.47	✗
ChatGPT with Search	7	0.36	✗
Google Search	3	0.15	✗
Gemini	2	0.10	✗

System

Diseases Found

Recall

Rare Conditions

WISE

0.84

✓

ChatGPT (GPT-4o)

0.47

✗

ChatGPT with Search

0.36

✗

Google Search

0.15

✗

Gemini

0.10

✗

We work on a different problem in a different context. But reading this paper carefully, several things came into focus that we want to share honestly.

The challenge of traversing an interconnected knowledge graph without drowning in redundancy is not specific to scientific literature. Any system that needs to reason across a large, heterogeneous corpus of structured entities faces the same fundamental tension: breadth versus depth, volume versus signal, traversal cost versus completeness.

What WISE demonstrates clearly is that the relationship between pieces of knowledge matters as much as the pieces themselves. A system that scores sources by unique contribution — that actively measures what each new source adds relative to everything already known — produces dramatically better results than a system that ranks by popularity or authority alone. The graph structure is not just a storage mechanism. It is a reasoning surface. The edges between entities carry meaning that the entities themselves cannot carry alone.

The researchers also identify knowledge graph integration as a direction they want to explore further — moving from a text-based knowledge container toward a node-and-edge representation where relationships are preserved explicitly rather than merged into accumulating text. Their preliminary experiments — representing the HBB gene entry as a graph of 56 nodes and 55 edges and filtering it to an 11-node, 16-edge subgraph aligned with a specific query — showed that structured representations can preserve relational meaning that text containers lose.

We find this direction genuinely exciting and we look forward to seeing where their future work goes.

RANKWITHME.AI