Property Graphs vs. RDF — Which Actually Matters for AI
Daniel Davis, Co-creator at TrustGraph
If you’re building AI infrastructure and you Google “graph database,” you find Neo4j. It’s fast, intuitive, and everywhere. So many teams building context layers for AI agents start with Neo4j — then run into a fundamental limitation that forces them to start over.
Daniel Davis has been through this migration. TrustGraph initially used Neo4j for graph storage. Then he realized the architecture didn’t support what context graphs actually need to do. So he switched to RDF with Cassandra. The difference isn’t academic — it’s the distinction between a database that works great for many things and one actually designed for how AI systems need to reason about information reliability.
“Neo4j is optimized for graph pattern matching,” Daniel explains. “It’s brilliant at that. But it’s fundamentally a property graph architecture. RDF is optimized for statements about statements — reification. Those are different problems.”
This distinction matters because choosing the wrong architecture for context infrastructure is like building a house on top of a foundation designed for a shopping mall. It might work initially, but when you add load, the cracks appear.
How property graphs work (and why they’re seductive)
Property graphs are intuitive. You have nodes. Each node has properties (attributes). You connect nodes with edges that also have properties. It maps naturally onto human thinking: “Fred is a dog. Fred has four legs. Fred lives with Alice.”
Neo4j made property graphs fast and accessible. You can write queries like “find all dogs owned by people named Alice” and get answers instantly. The traversal pattern matching is remarkably efficient for relationship queries.
“Neo4j dominates because it’s fast and the query language is human-readable,” Daniel says. “For relationship discovery — ‘show me all the connections between person A and person B’ — property graphs are genuinely excellent.”
But here’s where it breaks for AI context infrastructure: property graphs assume properties belong to entities. Nodes have attributes. Edges have attributes. But there’s no clean way to ask “who said this property existed, and when, and how confident are they?”
In Neo4j, you might store: “Fred has_legs 4” with properties like timestamp and source. But the structure treats the property as belonging to the node, not as a statement that can be questioned, evaluated, and compared against other statements.
The RDF model: making statements queryable
RDF works differently. It stores triples: subject-predicate-object. “Fred has-legs 4” becomes a statement — a fact you can reference, evaluate, and make other statements about.
Where property graphs store properties on nodes, RDF stores statements as first-class objects. This opens up a crucial capability: reification — the ability to make statements about statements.
Instead of “Fred has-legs 4,” you store:
- “Daniel stated that Fred has-legs 4 on March 1, 2026” (claim)
- “This claim comes from observation” (source type)
- “Similar claims from other sources: Chris stated Fred has-legs 5 on March 2, 2026” (contradiction)
- “Daniel’s historical accuracy on animal observations: 92%” (credibility)
Now the system can evaluate not just the fact, but the metadata around the fact. Is Fred actually growing legs, or is there an observation discrepancy? Which source is more reliable? When did the information last change?
“This is the architectural difference,” Daniel emphasizes. “With property graphs, the metadata lives outside the core structure. With RDF, the metadata is part of the structure. It’s queryable, comparable, evaluable.”
The production problem: why property graphs break
Most AI teams start with Neo4j because it’s the default choice. They build a prototype that works: store facts, retrieve similar information, connect related entities. In the lab, it functions fine.
Then they go to production. The AI agent needs to make decisions about whether to trust information. It needs to understand when data contradicts, when sources disagree, when information is stale. Suddenly the property graph architecture becomes a liability.
“You end up bolting metadata onto the side of the property graph,” Daniel explains. “You have a separate table tracking source credibility. Another table tracking temporal information. Another tracking observation changes. It works, but you’re fighting against the architecture instead of working with it.”
Neo4j can technically handle this — you can store whatever you want. But it’s not optimized for it. Queries asking “show me statements that are contradicted by newer information from more credible sources” become expensive. You’re forcing the property graph to do something it wasn’t designed for.
With RDF, these queries are natural because contradiction, source quality, and temporal information are all part of the core triple structure. The architecture supports it natively.
The Neo4j dilemma
This creates an awkward situation for Neo4j users building context infrastructure. The database is excellent for what it was designed for — relationship traversal, pattern matching, network analysis. But for context infrastructure, you eventually need something else.
“The people I know at Neo4j are probably going to unfollow me after this,” Daniel says with a knowing laugh, “but you have to admit that graph databases are fundamentally just rows and columns too. Neo4j optimizes for relationship traversal. If you’re optimizing for provenance evaluation, you need a different architecture.”
Some teams try to extend Neo4j into a context layer. It’s possible but awkward — like using a sports car for hauling lumber. It can technically work, but it’s the wrong tool.
Others start with property graphs and end up migrating to RDF when production requirements force the issue. It’s expensive. You’ve built queries optimized for one architecture, then have to translate them to another.
“This is why it’s critical to choose the right architecture early,” Daniel advises. “If you’re building AI infrastructure where credibility, provenance, and temporal reasoning matter, RDF isn’t optional. It’s foundational.”
Why RDF is less popular (and why that matters)
RDF has been around since 2004. It’s a W3C standard. It’s designed for exactly this use case: managing statements, their sources, their metadata, their contradictions. So why isn’t it as widely used as property graphs?
“RDF has a serious marketing problem,” Daniel jokes. “Neo4j marketed property graphs brilliantly. RDF got marketed by semantic web people as ‘the future of the internet.’ Most engineers looked at that and moved on to something simpler.”
There’s also a genuine usability issue. RDF requires thinking differently about data. You’re not traversing a graph of connected entities. You’re querying statements. The mental model is different. Triple stores like Cassandra aren’t as easy to explore interactively as Neo4j.
But for AI context infrastructure specifically, those marketing and usability challenges are worth solving. “When you need to reason about information reliability, RDF’s complexity is actually a feature, not a bug,” Daniel explains.
TrustGraph’s choice: Cassandra + RDF
When TrustGraph dropped Neo4j support, they moved to Apache Cassandra for storage with Qdrant for vector similarity. The choice wasn’t random.
Cassandra gives them distributed, fault-tolerant storage at scale. Qdrant gives them semantic similarity on top of the triples. But the core architecture is RDF — every piece of information is a statement that can be queried, contradicted, and evaluated.
“Cassandra at the storage layer, RDF at the data model layer, Qdrant for retrieval. That’s the stack optimized for ‘how do I build context infrastructure that lets AI agents reason about trustworthiness,’” Daniel describes.
This architecture lets TrustGraph do things property graphs fundamentally can’t: track competing claims from different sources, measure consistency over time, evaluate source credibility based on historical accuracy, and represent nuance in a way that an AI system can actually reason about.
Practical implications for builders
If you’re building AI infrastructure and considering databases, here’s the question to ask yourself: do you need to track and evaluate competing claims and their sources, or just traverse relationships?
If it’s the latter, Neo4j is excellent. Go there. Don’t overthink it.
If it’s the former — if your AI agent needs to understand whether information is trustworthy, when it became stale, or how it contradicts other information — RDF is foundational. The choice between property graphs and RDF is really a choice between “how quickly can I query the graph” and “how completely can I represent provenance and credibility?”
“For context infrastructure, provenance wins,” Daniel concludes. “You can always add speed and optimization. But if the architecture doesn’t support credibility tracking from the bottom up, you’ll keep hitting walls in production.”
FAQ
Is RDF harder to learn than property graphs?
For basic relationship queries, yes. Neo4j’s Cypher query language is more intuitive than SPARQL (the RDF query language). But the learning curve reflects the underlying difference: property graphs are optimized for intuitive traversal, RDF for complex statement evaluation. If your use case is relationship traversal, property graphs are simpler. If it’s provenance evaluation, RDF’s complexity is worth it.
Can Neo4j represent reification?
Technically, you can store reification-like data in Neo4j by creating nodes that represent claims and edges that represent their properties. But it’s not native to the architecture. You’re fighting the database instead of working with it. RDF supports reification natively because statements and statements-about-statements are the core unit.
What’s the performance difference between property graphs and RDF?
On traversal queries (“find connections between A and B”), property graphs are typically faster because they’re optimized for that. On statement evaluation queries (“find claims that are contradicted by newer, more credible information”), RDF is typically faster because the structure supports it. The architecture difference means they have different performance profiles for different query types.
Do I have to choose between Neo4j and RDF?
In theory, you could use both — Neo4j for relationship analysis, RDF for provenance. But it creates data synchronization problems. You’re maintaining two copies of related information in different formats. It’s expensive and error-prone. Most teams that try this end up picking one or the other.
Can Cassandra or Postgres replace a specialized triple store?
You can store RDF triples in any database that supports rows and columns. But a specialized triple store like GraphDB or Virtuoso is optimized for RDF queries (SPARQL) and reification. If you’re building on Cassandra, you’re managing RDF indexing yourself, which adds complexity. It’s a tradeoff: Cassandra is more flexible and distributed; a triple store is more optimized.
Is the “everything is just rows and columns” comment true?
Yes and no. At the storage layer, every database stores rows or columns. But at the logical layer (how you query and reason about data), the architecture matters. Property graphs optimize for traversal. RDF optimizes for statement evaluation. The optimization difference is real, not just marketing.
How do I know if my use case needs RDF?
Ask: do I need to evaluate competing claims? Track source credibility? Represent temporal changes? Measure consistency? If you’re doing any of these, RDF’s statement-based architecture is foundational. If you’re just finding relationships between entities, property graphs work fine.
What happens if I start with Neo4j and realize I need RDF later?
You’ll need to migrate. It’s expensive. All your queries need rewriting. Relationships stored as property attributes need converting to reifiable statements. This is why choosing architecture early matters — switching later is painful.
Can vector databases replace graph databases for AI context?
Vector databases are optimized for semantic similarity (“find similar embeddings”). Graph databases are optimized for relationships or statements. They solve different problems. For context infrastructure, you need statements (RDF) or at least relationships (property graphs) because semantics alone can’t represent credibility, contradiction, or provenance.
Full episode coming soon
This conversation with Daniel Davis is on its way. Check out other episodes in the meantime.
Visit the Channel