Why HTML Canvas Beats Graphics Rendering — And Why It Matters for AI
Steve Ruiz, Founder & CEO at tldraw
Most designers and developers expect that a tool like Figma or Miro — something that feels like a real-time, interactive canvas — must be built on custom graphics rendering. It’s the industry standard: Figma, Miro, Excalibur, all of them render everything as bitmaps because that’s how you squeeze performance out of pixel-heavy interactions.
Steve Ruiz, the founder and CEO of tldraw, made a different bet. Instead of graphics rendering, he built an infinite canvas using HTML and CSS — the same building blocks as any website. The result: a tool that feels like a whiteboard but can contain videos, interactive forms, entire chat windows, code editors, even instances of tldraw itself nested inside the canvas.
The bet worked. Not because HTML is faster at rendering pixels — it isn’t. But because standard web technology unlocks something graphics rendering can’t: the ability to put anything on the canvas, including the parts of the web that actually matter.
The Performance Myth That Blocked Innovation
For decades, developers assumed that canvas-like interactions — panning, zooming, dragging objects — required custom rendering pipelines. Graphics engines like WebGL or Canvas 2D API are built for speed, specifically for pushing raw pixels to the screen as fast as possible.
“Most of the types of apps that act this way — Figma, Miro, Whimsical, Excalibur — what you’re seeing is basically just an image being repainted with the contents of the canvas,” Steve explains. “There’s a million reasons why you would do that. It’s the same thing a video game would do for performance.”
But Ruiz suspected the assumption was too conservative. “My bet with tldraw was, no, I could probably just make normal website stuff just as fast and just as performance.”
He was right. Not perfectly right — HTML and CSS canvases won’t compete with graphics rendering on raw speed. But they’re fast enough for what people actually want to do: compare dashboards, take notes on visualizations, iterate on wireframes, collaborate in real time. The 60-frame-per-second rendering that Figma optimizes for isn’t the constraint anymore. The constraint is what you can put on the canvas.
Why That Distinction Matters
Here’s the critical difference: if your canvas is just a bitmap, you can only put images on it. But if your canvas is built on HTML and CSS, you can put anything — videos, interactive forms, entire chat interfaces, even other applications.
“If I could just make HTML and CSS and these building blocks of normal websites work like a canvas works,” Steve says, “then I would be able to put anything onto that canvas. Video, GIF, interactive stuff like forms, entire other chat, or code editors, or text editors, or even instances of tldraw itself on the canvas inside of itself.”
That sentence is easy to read but hard to grok. Let’s make it concrete: Observable, a data visualization platform built by the creators of D3.js, uses tldraw to let data scientists build queryable, interactive canvases where SQL queries, database connections, and visualizations all live as interactive objects on a single plane. You can annotate a graph, move queries around, and let changes ripple across dependencies — none of that would be possible if the canvas were just a bitmap.
Or take the work happening at HiTouch, a customer data platform. Designers are shipping features built on top of tldraw’s canvas because they can prototype multi-stage AI workflows — text inputs, image outputs, decision nodes — all in one visual space. The canvas becomes the design tool and the product.
Those products aren’t possible with graphics rendering. They’re only possible because tldraw chose infrastructure over micro-optimizations.
The Community Proof
Ruiz regularly demos what the community has built on top of tldraw. A programmer named Grant Coté built a liquid simulation in the browser using GPU acceleration. Then he used tldraw as the engine — tldraw’s geometry system for collision detection, its interaction model for dragging, its rendering layer for visualization. The result: you can draw a container, pour virtual liquid into it with your finger, and watch it behave realistically. All with code, all on the canvas.
“Grant used tldraw as a kind of an engine for working with this liquid simulation,” Steve says. “Using that same geometry system here, mapped into the shader or whatever. Now you can kind of just take this and create like vessels and pour things out.”
Those 800+ community projects — from Chinese Bible study groups using tldraw for collaborative markup, to Russian language tutoring sessions, to circuit diagram tools, to AI agent orchestration — are only possible because the foundation is open enough to support them.
What This Teaches About Infrastructure
The lesson here goes beyond canvas technology. Ruiz made the counterintuitive move: instead of optimizing for the use case he knew (whiteboards and design tools), he optimized for extensibility. Instead of choosing the fastest possible engine, he chose the most flexible one.
“My bet has always been that by making the canvas available, people will do really interesting things with the canvas,” he says. “It’s not just whiteboarding. It’s data visualization, tutoring, interviews, Bible study, code editing. The fundamental use case keeps going and going and going.”
That’s the hidden power of choosing HTML and CSS over graphics rendering. You don’t optimize for performance. You optimize for surprise. And in a world where AI is expanding what’s possible every few months, that flexibility matters more than pixel-perfect speed.
FAQ
What does “building a canvas on HTML and CSS” actually mean?
Instead of using graphics rendering APIs (WebGL, Canvas 2D) that create bitmaps, tldraw uses standard web technologies so the canvas can contain real web elements — videos, forms, text editors, other interactive components. It’s visually indistinguishable from a graphics-rendered canvas but architecturally unlimited.
Why don’t other products like Figma or Miro use HTML and CSS instead of graphics rendering?
Figma and Miro optimize for design tools specifically, where pixel-perfect control and raw performance matter. tldraw prioritized extensibility over single-product optimization, which works if your business model is infrastructure, not applications.
Can HTML canvases really match graphics rendering for performance?
Fast enough, yes. tldraw handles “even with gigantic documents” smoothly because the performance constraint isn’t the rendering engine — it’s the interaction model. People don’t need 144 fps; they need responsive panning and intuitive object manipulation.
How does tldraw’s approach work for collaborative, multiplayer features?
By using standard web architecture, tldraw can implement synchronization and multiplayer as a first-class feature, not a performance afterthought. The “real building blocks” of the web include websockets, event systems, and state management already designed for multi-user experiences.
What’s an example of something you couldn’t build on a graphics-rendered canvas but can on tldraw?
Observable’s canvas where SQL queries, data connections, and visualizations all interact as live objects. Or a nested instance of tldraw inside tldraw. Or a tutoring session where an AI agent sees what the student is drawing and responds in real time.
Is tldraw slower than Figma because it uses HTML instead of graphics rendering?
For Figma’s specific use case (design tools), possibly. For the things tldraw is designed for (collaboration, extensibility, mixed-media content), the tradeoff is worth it. The “slowness” is negligible for real-world usage.
What does “appropriately over-engineered” mean in this context?
Steve’s phrase for why tldraw’s synchronization engine is faster and more robust than any single product needs. Because tldraw is an SDK, not an app, every customer benefits from that engineering investment. For an individual product, it would be wasteful. For infrastructure, it becomes a competitive advantage.
Does this architecture scale to real products, or is it just a dev toy?
Real products: Shopify uses tldraw for internal collaboration. ClickUp embeds tldraw for canvas-based task management. Google integrated Make Real (tldraw’s AI sketching tool) into their design workflows. It’s production infrastructure, not a demo.
Why does AI change the equation for canvas infrastructure?
AI models benefit from spatial interfaces because they can see and manipulate multiple objects simultaneously. Chat is one-dimensional (message → reply → message). A canvas is multi-dimensional — agents can work in parallel, see each other’s contributions, and coordinate without waiting for turn-based conversation.
What’s the takeaway for builders deciding between performance optimization and extensibility?
Ruiz’s bet suggests that if you’re building infrastructure (not a single product), extensibility compounds. Optimize for surprise, not for the use case you predict. The market will find uses you didn’t anticipate.
Watch the full conversation
Hear Steve Ruiz share the full story on Heroes Behind AI.
Watch on YouTubeRelated Insights
Why Building Your Own AI Ad System Is a Worse Idea Than You Think
Nic Baird, CEO at Koah Labs
The 'Dynamic Internet' Is Coming — And It Changes How Every Website Makes Money
Nic Baird, CEO at Koah Labs
How to climb the enterprise logo ladder as an infrastructure startup.
Gil Feig, Co-Founder & CTO at Merge