Skip to content
Real-Time Avatars/Learn
End-to-End|GaussianMetaHumanVideo Gen

Interactive Learning Path

Three Paths to
Real-Time Avatars

From fundamentals to implementation. Each track builds from core concepts to working code, with interactive demos along the way. Choose your path based on your goals.

Which path is right for you?

Want photorealism of a specific person?

Start with Gaussian Splatting -- capture once, render in real-time forever.

Need precise animation control?

Start with MetaHuman -- industry-standard rigs with live face tracking.

Want any face from one photo, or building a production app?

Start with Video Generation -- diffusion synthesis and streaming providers via WebRTC.

Start Here30 min

End-to-End Real-Time Avatar

Build a complete conversational avatar system from scratch. Learn how audio flows through STT, LLM, TTS, and avatar rendering, with working code for each of the three approaches.

Voice AI PipelineLatency OptimizationLiveKit IntegrationProduction Deployment
Build the complete system →

Deep Dive: Choose Your Track

Intermediate45 min

Gaussian Splatting

3D Gaussian primitives for photorealistic rendering

Learn how millions of fuzzy 3D blobs can represent photorealistic scenes and avatars, rendering at 60+ FPS on consumer hardware.

3D GaussiansCovariance MatrixSpherical HarmonicsDifferentiable Rendering
Start learning →
Beginner30 min

MetaHuman Pipeline

Game-engine rigged avatars with precise control

Understand how game engines use skeletal rigs, blendshapes, and real-time face tracking to animate detailed 3D characters.

BlendshapesSkeletal RigsLive LinkAudio2Face
Start learning →
Advanced60 min

Video Generation

Diffusion synthesis and streaming infrastructure

Explore diffusion-based talking head synthesis and the WebRTC streaming infrastructure that delivers avatars to any device in real time.

Diffusion ProcessLatent SpaceWebRTCProvider Integration
Start learning →

How These Guides Work

Level 1: Practical Start

Each track starts with "what does this do?" — practical explanations you can understand in 60 seconds. No prerequisites beyond basic programming.

Level 2: Deep Dive

Curious about the math? Click any concept to drill deeper. Each explanation links to its prerequisites, so you can go as deep as needed.

Level 3: Interactive Demos

Every concept has tweakable parameters. Manipulate a Gaussian, reorder alpha layers, watch diffusion denoise — learning by doing.

Quick Comparison

ApproachLatencyQualitySetup CostBest For
Gaussian Splatting~16msPhotorealisticHigh (capture)Static scenes, known identities
MetaHuman~16msHigh-quality 3DMedium (setup)Games, precise animation control
Video Generation100-800msPhotorealisticLow (1 photo / API)Any face, production apps

See It In Action

Try real-time avatar demos spanning different approaches.

LiveKit + Hedra

Diffusion-based streaming avatar with voice

Launch Demo →

Rapport MetaHuman

Unreal Engine pixel-streamed avatar

Launch Demo →

These approaches are converging

The future isn't picking one approach -- it's combining them. MetaHumans enhanced by generative models. Gaussian avatars driven by parametric rigs. Understanding all three paths helps you build hybrid systems that take the best of each.

Learn Real-Time Avatar Technologies

Back to Research Survey