MetaHuman Pipeline
UE 5.7 rigged avatars with source-backed control paths
What is MetaHuman?
MetaHuman is Epic Games' framework for creating photorealistic digital humans in Unreal Engine. Think of it as a sophisticated puppet system: a skeleton of virtual bones controls a high-detail 3D mesh, while blendshapes handle subtle facial deformations. The magic comes from live face tracking - your iPhone's TrueDepth camera captures 52 different facial muscles at 60 FPS, streaming this data directly to drive the MetaHuman's expressions. Combined with Audio2Face for lip sync, you get real-time digital humans with precise, frame-by-frame control.
The Core Idea in 30 Seconds
Skeletal Rig
700+ bones control mesh deformation hierarchically
52 Blendshapes
ARKit standard for facial expressions at 60 FPS
Live Link
iPhone face tracking streams directly to UE5
The Marionette Metaphor
Bones
Wooden crossbars
Joints
Strings connecting bars
Mesh
Puppet's cloth/skin
Blendshapes
Facial expressions overlay
Skeletal Animation
Child bones inherit parent transforms
Child bones inherit parent transforms
Vertices follow bones based on weight
Vertices follow bone based on weight (purple = high)
Forward vs Inverse Kinematics
Rotate joints → end position
Rotation propagation down chain
Parent rotation: 0°
Angle constraints prevent bad poses
45° (range: -30° to 120°)
Smooth interpolation without gimbal lock
Facial Animation
Linear blend between stored poses
Anatomical facial muscle controls
Combine multiple deformations
Dynamic skin detail based on expression
Wrinkle intensity: 0%
Audio phonemes map to mouth shapes
/a/ → jaw: 80%, lips: 30%
Blendshape weights per viseme
Audio to Animation
Frequency-time representation
Amplitude over time for lip sync
Realistic Materials
Physically-based roughness/metallic
Surface detail without geometry
Soft contact shadows
Performance Optimization
Detail level based on camera distance
Distance: 50m → LOD1
Linear, ease, step transitions
ARKit 52 Blendshape Playground
Apple ARFaceAnchor.BlendShapeLocation — 52 coefficients mapped to FACS Action Units.
Each wi ∈ [0,1] linearly blends a basis deformation Bi onto the neutral mesh x0. Drag to orbit.
f(x) = x₀ + Σ wᵢ · Bᵢ0/52 activeLinear blend: neutral mesh x₀ + weighted sum of 52 basis deformations. Mapped to FACS Action Units (Ekman & Friesen, 1978). 3D model: Three.js facecap.glb with ARKit morph targets.
Skinning Weight Visualizer
See how skinning weights determine mesh deformation. Click a bone to select it, then rotate to see how weights blend the deformation.
Pose Presets
Bones
Weight Color Scale
Vertices with higher weight for the selected bone move more when that bone rotates. Weights are normalized so they sum to 1 per vertex.
Face Tracking Simulator
Simulate how ARKit extracts 52 blendshapes from face tracking. Move your mouse over the canvas to control head pose, and use the sliders for expressions.
Click to start simulated tracking
Expression
Blendshapes (0 active)
// ARKit blendshape output (simplified)
{}How Real Face Tracking Works
- • iPhone TrueDepth projects 30,000 infrared dots onto your face
- • ARKit processes the depth map to extract 52 blendshape coefficients
- • Each coefficient ranges from 0.0 to 1.0, representing muscle activation
- • Data streams at 60 FPS via Live Link to Unreal Engine
Audio-to-Expression Demo
Hello
Higher = smoother transitions
Current Blendshapes:
This demonstrates how phonemes (speech sounds) map to visemes (mouth shapes). Neural audio-to-expression models learn these mappings from video data automatically.
Bone Hierarchy Explorer
Explore how skeletal animation works. Click bones to select them, then adjust rotation. Notice how child bones inherit parent transformations.
Key Insight
When you rotate a parent bone, all children follow. This is forward kinematics. MetaHuman uses 700+ bones with this hierarchy to create lifelike movement.
Expression Blending Mixer
Mix multiple expressions together. Real faces blend emotions - you can be happy-surprised or sad-angry. Blendshapes add linearly then clamp to [0,1].
Presets
Expression Mix
No active blendshapes
How Expression Blending Works
final_blendshape[i] = clamp(Σ(expression_weight × blendshape_value), 0, 1)Each expression defines target values for relevant blendshapes. When you mix expressions, the blendshape values add together (then clamp). This is how MetaHuman and ARKit create nuanced expressions from simple building blocks.
Inverse vs Forward Kinematics
FK: Set joint angles → calculate end position. IK: Set target position → solve for angles. MetaHuman uses IK for realistic hand/foot placement.
IK Mode
Drag anywhere to move the target. The arm automatically solves for joint angles using the FABRIK algorithm.
More iterations = more accurate but slower
Comparison
FK
- • Simple to compute
- • Direct control
- • Animation curves
IK
- • Goal-oriented
- • Foot placement
- • Hand targets
Level of Detail (LOD) System
MetaHuman uses LOD to maintain performance. Closer = more triangles and higher textures. Far away = simplified mesh. The transition is seamless.
Current LOD Stats
All LOD Levels
Performance Impact
LOD 0 to LOD 4 is a 100x reduction in triangles. In a scene with multiple MetaHumans, this is essential for maintaining 60 FPS.
Wrinkle Map System
MetaHuman uses wrinkle maps for facial detail. Each expression drives specific wrinkle regions that blend on top of the base skin texture.
Expression Drivers
Texture Layers
MetaHuman blends multiple texture layers: base albedo, normal map, roughness, and wrinkle maps. Wrinkle maps are driven by blendshape values, creating realistic skin deformation during expressions.
Corrective Blendshapes
When multiple blendshapes activate together, their deformations can combine incorrectly. Corrective blendshapes fix these artifacts automatically.
Toggle correctives and combine both sliders to see the artifact
How It Works
Problem: Blink moves eyelids down. Look down also moves eye region down. Combined = double deformation artifact.
Solution: Corrective shape (blink_lookDown) activates when both are > 0, counteracting the excess deformation.
Corrective Formula
corrective_weight = blendA × blendBThe corrective is sculpted to exactly cancel the artifact when both blendshapes are at 100%, and scales proportionally for partial activations.
MetaHuman Usage
MetaHuman has hundreds of corrective shapes for common expression combinations: smile+blink, frown+jawOpen, etc. These are pre-computed and activate automatically.
Joint Constraints
Real joints have physical limits. Elbows can't bend backward. MetaHuman enforces these constraints to prevent unnatural poses.
Joint Limits
Constraint Types
Hinge: Single axis rotation (elbow, knee)
Ball: Multi-axis with cone limits (shoulder, hip)
Saddle: Two-axis with asymmetric limits (thumb)
In Animation
Constraints prevent impossible poses during IK solving and motion retargeting. They also help with collision avoidance (elbow not going through torso).
Eye Gaze & Tracking
Eyes are crucial for believable avatars. Control gaze direction, blink, and pupil dilation.
Gaze Direction
Controlled by eye bone rotation. ARKit provides eyeLookIn/Out/Up/Down blendshapes for detailed control.
Pupil Response
Dilates with emotion and lighting. Small detail that adds significant realism to digital humans.
Hair Simulation
Real-time hair dynamics using simplified physics. Each strand responds to wind, gravity, and stiffness.
In MetaHuman
Real hair simulation uses thousands of guide strands with interpolation, collision detection, and GPU-accelerated physics. Groom assets define the hair's look and behavior.
Secondary Motion
Secondary motion adds physics-based follow-through to primary animation. Watch earrings and hair react to head movement.
Animation Principles
Secondary motion follows Disney's principles: drag, overlap, and follow-through. Spring physics creates natural-looking motion that reacts to the primary animation.
Facial Muscle System
Anatomically-based facial animation. Click muscles to activate them and create expressions.
Muscle-Based Animation
FACS (Facial Action Coding System) maps muscle activations to Action Units. MetaHuman uses this for physically plausible facial animation driven by blendshapes.
The Animation Pipeline
Refined with metahuman-evolver output from Unreal Engine 5.7 source scans.
Identity Authoring
MetaHumanCharacter + MetaHumanIdentity modules assemble DNA-backed character assets
Use ← → arrow keys to navigate, Space to play/pause
Dependency Hot Paths
- MetaHumanAnimator -> MetaHumanCoreTechLib (35)
- MetaHumanAnimator -> RigLogic (8)
- MetaHumanCharacter -> MetaHumanSDK (10)
- MetaHumanLiveLink -> MetaHumanCoreTechLib (9)
- Top module hubs: MetaHumanCoreTech (20), MetaHumanCore (19), RigLogicModule (15)
Evolver Signals
- Cycle 11 scan: 12 plugins, 70 modules, 2898 source files, 248 internal module edges.
- Hub plugins: MetaHumanAnimator (28 modules), MetaHumanCoreTechLib (5), MetaHumanLiveLink (7).
- Official watch: 5/5 Epic MetaHuman docs endpoints reachable in latest cycle.
- One-line docs include tracker model tags: hyprface-0.1.4 and wav2face-0.0.10.
Key Concepts
Master these five building blocks of the MetaHuman system.
Blendshapes (Morph Targets)
Deformed versions of a mesh representing different expressions. Blend between neutral and target shapes using 0-1 weights to create smooth animations.
Skeletal Rig
A hierarchy of virtual bones that deform the mesh. Moving the shoulder bone cascades to the arm, hand, and fingers. Weight painting determines how strongly each bone affects nearby vertices.
Live Link / ARKit
Real-time streaming from iPhone TrueDepth camera to Unreal Engine. 30,000 infrared dots map to 52 blendshape weights at 60 FPS, giving frame-accurate expression control.
UE5 Rendering
Lumen for global illumination, ray-traced hair strands, and subsurface scattering for realistic skin. 8 LOD levels balance quality and performance.
Audio2Face
NVIDIA's AI that generates facial animation from audio. A neural network maps speech to 72 blendshapes in real-time, enabling automatic lip sync without motion capture.
Build It Yourself
Set up MetaHuman with Live Link face tracking in Unreal Engine 5.
Download from Epic Games Launcher and create a new project
1# Download from launcher.unrealengine.com2# Create new Third Person or Blank project3# Enable MetaHuman plugin in Edit > PluginsStep 1 of 5
Resources
Complete UE 5.7 Architecture Dossier
Full module-by-module implementation map spanning this repo, the UE source tree, and official Epic docs.
Open architecture documentationWhen to Use MetaHuman
Use When
- +You need frame-accurate animation control
- +You're building for desktop/console with good GPUs
- +You want to use existing Unreal Engine workflows
- +You need deterministic, repeatable output
- +You're integrating with game or simulation systems
Avoid When
- −You need photorealism of a real person (use Gaussian Splatting)
- −You want web/mobile deployment without powerful hardware
- −You need one-shot avatar from any photo (use Generative Video)
- −You're building a lightweight voice AI app (use Streaming)
- −You have limited GPU resources
Best Use Case
Production environments requiring precise control, deterministic animation, and integration with game engine workflows
Common Misconceptions
MetaHuman has completely crossed the uncanny valley
Actually: While impressive in stills, animation often reveals the illusion. Micro-expressions, subtle skin deformations, and natural asymmetry are still challenging.
MetaHumans are easy to run on any hardware
Actually: They're computationally expensive: RTX 3070 drops to 20 FPS with 10 MetaHumans. 8K textures, ray-traced hair, and 700-bone rigs require significant GPU power.
Blendshapes and bones are interchangeable
Actually: They serve different purposes: blendshapes for soft tissue (face muscles), bones for rigid structures (limbs). MetaHuman uses both together.
ARKit captures all facial expressions accurately
Actually: 52 blendshapes are a simplification. No micro-expressions, binary tongue tracking, 60 FPS cap misses fast movements.
Audio2Face replaces traditional animation
Actually: It excels at lip sync but can't generate head movement or body gestures. Best for NPCs and first drafts, not final film quality.
See MetaHuman in Action
Try a real-time MetaHuman avatar powered by Unreal Engine pixel streaming. Cloud-rendered on GPU and delivered to your browser via WebRTC.
Launch Rapport Demo →Ready to Go Deeper?
Explore ARKit blendshapes or see how MetaHuman compares to other approaches.