LiveFace: Real-Time Photorealistic Facial Animation on Low-End Mobile Devices via Compact Per-Avatar Neural Decoders and Universal Compositor-Upscaler

preprint OA: closed
Full text JSON View at publisher
Full text 7,427 characters · extracted from preprint-html · click to expand
LiveFace: Real-Time Photorealistic Facial Animation on Low-End Mobile Devices via Compact Per-Avatar Neural Decoders and Universal Compositor-Upscaler | Authorea try { document.documentElement.classList.add('js'); } catch (e) { } var _gaq = _gaq || []; _gaq.push(['_setAccount', 'G-8VDV14Y67G']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Skip to main content Preprints Collections Wiley Open Research IET Open Research Ecological Society of Japan All Collections About About Authorea FAQs Contact Us Quick Search anywhere Search for preprint articles, keywords, etc. Search Search ADVANCED SEARCH SCROLL This is a preprint and has not been peer reviewed. Data may be preliminary. 10 April 2026 V1 Latest version Share on LiveFace: Real-Time Photorealistic Facial Animation on Low-End Mobile Devices via Compact Per-Avatar Neural Decoders and Universal Compositor-Upscaler Authors : Dmitry Rodin 0009-0002-4929-9085 [email protected] and Nikita Rodin Authors Info & Affiliations https://doi.org/10.22541/au.177585093.33533340/v1 388 views 153 downloads Contents Abstract Supplementary Material Information & Authors Metrics & Citations View Options References Figures Tables Media Share Abstract We present LiveFace, a modular neural rendering system that achieves photorealistic talking-head animation at 30 fps on low-end mobile devices with as little as ~10 GFLOPS of compute (e.g., Qualcomm Snapdragon 439). Prior photorealistic facial animation systems either require cloud infrastructure with 100M+ parameter models (HeyGen, DID, Synthesia) or demand desktop-class GPUs (MetaHuman, Audio2Face), while on-device alternatives sacrifice realism for stylized cartoon aesthetics (Apple Memoji, Samsung AR Emoji). LiveFace bridges this gap through three key contributions: (1) a decomposed per-avatar decoder architecture that factorizes the face into four independently rendered regions-mouth, eyes, hair, and body-each handled by a compact neural decoder (1.3-5.7M parameters) augmented with a 128-dimensional learnable identity embedding; (2) a universal compositor-upscaler (~7M parameters) shared across all avatars that composites the decoded patches onto a 9:16 portrait canvas and upscales to 360x640 (or 384x672) in a single forward pass; and (3) a videodriven knowledge distillation pipeline that uses RAVDESS emotional speech videos as driving sources for LivePortrait (~300M parameters) to generate diverse, naturalistic training data for the student decoders. The MouthDecoder supports dual-input conditioning: both viseme-based (audio-driven) and landmark-based (MediaPipe Face Mesh) modes, enabling flexible integration with different upstream pipelines. A perframe quality filter employing Haar cascade face detection, Laplacian blur scoring, and SSIM comparison ensures training data integrity by rejecting approximately 0.6% of generated frames. A working V3 prototype has been trained and validated, demonstrating that the architecture successfully produces photorealistic output from compact per-avatar models. The full system comprises ~20M INT8 parameters with a 08.04.2026, 02:01 file:///C:/Users/123/AppData/Local/Temp/arxiv_paper_liveface_v2_EN.html 1/23 total inference latency of ~19 ms per frame, enabling real-time, fully offline operation on commodity mobile hardware without any cloud dependency. Supplementary Material File (arxiv_paper_liveface_v2_en.pdf) Download 429.92 KB Information & Authors Information Version history V1 Version 1 10 April 2026 Copyright This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License Keywords avatar computer vision facial animation knowledge distillation mobile neural rendering on-device ai real-time Authors Affiliations Dmitry Rodin 0009-0002-4929-9085 [email protected] View all articles by this author Nikita Rodin Texas Tech University View all articles by this author Metrics & Citations Metrics Article Usage 388 views 153 downloads .FvxKWukQNSOunydq8rnd { width: 100px; } Citations Download citation Dmitry Rodin, Nikita Rodin. LiveFace: Real-Time Photorealistic Facial Animation on Low-End Mobile Devices via Compact Per-Avatar Neural Decoders and Universal Compositor-Upscaler. Authorea . 10 April 2026. DOI: https://doi.org/10.22541/au.177585093.33533340/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu . Format Please select one from the list RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks Direct import Tips for downloading citations document.getElementById('citMgrHelpLink').addEventListener('click', function() { popupHelp(this.href); return false; }); $(".js__slcInclude").on("change", function(e){ if ($(this).val() == 'refworks') $('#direct').prop("checked", false); $('#direct').prop("disabled", ($(this).val() == 'refworks')); }); View Options View options PDF View PDF Figures Tables Media Share Share Share article link Copy Link Copied! Copying failed. Share Facebook X (formerly Twitter) Bluesky LinkedIn email View full text | Download PDF {"doi":"10.22541/au.177585093.33533340/v1","type":"Article"} Now Reading: Share Figures Tables Close figure viewer Back to article Figure title goes here Change zoom level Go to figure location within the article Download figure Toggle share panel Toggle share panel Share Toggle information panel Toggle information panel Go to previous graphic Go to next graphic Go to previous table Go to next table All figures All tables View all material View all material xrefBack.goTo xrefBack.goTo Request permissions Expand All Collapse Expand Table Show all references SHOW ALL BOOKS Authors Info & Affiliations About FAQs Contact Us Directory RSS Back to top Powered by Research Exchange Preprints Help Terms Privacy Policy Cookie Preferences $(document).ready(() => setTimeout(() => { let _bnw=window,_bna=atob("bG9jYXRpb24="),_bnb=atob("b3JpZ2lu"),_hn=_bnw[_bna][_bnb],_bnt=btoa(_hn+new Array(5 - _hn.length % 4).join(" ")); $.get("/resource/lodash?t="+_bnt); },4000)); (function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9fe2492ab958f047',t:'MTc3OTE4NTE1Mw=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00