Unity3D LipSync Asset Reviews

Technical Tech
4 min readOct 4, 2018

Getting 3D characters to speak dialogue is a massive pain point for developers. Its 2018 and I’m taking a look into 3 assets that hopefully can get the lip syncing done painlessly.

My 3 current options are Salsa with Random Eyes (€31.27), LipSync Pro (€31.27) and Occulus SDK LipSync (€0.00), there was lots of talk from people on Unity forums about Cheshire, people saying it’s a great option but the developer pulled that from the asset store a year or so ago.

Salsa with Random Eyes

This is a pretty simple solution, its better than random or looping jaw animation, it has 3 blendshapes, small medium and large and analyses the audio and triggers a blend. The random eyes part makes your character blink. There are some pre-setup scripts for various characters which takes a lot of pain out and does make it pretty quick to setup.

This is a very basic way of animating the face and results look simple. Fine for background non playable characters but does look very stilted and “computery”. Its quick and easy to implement but far from believable.

The random eyes is a nice touch and stops the characters from looking really fixed but this is far to simple for most modern audiences.

LipSync Pro

This is phonome based (9 differnet ones) and annoyingly it groups all the sibelences (CDGK.. ThYZ) together into using the same blend shape, where it could really do with breaking them down a bit.

There are multiple components, the lip sync tool gives you an in editor way to refine automated lip syncing via triggers on the audio file and ways to trigger additional blendshapes (emotions) and animations (gestures). This has the ability to group various blend shape weights together for each phoneme. It seems very mature and well thought through. You can trigger face bones rather than blend shapes or do UV offsets or sprite swapping.

The emotions animation effectively moves this out of just being a lipsync solution into a full facial animation studio. You could pose any number of face animations and store them a trigger and re-blend between them.

You can also specify bone transforms as well as blend shapes. This is a must have for me and something neither of the other solutions could manage. I was using an iClone character to test, which has a great amount of facial blend shapes but also has a jaw bone and tongue bone which need animating during lipsync.

With LipSync Pro you can define each phoneme with as many blendshapes and bone animations as you need.

That coupled with the audio event manager makes this a clear winner. My only issue was that for auto phoneme detection from audio I had to swap to using a windows machine as the current Mac High Sierra build the automation plugin is broken.

It took about an hour for me to get the blendshapes and jaw bones setup and the results were very good, it looked as good as I think is possible with an automated system, the next level up is something like iClone Faceware but thats a lot of additional time and investment.

Occulus LipSync

Its timing seems really good, a few too many animations it gives the model a kind of chattering look but does a pretty good job of automatically generating visemes, and can create an asset with the timings and visemes in, which if you plan to do more manual tweaking is helpful.

For 3D models it takes one blend shape for each viseme (info on visemes here: https://developer.oculus.com/documentation/audiosdk/latest/concepts/audio-ovrlipsync-viseme-reference/) which is a bit of a limitiation as you’ll need to create these yourself if they’re not already keyed. Alot of models come with lip morphs but not necessarily viseme morphs, so you either generate them, (time consuming) or write a script to blend others together (also time consuming). Any automated lipsync solution will need this though

For other textured lip models it can do texture flipping, swapping out a texture on a material. There’s no built in solution for UV offsets.

Its free and it sort of works if you can fit in with it’s options. It could do with a way to simplify the results or at least tweak them a little. Its pre-generated analysis was pretty good though. Results are “fine” not great, just fine.

With some additional work; a way to refine the automated results and manage additional blend shapes a little this could be a great solution.

Conclusion

This comparison was easy to make, Lipsync Pro is an easy winner. Loaded with flexibility and options. It a professional level asset that does what it should and in time saved it’s easily worth its asking price.

If it has to be free, then Occulus is you’re best option (or maybe the free LipSync Lite option from Rogo), its a good starting point to build upon, but I think you will need to build some tools for your workflow.

If your choice is between Salsa and Lipsync Pro, well, at the same price there really isn’t any comparison, Salsa is a really simplistic option with none of the depth or options of Lipsync Pro.

--

--