Visualizing Voice in Microsoft Teams
When it comes to meetings there are two types of people; those who turn their video on and those that do not. In this post I will talk about how I designed a visual/motion system to give audio-only participants an equal voice in meetings. "Visual voice" is a term I will use which refers to the halo that appears around the audio-only avatars as they speak. Watch the screen recording below to get an idea of how visual voice sytem patterns across desktop and mobile experiences.
The above represents the end result after many design iterations. Let's break it down by starting with the avatar, when a user begins talking their avatar image come alive by lifting off the canvas while animating from its 84% base opacity to 100% opacity. I really wanted to drive home the notion of active vs inactive presence and I achieved this through very subtle shifts in scale alongwith the appearance of a drop shadow to reinforce a sense of depth. Want to try it out yourself? Just mouseover the avaras to simulate active speaking, also I highly recommend viewing this prototype in Safari however Chrome will work as well. Click to view the interactive desktop prototype here.
Moving on, lets talk about the secondary and arguably equally important counterpart to the visual voice animation, the "living" background. As you may have noticed by now the background heats up and cools off at a slightly slower rate than the visual voice ring. This was intentional. The background was designed to follow the flow of the conversation while visual voice was intended to be a more deterministic indicator of the current active speaker. Overall I think the slight lag in the heat up and cool off helps softens the edges around abrupt speaker changes during real world meeting scenarios. Lets see how this system scales to mobile by clicking here. How to use: Mouseover or tap on an avatar if your using touch enabled device to simulate speaking and tap anywhere along the screen to invoke the meeting call controls and header.
An essential part of this system is how light is used to signify activity. You will notice that the background has a subtle light casting gradient in the background. I created this by using three dimensional lights Cinema 4D. I wanted to accurately emulate physical light and I achieved this by creating a light ring that casted light onto physical surfaces behind it. Once I was satisfied with the results I used blending modes to screen the light cast render over the solr purple background layer.


In future iterations I hope to use the active speakers audio signal to range map the scale of the visual voice ring. I hope this will complete the system by adding that much needed human element to the calling grid. That's it! If you like you can check out another desktop prototype I created to see how this system scales around content sharing in teams. Watch the video below or try it out yourself by clicking here.
Another component to visualizing voice in calling experiences is giving users audio reactive feedback on their microphone levels. I built a prototype using the getUserMedia API to drive animations based on microphone levels streaming over the browser. Click here to view. Note this prototype only works in Chrome and Firefox and you will need to allow access your microphone in order to properly drive animation with your voice.