
The Cocktail Party Effect and Digital Fitness Audio
A 1950s Cocktail Party
Let me take you back in time to a 1950s cocktail party. Help yourself to the devilled eggs and cheese with crackers; ambrosia is for dessert. I imagine you and I will be drinking a Singapore Sling, I will lean over to whisper a few words in your ear, but can you hear what I'm saying?
Rock around the clock by Bill Haley & His Comets is blasting out of the jukebox; it’s so noisy in here that do you have any idea what I'm saying at all in this busy, crowded room?
Amazingly, despite all of the different conversations happening at this swich cocktail party, plus the loud rock and roll music, you somehow can single out my individual words in amongst the cacophony of different musical frequencies. How so?
"You somehow can single out my individual words in amongst the cacophony of different musical frequencies."
The Cocktail Party Effect
The cocktail party effect (or problem) was first identified by Colin Cherry in 1953. He identified how the brain can focus on a single sound or conversation in a noisy environment like a crowded cocktail party, while filtering out other competing sounds. This selective auditory attention allows someone to follow a unique speaker despite the background music, and demonstrates how the brain separates audio streams and prioritises important information.
Further research by Neville Moray identified that attention could be broken by important new information received on a different audio stream that we initially do not perceive consciously, such as someone mentioning our name or making an emergency announcement.
"This selective auditory attention allows someone to follow a unique speaker despite the background music."
How the Brain Separates Sound
The brain is using the slight differences in the timing and intensity of sound arriving at each ear, to determine a sound's location and distance. This spatial mapping is crucial for separating sound sources. Most current audio devices struggle to automatically replicate the human brain's ability to isolate specific voices from a single, blended audio track. When watching TV, all sounds (dialogue, music, effects) are typically mixed into one or two channels, making it difficult for the brain to use its natural spatial processing or selective attention filter effectively.
Many modern TVs and sound systems include a "speech focus," "clear voice," or "dialogue enhancement" settings. These features use signal processing algorithms to boost the frequency range of human speech. Advanced hearing aids can sync with TV audio systems or smart glasses to stream a focused, filtered sound directly to the user's ears, often with impressive results when combined with visual cues.
"Most current audio devices struggle to automatically replicate the human brain's ability to isolate specific voices from a single, blended audio track."
The Digital Workout Challenge
But in the world of digital workouts, the words spoken by the instructor and the lyrics sung by the singer are often clashing at key points during the workouts. Coaching from the instructor is a vital component of the experience for both safety and motivational reasons. Separately, it is easy to underestimate the importance of the lyrics in the music in the overall group fitness experience. Historic studies have shown that motivational lyrics can make a difference to performance, enjoyment and remembered pleasure. Many instructors are known for their music choices. Variations in musical intensity within a composition can also have similar positive influences over the group fitness experience, and these musical changes can also be hidden by the instructor's voice.
"Coaching from the instructor is a vital component of the experience for both safety and motivational reasons."
Our Technical Solution
At Johnson Digital we have been working on a clever technical solution to this problem. We add an audio engineering step in post-production to ensure that the instructor's voice is clearly heard over the lyrics of the music, whilst maintaining attentional focus on the music and lyrics when the instructor isn't speaking.
Traditionally, the same effect would have been achieved by noticeably turning the music down when the instructor speaks, a function that has been around for decades called audio ducking. You might have heard this before when an announcement is made at a wedding disco. The music clearly dips in volume when the mic channel is activated by speech. Not a great experience!
"The music clearly dips in volume when the mic channel is activated by speech."
We don't have to reduce the volume of the music to achieve total instructor clarity. We've ensured that there is no obvious dip in audio frequency volume on any component of the sound. In other words, you can't tell that the overall musical experience has been tampered with in this way. The outcome is simply that the music and the instructor's voice appear to be perfectly clear.
"You can't tell that the overall musical experience has been tampered with in this way."
Implications for In-Person Group Exercise
We believe this clever technique has broader implications for in-person group exercise. The battle between the mic and the music in most group fitness studios with archaic, decades-old sound systems typically leads to each channel being increased in volume to compete with the other. The instructor increases the volume of the mic to be heard over the music, then increases the volume of the music as the class progresses, and our hearing becomes sensitised to the volume. This leads to another increase in the volume of the mic again, as they can no longer be heard over the music.
If the instructor doesn’t have access to the mic volume control, they instead shout louder instead. Bad news for participants' hearing and for their own voice. Improving the acoustics of the room can help. Reducing overall reverberation (echo) in the room by adding soft surfaces or acoustic panels will make dialogue easier to understand.
"The battle between the mic and the music typically leads to each channel being increased in volume to compete with the other."
Why It Matters
If you want your digital fitness participants to work harder, enjoy the experience more, connect with the instructor to a great degree, engage with your music choices and for maximum safety, you should be using our post-production sound engineering techniques.
So, once you have sobered up from the cocktails, learn more about what we offer at -
Check out my next blog for more about improving the in-club acoustic experience.



