Why are you walking with your head down, arms swinging listlessly by your sides, dragging your feet? Did you just have an argument? Did you check your bank balance? Get an unexpected bill? Or do you always walk like that?
An interesting new study from the University of North Carolina, Chapel Hill, suggests AI may be able to guess your “inner state” from your gait. But there’s clearly a long way to go in this research.
Here’s the short version of what the researchers did. They created a simplified representation of a human being — a “skeleton mesh” — involving the movement features (e.g. speed, acceleration, jerkiness) for sixteen body joints. Using their own video dataset, as well as publicly available datasets, they abstracted visualizations of the way people walked to the skeleton mesh model. Using 688 Mechanical Turk volunteers, sample sets of videos were emotionally classified (happy, angry, sad, neutral). This provided a data-set which the researchers used to train a machine learning model to identify emotions from skeleton mesh walking behavior.
Deep breath. Once trained, how well did the model work? Applying the model to their own video data-set, while having the videos evaluated by an MTurk crew, the researchers reported accuracy of over 80 percent in the machine learning model, a big advance on earlier experimental attempts.
Of course, there’s nothing novel in the idea that we can convey what is actually a very large and subtle array of emotions and attitudes through body language alone. That’s something I heard from evolutionary psychologist and body language expert Mark Bowden at the Demandbase ABM Innovation Summit earlier this year (see below for a TED talk from Bowden). Humans, largely unconsciously, respond to the way the people around them behave, deriving nuanced insights from posture, facial expression, and hand movements. While it’s impressive that a machine learning model appears to be able to detect three very basic emotions, it’s at a very early stage of being able to do what we do all the time.
Let’s also talk about the walks themselves.
The researchers’ own video data-set is not derived from what one might call real-life episodes of walking. They recruited a group of university students and asked them to walk as if happy, angry, sad or neutral. The observation in the study that non-actors are as adept at this as actors is interesting, but doesn’t really address what seems to be an important limitation here. Both the MTurk observers, and the machine learning model, were rating the behavior not of happy (angry, sad) people walking, but the behavior of people walking as if they were happy, angry, or sad. The missing element here is evidence that what one might call performative gaits are sufficient to real-life gaits. In other words, do people really experiencing those emotions actually walk the same way as people performing those emotions.
A fake walk might not be as extreme as in the Monty Python sketch, but it might well be exaggerated.
The researchers themselves point out other limitations. In real-life situations, video might not capture the full range of 16 joints. Walking might be affected by carrying bags, or by mobile phone use. And clearly, there’s recognition that such data would be usefully combined with data on speech and facial expressions.
One expert like Mark Bowden can tell a great deal more by observing a holistic set of behavioral characteristics than AI can yet derive from gait or gestures. The obvious appeal of assigning the task to machine learning, is that there aren’t many Bowdens to go around. As ever, it’s about scale. But if headlines about this study prompt you to think there’s a new way available of knowing how an audience is feeling, we’re not quite there yet.