Google Summer of Code: Week 5–6 Excitement modelling: Analysing Sound and Subtitles

Sound Energy Analysis
The next component in excitement modelling was analysing SOUND. During a match, the cheer of crowd and the excitement in the voice of commentators marks a highly elated event of the match.
To capture this aspect, the volume was approximated by calculating the root-mean-square of the sound array related to a particular subclip (at a framerate of 10).
The volume curve, thus obtained, was convoluted with a smoothing kaiser function, to achieve a curve with few distinct peaks like follows.

This component was inspired by Zulko’s blog, but unlike his approach, I tried to smoothen out the sound curve. Also, this curve will further be used along with earlier developed components, for final highlights.

Subtile testing : Sentiment analysis
The commentators have rigorous discussions about the match, thus subtitles can serve a useful for analysing how the match proceeds, where a goal was scored or where someone makes a foul.
The attempt was to get the polarity of statements from the subtitles. The Language processing library used for sentiment analysis was TextBlob.
So, I tried to analyse cluster of statements with high positive polarity. The model was able to detect a few crucial moments, for example, statements with high positive polarity like :
“THAT’S A CLASSIC FINISH. GOOD PLAY FROM URRUTI.
WOW. GOOD FINISH FROM THE KID.” which was accompanied by a goal.
But there were moments which were missed due to negative polarity, where the commentators said statements in counter-positive fashion, like:
“YOU’RE NOT GOING TO BEAT THAT.” or “ABSOLUTELY NOTHING A GOALKEEPER CAN DO.” which were also accompanied by a goal.
One more problem, I think, was that the commentators randomly started discussing something positive or negative about a topic (like performance of a player in the season) etc., which caused a false-positive clusters. For example:
“HIS TECHNIQUE IS GREAT. EXCITED ABOUT THIS KID. GOOD SEASON FOR HIM. ” Thus, sentiment analysis was not as successful for the subtitle analysis.
This made the analysis tougher and the results were not satisfactory.
I will try things like Fuzzy string search for the subtitles, for statements with exclamations, or words like goal/score etc.
A close miss though..

Conclusion
The Sound curve was successfully obtained and will be further used in overall highlights detection. The subtitle processing did not work well with the sentiment analysis method. I will experiment other methods for it.