Is big data only for the big?

Joanna Szenk
Getindata Blog

--

Would you consider using your customers’ images and videos in your marketing campaign? What are the challenges for an app with 47M users? Is it really required to visit the elderly to take good care of them? Are you curious to know how to monitor the spread of pandemic? How to measure impressions in podcasts? Read about Flowbox, Lifesum, Nectarine Health, Xolaris and Acast — medium-sized Swedish companies that make great use of big data technologies similarly to the globally renowned Spotify, King or Truecaller.

Flowbox — marketing technology

Flowbox developed a platform that allows companies to collect any public visual content from configurable sources and social media channels. Images or videos created by customers or brand ambassadors can be used in company digital channels such as their website, online shop or ads, once approved by their authors. Several customers testify this kind of content receives increased click-through rate compared to ordinary campaign material and is faster and cheaper to create.

“We’re currently building a model, using advanced statistical algorithms, that will sort the image flows in our platform in a way that ensures that the most engaging and relevant images are the ones that users see first”, says Erik Lundberg, a data scientist at Flowbox. This scoring algorithm takes into account how many times a particular content was viewed, clicked through as well as how recent it was.

In terms of technology Flowbox uses Apache Spark for data retrieval and complex SQL queries to calculate Flowscore, a newly developed sorting algorithm for optimizing flows, based on an advanced statistical model.

The company has plans on more advanced, machine learning projects. “One future project is to create filters that allow customers to more easily find the content that is relevant to them”, explains Lundberg. “One company might for example want pictures with only people on, while another might just want outdoor pictures.”

Lifesum — health tracking app

Lifesum created a mobile app that helps people keep track of what they eat, their exercises, habits, weight and body measurements. It can also recommend a 1 to 3-week meal plan that meets your goals and preferences. It is offered under a freemium model, meaning you can use the app for free but get a richer product while subscribing for a premium plan.

Lifesum relies on data analytics to provide the best possible experience to their users. And it has a lot of data, considering its size. Their data sets encompass a massive database of food items that come in many varieties, users behaviour data as well as the information on the subscriptions. It’s worth highlighting that the Lifesum app has recently exceeded 47 million global users. “With this amount of data you would be helpless with any form of calculation if you didn’t have access to platforms such as Google Cloud BigQuery or Snowflake”, said Samuel Troilius, Vice President Data & Analytics at Lifesum.

AI-wise the app has the image recognition features. As tracking your meals requires a certain level of determination, Lifesum would like to make it as easy as possible. You can take a photo of your dish and several tags will be proposed by the app to record what’s on your plate. “The challenges here are that there is so much noise in the pictures, the model requires a lot of training and in case of an app with the global reach you need the multi-language support”, Troilius commented.

Lifesum also launched predictive tracking of meals this summer, leveraging the on-client ML capabilities that Apple’s and Google’s development kits provide.

Nectarine Health — remote care solution

Nectarine Health developed their remote care solution to assist seniors living independently and their caregivers. Data which is collected by their proprietary wristband flows directly to their Google Cloud-based platform via nodes placed around the living space. Once it reaches the platform, different artificial intelligence-based algorithms start processing it. The software learns from the collected information and can detect patterns and anomalies. For instance, the system can identify if the wearer is up unusually long during night time or if a fall occurs and generate alerts to caregivers.

Image from Nectarine Health

Longer term analysis of the data can also help in optimizing the treatment. All of that significantly improves caregivers’ operational efficiency.

As usual for IoT systems, Nectarine Health collects a lot of data (around 4 billions of samples per week) and needs to store terabytes of it. The movement data points from end-users are streamed in real-time.

“The open-source solution Spark Streaming turned out to be optimally suitable for our requirements. Organized and stored data points are then available for further pattern detection algorithms”, explained Niklas Fürderer, AI/ML Engineer at Nectarine Health. “Our focus here is accuracy over speed while maintaining costs to a minimum”, Fürderer added.

In the AI and machine learning domain the company applies a wide spread of services and frameworks and constantly adds new research-based solutions. Fürderer provided the example of using the latest version of TensorFlow combined with many supporting libraries as well as fitting ML model tracking and serving solutions. “That way, we are ready for testing out newly trained models and evaluating them with real use cases”, he said.

Xolaris — public safety technology

Xolaris creates systems that enable their customers to gain insights from mobile networks to save lives in times of public threat and to detect criminal and terrorist activity. With enough historical data collected over time, machine learning can be used to recognize certain patterns of cellular activity such as position, movement and way of using phones as suspect and flag them for human review.

The most relevant product of Xolaris is currently REACT™, a decision support platform to assist government and health authorities in understanding how their measures affect people’s movements and ultimately the spread of infection during a pandemic. The platform measures the number of people in each location at a given time, generates heatmaps and density measurements as well as detects movements of individuals to monitor self-isolation. REACT current users are governments in the Middle East and Asia.

“Our big data solutions are used to collect billions of anonymous data events per 24h in real-time from the mobile operator networks and then apply different algorithms and AI solutions”, said Amalendu Parasnis, Xolaris founder and CEO.

Acast — podcast marketplace

Acast enables producers to host podcasts for free and monetize them via its ad-supported platform. It means that whenever a listener requests a new episode via any of the various podcast apps, the podcast file is merged with an audio ad tailored to the listener profile. Due to a multitude of podcast apps and lack of common standards, advanced analytics need to be applied in order to measure the number of impressions the advertiser is buying. To achieve that Acast data pipelines collect massive data streams from episodes access requests, join them with the data on the size of the files actually uploaded for each request and further process and classify the data. The service currently serves over 125 million streams per month.

Data after similar processing feeds also the Acast’s portal for content creators. It allows them to draw insights about their listener base such as for example cities where podcasts were requested and devices used for that.

Acast uses podcasting data also for their internal needs, to quickly respond to clients’ expectations and let their business grow. They look for patterns such as trendy topics or shows that attract more audience as well as try to detect anomalies and malicious behaviour (read more about anomaly detection).

All of that is possible with the use of big data technologies such as Apache Spark coming in different flavours of AWS infrastructure: EMR (Amazon-managed Hadoop distribution), AWS Glue Jobs or Spark on AWS Fargate (here’s more on running Spark on AWS). GetInData has recently assisted Acast in their successful data system migration to these technologies.

--

--