Have you ever dealt with a dataset that required you to work with list values? If so, you will understand how painful this can be. If you have not, you better prepare for it.
If you look closely, you will find that lists are everywhere! Here are some practical problems, where you will probably encounter list values.
Histograms are the most common method for visualizing the distribution of a variable. A simple histogram can be very useful to get a first glance at the data. However, compared to other prominent plot types like pie-, bar-, or line plots, they are rather boring to look at. Most importantly, they require a statistical background to be interpreted correctly. However, before we ditch the histogram entirely, let’s try and make it more beautiful, richer in information, as well as easier to interpret!
In this tutorial, we’ll take a standard matplotlib histogram and improve it aesthetically as well as add some…
Cronbach’s Alpha is the dominant measure of scale reliability in psychology and the social sciences. Its importance can’t be overstated and you’ll find it in almost every quantitative empirical study you’ll read. Surprisingly, there hasn’t been a single article published on Medium covering its application in Python. On top, none of the common data science libraries like NumPy, Pandas, or Sklearn feature Cronbach alpha measures. This article will show you exactly how to do it.
First, I’ll explain some of the theory and math behind Cronbach's alpha. Then, we’ll quickly get to hands-on coding.
Whenever we use multiple items to…
Are you a music lover or a programmer? Chances are, you’re both, just like me! When I started using Spotipy, I had little programming experience and wanted to explore computational audio analysis. Now, as I’m knee-deep into programming and Data Science, I’m starting to appreciate Spotipy for creating amazing datasets for my data science projects.
Spotipy is a Python library that makes it easier for users to access the Spotify Web API and retrieve…
Music streaming services and their community features have had a rough marriage. After a multitude of shockingly unsuccessful advances, Spotify has shut down nearly all of its community features while Apple keeps and Tidal expands them. Their features typically revolve around sharing music, playlists, and personal listening histories on social media or in the streaming services own communities. As a student of musicology with special interest in the economics and psychology of everyday music listening, the dogmatic insistance on this approach has always made me frown. I believe Spotify is on the right track with shutting down their community features…
In scientific studies, displaying error bars in your descriptive visualizations is inevitable. Holding information about the variability of your data, they are a necessary complement to your mean scores. However, scientific visualizations tend to be more beautiful on the inside than on the outside.
As data scientists, we are taught to use attractive visualizations to tell stories. Anything that distracts the viewer from the main point we are trying to make is adviced to be removed. …
Why you need to learn this
It is widely known that various forms of data preparation make up more than 80% of a data scientists job. This is a valuable piece of information given that new people getting into the field seem to think it is all about expensive computers and complex deep learning models. You might be able to impress your coworkers in the data team with those things, but your managers and customers generally don’t care much about technical details. However, it is crucial to get your work noticed and understood correctly by exactly these people. …
I was listening to a podcast about AI in the music industry, recently. When we think of AI and music, it’s typically music classification and generation that come to mind first. In this podcast, however, Alex Jacobi from “With Love and Data” blew my mind with a completely different approach. With his team, he developed a ML model that let’s media producers find the right music for their ads, movies, etc. much more efficiently. When they search for a few keywords, the model will also recommend them music that is not tagged with any of the keywords, but with keywords…
Listening to music can be a deeply emotional experience. To some, this means getting hit by a certain song on the radio and revisiting their first kiss with their partner, or the loss of a loved one. For others, music is a means to shield off their surroundings in stressfull situations and to hide themselves in their own auditory bubble. Since experiencing music is such an individual process, different kinds of personalities tend to listen to music in different ways.
In my previous article “How Your Personality Predicts the Way You Listen to Music” I outlined how personality is connected…
If you are learning statistics for data analysis, chances are you have come across the concept of continuous and categorical variables. Some of you may know the difference between the two, and some of you may even know the different analytical approaches the two require.
For this article however, your statistical background won’t matter much, as I will give you a short introduction to the theory first. You’ll learn how to tell categorical from continuous data and what categorizing continuous data is even good for. …
Exploring the intersection of data science, musicology, and economics . Special interest in classification, visualization and the psychology of music.