BY Alind Vats

As a music enthusiast, I have always wondered what kinds of music other people find interesting.

Surely, it had to be different from what you hear on the radio or at mainstream night clubs. I decided to do a little digging, and surveyed, anonymously, 59 students on their music tastes. I used Spotify’s Developer Tools to collect data regarding the responses I got from the survey.

We posted the survey on the UNSW Discussion and the Blitz 2019 groups on Facebook, and I asked a few friends to fill out the survey. I tried my best to maximise my goal of randomising the people who respond within UNSW. The data that I decided to use was composed of the responses people gave to the question “What is a song you’ve listened to a lot lately?”

I think the answer to this question gives an indication of a general type of music the person finds interesting, and worthy of multiple listens. I had theorised that most of the music would be recent, and looking at the data, I was fairly correct. We’re almost 7 months into 2019 and almost half the tracks in the data were released in 2019. Only about a sixth of the tracks were released before 2010.  

I analysed the artists these songs are from, and expected to see multiple occurrences of popular artists like Ed Sheeran, Ariana Grande, Billie Eilish, Post Malone and Drake. This was only somewhat reflected in the data. The highest occurrence of an artist was surprisingly 2, and it included Top 10 artists on the Billboard Chart like Ed Sheeran, Billie Eilish and Lizzo - but it also included Paramore.

Some popular artists missing were Post Malone, BTS, Khalid, Drake and Ariana Grande. In fact, most of the artists in the data aren’t even in the Top 100 Billboard Chart right now. This means that even though students are mostly listening to music that came out in the past decade, they are listening to mostly artists that are currently non-mainstream.

Spotify’s API gives a lot of information regarding tracks, playlists and artists. One such attribute it gives regarding a track is its popularity. The popularity of a track is a value between 0 and 100, with 100 being the most popular. The popularity is calculated by an algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are. Generally speaking, songs that are being played a lot now will have a higher popularity than songs that were played a lot in the past. This is what the popularity distribution of the data I surveyed looks like.

You can see that the data has peaks from 60-80 and also an unexpected bump from 30-40. To compare this, have a look at the popularity distribution of the Hot 100 playlist below. The playlist features this week’s most popular songs (a song is popular if it sells and streams a lot).

The peak, more or less, is 70-90, which is higher than that of UNSW students’. It is also more concentrated within this peak and less widely distributed than the tastes of students. So, UNSW students are listening both to songs that are popular and not popular today.

Valence, an attribute provided by Spotify’s API, is a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). This is what the distribution of valence looks like among the student data.

The majority of the plot lies below 0.5 and the peak is between 0.2 and 0.3, which means that UNSW students heavily listen to songs that are depressing or angry. In comparison, this is what the valence distribution of the Hot 100 playlist looks like.

The plot is almost symmetric, surprisingly, with values less than 0.5 being slightly more dominant. The peak, more or less, is 0.4-0.7, which indicates tracks that sound neutral. In general, the tone of popular Hot 100 music is more positive than that of the music UNSW students listen to.

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. This is what the distribution of danceability is within the student data.

There’s a discernable peak from 0.7-0.8, which indicates a fairly high level of danceability.

This is what the danceability distribution of the Hot 100 playlist looks like.

There is a peak from 0.7-0.8 and a heavy majority of the values are upward of 0.6. Even though the peaks of both the graphs are the same, the Hot 100 playlist has a distribution indicating more prominent danceability than the UNSW students’ distribution. Perhaps, this might be correlated with the fact that UNSW listens to sadder songs than your average person.

Although I found some interesting results, there were some limitations when it comes to the analysis. A major limitation was the small size of 59 students, which is insufficient to generalise for a student body of over 53,000 students. Some of the graphs are also slightly misaligned—please assume the start of a blue box plot to be aligned with a labeled (or inferred) number on the x-axis. Another limitation is that I did the whole project based on one song that a person listens to. If I had a set of songs a student has most frequently listened to, I would have a wider range of data. A limitation related to survey responses is that sometimes a responder might not give fully accurate responses— advertently or inadvertently.

This was an exciting study, and I would love to do more data dives and analyze them. Shout out to Blitz for helping me with the survey, and giving me a platform to publish my findings. And, shout out to Spotify’s API for helping me gather data and Python and scikit-learn for helping me analyse and visualise the data.


The Public


The Boys