Ben’s Blog

The New Jazz Album Data Visualization (Portfolio Edition)

Posted by:

|

On:

|

,

For the previous version of this visualization, see my old blog post here. I wanted to both improve the quality of the code and the visualization. See this post for more details. In short, this is a visualization of Wikipedia articles on jazz albums clustered via TF-IDF and k-means techniques. Colors represent k-means clusters in the 3000-dimensional TF-IDF feature space, and the size of a datapoint represents how often it was referenced by over 100,000 articles on jazz artists, songs, genres or other albums.

jazz-albums-1900

The Python source for this project is available here.

One response to “The New Jazz Album Data Visualization (Portfolio Edition)”