Category: Python
-
Explore Your Music Library with Python
Hi all! ‘Tis the season for Spotify Wrapped, that is, if you subscribe to Spotify. I used to, but I left for two reasons: The algorithm was not good enough at making recommendations that stuck with me long-term, and the delays in their lossless audio options. However, this means I miss out on the annual… Read more
-
Deployment-Ready Flask App
For me, deployment is one of the most exciting parts of the development process. Knowing code is live and serving a real-world purpose. Unfortunately, while developing, the gulf between by local development branch and a live production environment feels oceanic. That’s why planning for deployment from the start can help accelerate your development! So today,… Read more
-
Searching for Jazz
I may have mentioned in one of my first posts that one of my long-term goals is to create a music discovery/recommendation algorithm. The music streaming apps are great at managing playlists and streaming music. But I want a seamless experience for discovering new music. And, without having to listen to the full song, I… Read more
-
Deducing Poets by their Poetry
In case you didn’t read my tutorial on predicting jazz genres extreme gradient boosting, check it out now! In short, I used Wikipedia articles on jazz albums to predict the genre of the albums. The method was over 80% accurate on the test set. So it gave a very reliable way of extrapolating genre data… Read more
-
Tracking E-Book Downloads from Project Gutenberg
Background on my work tracking e-book downloads from Project Gutenberg. Update: The code is still available, but I have turned off the cron job to live update the data. In the era of e-readers and digital subscriptions, Project Gutenberg is an online library publishing great e-books for free. Their selection emphasizes classic literature from around… Read more
-
Predicting Jazz Genres with XGBoost Classifiers
Introduction Welcome to the first tutorial post on my blog! Any post that I tag “Tutorial” should suit a wider audience, and I hold to a higher standard for reproducibility. I want to eliminate the problem of developers searching for answers in documentation, but things don’t quite line up with how their system is set… Read more
-
The Best Git Repository of All Time
In this post, I’m going to proselytize open source software and what I’m calling The Best Git Repository of All Time. I will preview how it’s going to help me deploy a full-stack web application. The Problem The reason I’ve been so focused on web-crawling in recent blogs is I have a problem. There’s a… Read more
-
Sci-Fi Novel Data Visualization
I collected Wikipedia articles for nearly 900 science fiction novels, and clustered them via the TF-IDF and k-means algorithms. The dimensions are reduced for visualization via t-SNE, the colors represent the k-means clusters, and the dot size represents the prominence of an article among the other data. Read more about the methodology at this post.… Read more
-
Is Jazz Dead?
Background Recently, I attended a concert by Marcos Valle put on by the record label Jazz Is Dead. It reminded me to make good on my promise to upgrade my jazz data projects by extracting new data from the wiki pages. The main obstacle to doing this at the time I extracted the data was… Read more
-
A Wiki Crawling Reflection: The Return of L. Ron
Introduction Hey all! Maybe you’ve seen my previous post on visualizations of jazz-themed Wikipedia articles. I also posted the code for the project here on GitHub. I just wanted to go behind-the-scenes a little on how that project works and brainstorm some of the other things you can do with it! Some of this, I… Read more