Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Mapping Your Music Collection (christianpeccei.com)
99 points by rbanffy on March 13, 2015 | hide | past | favorite | 22 comments


This is a really interesting article. A problem I've discussed on HN before is the "dynamic playlist."[0] Creating a music map like this provides a complete and elegant solution to this problem. Choose starting and ending tiles on the map, find any path between those tiles and you have your playlist. I might try my hand at implementing this soon.

In the past I had focused on using listener-generated data from sites like Last.fm, but if it turns out that musical similarity can be accurately determined using statistical methods that makes the whole thing much, much simpler.

[0] https://news.ycombinator.com/item?id=7630167


I haven't really looked in it yet, but it seems that "statistical features" here are mostly about replay-gain, which makes wholes thing much less interesting and isn't "complete solution" whatsoever.




Thank you, that's much better... didn't think to check there.


I was exploring this area recently, in a more specific way: I'd like to try to implement a DJ bot. But I am not sure machine learning is the best way: I would need to have a very big set of mix playlists by professional DJ to train a machine against. But then, if it worked, it would simply ape DJs. What I want to do is different: a "bot-flavored" playlist generator. A DJ that pushes tracks one after the other in an interesting way, but not in a human way. For instance, my bot-jockey could rightly decide to play a classical music track just after some deep house, because some features are matching. And this could be actually a good choice that a human DJ would not have done.

By the way, in the article, they avoid using external data, but for my project, I am trying to use AcousticBrainz data, and also explored Echonest's. Not sure where it will bring me though.


that reminds me of this project: MusicBox: Mapping and visualizing music collections (http://thesis.flyingpudding.com/)

there is a commercial product for unique ID provided by http://the.echonest.com/


One of the people involved in Echo Nest has put together (roughly speaking) multi dimensional genre maps based on their data:

http://www.furia.com/page.cgi?type=log&id=377#id377

http://everynoise.com/engenremap.html


It also reminded me of Elias Pampalk's Islands of Music project: http://www.ofai.at/~elias.pampalk/music/


Nice find. I remember seeing that. I just was amazed by the MusicBox. :)


While an interesting technique and premises, there are yet some oddities. Such as Beethoven in the middle of Rock songs (http://i.imgur.com/HgwuREl.png), I guess there are others.

I wonder how this would compare with Apple Genius, which I never used.


I get an "over quota" error :(


Now I'm getting it ;(


It's back on line now


And back offline now!


It seems to be running on Google App Engine. Its probably hitting the daily budget.


note to self: never, ever run anything on Google App Engine.


It's more like "set your budgets accordingly". I run a couple sites there and only once I hit one quota limit for a site I did not set a budget and that exceeded the free tier bandwidth limit. I like it specially for students because it encourages good practices.


Any chance on releasing the code as a single script, so I don't need to copy-paste the snippets you posted and fill in the blanks myself?


So much cool things you can do with numpy and machine learning on python, this article makes me determined to play with it.

however, I don't know where to start. I'm not particularly good at math, calculus or statistics. to me, machine learning and numpy still sounds like stuff scientists are qualified to use, people with phds. regardless I'm amazed at the stuff like this article and inclined to venture in to it.


Shoot me an email if you find a starting point. Amazed as you are and in the same spot regarding lack of skills.


I just dived into some of Udacity's data science classes and it's very easy to get started with numpy. I'm just scratching the surface but it still feels amazingly powerful. Pandas is another great tool that goes hand in hand with data, its like Excel inside your Python REPL with SQL tools and more.

Online class is one way to start, another would be picking a dataset and a goal for it and diving in. Kaggle.com has a number of content projects and a great deal of past data science stuff that's open sourced, I've been browsing it for ideas and approaches. Another source of inspiration might be local open data initiatives, your city, county, or state might have a pile of data available for interesting projects.

I think half the battle, at least for me, is not deep statistics or other maths skills, it is just diving in and trying stuff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: