MusicMiner

Introduction

This document gives a brief introduction into the technology and source code of the Databionic MusicMiner for interested developers. More information can be found in the Javadoc and the source itself. If you have any questions or ideas for new features please contact us on the mailing lists.

Compiling

This document describes what needs to be done to compile and set up MusicMiner. The description covers Linux, Windows, and OS X Tiger. It might work on other operating systems as well, please report and experiences.

Environment variables	Many of the following steps require setting environment variables. This is how you do it on Windows: Open the Control Panel, then the System icon, then the Advanced tab. Press the button Environment Variables and add or changes existing variables, e.g. enter `MUSICMINER_HOME` as the name and the install path as the value. Linux, OS X: Add e.g. `MUSICMINER_HOME=~/musicminer` to the file `~/.bash_profile` if you use the bash shell or `setenv MUSICMINER_HOME ~/musicminer` to the file `~/.cshrc` if you use the tcsh.
Java	Download and install the JDK (Version >= 1.5): for Linux, for Windows, for OS X. Set the environment variable `JAVA_HOME` to the installation path, for OS X this is: `/System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/Home` Add the environment variable `JAVA_HOME/bin` to the environment variable `PATH`.
Maven	Download and unpack or install the Maven 1.0.2 build tool. Note: Don't use Maven 2.0, it does not work well with the Torque plugin, yet. Set the environment variable `MAVEN_HOME` to the installation path. Add the environment variable `MAVEN_HOME/bin` to the `PATH`. Install the Torque plugin: `maven plugin:download -DartifactId=maven-torque-plugin -DgroupId=torque -Dversion=3.2` More help is available with the Maven quickstart guide.
MusicMiner	Get the latest sources from CVS, e.g. with cvs -d :pserver:anonymous@cvs.sourceforge.net:/cvsroot/musicminer login cvs -z3 -d :pserver:anonymous@cvs.sourceforge.net:/cvsroot/musicminer co musicminer Set the environment variable `MUSICMINER_HOME` to the installation path, i.e. the folder that contains the `src` folder. Add the environment variable `MUSICMINER_HOME/bin` to the `PATH` environment variable. More is available with the CVS manual.
Audio tools	The sox program and one or more of mplayer, lame, and mpg123 are required by MusicMiner. Windows: The binaries are included. Linux: See the installation instructions. OS X: Install sox, mplayer, and mpg123 via the Fink project. Comprehensive documentation is to be found at the Fink website. Please ensure that you enable the unstable branch of Fink and compile the latest packages from source where necessary. The version of lame supplied by Fink is deprecated. The most recent binaries and associated libraries can be found here. In order to audition tracks directly from a MusicMiner map or playlist, it is currently necessary to install the XMMS multimedia system via Fink. Please check the Fink F.A.Q. for help with setting up XMMS.
Compiling	Change to the `src` folder and call `maven jar` to compile the sources. A faster way for repeated compiling is to use `maven console` and type `jar` or any other Maven commands at the upcoming prompt.
Installing	Call `maven install` to copy all files neccessary to run MusicMiner to `MUSICMINER_HOME`. Call `mmdb -c` to create the database. Call `mmdb -i` to see whether MusicMiner is working. Only on OS X: Launch X11 (required by XMMS). Call `mminer` to lauch the MusicMiner GUI. See the user manual on how to add songs to the database.
Documentation	Optionally call `maven javadoc` to create the source code documentation. Optionally call `maven site:generate` to create this HTML documentation.
Misc	The file `etc/musicminer.conf` contains settings you might want to change, e.g. the log4j settings for debugging or the database user and password. If you really want to change the database schema (this may break a lot of things!) edit the file `src/schema/musicminer-schema.xml` and call `maven torque` to regenerate the *.sql files and the data access classes. Repeat the creation of the database and be prepared to have lost all database contents!

If something doesn't work as described above please ask for assistance, we might have forgotten to mention something.

Audio Features

The description of sound based on features extracted from the audio data of each song as performed by the MusicMiner is directly based on the latest results in research on audio similarity. Each song is described by a fixed length vector of real valued features extracted from the audio data. The conversion from *.mp3 to raw audio data is performed using e.g. mplayer and sox. A pure Java processing of the audio files is planned. The actual audio features are then extracted using the Value Series Plugin of Yale machine learning software. Yale offers a modular processing concept based on XML. We did have to extend the Value Series Plugin to fit our purpose, however. These changes are already incorporated into the latest CVS version of the plugin.

Music Maps

The MusicMaps used to visualize the sound space are U-Matrix visualizations of Emergent Self-Organizing Maps based on the extracted audio features. Emergence is the ability of a system to develop high level structures by the cooperation of many elementary processes. A popular example of an emergent phenomenon is the so called La Ola in a sports stadium. A large number of people perform the simple task of standing up and cheering for a short moment. In this way a wave rolling through the crowd is formed. The wave is not visible at the moment of being part of it, only from a distance. Transferring the principles of self organization into data analysis leads to Emergent Self-Organizing Maps (ESOM). ESOM consist of a large number of prototypes describing different sounding points in the soudn space. During the training, the sound examples are iteratively adjusted to the structures thersonal music collection. A low dimensional projection that preserves the perceptual topology of the input space as good as possible is created.

The U-Matrix is the canonical display of ESOM. The local differences in sound are displayed at each map position as a height value creating a 3D landscape of the high dimensional sound space. The height will be large in areas where no or few songs reside, creating mountain ranges for large perceptual differences in sound. The height will be small in areas with many similar sounding songs. Thus homogeneous groups of music are depicted as valleys. Using the same coloring commonly found on geographical maps, an intuitive visualizations of the sound space of your personal music collection is created.

We have implemented the latest research results on ESOM in the Databionic ESOM Tools, a software for scientists interested in using ESOM technology for visualization, clustering and classification. A simplified version offering less options is integrated in the MusicMiner to offer an intuitive way of navigating personal music collections.

Music Database

The database storing the meta information on songs like artist and title as well as the extracted audio features and the MusicMaps is programmed using the Torque database abstraction layer. This way, theoretically many different backend databases can be used. Currently we are using the pure java HSQL database because it can easily be distributed and set up with the MusicMiner installer. If you are interested in integrating support for other databases, e.g. MySQL, please contact us on the mailing lists.

The database schema is designed to be very flexible. Most information is stored in seperate tables, to reduce redundancies, e.g. genres, artists, albums. A song can be in the database without an actual file being present. In fact a song can also be linked to several audio files at different locations, though this is not heavily used within MusicMiner, yet. Playlists can contain songs, albums, and other playlists. Ratings can be stored per user and song or per user and playlist.

Import/Export

The import and export of metadata is based on XML. The dom4j library is used for this purpose. An example XML file looks as follows:

<mm>
  <artist id="197" name="Bob Marley" sortname="Bob Marley"/>
  <artist id="198" name="The Wailers" sortname="The Wailers"/>
  <playlist id="141">
    <name>Legend</name>
  </playlist>
  <media id="2" name="localhost"/>
  <song id="844" playtime="233" position="3">
    <name>Could You Be Loved</name>
    <added>2005-03-23</added>
    <comment></comment>
    <artist ref="197"/>
    <artist ref="198"/>
    <album ref="141"/>
    <user id="3" name="mminer"/>
    <file id="894" size="5282829">
      <url>file:///home/mm/reggae/Bob Marley & The Wailers - 03 - Could You Be Loved.mp3</url>
      <bitrate>320</bitrate>
      <mpegid>1</mpegid>
      <layer>III</layer>
      <frequency>44100</frequency>
      <channelmode>Joint Stereo (Stereo)</channelmode>
      <emphasis>none</emphasis>
      <framecount>8998</framecount>
      <media ref="2"/>
    </file>
  </song>
</mm>

Not all database fields are supported, yet. If you are interested completing the XML support of MusicMiner, please contact us on the mailing lists.

MusicMiner Packages

databionics.mm	All command line applications and the MusicMiner graphical user interface including the definitions for command line options in the `*Options` classes.
databionics.mm.add	Some helper classes for parsing filenames and ID3 tags and merging the information for adding them to the database.
databionics.mm.afe	Audio feature extraction. Includes conversion of audio files and and interface to Yale that perform the feature extraction.
databionics.mm.gui	Everything around the graphical user interface.
databionics.mm.om	The object model as generated by Torque, plus many convenience functions added for easy object-oriented use in the rest of MusicMiner.
databionics.mm.xml	Some helper classes for XML import with dom4j.