M U S E:

MUsic Search Engine

 

Information Storage & Retrieval

CPSC-670

Fall 2000

 

 

 

 

 

 

 

 

 

 

 

 

 

Chia-Tung Chen

Feihong Wang

Gaurav Maini

Luis Francisco-Revilla


Introduction

MUSE is a Digital Library based on the mg system. MUSE provides full-text retrieval of a collection of songs. The songs in the collection are encoded as MP3’s containing the audio, lyrics, and metadata. Metadata includes title, artist, album, genre, tempo, mood, and situation. Users can search the collection by title, artist, album, lyrics, or all of the above. This is much more advantageous in comparison to the current state of the Web, were users usually need to conduct separate search efforts in order to locate the song’s audio and lyrics. In addition MUSE supports exploratory behavior by allowing users to browse the contents by genre, tempo, mood, or situation. Again this is better than the typical Web indexing systems, which only provide limited browsing by genre or artist name.

Motivation

Almost everyone has an MP3 collection that is growing and is treasured. MP3s are the way that people are personalizing their digital audio collections. But it grows at such a pace that keeping track of the individual audio piece within the collection becomes a slow and painstaking task.

 

Searching for MP3s is a painful job, since the search is generally by filename

. So we have put together a demo collection that has songs in various languages (e.g. English, Spanish, etc) that are compressed in the MP3 format. But we also insert the lyrics and other useful information about the song like title, artist, album, genre, tempo, mood and situation as text within the MP3 file format.

 

We then build a collection with this text extracted from the MP3 files and full text index the collection using mg as the backend. This enables the user to search for any particular song within the collection by merely entering a few words from the lyrics of the song, or by entering the title, artist, album, etc.

 

This approach aids the user in maintaining a sizeable number of MP3 files and the user is able to retrieve the exact song he/she is looking for without having to remember the filename or even the title of the song.

What good is a collection from which retrieval is slow and in some cases futile? Clearly this was sufficient motivation to propel us into building an index of MP3 encoded songs.

MP3

MP3 stand for MPEG-1 Audio Layer III. It is an audio only compression component. MP3 uses two compression techniques, one lossy and one lossless. The lossy compression is based on a psycho-acoustic model to eliminate audio frequencies that the human ear cannot hear. Then the lossless compression encodes the redundancies using Huffman coding. Typically, an MP3 file is around one-tenth the size of the corresponding uncompressed audio source.

 

The basic MP3 compression is a two-pass process with the following steps:

 

·        Break the signal into "frames".

·        Analyze the signal to determine its spectral energy distribution. This step breaks the signal into sub-bands, which can be processed independently for optimal results.

·        The encoding bitrate is taken into account, and the maximum number of bits that can be allocated to each frame is calculated. This step determines how much of the available audio data will be stored, and how much will be left out.

·        The frequency spread for each frame is compared to mathematical models of human psychoacoustics, which are stored in the codec as a reference table. From this model, it can be determined which frequencies need to be rendered accurately, since they'll be perceptible to humans, and which ones can be dropped or allocated fewer bits.

·        Finally, the bitstream is then compress using a Huffman coder.

 


Figure 1. ID3v1 and ID3v2 specifications

 

MP3 files are composed of a series of frames. The metadata is located at the beginning or end of the MP3 file. These metadata is encoded as ID3 tags. There are two variants of the ID3 specification: ID3v1 and ID3v2, and while the potential differences between them are great, virtually all modern MP3 players can handle files with tags in either format. ID3v2 has more features. Figure 1 shows an example of ID3v1 and ID3v2.

 

 

In addition to the metadata it is possible to usig a tag to insert the lyrics inside the audio file. Lyrics are embedded in the audio file between the audio and the ID3 tag. The lyrics section begins with "LYRICSBEGIN" and ends with "LYRICSEND" and has the lyrics between these keywords.  Figure 2 shows the Lyrics3 specification for including lyrics inside MP3 files.

The following simple rules applies to the lyrics inserted between the keywords:

 

·        The keywords "LYRICSBEGIN" and "LYRICSEND" must not be present in the lyrics.

·        The text is encoded with ISO-8859-1 character set

·        A byte in the text must not have the binary value 255.

·        The maximum length of the lyrics is 5100 bytes.

·        Newlines are made with CR+LF sequence.

 

Collection construction

The construction of the collection involved two steps. First the MP3 files were parsed and all tags were extracted. The MP3 files can be broadly divided as containing audio compressed using the MP3 compression algorithm and uncompressed text that constitutes the ID3 tag and the Lyrics3 tag.

 

Our approach to build the collection is to extract this uncompressed textual information and full-text index it using mg. Additionally a better memory usage is obtained by the compression of the text provided by mg. Although the uncompressed text embedded in the MP3 file is of insignificant size as compared to the audio data in the MP3 file.

 

 

 

 

     Figure 2. Lyrics specifications 

 

We present an example of the uncompressed text that we modify in order to build our collection

 

Title: Always

Artist: Bon Jovi

Album: Cross Road – The Best of Bon Jovi

Genre: Rock

Tempo: Moderate

Mood: Morose

Situation: Heartbreak

 

Lyrics:

This romeo is bleeding, but you can't see his blood

Its nothing but some feelings, that this old dog kicked up

Its been raining since you left me, now I'm drowning in the flood

You see I've always been a fighter, but without you, I give up

Now I can't sing a love song, like the way its meant to be

Well, I guess I'm not that good anymore

But baby, thats just me

 

NOTE: Only a part of the complete lyrics is provided in the above example

 

In the next step we mark the above information in HTML, and provide the HTML document to mg for compression and indexing.

 

To enable searching by title, artist, album, genre, etc, we break up the text corresponding to the ID3 tag into individual words and prefix it with 777tagname777 and place it in the comment section of the HTML document. E.g. the artist name “Bon Jovi” is stored as:

 

777artist777Bon

777artist777Jovi

 

This approach enables searching with partial matches of the value of the text associated with the tag. In the above example, the document will be retrieved if the user searches on artist=Bon or on artist=Jovi. When the user chooses to search by artist, the search text entered by the user is broken down into individual words and each word is prefixed by 777artist777 and this is passed to mgquery for a ranked query. An identical approach is used for the other tags. If the user searches by lyrics, the text entered by the user is passed as it is to mgquery and no prefixing is done.

 

A final step is the addition of links to the MP3 files and to the lyric files. This links allow the user to retrieve the audio file or the full lyrics.

 

Using the approach described above, the sample MP3 text as marked up in html will be as follows:

 

<!-- >777title777Always

     777artist777Bon

     777artist777Jovi

     777album777Cross

     777album777Road

     777album777-

     777album777The

     777album777Best

     777album777Of

     777album777Bon

     777album777Jovi

     777genre777Rock

     777tempo777Moderate

     777mood777Morose

     777situation777Heartbreak -->

<p>

<font color="#0000FF" size="4" face="Courier New">

<a href="http://www.csdl.tamu.edu/~l0f0954/mp3/Always.mp3">

<img src="http://www.csdl.tamu.edu/~l0f0954/academic/cpsc670/listen.gif" border="0"></a>

<a ref="http://www.csdl.tamu.edu/~l0f0954/mp3/lyrics/Always.html">

<img src="http://www.csdl.tamu.edu/~l0f0954/academic/cpsc670/lyric.gif" border="0"></a>

Always

</font>

<br>

<font size="2" face="Courier">

&nbsp;&nbsp;&nbsp;&nbsp;

Bon Jovi

<br>

&nbsp;&nbsp;&nbsp;&nbsp;

Cross Road - The Best of Bon Jovi

</font>

</p>

<pre>

This romeo is bleeding, but you can't see his blood

Its nothing but some feelings, that this old dog kicked up

Its been raining since you left me, now I'm drowning in the flood

You see I've always been a fighter, but without you, I give up

Now I can't sing a love song, like the way its meant to be

Well, I guess I'm not that good anymore

But baby, thats just me

</pre>

 

Each MP3 file now corresponds to one HTML file. A single HTML file is one document in the collection. A directory containing the HTML files is provided to mgbuild to build the collection.

System

The system design is based on a straightforward architecture as shown in Figure 3. Client-Server communication is accomplish using CGI technology. The CGI executes mg on demand sending and receiving information through the file system

.


 


Figure 3. System architecture.

 

This architecture was selected over other alternative explicitly for its simplicity, and for its wide use on the World Wide Web. It is accessible at the URL:

http://www.csdl.tamu.edu/~l0f0954/academic/cpsc670/cover.html

Client

The client is implemented using standard Web technology and was optimized to work on Internet Explorer 5. HTML Web pages, forms and Javascript are employed in order to support interaction with the user. Figure 4 shows the main search page.

 

Figure 4. MUSE main page

 

Server

The server is a CGI’s implemented in C. The Server follows the following process:

 

1.      Get query term(s) and search options from the client.

2.      Parse the query term and determine type of query. By default MUSE uses ranked queries. However, if the query term includes characters like “&” or “|” then it automatically switches to boolean queries.

3.      Create input file for mg based on the users terms and options.

4.      Run mg.

5.      Retrieve and parse output from mg.

6.      Create and format results.

7.      Send query results to the Client.

 

 

 

 

 

 

 

Using the System

The user starts searching the system by loading the MUSE Search Interface as shown in figure 5.

 

Figure 5. MUSE search interface

 

In order to do a full-text search on the whole collection, the collection field is left as “(all collections)” which mans that MUSE will search by title, artist, album, and lyrics. In this case the user has typed always. The results of this query are shown in Figure 6.

 

Figure 6. Search results

 

In Figure 6, the two icons to the left of the title of the song (in blue) are links to the song (MP3 files) and the complete lyrics of the song (HTML files).

 

The interface also supports browsing by Catalogues. To search by say Genre=Rock, the genre field is selected from the drop-down menu as “Rock” and the “Search” button is clicked. Figure 7 illustrates this process.

 

Figure 7. Browsing by genre

 

Figure 8. Browsing results

 

To search within a catalogue, some text can be provided in the text field that is to the left of the “Search” button.

 

Figure 9. Browsing results