Monday, September 07, 2009

Windows Media to MP3 Conversion for Mac OS X and Linux


For the past couple years, my girlfriend has been amazingly (astonishingly) patient about a whole slew of .wma files that we've got on the network drive... backups of her CD collection made when she was a Windows user. We managed to save them right before the computer died, but she hasn't been able to listen to them when she's booted into Ubuntu or Mac OS X.

Late last month, after getting back from two weeks abroad, Marjorie said that she'd really like to have access to her music collection again (the CDs are cumbersome and stored away in boxes for our impending move back to Colorado). With that said, I did some digging around, and found some immediately helpful links (two years ago, a few google searches had turned up results that indicated too much effort was involved).

I started out by trying a couple free Mac OS X GUI applications, but these ended up being quite horrible: either they did not offer the functionality I desired, they were buggy to the point of being unusable, or they rendered audio with unlistenable artifacts.

In the end, I had to use mplayer and lame in combination. After googling around and some trial and error, I discovered the combination of mplayer options that would successfully extract the audio data from .wma files and dump them as .wav files.

I started with a shell script, but quickly changed to Python, since there were several locations for the .wma files, and none of them on nice paths. I've used this script several times since then, when more .wma files were discovered, and have yet to encounter any issues in sound quality. Once nice-to-have would be to extract .wma metadata and save it in the new .mp3 files as id3 tags...

Anyway, here's the code:

#!/usr/bin/python
import os
import re
import subprocess
import sys
# script configuration
if sys.platform == "darwin":
MPLAYER_PATH = os.path.join(
"/Applications/Non-Standard/Audio and Video",
"MPlayer OS X 2.app/Contents/Resources/mplayer.app/Contents/MacOS")
# lame was manually installed into /usr/bin
LAME_PATH="/usr/bin"
elif sys.platform == "linux2":
MPLAYER_PATH = "/usr/bin"
LAME_PATH="/usr/bin"
MPLAYER = os.path.join(MPLAYER_PATH, "mplayer")
LAME = os.path.join(LAME_PATH, "lame")
DUMP_FILE = "audiodump.wav"
BACKUP_DIR = "wma"
WORKING_DIR = "/tmp"
def make_audio_dump(filename):
"""
Use mplayer to dump the audio contents of the .wma files as .wav files.
"""
command = ("\"%s\" -nosound -vo null -vc dummy -af resample=44100 -aid 1 "
"-ao pcm:waveheader \"%s\"" % (MPLAYER, filename))
subprocess.call(command, shell=True)
def convert_wav_to_mp3(input, output):
"""
Use lame to convert the .wav files to .mp3 files. Remove the raw .wav file
when done.
"""
command = "\"%s\" -b 256 -h \"%s\" -o \"%s\"" % (LAME, input, output)
subprocess.call(command, shell=True)
os.unlink(input)
def convert_wma_to_mp3(wma_filename):
"""
Given a .wma filename, get a filename for the new .mp3 file based on this,
convert the original to a .wav and then that to an .mp3 file.
"""
mp3_filename = re.sub("\.wma$", ".mp3", wma_filename)
make_audio_dump(wma_filename)
convert_wav_to_mp3(DUMP_FILE, mp3_filename)
def has_wma_files(filenames):
"""
Given a list of filenames, check to see if any of them have the .wma file
extension. If so, return a true value; otherwise, a false one.
"""
for filename in filenames:
if filename.endswith(".wma"):
return True
return False
def convert_wma_files(path):
"""
Walk a given file system directory and all its child directories in order
to find .wma files. If found, convert them to .mp3 files and backup the
originals.
"""
for dir, subdirs, filenames in os.walk(path):
# we don't want to convert files that have already been converted
if os.path.basename(dir) == BACKUP_DIR:
continue
# if there's nothing to do, move on
if not has_wma_files(filenames):
continue
# define and create the backup dir, if it hasn't been already
backup_dir = os.path.join(dir, BACKUP_DIR)
if not os.path.exists(backup_dir):
os.mkdir(backup_dir)
for filename in sorted(filenames):
# on mac os x samba shares, sometimes ._*.wma files are present;
# skip these
if filename.startswith("."):
continue
if filename.endswith(".wma"):
print "Dumping audio for %s ..." % (filename)
wma_filename = os.path.join(dir, filename)
wma_backup = os.path.join(backup_dir, filename)
convert_wma_to_mp3(wma_filename)
os.rename(wma_filename, wma_backup)
if __name__ == "__main__":
path = sys.argv[1]
os.chdir(WORKING_DIR)
convert_wma_files(path)
view raw convert.py hosted with ❤ by GitHub


Hope someone else finds this useful and their significant others don't have to wait 2 years for their music!


2 comments:

Unknown said...

Thanks, this has been very helpful!

ΤΖΩΤΖΙΟΥ said...

re "Shell script" and "nice paths": it helps a lot if at the beginning of your bash/dash/sh/ksh script you insert a:

IFS="<tab>
"

(that is: quote, actual tab character, newline, quote)

From that point on, your shell script will choke on filenames only if they contain <tab> or <lf> characters.