Ubuntu – Bulk export/embed album art in Banshee

banshee

I have a well structured music library in Banshee. I used to just use folders for years so I've always been pretty good at maintaining a strict filing system. I say this not to brag (it did waste a lot of my time, after all) but to explain that my end-game should be possible.

Until Banshee, I never really had any use for album art so when I started using it, I used its Album Art Finder to (painstakingly) go through all 8000-odd albums. My understanding is that Banshee has these files scrobbled away in a cache directory somewhere with a meaningless name attached to them.

I've recently moved into the world of the Squeezebox. It's awesome but I'm having problems getting it to see the existing album art because Banshee has it locked away in its own directories rather than putting it in the "right" place.

So I'm looking for one of two solutions, both parsing Banshee's database to:

  1. Preferred: Copy the art file out as /artist/album/cover.jpg (the Squeezebox server will understand this).
  2. Embed the art into each MP3/FLAC/OGG/etc (this requires all formats to support blob metadata)

Edit: Just found all the art in ~/.cache/media-art with names like album-f952aa94b80de0b31b8979d70d5605e2.jpg as I suspected.

If there's a good way of correllating "f952aa94b80de0b31b8979d70d5605e2" to an artist, that's what I'm really after.

Best Answer

Based on the MD5 lookup in Oli's script (thanks!), I've written a Python script that uses the eyeD3 module to find MP3s, look up the album artwork from Banshee's cache, and embed the artwork inside the MP3s. It skips any files that already have artwork embedded.

It's not perfect, but it worked on about 90% of my MP3s, and you can manually handle any exceptions using EasyTag. As it stands the script expects the MP3s to be two directory levels deep from the target directory (music root/artist/album). The script prints a report once it's done highlighting any files it couldn't process or for which it couldn't find artwork.

Obviously you need to install Python and the eyeD3 module to use it:

#! /usr/bin/env python

import os, sys, glob, eyeD3, hashlib

CACHE_FILE_PREFIX = os.getenv("HOME") + "/.cache/media-art/album-"

def embedAlbumArt(dir = "."):
    artworkNotFoundFiles = []
    errorEmbeddingFiles = []
    noMetadataFiles = []
    mp3s = findMP3Files(dir)

    for mp3 in mp3s:
        print "Processing %s" % mp3

        tag = eyeD3.Tag()
        hasMetadata = tag.link(mp3)

        if not hasMetadata:
            print "No Metadata - skipping."
            noMetadataFiles.append(mp3)
            continue

        if hasEmbeddedArtwork(tag):
            print "Artwork already embedded - skipping."
            continue

        artworkFilename = findAlbumArtworkFile(tag)

        if not artworkFilename:
            print "Couldn't find artwork file - skipping."
            artworkNotFoundFiles.append(mp3)
            continue

        print "Found artwork file: %s" % (artworkFilename)

        wasEmbedded = embedArtwork(tag, artworkFilename)

        if wasEmbedded:
            print "Done.\n"
        else:
            print "Failed to embed.\n"
            errorEmbeddingFiles.append(mp3)

    if artworkNotFoundFiles:
        print "\nArtwork not found for:\n"
        print "\n".join(artworkNotFoundFiles)

    if errorEmbeddingFiles:
        print "\nError embedding artwork in:\n"
        print "\n".join(errorEmbeddingFiles)

    if noMetadataFiles:
        print "\nNo Metadata found for files:\n"
        print "\n".join(noMetadataFiles)

def findMP3Files(dir = "."):    
    pattern = "/".join([dir, "*/*", "*.mp3"])   
    mp3s = glob.glob(pattern)
    mp3s.sort()
    return mp3s

def hasEmbeddedArtwork(tag):
    return len(tag.getImages())

def findAlbumArtworkFile(tag):
    key = "%s\t%s" % (tag.getArtist(), tag.getAlbum())
    md5 = getMD5Hash(key)
    filename = CACHE_FILE_PREFIX + md5 + ".jpg"
    if os.path.exists(filename):
        return filename
    else:
        return 0

def getMD5Hash(string):
    string = string.encode("utf-8")
    md5 = hashlib.md5()
    md5.update(string)
    return md5.hexdigest()

def embedArtwork(tag, artworkFilename):
    tag.addImage(eyeD3.ImageFrame.FRONT_COVER, artworkFilename)
    success = 0
    try:
        success = tag.update()
    except:
        success = 0
    return success

if __name__ == "__main__":
    if len(sys.argv) == 1:
        print "Usage: %s path" % (sys.argv[0])
    else:
        embedAlbumArt(sys.argv[1])