In my previous post I told about foobar2000 and album art, but there are also artist images to be shown with the FofR Theme. For reasons explained later and just for fun and learning Python, I started to write a Python script for downloading artist images from Discogs through their Web API.
There’s already some application for downloading artist images, but the server it depends on is currently offline. Then there’s the Discogs component for foobar2000, which can be used for tagging and downloading album art and artist images. But at least for the artist images these will be named with the Discogs artist id, and not the artist names. That’s not useful for me, except I’d retag all my audiofiles to contain this ID. A thing I don’t dare to do.
So this is my first approach on writing a Python application. I’m very impressed how easy and fast it went, combining some tutorials’ code (sorry, can’t remember them all for credit and copyright :/) and some peeks at the code reference.
The only complications were charset issues: Firstly because I didn’t know about the inner workings of Python with special chars (it’s Unicode :). Secondly because I was using Cygwin’s Python which don’t seem to handle (output/input) any special chars at all, as its native charset is set to us-ascii (only 7 Bit chars).
Well, and one other confusion came from using tabs to indent the sourcecode, resulting in “weird” interpreter errors.
So now I switched to Windows Python (charset cp850) and all is fine. This script should also run nicely under Linux and alike.
This script is very, very ugly and totally not failsafe. It’s just a starting point and as I told above, I’m a total beginner in Python. Just for your amusement, here’s the code so far. Expect updates sometime. :)
#!/usr/bin/python # -*- coding: iso-8859-15 -*- import urllib2, gzip, cStringIO import urllib import re import xml.sax.handler import getopt, sys apikey = "111" #artistname = u"DJ Ötzi" stdout_encoding = sys.stdout.encoding or sys.getfilesystemencoding() fs_encoding = sys.getfilesystemencoding() print stdout_encoding def xml2obj(src): """ A simple function to converts XML data into native Python object. """ non_id_char = re.compile('[^_0-9a-zA-Z]') def _name_mangle(name): return non_id_char.sub('_', name) class DataNode(object): def __init__(self): self._attrs = {} # XML attributes and child elements self.data = None # child text data def __len__(self): # treat single element as a list of 1 return 1 def __getitem__(self, key): if isinstance(key, basestring): return self._attrs.get(key,None) else: return [self][key] def __contains__(self, name): return self._attrs.has_key(name) def __nonzero__(self): return bool(self._attrs or self.data) def __getattr__(self, name): if name.startswith('__'): # need to do this for Python special methods??? raise AttributeError(name) return self._attrs.get(name,None) def _add_xml_attr(self, name, value): if name in self._attrs: # multiple attribute of the same name are represented by a list children = self._attrs[name] if not isinstance(children, list): children = [children] self._attrs[name] = children children.append(value) else: self._attrs[name] = value def __str__(self): return self.data or '' def __repr__(self): items = sorted(self._attrs.items()) if self.data: items.append(('data', self.data)) return u'{%s}' % ', '.join([u'%s:%s' % (k,repr(v)) for k,v in items]) class TreeBuilder(xml.sax.handler.ContentHandler): def __init__(self): self.stack = [] self.root = DataNode() self.current = self.root self.text_parts = [] def startElement(self, name, attrs): self.stack.append((self.current, self.text_parts)) self.current = DataNode() self.text_parts = [] # xml attributes --> python attributes for k, v in attrs.items(): self.current._add_xml_attr(_name_mangle(k), v) def endElement(self, name): text = ''.join(self.text_parts).strip() if text: self.current.data = text if self.current._attrs: obj = self.current else: # a text only node is simply represented by the string obj = text or '' self.current, self.text_parts = self.stack.pop() self.current._add_xml_attr(_name_mangle(name), obj) def characters(self, content): self.text_parts.append(content) builder = TreeBuilder() if isinstance(src,basestring): xml.sax.parseString(src, builder) else: xml.sax.parse(src, builder) return builder.root._attrs.values()[0] def downloadartistimage(uri, filename): fp = urllib2.urlopen(uri) op = open(filename, "wb") n = 0 while 1: s = fp.read(8192) if not s: break op.write(s) n = n + len(s) fp.close() op.close() for k, v in fp.headers.items(): print k, "=", v print "copied", n, "bytes from", fp.url return 0 try: opts, args = getopt.getopt(sys.argv[1:], "ha:v", ["help", "artist="]) except getopt.GetoptError: # print help information and exit: print("no argument given") sys.exit(2) verbose = False for o, a in opts: if o == "-v": verbose = True if o in ("-h", "--help"): print("no argument given") sys.exit() if o in ("-a", "--artist"): artistname = a.decode(fs_encoding) requesturi = "http://www.discogs.com/artist/%s?f=xml&api_key=%s" % (urllib.quote_plus(artistname.encode('utf-8')), apikey) print "Requesting: %s" % requesturi request = urllib2.Request(requesturi) request.add_header('Accept-Encoding', 'gzip') response = urllib2.urlopen(request) data = response.read() unzipped_data = gzip.GzipFile(fileobj = cStringIO.StringIO(data)).read() # print(unzipped_data) data_obj = xml2obj(unzipped_data) images = data_obj.artist.images primaryfound = False bigsecondarysize = 0 for image in images.image: print "Type: %s URL: %s" % (image.type, image.uri) if image.type == "primary": primaryfound = True fn = u"%s.%s" % (artistname, image.uri.rpartition('.')[2]) print u"Downloading primary image as %s from %s".encode(stdout_encoding) % (fn, image.uri) downloadartistimage(image.uri, fn) continue if image.type == "secondary": if (image.width + image.height) > bigsecondarysize: bigsecondarysize = image.width + image.height bigsecondary = image continue if not primaryfound: fn = u"%s.%s" % (artistname, bigsecondary.uri.rpartition('.')[2]) print u"Falling back to secondary as %s sized %sx%s at %s".encode(stdout_encoding) % (fn, bigsecondary.width, bigsecondary.height, bigsecondary.uri) downloadartistimage(bigsecondary.uri, fn) print "All done! :)"
And now two usage examples:
C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>example-04.py -a "Aphex Twin" cp850 Requesting: http://www.discogs.com/artist/Aphex+Twin?f=xml&api_key=111 Type: secondary URL: http://www.discogs.com/image/A-45-005.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1094774583.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1097005597.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1098171105.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1107949060.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1122852930.jpg Type: secondary URL: http://www.discogs.com/image/A-45-1126949071.jpeg Type: secondary URL: http://www.discogs.com/image/A-45-1126949078.jpeg Type: secondary URL: http://www.discogs.com/image/A-45-1126949085.jpeg Type: secondary URL: http://www.discogs.com/image/A-45-1126949091.jpeg Type: secondary URL: http://www.discogs.com/image/A-45-1129512422.jpeg Type: primary URL: http://www.discogs.com/image/A-45-1176664580.jpeg Downloading primary image as Aphex Twin.jpeg from http://www.discogs.com/image/A-45-1176664580.jpeg content-length = 141117 set-cookie = sid=5c3847142265e10e296934b877585749; path=/; expires=Sun, 03-Dec-2017 00:07:28 GMT; domain=.discogs.com server = Apache connection = close reproxy-status = yes date = Thu, 06 Dec 2007 00:07:28 GMT content-type = image/jpeg copied 141117 bytes from http://www.discogs.com/image/A-45-1176664580.jpeg All done! :) C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>example-04.py -a "Black Sabbath" cp850 Requesting: http://www.discogs.com/artist/Black+Sabbath?f=xml&api_key=111 Type: secondary URL: http://www.discogs.com/image/A-144998-1098725461.jpg Type: secondary URL: http://www.discogs.com/image/A-144998-1147641856.jpeg Falling back to secondary as Black Sabbath.jpeg sized 528x531 at http://www.discogs.com/image/A-144998-1147641856.jpeg content-length = 45353 set-cookie = sid=e47b9acbe8257ca4ad7fe6944a36fef1; path=/; expires=Sun, 03-Dec-2017 00:07:08 GMT; domain=.discogs.com server = Apache connection = close reproxy-status = yes date = Thu, 06 Dec 2007 00:07:08 GMT content-type = image/jpeg copied 45353 bytes from http://www.discogs.com/image/A-144998-1147641856.jpeg All done! :) C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>