Bembel-B Blog

2009/04/26

My FTPSync.pl Native Windows Build

I needed a Windows command line tool for syncing local files with an FTP server and couldn’t find any lightweight ones. There’s FTPSync.pl for Unix/Linux with a Cygwin build available (seems to be gone now). Instead of using that, I built an EXE of FTPSync.pl with ActivePerl for Windows.

Main reason for not using the Cygwin build is that using different versions of cygwin.dll simultaneously isn’t possible, which could happen easily when for example additionally having a full Cygwin install with running services. Stand alone Cygwin builds need to be shipped with the cygwin.dll which it is linked to.
Also that Cygwin build is outdated.

So, to make short things short, here you can download ftpsync-1.3.01.W32.zip (and my previous build ftpsync-1.2.33.W32.zip).

As far as I remember the build was straight forward. I installed ActivePerl 5.10, added a 3rd party PPM Repository (bribes.org I think) for installing the PAR Packer. Then all I had to do was issuing pp -o ftpsync.pl.exe ftpsync.pl in the directory with the FTPSync.pl sources.
You can reduce the EXE file size using UPX if you need to.
So far the EXE is working without problems.

ChangeLog

[2009-04-27: New build of latest version.]

Advertisements

2007/12/06

Downloading Artist Images from Discogs with Python

In my previous post I told about foobar2000 and album art, but there are also artist images to be shown with the FofR Theme. For reasons explained later and just for fun and learning Python, I started to write a Python script for downloading artist images from Discogs through their Web API.

Discogs Logo
There’s already some application for downloading artist images, but the server it depends on is currently offline. Then there’s the Discogs component for foobar2000, which can be used for tagging and downloading album art and artist images. But at least for the artist images these will be named with the Discogs artist id, and not the artist names. That’s not useful for me, except I’d retag all my audiofiles to contain this ID. A thing I don’t dare to do.

So this is my first approach on writing a Python application. I’m very impressed how easy and fast it went, combining some tutorials’ code (sorry, can’t remember them all for credit and copyright :/) and some peeks at the code reference.
The only complications were charset issues: Firstly because I didn’t know about the inner workings of Python with special chars (it’s Unicode :). Secondly because I was using Cygwin’s Python which don’t seem to handle (output/input) any special chars at all, as its native charset is set to us-ascii (only 7 Bit chars).
Well, and one other confusion came from using tabs to indent the sourcecode, resulting in “weird” interpreter errors.
So now I switched to Windows Python (charset cp850) and all is fine. This script should also run nicely under Linux and alike.

This script is very, very ugly and totally not failsafe. It’s just a starting point and as I told above, I’m a total beginner in Python. Just for your amusement, here’s the code so far. Expect updates sometime. :)

#!/usr/bin/python
# -*- coding: iso-8859-15 -*-

import urllib2, gzip, cStringIO
import urllib
import re
import xml.sax.handler
import getopt, sys

apikey = "111"

#artistname = u"DJ Ötzi"
           
stdout_encoding = sys.stdout.encoding or sys.getfilesystemencoding()
fs_encoding = sys.getfilesystemencoding()
print stdout_encoding

def xml2obj(src):
    """
    A simple function to converts XML data into native Python object.
    """

    non_id_char = re.compile('[^_0-9a-zA-Z]')
    def _name_mangle(name):
        return non_id_char.sub('_', name)

    class DataNode(object):
        def __init__(self):
            self._attrs = {}    # XML attributes and child elements
            self.data = None    # child text data
        def __len__(self):
            # treat single element as a list of 1
            return 1
        def __getitem__(self, key):
            if isinstance(key, basestring):
                return self._attrs.get(key,None)
            else:
                return [self][key]
        def __contains__(self, name):
            return self._attrs.has_key(name)
        def __nonzero__(self):
            return bool(self._attrs or self.data)
        def __getattr__(self, name):
            if name.startswith('__'):
                # need to do this for Python special methods???
                raise AttributeError(name)
            return self._attrs.get(name,None)
        def _add_xml_attr(self, name, value):
            if name in self._attrs:
                # multiple attribute of the same name are represented by a list
                children = self._attrs[name]
                if not isinstance(children, list):
                    children = [children]
                    self._attrs[name] = children
                children.append(value)
            else:
                self._attrs[name] = value
        def __str__(self):
            return self.data or ''
        def __repr__(self):
            items = sorted(self._attrs.items())
            if self.data:
                items.append(('data', self.data))
            return u'{%s}' % ', '.join([u'%s:%s' % (k,repr(v)) for k,v in items])

    class TreeBuilder(xml.sax.handler.ContentHandler):
        def __init__(self):
            self.stack = []
            self.root = DataNode()
            self.current = self.root
            self.text_parts = []
        def startElement(self, name, attrs):
            self.stack.append((self.current, self.text_parts))
            self.current = DataNode()
            self.text_parts = []
            # xml attributes --> python attributes
            for k, v in attrs.items():
                self.current._add_xml_attr(_name_mangle(k), v)
        def endElement(self, name):
            text = ''.join(self.text_parts).strip()
            if text:
                self.current.data = text
            if self.current._attrs:
                obj = self.current
            else:
                # a text only node is simply represented by the string
                obj = text or ''
            self.current, self.text_parts = self.stack.pop()
            self.current._add_xml_attr(_name_mangle(name), obj)
        def characters(self, content):
            self.text_parts.append(content)

    builder = TreeBuilder()
    if isinstance(src,basestring):
        xml.sax.parseString(src, builder)
    else:
        xml.sax.parse(src, builder)
    return builder.root._attrs.values()[0]

def downloadartistimage(uri, filename):
    fp = urllib2.urlopen(uri)
    op = open(filename, "wb")
    n = 0
    while 1:
        s = fp.read(8192)
        if not s:
            break
        op.write(s)
        n = n + len(s)
    fp.close()
    op.close()
    for k, v in fp.headers.items():
        print k, "=", v
    print "copied", n, "bytes from", fp.url
    return 0

try:
    opts, args = getopt.getopt(sys.argv[1:], "ha:v", ["help", "artist="])
except getopt.GetoptError:
    # print help information and exit:
    print("no argument given")
    sys.exit(2)
verbose = False
for o, a in opts:
    if o == "-v":
        verbose = True
    if o in ("-h", "--help"):
        print("no argument given")
        sys.exit()
    if o in ("-a", "--artist"):
        artistname = a.decode(fs_encoding)

requesturi = "http://www.discogs.com/artist/%s?f=xml&api_key=%s" % (urllib.quote_plus(artistname.encode('utf-8')), apikey)
print "Requesting: %s" % requesturi
request = urllib2.Request(requesturi)
request.add_header('Accept-Encoding', 'gzip')
response = urllib2.urlopen(request)
data = response.read()
unzipped_data = gzip.GzipFile(fileobj = cStringIO.StringIO(data)).read()
# print(unzipped_data)

data_obj = xml2obj(unzipped_data)
images = data_obj.artist.images

primaryfound = False
bigsecondarysize = 0
for image in images.image:
    print "Type: %s URL: %s" % (image.type, image.uri)
    if image.type == "primary":
        primaryfound = True
        fn = u"%s.%s" % (artistname, image.uri.rpartition('.')[2])
        print u"Downloading primary image as %s from %s".encode(stdout_encoding) % (fn, image.uri)
        downloadartistimage(image.uri, fn)
        continue
    if image.type == "secondary":
        if (image.width + image.height) > bigsecondarysize:
            bigsecondarysize = image.width + image.height
            bigsecondary = image
        continue

if not primaryfound:
    fn = u"%s.%s" % (artistname, bigsecondary.uri.rpartition('.')[2])
    print u"Falling back to secondary as %s sized %sx%s at %s".encode(stdout_encoding) % (fn, bigsecondary.width, bigsecondary.height, bigsecondary.uri)
    downloadartistimage(bigsecondary.uri, fn)

print "All done! :)"

And now two usage examples:

C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>example-04.py -a "Aphex Twin"
cp850
Requesting: http://www.discogs.com/artist/Aphex+Twin?f=xml&api_key=111
Type: secondary URL: http://www.discogs.com/image/A-45-005.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1094774583.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1097005597.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1098171105.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1107949060.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1122852930.jpg
Type: secondary URL: http://www.discogs.com/image/A-45-1126949071.jpeg
Type: secondary URL: http://www.discogs.com/image/A-45-1126949078.jpeg
Type: secondary URL: http://www.discogs.com/image/A-45-1126949085.jpeg
Type: secondary URL: http://www.discogs.com/image/A-45-1126949091.jpeg
Type: secondary URL: http://www.discogs.com/image/A-45-1129512422.jpeg
Type: primary URL: http://www.discogs.com/image/A-45-1176664580.jpeg
Downloading primary image as Aphex Twin.jpeg from http://www.discogs.com/image/A-45-1176664580.jpeg
content-length = 141117
set-cookie = sid=5c3847142265e10e296934b877585749; path=/; expires=Sun, 03-Dec-2017 00:07:28 GMT; domain=.discogs.com
server = Apache
connection = close
reproxy-status = yes
date = Thu, 06 Dec 2007 00:07:28 GMT
content-type = image/jpeg
copied 141117 bytes from http://www.discogs.com/image/A-45-1176664580.jpeg
All done! :)

C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>example-04.py -a "Black Sabbath"
cp850
Requesting: http://www.discogs.com/artist/Black+Sabbath?f=xml&api_key=111
Type: secondary URL: http://www.discogs.com/image/A-144998-1098725461.jpg
Type: secondary URL: http://www.discogs.com/image/A-144998-1147641856.jpeg
Falling back to secondary as Black Sabbath.jpeg sized 528x531 at http://www.discogs.com/image/A-144998-1147641856.jpeg
content-length = 45353
set-cookie = sid=e47b9acbe8257ca4ad7fe6944a36fef1; path=/; expires=Sun, 03-Dec-2017 00:07:08 GMT; domain=.discogs.com
server = Apache
connection = close
reproxy-status = yes
date = Thu, 06 Dec 2007 00:07:08 GMT
content-type = image/jpeg
copied 45353 bytes from http://www.discogs.com/image/A-144998-1147641856.jpeg
All done! :)

C:\Dokumente und Einstellungen\scheff\Eigene Dateien\python\pydiscogs>

2007/04/29

Preparing MP3s and Cover Art for SanDisk Sansa e200 Portable Audio Player

Last year I purchased a flash based 2 GB DAP SanDisk Sansa e250 to join my good ole 20 GB HDD player Creative Nomad Jukebox Zen Firewire. Sadly the Sansa is quite picky when it comes to ID3 tags.

SanDisk Sansa e200

By experimenting and reading up I found out how to retag my MP3s using Linux (and probably also Windows using Cygwin) so that they all will be fully recognized. I also found a procedure to nicely resize cover scans which can be displayed when playing a corresponding track.

Prerequisites

For ID3 tagging I’m using the command line application eyeD3, as it handles ID3v2.4 and – yes – it’s commandline driven. I made an RPM for Fedora Core 5. The original spec file didn’t work because of using the wrong python version. I’ve put my edited spec file and the RPM into my box.net share.

To handle the cover scans I use ImageMagick.

Preparation

For better performance one should copy the files to be uploaded to your player somewhere on your harddrive.

Copy your MP3s to some working directory and finally cd into it

mkdir -p ~/wrk/mp3new
cd /windows/g/mp3
cp -r _electronic/Aphex\ Twin/Aphex_Twin-...I_Care_Because_You_Do-1995/ /windows/g/mp3
cp -r _metal/Colonel\ Claypool\'s\ Bucket\ of\ Bernie\ Brains\ -\ Big\ Eyeball\ in\ the\ Sky/ /windows/g/mp3
cp -r _metal/Primus/primus-pork_soda-1993/ /windows/g/mp3
cd ~/wrk/mp3new

Retagging

The Sansa preferres ID3v2 tags over ID3v1 and can read up to ID3v2.3 tags. But it doesn’t seem to like some of the special fields possible with ID3v2.3 and therefore these will have to be removed. The following steps may seem a bit complicated, but it’s the only way to do the cleanup with eyeD3 currently.

1.) Convert to v1.x

find -type f -and  \( -name "*.mp3" -or -name "*.MP3" \) -print0 | xargs -0 eyeD3 --to-v1

2.) Remove v2

find -type f -and  \( -name "*.mp3" -or -name "*.MP3" \) -print0 | xargs -0 eyeD3 --remove-v2

3.) Convert to v2.3

find -type f -and  \( -name "*.mp3" -or -name "*.MP3" \) -print0 | xargs -0 eyeD3 --to-v2.3

4.) Remove v1

find -type f -and  \( -name "*.mp3" -or -name "*.MP3" \) -print0 | xargs -0 eyeD3 --remove-v1

To make things easier, I’ll try to condense all this to one single command next time. So stay tuned..

Cover Art

My procedure will only keep the front cover scans, where available. If you’d like to keep other images too, you’d have to follow the alternative procedure 1b/2b instead of 1a/2a.
Please note: even png images will be renamed to folder.jpg ImageMagick will correctly recognize them as pngs when converting.

If you’d prefer not to use folder.jpg for the resized cover images you may instead use Album Art.jpg as filename.

1a.) List all jpegs and png files. Then manually delete all unneeded files!

find -type f -and \( -name "*.jpg" -or -name "*.JPG" -or -name "*.png" \)

2a.) Rename cover images to folder.jpg

OLDIFS=$IFS ; IFS=$'\n' ; \
for fn in `find -type f -and \( -name "*.jpg" -or -name "*.JPG" -or -name "*.png" \) -not -name "folder.jpg"` ; \
do mv -v "$fn" "${fn/`basename "$fn"`/folder.jpg}" ; \
done ; IFS=$OLDIFS

1b/2b.) Alternative: To keep other images as well, you’d have to manually rename (or copy to keep the original) the front cover images to folder.jpg

3.) Proportionally resize cover images to max 200 x 200 px

OLDIFS=$IFS ; IFS=$'\n' ; \
for fn in `find -type f -name folder.jpg` ; \
do mogrify -verbose -resize '200x200>' "$fn" ; \
done ; IFS=$OLDIFS

I must admit this isn’t the most elegant way. Maybe I’ll manage to make some little GUI and trying to automatically chose the front cover image files. We’ll see..

Uploading to the Player

Nothing special here. Just plug in your Sansa in MSC mode. Then move the files to the player.

mv -v * /media/Sansa\ e250/MUSIC

If you’re replacing files on your player, you’ll have to purge the tag database in order to let the player recognize the changes.

rm /media/Sansa\ e250/SYSTEM/DATA/PP5000.DAT

ChangeLog

[070923 Fix backslashes (“\\” markup was meanwhile rendered as “\\” and not “\” as before.]
[080120 Add “Sansa e200” category and link to it.]

Blog at WordPress.com.