Python 2/3 Compatible Source and PyXB

I started work on PyXB just over five years ago. At the time, Python 3.0 had just come out, but was far too new to hassle with, so I made Python 2.4 the minimum required version.

In September 2011 people started to hint they’d like Python 3 support, but it looked like it’d be an awful lot of work, and nobody asked officially, so I just kept it in the back of my mind. In June 2012 the noise was getting harder to ignore, so I logged the request but didn’t take it further.

Over the next year or so PyXB’s unicode support got stronger, and I started understanding exactly how much easier it’d be to do XML with a proper distinction between text (i.e., unicode) and data (i.e, octet sequences). Python 2 did this poorly, but the difference is deeply embedded in Python 3. In September 2013 I finally created a branch for Python 3 off the 1.2.3 release. This involved running 2to3 over the source then running a second script to fix the resulting errors. This was good enough to make available for folks who could build from the repository, but couldn’t support packaging a version because converting the source was too complex to run on an end-user’s machine.

While investigating an installation problem that ultimately turned out to be a bug in pip I discovered six. Six is a single module, released under the MIT license, that can be integrated into a Python package to allow the same source code to work under both Python 2 and Python 3. No more running 2to3. No more fixing up the mess 2to3 makes when it changes pyxb.utils.unicode to pyxb.utils.str.

As of today, the next branch of PyXB passes all tests using Python version 2.6 up through 3.4.0rc1 without source-code changes. Well, ok, some unit tests fail because whitespace in formatted XML changed in 2.7; the unittest.TestCase.assertRaisescontext manager feature isn’t handled in 2.6, 3.0, or 3.1; and I haven’t tested 3.0.1 because hacking its configure script so it can build a functional hashlib module on Ubuntu 12.04 isn’t worth the effort. Nonetheless, PyXB itself works fine.

There’s more work to be done. A packaged PyXB includes generated bindings for about 186 namespaces. When building from the repository those can be generated with the same Python that’ll be running them, so they might include Unicode literals which aren’t going to work across the gap where Python 3 didn’t support the unicode prefix ( u'text') until version 3.3. But the big hurdle has been overcome, and the next PyXB release should support all Python versions from 2.6 onward.

Leave a Reply

Your email address will not be published. Required fields are marked *