Sunday, June 17, 2012

PIL, text and RTL

Recently, I needed to create an image using the python imaging library (PIL). Among other things, the image contains text in Hebrew that should fit in a predefined column width. This presented a few issues, first I needed to handle RTL language in PIL and secondly I needed to break the line of text into several lines so they don't exceed the defined column width.
This is a trivial task in HTML but not so much in a graphic library like PIL.

Drawing RTL Text

In PIL, texts are drawn on an image via the text method of the ImageDraw instance. However, writing an RTL text would result in reverse order result. You might be tempted to just reverse the str array, but that would cause erroneous results when other symbols appear in the text, like numbers and parenthesis.
Fortunately, there is a great python library for that called pybidi that reorders the text symbols according to the language they are written in.

The other issue we are facing is that the text is positioned relatively to the top left corner. In an RTL scenario you would probably want to align the text to the top right corner of the block, so that different lines would start on the same horizontal position.
PIL doesn't have a straightforward method for doing this so we need to hack it a bit. The idea is to calculate where the text block would end if it will start where we want it and then feed that ending position to the text method. Luckily, ImageDraw has a method (called textsize) to calculate the width (in pixels) that a rendered line of text will require without actually rendering it. So the final method is:
from bidi.algorithm import get_display
class ImageDrawRTL(ImageDraw.ImageDraw):
    def text_rtl(self, pos, text, font, fill):
        text = get_display(text)
        width, height = self.textsize(text, font = font)
        self.text((pos[0]-width, pos[1]), text,  font = font, fill = fill)
        return width, height
ImageDraw.ImageDraw = ImageDrawRTL
Notice I am using a subclass of the original ImageDraw and I replace the class on the module since the instance is created by an internal factory method.

Breaking a text line

The other issue I had to work out is that the line of text I am given, might be rendered outside the given column width. So I need to break the single line of text into multiple lines and render them one after the other.
To calculate the optimal breaks in the text, I used a recursive binary search, mainly because it's the coolest way I could think of. There is probably a faster way of doing this by trying to estimate the break position and look for a space near it. In any case, this is what I came up with :
def text_break(self, text, font, column_width, space_index = None, space_indexes = None, start_space_index = 0, end_space_index = None):
    if space_indexes is None: # Do this once to save some time
        space_indexes = [m.start() for m in re.finditer(' ', text)] + [len(text)]
    if space_index is None:
        space_index = len(space_indexes) - 1
    if end_space_index is None:
        end_space_index = len(space_indexes)
    index = space_indexes[space_index]
    width, _ = self.textsize(text[:index], font = font)
    if width <= column_width:
        if index == len(text): # Entire text can be inserted in a single column
            return [text]
        # Check if the next word can also be inserted
        width, _ = self.textsize(text[:space_indexes[space_index + 1]], font = font)
        if width <= column_width: # Next word can also be inserted in this column so this is not the breaking point
            return self.text_break(text, font, column_width, space_index = int(math.ceil(float(space_index + end_space_index)/2)), space_indexes = space_indexes, start_space_index = space_index, end_space_index = end_space_index)
        else: # This is the breaking point, so break the text
            return [text[:index]] + self.text_break(text[index+1:], font, column_width)
    else: # Text is too big
        return self.text_break(text, font, column_width, space_index = int(math.floor(float((start_space_index + space_index)/2))), space_indexes = space_indexes, start_space_index = start_space_index, end_space_index = space_index)
Again, you need to inject this method to the ImageDraw instance (by inheritance or otherwise).

So this is it for today, hope you've found it useful...

Saturday, June 16, 2012

My first post

As the title implies, this is my first blog post ever, and as such it should be quite boring.

I am a physicit by my academic background and a programmer by my actual work, so I guess you can say I'm an engineer. I spend most of my time between server (python) and web development.

I am writing this blog to keep my legacy alive for future generations. Well, not really, future generations will probably look back at this and think "wow, they were all idiots!", and in many ways, they will be correct. But seriously, from time to time I encounter some very weird ungooglable problems that I need to work out, and it would be cool if someone else would find my solutions helpful in any way.

Well enough chitchat then, let's get to work!