2009-04-27

Note to Self: Parsing the Accepts header

Here's an easy-to-follow snippet for the next time you need to parse an Accepts header.

Back to flipping out...

2009-04-13

Project Euler: Problem 17

I recently finished up Problem 17, and since the problem was fairly straightforward, I used it as an opportunity to explore some of Python's language features that I've been meaning to look into.


# problem17.py
"""
Find the solution to `Problem 17`_ at `Project Euler`_.

.. _Problem 17: http://projecteuler.net/index.php?section=problems&id=17
.. _Project Euler: http://projecteuler.net/
"""
__docformat__ = "restructuredtext en"

import re

_NUMBER_NAMES = {1: "one", 2: "two", 3: "three", 4: "four", 5: "five",
                6: "six", 7: "seven", 8: "eight", 9: "nine", 10: "ten",
                11: "eleven", 12: "twelve", 13: "thirteen", 14: "fourteen",
                15: "fifteen", 16: "sixteen", 17: "seventeen",
                18: "eighteen", 19: "nineteen", 20: "twenty",
                21: "twenty-one", 22: "twenty-two", 23: "twenty-three",
                24: "twenty-four", 25: "twenty-five", 26: "twenty-six",
                27: "twenty-seven", 28: "twenty-eight", 29: "twenty-nine",
                30: "thirty", 31: "thirty-one", 32: "thirty-two",
                33: "thirty-three", 34: "thirty-four", 35: "thirty-five",
                36: "thirty-six", 37: "thirty-seven", 38: "thirty-eight",
                39: "thirty-nine", 40: "forty", 41: "forty-one",
                42: "forty-two", 43: "forty-three", 44: "forty-four",
                45: "forty-five", 46: "forty-six", 47: "forty-seven",
                48: "forty-eight", 49: "forty-nine", 50: "fifty",
                51: "fifty-one", 52: "fifty-two", 53: "fifty-three",
                54: "fifty-four", 55: "fifty-five", 56: "fifty-six",
                57: "fifty-seven", 58: "fifty-eight", 59: "fifty-nine",
                60: "sixty", 61: "sixty-one", 62: "sixty-two",
                63: "sixty-three", 64: "sixty-four", 65: "sixty-five",
                66: "sixty-six", 67: "sixty-seven", 68: "sixty-eight",
                69: "sixty-nine", 70: "seventy", 71: "seventy-one",
                72: "seventy-two", 73: "seventy-three", 74: "seventy-four",
                75: "seventy-five", 76: "seventy-six", 77: "seventy-seven",
                78: "seventy-eight", 79: "seventy-nine", 80: "eighty",
                81: "eighty-one", 82: "eighty-two", 83: "eighty-three",
                84: "eighty-four", 85: "eighty-five", 86: "eighty-six",
                87: "eighty-seven", 88: "eighty-eight", 89: "eighty-nine",
                90: "ninety", 91: "ninety-one", 92: "ninety-two",
                93: "ninety-three", 94: "ninety-four", 95: "ninety-five",
                96: "ninety-six", 97: "ninety-seven", 98: "ninety-eight",
                99: "ninety-nine"}

_CHARACTERS_WE_CARE_ABOUT = re.compile("\w")

def _words_from_num(num):
    """
    Convert ``num`` to its (British) English phrase equivalent.

    If ``num`` is greater than 9,999 then raise an ``Exception``.
    >>> _words_from_num(115)
    'one hundred and fifteen'
    """
    if num >= 10000:
        raise Exception, 'This function only supports numbers less than 10000.'

    parts_list = []
    if num >= 1000:
        thousands = num // 1000
        parts_list.append(_NUMBER_NAMES[thousands])
        parts_list.append(" thousand")
        num -= thousands * 1000
    if num >= 100:
        hundreds = num // 100
        parts_list.append(_NUMBER_NAMES[hundreds])
        parts_list.append(" hundred")
        num -= hundreds * 100
    if num:
        if parts_list:
            parts_list.append(" and")
        parts_list.extend([" ", _NUMBER_NAMES[num]])

    return "".join(parts_list)

def _count_characters_we_care_about(string_to_count):
    """
    Count the characters in ``string_to_count``, excluding things like hyphens and spaces.

    >>> _count_characters_we_care_about("one hundred and twenty-three")
    24
    """
    return len(_CHARACTERS_WE_CARE_ABOUT.findall(string_to_count))

def problem_17(upper_bound = 1000):
    """
    Find the solution to `Problem 17`_ at `Project Euler`_.

    .. _Problem 17: http://projecteuler.net/index.php?section=problems&id=17
    .. _Project Euler: http://projecteuler.net/

    >>> problem_17(2)
    6
    """
    converted_nums = (_words_from_num(num) for num in xrange(1, upper_bound + 1))
    lengths = (_count_characters_we_care_about(phrase) for phrase in converted_nums)
    return sum(lengths)

if __name__ == '__main__':
    print problem_17()

Generator Expressions

The quick version: They're like list comprehensions, only lazy. The longer version resides in PEP 289. The verdict: They rock. It didn't make a big difference in this problem, but in general, it's nice that you don't have to allocate enough memory to contain an entire list. One problem with them is that once you've consumed an element of the generator expression, you can't get it back, so they aren't well-suited for problems where you need to iterate across the data more than once.

Regexes

Regexes are hardly unique to Python, but this is the first time I've ever used them in Python. One feature I was excited to try—and that Python pioneered—is named capturing groups, but the regex I used in this example didn't need them.

Docstrings

I've written docstrings before, but this is the first time I've tried to generate stand-alone documentation from them. To do the generation, I used epydoc. The original PEP 257 defines docstrings in terms of plaintext, but PEP 287 establishes reStructuredText as a richer alternative. Since I was generating documentation from the docstrings, I decided to go with reStructuredText. To avoid typing --docformat restructuredtext every time I invoked epydoc again:

__docformat__ = "restructuredtext en"

While it took a while to get used to it (I've gotten pretty set in my Markdown ways), I really like it so far. In fact, I liked it enough to write this entire blog post in it and simply post the generated HTML into Blogger.

Back to flipping out…

2009-04-03

Note to Self: Statistical Analysis Inside Oracle

If you need to do statistical analysis on data in your Oracle database, DBMS_STAT_FUNCS is your friend, especially the SUMMARY function.

Back to flipping out...

Sneak Attack: namedtuple

Python 2.6 introduced namedtuple, and I have to say: It rocks.

Back to flipping out...