Here's an easy-to-follow snippet for the next time you need to parse an `Accepts`

header.

Back to flipping out...

skip to main |
skip to sidebar
## 2009-04-27

###
Note to Self: Parsing the Accepts header

## 2009-04-17

###
Sneak Attack: Importing CSV Files to PostgreSQL Databases

## 2009-04-13

###
Project Euler: Problem 17

#### Generator Expressions

#### Regexes

#### Docstrings

## 2009-04-03

###
Note to Self: Statistical Analysis Inside Oracle

###
Sneak Attack: namedtuple

## About Me

## Hank's Media

The Home for People Who Like to Flip Out and Write Code

Here's an easy-to-follow snippet for the next time you need to parse an `Accepts`

header.

Back to flipping out...

Posted by Hank Gay at 06:10 View Comments Links to this post

Labels: http, note-to-self, python

Importing CSV Files to PostgreSQL Databases.

Back to flipping out...

Posted by Hank Gay at 11:43 View Comments Links to this post

Labels: postgresql, sql

I recently finished up Problem 17, and since the problem was fairly straightforward, I used it as an opportunity to explore some of Python's language features that I've been meaning to look into.

```
# problem17.py
"""
Find the solution to `Problem 17`_ at `Project Euler`_.
.. _Problem 17: http://projecteuler.net/index.php?section=problems&id=17
.. _Project Euler: http://projecteuler.net/
"""
__docformat__ = "restructuredtext en"
import re
_NUMBER_NAMES = {1: "one", 2: "two", 3: "three", 4: "four", 5: "five",
6: "six", 7: "seven", 8: "eight", 9: "nine", 10: "ten",
11: "eleven", 12: "twelve", 13: "thirteen", 14: "fourteen",
15: "fifteen", 16: "sixteen", 17: "seventeen",
18: "eighteen", 19: "nineteen", 20: "twenty",
21: "twenty-one", 22: "twenty-two", 23: "twenty-three",
24: "twenty-four", 25: "twenty-five", 26: "twenty-six",
27: "twenty-seven", 28: "twenty-eight", 29: "twenty-nine",
30: "thirty", 31: "thirty-one", 32: "thirty-two",
33: "thirty-three", 34: "thirty-four", 35: "thirty-five",
36: "thirty-six", 37: "thirty-seven", 38: "thirty-eight",
39: "thirty-nine", 40: "forty", 41: "forty-one",
42: "forty-two", 43: "forty-three", 44: "forty-four",
45: "forty-five", 46: "forty-six", 47: "forty-seven",
48: "forty-eight", 49: "forty-nine", 50: "fifty",
51: "fifty-one", 52: "fifty-two", 53: "fifty-three",
54: "fifty-four", 55: "fifty-five", 56: "fifty-six",
57: "fifty-seven", 58: "fifty-eight", 59: "fifty-nine",
60: "sixty", 61: "sixty-one", 62: "sixty-two",
63: "sixty-three", 64: "sixty-four", 65: "sixty-five",
66: "sixty-six", 67: "sixty-seven", 68: "sixty-eight",
69: "sixty-nine", 70: "seventy", 71: "seventy-one",
72: "seventy-two", 73: "seventy-three", 74: "seventy-four",
75: "seventy-five", 76: "seventy-six", 77: "seventy-seven",
78: "seventy-eight", 79: "seventy-nine", 80: "eighty",
81: "eighty-one", 82: "eighty-two", 83: "eighty-three",
84: "eighty-four", 85: "eighty-five", 86: "eighty-six",
87: "eighty-seven", 88: "eighty-eight", 89: "eighty-nine",
90: "ninety", 91: "ninety-one", 92: "ninety-two",
93: "ninety-three", 94: "ninety-four", 95: "ninety-five",
96: "ninety-six", 97: "ninety-seven", 98: "ninety-eight",
99: "ninety-nine"}
_CHARACTERS_WE_CARE_ABOUT = re.compile("\w")
def _words_from_num(num):
"""
Convert ``num`` to its (British) English phrase equivalent.
If ``num`` is greater than 9,999 then raise an ``Exception``.
>>> _words_from_num(115)
'one hundred and fifteen'
"""
if num >= 10000:
raise Exception, 'This function only supports numbers less than 10000.'
parts_list = []
if num >= 1000:
thousands = num // 1000
parts_list.append(_NUMBER_NAMES[thousands])
parts_list.append(" thousand")
num -= thousands * 1000
if num >= 100:
hundreds = num // 100
parts_list.append(_NUMBER_NAMES[hundreds])
parts_list.append(" hundred")
num -= hundreds * 100
if num:
if parts_list:
parts_list.append(" and")
parts_list.extend([" ", _NUMBER_NAMES[num]])
return "".join(parts_list)
def _count_characters_we_care_about(string_to_count):
"""
Count the characters in ``string_to_count``, excluding things like hyphens and spaces.
>>> _count_characters_we_care_about("one hundred and twenty-three")
24
"""
return len(_CHARACTERS_WE_CARE_ABOUT.findall(string_to_count))
def problem_17(upper_bound = 1000):
"""
Find the solution to `Problem 17`_ at `Project Euler`_.
.. _Problem 17: http://projecteuler.net/index.php?section=problems&id=17
.. _Project Euler: http://projecteuler.net/
>>> problem_17(2)
6
"""
converted_nums = (_words_from_num(num) for num in xrange(1, upper_bound + 1))
lengths = (_count_characters_we_care_about(phrase) for phrase in converted_nums)
return sum(lengths)
if __name__ == '__main__':
print problem_17()
```

The quick version: They're like list comprehensions, only lazy. The longer version resides in PEP 289. The verdict: They rock. It didn't make a big difference in this problem, but in general, it's nice that you don't have to allocate enough memory to contain an entire list. One problem with them is that once you've consumed an element of the generator expression, you can't get it back, so they aren't well-suited for problems where you need to iterate across the data more than once.

Regexes are hardly unique to Python, but this is the first time I've ever used them in Python. One feature I was excited to try—and that Python pioneered—is named capturing groups, but the regex I used in this example didn't need them.

I've written docstrings before, but this is the first time I've tried to generate stand-alone documentation from them. To do the generation, I used epydoc. The original PEP 257 defines docstrings in terms of plaintext, but PEP 287 establishes reStructuredText as a richer alternative. Since I was generating documentation from the docstrings, I decided to go with reStructuredText. To avoid typing `--docformat restructuredtext` every time I invoked `epydoc` again:

__docformat__ = "restructuredtext en"

While it took a while to get used to it (I've gotten pretty set in my Markdown ways), I really like it so far. In fact, I liked it enough to write this entire blog post in it and simply post the generated HTML into Blogger.

Back to flipping out…

Posted by Hank Gay at 21:32 View Comments Links to this post

Labels: project_euler, python

If you need to do statistical analysis on data in your Oracle database, DBMS_STAT_FUNCS is your friend, especially the `SUMMARY`

function.

Back to flipping out...

Posted by Hank Gay at 14:08 View Comments Links to this post

Labels: note-to-self, Oracle, statistics

Python 2.6 introduced namedtuple, and I have to say: It rocks.

Back to flipping out...

Posted by Hank Gay at 08:08 View Comments Links to this post

Labels: python, sneak_attack

Subscribe to:
Posts (Atom)

- Hank Gay
- I like to spend time with my beautiful wife, our daughter, and our dog. I'm also a fan of geek humor, and I've been known to flip out and write code… elegant code, if I'm really lucky.

My LinkedIn Profile

- Technical
- Practical Django Projects, Second Edition
- Nontechnical
- Brave New World
- Music
- American IV: The Man Comes Around
- Movie Buffy the Vampire Slayer—The Complete First Season