hpricot provides a nice dom traversal library for web scraping. When using hpricot to parse xml docs, I discovered that Hpricot automatically downcases xml nodes. XML files generated by Microsoft Excel generate nodes with capital letters.
example:
becomes:
doc:/workbook
kind of a small thing, that I’m documenting for myself.
Leave a Comment
You must be logged in to post a comment.