Was curious if folks with experience with this sort of thing could suggest a good library for scraping the data from this database of books? http://bit.ly/i8bvJS

Goal is to pull down the book title, author and 'Guided Reading' level.

Python or R libraries would be ideal, as those are the languages/platforms I use the most.

Based on the structure of that particular page I would guess that some tools would be better/easier than others, so would love any advice that will help me get pointed in the right direction!

asked 02 Mar '11, 22:13

almartin's gravatar image

accept rate: 0%

There is a service called ScraperWiki which allows users to create or even request scrapers to be created by volunteers.

It currently provides the ability to create scrapers using the Python, Ruby and PHP languages as well as providing a API


answered 03 Mar '11, 01:52

naesk's gravatar image

accept rate: 20%

Also see answers to the GetTheData question What tools or services are good for scraping data from websites?


answered 03 Mar '11, 16:45

psychemedia's gravatar image

psychemedia ♦♦
accept rate: 11%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 02 Mar '11, 22:13

Seen: 862 times

Last updated: 03 Mar '11, 16:45

powered by OSQA