I want to see a simple horizontal bar graph comparing the average career length for male and female film stars.

Imagine you could get this data from IMDB with a bit of processing.

Issues

  • Determining start and end of a career. Basic approach would be to pick first film and last film. However may want to take date of e.g. third and third-to-last film to prevent bias from a one-off 20 years later.
  • Would you need to omit early deaths to prevent bias?
  • May want to focus on just hollywood film stars initially (and require at least one 'studio' film)

asked 03 Aug '11, 20:24

rgrp's gravatar image

rgrp ♦♦
501122027
accept rate: 14%


A less complete set of data would be Freebase. The advantage is that it's possible to use structured queries, which means less processing. Here is a query that does most of the work. It returns a list of actresses, their films and the respective release dates for each region. From here, it's possible to create a career range for each actress:

[{
  "type": "/film/actor",
  "name": null,
  "/people/person/gender" : "Female",
  "/film/actor/film" : [{
    "film" : {
      "name" : null,
      "release_date_s" : [{
        "film_release_region" : [{"name" : null }],
        "release_date" : null,
      }]
    }
  }]
}]

The query should be fairly easy to understand. Freebase will fill in any blanks (null or []) and use given values, such as "Female" as constraints.

Here is an excerpt from the query:

{
  "code":          "/api/status/ok",
  "result": [
    {
      "/film/actor/film": [
        {
          "film": {
            "name": "The Talented Mr. Ripley",
            "release_date_s": [{
              "film_release_region": [{
                "name": "United States of America"
              }],
              "release_date": "1999-12-25"
            }]
          }
        },
        ...
        {
          "film": {
            "name": "The Avengers",
            "release_date_s": [{
              "film_release_region": [{
                "name": "United States of America"
              }],
              "release_date": "2012-05-04"
            }]
          }
        }
      ],
      "/people/person/gender": "Female",
      "name":          "Gwyneth Paltrow",
      "type":          "/film/actor"
    },
...
  ],
  "status":        "200 OK",
  "transaction_id": "cache;cache03.p01.sjc1:8101;2011-08-15T05:05:12Z;0031"
}

Permalink to live call: http://tinyurl.com/3jqeq9c

link

answered 03 Aug '11, 22:25

timClicks's gravatar image

timClicks ♦♦
16346
accept rate: 0%

edited 15 Aug '11, 06:15

Any details of what the query would be for that? Also what about getting this directly from dbpedia? Give me a query and a definite +1 ;-)

(09 Aug '11, 11:21) rgrp ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×3
×1
×1

Asked: 03 Aug '11, 20:24

Seen: 840 times

Last updated: 15 Aug '11, 06:15

powered by OSQA