1 dataset found
  1. f

    U.S. movies with gender-disambiguated actors, directors, and producers

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amaral Lab (2023). U.S. movies with gender-disambiguated actors, directors, and producers [Dataset]. http://doi.org/10.6084/m9.figshare.4967876.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Amaral Lab
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These datasets contain complete genre, cast, director, and producer information about 15,425 U.S.-produced movies released between 1894 and 2011.The initial movie year, title, and genre information was obtained by Wasserman et al. (Cross-evaluation of metrics to estimate the significance of creative works, PNAS, 2015) from IMDb.com That dataset was expanded by Moreira et al. (forthcoming, 2017) to include movie budget, gender composition, cast, director, and producer information.Assigning gender to individualsThe gender of actors is explicitly mentioned in their individual biographical pages, thus we are able to fully determine their gender. For producers and directors that do not also have acting credits, we use indirect methods to assign a gender. If present, we parse the individual's biographical text for gender-specific pronouns (he/his/him/himself, or she/her/hers/herself). If the number of (male-) female-specific pronouns exceeds that of (female-) male-specific ones, we assume the individual is a (male) female. If the previous attempt is inconclusive, we use the Python package gender-guesser (version 0.4.0) to "guess" the gender based on the first name of the individual. The output of gender-guesser is one of "female", "mostly female", "androgynous", "unknown", "mostly male", or "male". We only assign a gender if the guess is either "male" or "female". If we still have not been able to assign a gender, we try to find a photograph of the individual. If all attempts fail, we mark the individual's gender as "undetermined".actors.json - Contains the following information about 225,754 actors:_id - unique IMDb identifier of individual.name - individual's namegender - individual's genremovies_list - list of movie ids individual was cast in. Matches _id field in movies.json.directors.json - Contains the following information about 6,895 directors:_id - unique IMDb identifier of individual.name - individual's name movies_list - list of (year, movie_id, type) triplets. type is one of 'director', 'main_casting', or 'secondary_casting'. Remaining fields match year, _id, from movies.json gender - director gender: male, female, or undeterminedfirst_movie - year of first movie directed.last_movie - year of last movie directed.male_count - Number of male-specific pronouns (he/his/him/himself) from director's IMDb bio page.female_count - Number of female-specific pronouns (she/her/hers/herself) from director's IMDb bio page.actor_credits - True (False) if director has (does not have) "Actor" credits in IMDb filmography.actress_credits - True (False) if director has (does not have) "Actress" credits in IMDb filmography.movies.json - Contains the following information about 15,425 movies:_id - unique IMDb identifier of movie.adjusted_budget - movie budget, if present in IMDb, adjusted for 2014 inflation. Only present for about 36% of movies.all_actors - list of (gender, url, name) triplets for each actor in cast. Each triplet matches gender, _id, and name from movies.json, respectively. director - list of (name, url, type, gender) quadruplets for each director in the movie. type is one of 'director', 'main_casting', or 'secondary_casting'. Remaining fields match name, _id, and gender from directors.json, respectively.producer - list of (name, url, role, gender) quadruplets for each producer in the movie. role indicates specific producer role: producer, associate producer, executive producer, line producer, etc. Remaining fields match name, _id, and gender from producers.json.gender_percent - integer percent of female actors in movie.genre - list of movie genres.year - year when movie was released.title - title of movie.producers.json - Contains the following information about 25,557 producers:_id - unique IMDb identifier of individual.name - individual's name movies_list - list of (role, year, movie_id) triplets. role indicates specific producer role: producer, associate producer, executive producer, line producer, etc. Remaining fields match year, _id, from movies.json gender - producer gender: male, female, or undeterminedfirst_movie - year of first movie produced as any producer role.last_movie - year of last movie produced as any producer role.first_producer_movie - year of first movie produced as a "producer". Only present if individual has at least one credit as "producer".last_producer_movie - year of last movie produced as a "producer". Only present if individual has at least one credit as "producer".first_executive_movie - year of first movie produced as an "executive producer". Only present if individual has at least one credit as "executive producer".last_executive_movie - year of last movie produced as an "executive producer". Only present if individual has at least one credit as "executive producer".first_associate_movie - year of first movie produced as an "associate producer". Only present if individual has at least one credit as "associate producer".last_associate_movie - year of last movie produced as an "associate producer". Only present if individual has at least one credit as "associate producer".male_count - Number of male-specific pronouns (he/his/him/himself) from producer's IMDb bio page.female_count - Number of female-specific pronouns (she/her/hers/herself) from producer's IMDb bio page.actor_credits - True (False) if producer has (does not have) "Actor" credits in IMDb filmography.actress_credits - True (False) if producer has (does not have) "Actress" credits in IMDb filmography.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Amaral Lab (2023). U.S. movies with gender-disambiguated actors, directors, and producers [Dataset]. http://doi.org/10.6084/m9.figshare.4967876.v1

U.S. movies with gender-disambiguated actors, directors, and producers

Explore at:
txtAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Amaral Lab
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

These datasets contain complete genre, cast, director, and producer information about 15,425 U.S.-produced movies released between 1894 and 2011.The initial movie year, title, and genre information was obtained by Wasserman et al. (Cross-evaluation of metrics to estimate the significance of creative works, PNAS, 2015) from IMDb.com That dataset was expanded by Moreira et al. (forthcoming, 2017) to include movie budget, gender composition, cast, director, and producer information.Assigning gender to individualsThe gender of actors is explicitly mentioned in their individual biographical pages, thus we are able to fully determine their gender. For producers and directors that do not also have acting credits, we use indirect methods to assign a gender. If present, we parse the individual's biographical text for gender-specific pronouns (he/his/him/himself, or she/her/hers/herself). If the number of (male-) female-specific pronouns exceeds that of (female-) male-specific ones, we assume the individual is a (male) female. If the previous attempt is inconclusive, we use the Python package gender-guesser (version 0.4.0) to "guess" the gender based on the first name of the individual. The output of gender-guesser is one of "female", "mostly female", "androgynous", "unknown", "mostly male", or "male". We only assign a gender if the guess is either "male" or "female". If we still have not been able to assign a gender, we try to find a photograph of the individual. If all attempts fail, we mark the individual's gender as "undetermined".actors.json - Contains the following information about 225,754 actors:_id - unique IMDb identifier of individual.name - individual's namegender - individual's genremovies_list - list of movie ids individual was cast in. Matches _id field in movies.json.directors.json - Contains the following information about 6,895 directors:_id - unique IMDb identifier of individual.name - individual's name movies_list - list of (year, movie_id, type) triplets. type is one of 'director', 'main_casting', or 'secondary_casting'. Remaining fields match year, _id, from movies.json gender - director gender: male, female, or undeterminedfirst_movie - year of first movie directed.last_movie - year of last movie directed.male_count - Number of male-specific pronouns (he/his/him/himself) from director's IMDb bio page.female_count - Number of female-specific pronouns (she/her/hers/herself) from director's IMDb bio page.actor_credits - True (False) if director has (does not have) "Actor" credits in IMDb filmography.actress_credits - True (False) if director has (does not have) "Actress" credits in IMDb filmography.movies.json - Contains the following information about 15,425 movies:_id - unique IMDb identifier of movie.adjusted_budget - movie budget, if present in IMDb, adjusted for 2014 inflation. Only present for about 36% of movies.all_actors - list of (gender, url, name) triplets for each actor in cast. Each triplet matches gender, _id, and name from movies.json, respectively. director - list of (name, url, type, gender) quadruplets for each director in the movie. type is one of 'director', 'main_casting', or 'secondary_casting'. Remaining fields match name, _id, and gender from directors.json, respectively.producer - list of (name, url, role, gender) quadruplets for each producer in the movie. role indicates specific producer role: producer, associate producer, executive producer, line producer, etc. Remaining fields match name, _id, and gender from producers.json.gender_percent - integer percent of female actors in movie.genre - list of movie genres.year - year when movie was released.title - title of movie.producers.json - Contains the following information about 25,557 producers:_id - unique IMDb identifier of individual.name - individual's name movies_list - list of (role, year, movie_id) triplets. role indicates specific producer role: producer, associate producer, executive producer, line producer, etc. Remaining fields match year, _id, from movies.json gender - producer gender: male, female, or undeterminedfirst_movie - year of first movie produced as any producer role.last_movie - year of last movie produced as any producer role.first_producer_movie - year of first movie produced as a "producer". Only present if individual has at least one credit as "producer".last_producer_movie - year of last movie produced as a "producer". Only present if individual has at least one credit as "producer".first_executive_movie - year of first movie produced as an "executive producer". Only present if individual has at least one credit as "executive producer".last_executive_movie - year of last movie produced as an "executive producer". Only present if individual has at least one credit as "executive producer".first_associate_movie - year of first movie produced as an "associate producer". Only present if individual has at least one credit as "associate producer".last_associate_movie - year of last movie produced as an "associate producer". Only present if individual has at least one credit as "associate producer".male_count - Number of male-specific pronouns (he/his/him/himself) from producer's IMDb bio page.female_count - Number of female-specific pronouns (she/her/hers/herself) from producer's IMDb bio page.actor_credits - True (False) if producer has (does not have) "Actor" credits in IMDb filmography.actress_credits - True (False) if producer has (does not have) "Actress" credits in IMDb filmography.

Search
Clear search
Close search
Google apps
Main menu