This dataset was created by Bojan Tunguz
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset was created by Bojan Tunguz
Released under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
List of major cities in the world
The data is extracted from geonames, a very exhaustive list of worldwide toponyms.
This datapackage only list cities above 15,000 inhabitants. Each city is associated with its
country and subcountry to reduce the number of ambiguities. Subcountry can be the name of a state (eg in
United Kingdom or the United States of America) or the major administrative section (eg ''region'' in France'').
See admin1
field on geonames website for further info about subcountry.
Notice that :
* some cities like Vatican city or Singapore are a whole state so they don't belong to any subcountry. Therefore subcountry is N/A
.
* There is no guaranty that a city has a unique name in a country and subcountry (At the time of writing, there are about 60 ambiguities). But for each city,
the source data primary key geonameid
is provided.
You can run the script yourself to update the data and publish them to github : see scripts README
All data is licensed under the Creative Common Attribution License as is the original data from geonames. This means you have to credit geonames when using the data. And while no credit is formally required a link back or credit to Lexman and the Open Knowledge Foundation is much appreciated.
All source code is licensed under the MIT licence.
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
This dataset was created by Bojan Tunguz
Released under U.S. Government Works
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Bojan Tunguz
Released under CC0: Public Domain
This dataset was created by Bojan Tunguz
This dataset was created by Bojan Tunguz
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database presents population and other demographic estimates and projections from 1960 to 2050, covering more than 200 economies. It includes population data by various age groups, sex, urban/rural; fertility data; mortality data; and migration data.
This dataset was created by Bojan Tunguz
This dataset was created by Bojan Tunguz
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Country, regional and world GDP in current US Dollars ($). Regional means collections of countries e.g. Europe & Central Asia.
The data is sourced from the World Bank, which in turn lists as sources: World Bank national accounts data, and OECD National Accounts data files.
This dataset was created by Bojan Tunguz
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Residential property price statistics from different countries. Contains property price indicators (real series are the nominal price series deflated by the consumer price index), both in levels and in growth rates. Can be used for property market analysis.
This data comes from Bank For International Settlements BIS.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains data behind the story Comic Books Are Still Made By Men, For Men And About Men.
The data comes from Marvel Wikia and DC Wikia. Characters were scraped on August 24. Appearance counts were scraped on September 2. The month and year of the first issue each character appeared in was pulled on October 6.
The data is split into two files, for DC and Marvel, respectively: dc-wikia-data.csv
and marvel-wikia-data.csv
. Each file has the following variables:
Variable | Definition |
---|---|
page_id | The unique identifier for that characters page within the wikia |
name | The name of the character |
urlslug | The unique url within the wikia that takes you to the character |
ID | The identity status of the character (Secret Identity, Public identity, [on marvel only: No Dual Identity]) |
ALIGN | If the character is Good, Bad or Neutral |
EYE | Eye color of the character |
HAIR | Hair color of the character |
SEX | Sex of the character (e.g. Male, Female, etc.) |
GSM | If the character is a gender or sexual minority (e.g. Homosexual characters, bisexual characters) |
ALIVE | If the character is alive or deceased |
APPEARANCES | The number of appareances of the character in comic books (as of Sep. 2, 2014. Number will become increasingly out of date as time goes on.) |
FIRST APPEARANCE | The month and year of the character's first appearance in a comic book, if available |
YEAR | The year of the character's first appearance in a comic book, if available |
This dataset was created by Bojan Tunguz
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Provisional count of deaths involving coronavirus disease 2019 (COVID-19) by county of occurrence, in the United States, 2020-2021.
National Center for Health Statistics
Deaths with confirmed or presumed COVID-19, coded to ICD–10 code U07.1. Counties included in this table have more than one (1) death overall at the time of analysis. Number of deaths reported in this table are the total number of deaths received and coded as of the date of analysis and do not represent all deaths that occurred in that period. Data during this period are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset was created by Bojan Tunguz