66 datasets found
  1. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  2. f

    First names and last names associating with countries

    • figshare.com
    zip
    Updated Jan 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mike Thelwall (2023). First names and last names associating with countries [Dataset]. http://doi.org/10.6084/m9.figshare.21954467.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 25, 2023
    Dataset provided by
    figshare
    Authors
    Mike Thelwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    First names and last names by country according to affiliations in journal articles 2001-2021 as recorded in Scopus. For 200 countries, there is a complete list of all first names and all last names of at least one researcher with a national affiliation in that country. Each file also records: the number of researchers with that name in the country, the proportion of researchers with that name in the country compared to the world, the number of researchers with that name in the world,

    For example, for the USA:

    Name Authors in USA Proportion in USA Total Sadrach 3 1.000 3 Rangsan 1 0.083 12 Parry 6 0.273 22 Howard 2008 0.733 2739

    Only the first parts of double last names are included. For example, Rodriquez Gonzalez, Maria would have only Rodriquez recorded.

    This is from the paper: "Can national researcher mobility be tracked by first or last name uniqueness"

    List of countries Afghanistan; Albania; Algeria; Angola; Argentina; Armenia; Australia; Austria; Azerbaijan; Bahamas; Bahrain; Bangladesh; Barbados; Belarus; Belgium; Belize; Benin; Bermuda; Bhutan; Bolivia; Bosnia and Herzegovina; Botswana; Brazil; Brunei Darussalam; Bulgaria; Burkina Faso; Burundi; Cambodia; Cameroon; Canada; Cape Verde; Cayman Islands; Central African Republic; Chad; Chile; China; Colombia; Congo; Costa Rica; Cote d'Ivoire; Croatia; Cuba; Cyprus; Czech Republic; Democratic Republic Congo; Denmark; Djibouti; Dominican Republic; Ecuador; Egypt; El Salvador; Eritrea; Estonia; Ethiopia; Falkland Islands (Malvinas); Faroe Islands; Federated States of Micronesia; Fiji; Finland; France; French Guiana; French Polynesia; Gabon; Gambia; Georgia; Germany; Ghana; Greece; Greenland; Grenada; Guadeloupe; Guam; Guatemala; Guinea; Guinea-Bissau; Guyana; Haiti; Honduras; Hong Kong; Hungary; Iceland; India; Indonesia; Iran; Iraq; Ireland; Israel; Italy; Jamaica; Japan; Jordan; Kazakhstan; Kenya; Kuwait; Kyrgyzstan; Laos; Latvia; Lebanon; Lesotho; Liberia; Libyan Arab Jamahiriya; Liechtenstein; Lithuania; Luxembourg; Macao; Macedonia; Madagascar; Malawi; Malaysia; Maldives; Mali; Malta; Martinique; Mauritania; Mauritius; Mexico; Moldova; Monaco; Mongolia; Montenegro; Morocco; Mozambique; Myanmar; Namibia; Nepal; Netherlands; New Caledonia; New Zealand; Nicaragua; Niger; Nigeria; North Korea; North Macedonia; Norway; Oman; Pakistan; Palau; Palestine; Panama; Papua New Guinea; Paraguay; Peru; Philippines; Poland; Portugal; Puerto Rico; Qatar; Reunion; Romania; Russia; Russian Federation; Rwanda; Saint Kitts and Nevis; Samoa; San Marino; Saudi Arabia; Senegal; Serbia; Seychelles; Sierra Leone; Singapore; Slovakia; Slovenia; Solomon Islands; Somalia; South Africa; South Korea; South Sudan; Spain; Sri Lanka; Sudan; Suriname; Swaziland; Sweden; Switzerland; Syrian Arab Republic; Taiwan; Tajikistan; Tanzania; Thailand; Timor-Leste; Togo; Trinidad and Tobago; Tunisia; Turkey; Uganda; Ukraine; United Arab Emirates; United Kingdom; United States; Uruguay; Uzbekistan; Vanuatu; Venezuela; Viet Nam; Virgin Islands (U.S.); Yemen; Yugoslavia; Zambia; Zimbabwe

  3. Worldwide COVID-19 Data from WHO (2025 Edition)

    • kaggle.com
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adil Shamim (2025). Worldwide COVID-19 Data from WHO (2025 Edition) [Dataset]. https://www.kaggle.com/datasets/adilshamim8/worldwide-covid-19-data-from-who
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    Kaggle
    Authors
    Adil Shamim
    Description

    Dataset Overview

    This dataset contains global COVID-19 case and death data by country, collected directly from the official World Health Organization (WHO) COVID-19 Dashboard. It provides a comprehensive view of the pandemic’s impact worldwide, covering the period up to 2025. The dataset is intended for researchers, analysts, and anyone interested in understanding the progression and global effects of COVID-19 through reliable, up-to-date information.

    Source Information

    • Website: WHO COVID-19 Dashboard
    • Organization: World Health Organization (WHO)
    • Data Coverage: Global (by country/territory)
    • Time Period: Up to 2025

    The World Health Organization is the United Nations agency responsible for international public health. The WHO COVID-19 Dashboard is a trusted source that aggregates official reports from countries and territories around the world, providing daily updates on cases, deaths, and other key metrics related to COVID-19.

    Dataset Contents

    • Country/Region: The name of the country or territory.
    • Date: Reporting date.
    • New Cases: Number of new confirmed COVID-19 cases.
    • Cumulative Cases: Total confirmed COVID-19 cases to date.
    • New Deaths: Number of new confirmed deaths due to COVID-19.
    • Cumulative Deaths: Total deaths reported to date.
    • Additional fields may include population, rates per 100,000, and more (see data files for details).

    How to Use

    This dataset can be used for: - Tracking the spread and trends of COVID-19 globally and by country - Modeling and forecasting pandemic progression - Comparative analysis of the pandemic’s impact across countries and regions - Visualization and reporting

    Data Reliability

    The data is sourced from the WHO, widely regarded as the most authoritative source for global health statistics. However, reporting practices and data completeness may vary by country and may be subject to revision as new information becomes available.

    Acknowledgements

    Special thanks to the WHO for making this data publicly available and to all those working to collect, verify, and report COVID-19 statistics.

  4. Famous Celebrity Name Misspellings

    • kaggle.com
    Updated Jan 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Famous Celebrity Name Misspellings [Dataset]. https://www.kaggle.com/datasets/thedevastator/famous-celebrity-name-misspellings
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    Famous Celebrity Name Misspellings

    Aggregated data from The Gyllenhaal Experiment

    By data.world's Admin [source]

    About this dataset

    This dataset contains aggregated spellings and mispellings of the names of 15 famous celebrities. Ever wonder if people can actually spell someone's name correctly? Now you can see for yourself with this compiled data from The Pudding's interactive spelling experiment called The Gyllenhaal Experiment! Interesting to see which names get misspelled more than others - some are easy to guess, some are surprising! With the data provided here, you can start uncovering trends in name-spelling habits. Visualize the data and start analyzing how unique or common each celebrity is with respect to spelling - who stands out? Who blends in? Check it out today and explore a side of celebrity life that hasn't been seen before!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains misnames of 15 famous celebrities. It can be used for a variety of research and analysis purposes, including exploring human language, understanding how names are misspelled, or generating data visualizations.

    In order to get the most out of this dataset, you will need to familiarize yourself with its columns. The dataset consists of two columns- “data” and “updated”. The “data” column contains the misnames associated with each celebrity name. The “updated” column is automatically updated with the date on which the data was last changed or modified.

    To use this dataset for your own research and analysis purposes, you may find it useful to filter out certain types of responses or patterns in order to focus more closely on particular trends or topics of interest; for example, if you are interested in exploring how spellings vary by region then you might wish to group together similar responses regardless of whether they exactly match one celebrity name over another (i.e., categorizing all spellings that follow a certain phonetic pattern). You can also separate different types of responses into separate groups in order to explore different aspects such as popularity (i.e., looking at which misspellings occurred most frequently).

    Research Ideas

    • Creating an interactive quiz for users to test their spelling ability by challenging them to spell names correctly from the celebrity dataset.
    • Building a dictionary database of the misspellings, fans’ nicknames and phonetic spellings of each celebrity so that people can find more information about them more easily and accurately.
    • Measuring the popularity of individual celebrities by tracking the frequency in which their name is misspelled

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: data-all.csv | Column name | Description | |:--------------|:---------------------------------------------------| | data | Misspellings of celebrity names. (String) | | updated | Date when the misspelling was last updated. (Date) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit data.world's Admin.

  5. w

    Dataset of books called Denying democracy : how the IMF and World Bank take...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Denying democracy : how the IMF and World Bank take power from people [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Denying+democracy+%3A+how+the+IMF+and+World+Bank+take+power+from+people
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is Denying democracy : how the IMF and World Bank take power from people. It features 7 columns including author, publication date, language, and book publisher.

  6. Country Mapping - ISO, Continent, Region

    • kaggle.com
    Updated Dec 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrada (2019). Country Mapping - ISO, Continent, Region [Dataset]. https://www.kaggle.com/andradaolteanu/country-mapping-iso-continent-region/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Andrada
    Description

    Context

    I needed this dataset to map some countries in the analysis: Advanced Global Warming Analysis with Plotly. Feel free to use this mapping for whatever cool analysis you're doing. :)

    Content

    • name - Country name in english
    • alpha-2 - ISO code formed of 2 letters
    • alpha-2 - ISO code formed of 3 letters (use this in your plotly maps ;) )
    • country code - unique
    • region - the continent of provenience
    • sub-region - subcontinent
    • intermediate region
    • codes for region/ subregion/ intermediate region

    Acknowledgements

    Dataset was taken from lukes on GitHub: https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv. I made only some small changes to the country names to mach my needs in the dataset (eg. United States of America transformed in United States).

  7. World Population Statistics - 2023

    • kaggle.com
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2024). World Population Statistics - 2023 [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/world-population-statistics-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhavik Jikadara
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description
    • The current US Census Bureau world population estimate in June 2019 shows that the current global population is 7,577,130,400 people on Earth, which far exceeds the world population of 7.2 billion in 2015. Our estimate based on UN data shows the world's population surpassing 7.7 billion.
    • China is the most populous country in the world with a population exceeding 1.4 billion. It is one of just two countries with a population of more than 1 billion, with India being the second. As of 2018, India has a population of over 1.355 billion people, and its population growth is expected to continue through at least 2050. By the year 2030, India is expected to become the most populous country in the world. This is because India’s population will grow, while China is projected to see a loss in population.
    • The following 11 countries that are the most populous in the world each have populations exceeding 100 million. These include the United States, Indonesia, Brazil, Pakistan, Nigeria, Bangladesh, Russia, Mexico, Japan, Ethiopia, and the Philippines. Of these nations, all are expected to continue to grow except Russia and Japan, which will see their populations drop by 2030 before falling again significantly by 2050.
    • Many other nations have populations of at least one million, while there are also countries that have just thousands. The smallest population in the world can be found in Vatican City, where only 801 people reside.
    • In 2018, the world’s population growth rate was 1.12%. Every five years since the 1970s, the population growth rate has continued to fall. The world’s population is expected to continue to grow larger but at a much slower pace. By 2030, the population will exceed 8 billion. In 2040, this number will grow to more than 9 billion. In 2055, the number will rise to over 10 billion, and another billion people won’t be added until near the end of the century. The current annual population growth estimates from the United Nations are in the millions - estimating that over 80 million new lives are added yearly.
    • This population growth will be significantly impacted by nine specific countries which are situated to contribute to the population growth more quickly than other nations. These nations include the Democratic Republic of the Congo, Ethiopia, India, Indonesia, Nigeria, Pakistan, Uganda, the United Republic of Tanzania, and the United States of America. Particularly of interest, India is on track to overtake China's position as the most populous country by 2030. Additionally, multiple nations within Africa are expected to double their populations before fertility rates begin to slow entirely.

    Content

    • In this Dataset, we have Historical Population data for every Country/Territory in the world by different parameters like Area Size of the Country/Territory, Name of the Continent, Name of the Capital, Density, Population Growth Rate, Ranking based on Population, World Population Percentage, etc. >Dataset Glossary (Column-Wise):
    • Rank: Rank by Population.
    • CCA3: 3 Digit Country/Territories Code.
    • Country/Territories: Name of the Country/Territories.
    • Capital: Name of the Capital.
    • Continent: Name of the Continent.
    • 2022 Population: Population of the Country/Territories in the year 2022.
    • 2020 Population: Population of the Country/Territories in the year 2020.
    • 2015 Population: Population of the Country/Territories in the year 2015.
    • 2010 Population: Population of the Country/Territories in the year 2010.
    • 2000 Population: Population of the Country/Territories in the year 2000.
    • 1990 Population: Population of the Country/Territories in the year 1990.
    • 1980 Population: Population of the Country/Territories in the year 1980.
    • 1970 Population: Population of the Country/Territories in the year 1970.
    • Area (km²): Area size of the Country/Territories in square kilometers.
    • Density (per km²): Population Density per square kilometer.
    • Growth Rate: Population Growth Rate by Country/Territories.
    • World Population Percentage: The population percentage by each Country/Territories.
  8. Total population worldwide 1950-2100

    • statista.com
    • ai-chatbox.pro
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Total population worldwide 1950-2100 [Dataset]. https://www.statista.com/statistics/805044/total-population-worldwide/
    Explore at:
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.

  9. Worldwide Soundscapes project meta-data

    • zenodo.org
    Updated Dec 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin F.A. Darras; Kevin F.A. Darras; Rodney Rountree; Rodney Rountree; Steven Van Wilgenburg; Steven Van Wilgenburg; Amandine Gasc; Amandine Gasc; 松海 李; 松海 李; 黎君 董; 黎君 董; Yuhang Song; Youfang Chen; Youfang Chen; Thomas Cherico Wanger; Thomas Cherico Wanger; Yuhang Song (2022). Worldwide Soundscapes project meta-data [Dataset]. http://doi.org/10.5281/zenodo.7415473
    Explore at:
    Dataset updated
    Dec 9, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kevin F.A. Darras; Kevin F.A. Darras; Rodney Rountree; Rodney Rountree; Steven Van Wilgenburg; Steven Van Wilgenburg; Amandine Gasc; Amandine Gasc; 松海 李; 松海 李; 黎君 董; 黎君 董; Yuhang Song; Youfang Chen; Youfang Chen; Thomas Cherico Wanger; Thomas Cherico Wanger; Yuhang Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated soundscape datasets. This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description.

    The overview of all sampling sites can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings. More information on the project can be found here and on ResearchGate.

    The audio recording criteria justifying inclusion into the meta-database are:

    • Stationary (no transects, towed sensors or microphones mounted on cars)
    • Passive (unattended, no human disturbance by the recordist)
    • Ambient (no spatial or temporal focus on a particular species or direction)
    • Spatially and/or temporally replicated (multiple sites sampled at least at one common daytime or multiple days sampled at least in one common site)

    The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database.

    datasets

    • dataset_id: incremental integer, primary key
    • name: name of the dataset. if it is repeated, incremental integers should be used in the "subset" column to differentiate them.
    • subset: incremental integer that can be used to distinguish datasets with identical names
    • collaborators: full names of people deemed responsible for the dataset, separated by commas
    • contributors: full names of people who are not the main collaborators but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses, separated by commas.
    • date_added: when the datased was added (DD/MM/YYYY)
    • URL_open_recordings: if recordings (even only some) from this dataset are openly available, indicate the internet link where they can be found.
    • URL_project: internet link for further information about the corresponding project
    • DOI_publication: DOI of corresponding publications, separated by comma
    • core_realm_IUCN: The core realm of the dataset. Datasets may have multiple realms, but the main one should be listed. Datasets may contain sampling sites from different realms in the "sites" sheet. IUCN Global Ecosystem Typology (v2.0): https://global-ecosystems.org/
    • medium: the physical medium the microphone is situated in
    • protected_area: Whether the sampling sites were situated in protected areas or not, or only some.
    • GADM0: For datasets on land or in territorial waters, Global Administrative Database level0
      https://gadm.org/
    • GADM1: For datasets on land or in territorial waters, Global Administrative Database level1
      https://gadm.org/
    • GADM2: For datasets on land or in territorial waters, Global Administrative Database level2
      https://gadm.org/
    • IHO: For marine locations, the sea area that encompassess all the sampling locations according to the International Hydrographic Organisation. Map here: https://www.arcgis.com/home/item.html?id=44e04407fbaf4d93afcb63018fbca9e2
    • locality: optional free text about the locality
    • latitude_numeric_region: study region approximate centroid latitude in WGS84 decimal degrees
    • longitude_numeric_region: study region approximate centroid longitude in WGS84 decimal degrees
    • sites_number: number of sites sampled
    • year_start: starting year of the sampling
    • year_end: ending year of the sampling
    • deployment_schedule: description of the sampling schedule, provisional
    • temporal_recording_selection: list environmental exclusion criteria that were used to determine which recording days or times to discard
    • high_pass_filter_Hz: frequency of the high-pass filter of the recorder, in Hz
    • variable_sampling_frequency: Does the sampling frequency vary? If it does, write "NA" in the sampling_frequency_kHz column and indicate it in the sampling_frequency_kHz column inside the deployments sheet
    • sampling_frequency_kHz: frequency the microphone was sampled at (sounds of half that frequency will be recorded)
    • variable_recorder:
    • recorder: recorder model used
    • microphone: microphone used
    • freshwater_recordist_position: position of the recordist relative to the microphone during sampling (only for freshwater)
    • collaborator_comments: free-text field for comments by the collaborators
    • validated: This cell is checked if the contents of all sheets are complete and have been found to be coherent and consistent with our requirements.
    • validator_name: name of person doing the validation
    • validation_comments: validators: please insert the date when someone was contacted
    • cross-check: this cell is checked if the collaborators confirm the spatial and temporal data after checking the corresponding site maps, deployment and operation time graphs found at https://drive.google.com/drive/folders/1qfwXH_7dpFCqyls-c6b8RZ_fbcn9kXbp?usp=share_link

    datasets-sites

    • dataset_ID: primary key of datasets table
    • dataset_name: lookup field
    • site_ID: primary key of sites table
    • site_name: lookup field

    sites

    • site_ID: unique site IDs, larger than 1000 for compatibility with ecoSound-web
    • site_name: name or code of sampling site as used in respective projects
    • latitude_numeric: exact numeric degrees coordinates of latitude
    • longitude_numeric: exact numeric degrees coordinates of longitude
    • topography_m: for sites on land: elevation. For marine sites: depth (negative). in meters
    • freshwater_depth_m
    • realm: Ecosystem type according to IUCN GET https://global-ecosystems.org/
    • biome: Ecosystem type according to IUCN GET https://global-ecosystems.org/
    • functional_group: Ecosystem type according to IUCN GET https://global-ecosystems.org/
    • comments

    deployments

    • dataset_ID: primary key of datasets table
    • dataset_name: lookup field
    • deployment: use identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
    • start_date_min: earliest date of deployment start, double-click cell to get date-picker
    • start_date_max: latest date of deployment start, if applicable (only used when recorders were deployed over several days), double-click cell to get date-picker
    • start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • permanent: is the deployment permanent (in which case it would be ongoing and the end date or duration would be unknown)?
    • variable_duration_days: is the duration of the deployment variable? in days
    • duration_days: deployment duration per recorder (use the minimum if variable)
    • end_date_min: earliest date of deployment end, only needed if duration is variable, double-click cell to get date-picker
    • end_date_max: latest date of deployment end, only needed if duration is variable, double-click cell to get date-picker
    • end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
    • recording_time: does the recording last from the deployment start time to the end time (continuous) or at scheduled daily intervals (scheduled)? Note: we consider recordings with duty cycles to be continuous.
    • operation_start_time_mixed: scheduled recording start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • operation_duration_minutes: duration of operation in minutes, if constant
    • operation_end_time_mixed: scheduled recording end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes.
    • sampling_frequency_kHz: only indicate the sampling frequency if it is variable within a particular dataset so that we need to code different frequencies for different deployments
    • recorder
    • subset_sites: If the deployment was not done in all the sites of the

  10. w

    Dataset of books called People and education in the Third World

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called People and education in the Third World [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=People+and+education+in+the+Third+World
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    This dataset is about books. It has 1 row and is filtered where the book is People and education in the Third World. It features 7 columns including author, publication date, language, and book publisher.

  11. aggregate-data-italian-cities-from-wikipedia

    • kaggle.com
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    alepuzio (2020). aggregate-data-italian-cities-from-wikipedia [Dataset]. https://www.kaggle.com/alepuzio/aggregatedataitaliancitiesfromwikipedia/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    alepuzio
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This dataset is the result of my study on web-scraping of English Wikipedia in R and my tests on regression and classification modelization in R.

    Content

    The content is create by reading the appropriate articles in English Wikipedia about Italian cities: I did'nt run NPL analisys but only the table with the data and I ranked every city from 0 to N in every aspect. About the values, 0 means "*the city is not ranked in this aspect*" and N means "*the city is at first place, in descending order of importance, in this aspect* ". If there's no ranking in a particular aspect (for example, the only existence of the airports/harbours with no additional data about the traffic or the size), then 0 means "*no existence*" and N means "*there are N airports/harbours*". The only not-numeric column is the column with the name of the cities in English form, except some exceptions (for example, "*Bra (CN)* " because of simplicity.

    Acknowledgements

    I acknowledge the Wikimedia Foundation for his work, his mission and to make available the cover image of this dataset, (please read the article "The Ideal city (painting)") . I acknowledge too StackOverflow and Cross-Validated to be the most important focus of technical knowledge in the world, all the people in Kaggle for the suggestions.

    Inspiration

    As a beginner in data analisys and modelization (Ok, I passed the exam of statistics in Politecnico di Milano (Italy), but there are more than 10 years that I don't work in this topic and my memory is getting old ^_^) I worked more on data clean, dataset building and building the simplest modelization.

    You can use this datase to realize which city is good to live or to expand this to add some other data from Wikipedia (not only reading the tables but too to read the text adn extrapolate the data from the meaningless text.)

  12. w

    Dataset of books called Between heaven and earth : the religious worlds...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Between heaven and earth : the religious worlds people make and the scholars who study them [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Between+heaven+and+earth+%3A+the+religious+worlds+people+make+and+the+scholars+who+study+them
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Earth
    Description

    This dataset is about books. It has 2 rows and is filtered where the book is Between heaven and earth : the religious worlds people make and the scholars who study them. It features 7 columns including author, publication date, language, and book publisher.

  13. Data from: DOO-RE: A dataset of ambient sensors in a meeting room for...

    • figshare.com
    zip
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hyunju Kim (2024). DOO-RE: A dataset of ambient sensors in a meeting room for activity recognition [Dataset]. http://doi.org/10.6084/m9.figshare.24558619.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Hyunju Kim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We release the DOO-RE dataset which consists of data streams from 11 types of various ambient sensors by collecting data 24/7 from a real-world meeting room. 4 types of ambient sensors, called environment-driven sensors, measure continuous state changes in the environment (e.g. sound), and 4 types of sensors, called user-driven sensors, capture user state changes (e.g. motion). The remaining 3 types of sensors, called actuator-driven sensors, check whether the attached actuators are active (e.g. projector on/off). The values of each sensor are automatically collected by IoT agents which are responsible for each sensor in our IoT system. A part of the collected sensor data stream representing a user activity is extracted as an activity episode in the DOO-RE dataset. Each episode's activity labels are annotated and validated by cross-checking and the consent of multiple annotators. A total of 9 activity types appear in the space: 3 based on single users and 6 based on group (i.e. 2 or more people) users. As a result, DOO-RE is constructed with 696 labeled episodes for single and group activities from the meeting room. DOO-RE is a novel dataset created in a public space that contains the properties of the real-world environment and has the potential to be good uses for developing powerful activity recognition approaches.

  14. w

    Dataset of books called Baptists through the centuries : a history of a...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Baptists through the centuries : a history of a global people [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Baptists+through+the+centuries+%3A+a+history+of+a+global+people
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 2 rows and is filtered where the book is Baptists through the centuries : a history of a global people. It features 7 columns including author, publication date, language, and book publisher.

  15. GBIF Backbone Taxonomy

    • gbif.org
    • smng.net
    • +3more
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GBIF Secretariat (2023). GBIF Backbone Taxonomy [Dataset]. http://doi.org/10.15468/39omei
    Explore at:
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The GBIF Backbone Taxonomy is a single, synthetic management classification with the goal of covering all names GBIF is dealing with. It's the taxonomic backbone that allows GBIF to integrate name based information from different resources, no matter if these are occurrence datasets, species pages, names from nomenclators or external sources like EOL, Genbank or IUCN. This backbone allows taxonomic search, browse and reporting operations across all those resources in a consistent way and to provide means to crosswalk names from one source to another.

    It is updated regulary through an automated process in which the Catalogue of Life acts as a starting point also providing the complete higher classification above families. Additional scientific names only found in other authoritative nomenclatural and taxonomic datasets are then merged into the tree, thus extending the original catalogue and broadening the backbones name coverage. The GBIF Backbone taxonomy also includes identifiers for Operational Taxonomic Units (OTUs) drawn from the barcoding resources iBOL and UNITE.

    International Barcode of Life project (iBOL), Barcode Index Numbers (BINs). BINs are connected to a taxon name and its classification by taking into account all names applied to the BIN and picking names with at least 80% consensus. If there is no consensus of name at the species level, the selection process is repeated moving up the major Linnaean ranks until consensus is achieved.

    UNITE - Unified system for the DNA based fungal species, Species Hypotheses (SHs). SHs are connected to a taxon name and its classification based on the determination of the RefS (reference sequence) if present or the RepS (representative sequence). In the latter case, if there is no match in the UNITE taxonomy, the lowest rank with 100% consensus within the SH will be used.

    The GBIF Backbone Taxonomy is available for download at https://hosted-datasets.gbif.org/datasets/backbone/ in different formats together with an archive of all previous versions.

    The following 105 sources have been used to assemble the GBIF backbone with number of names given in brackets:

    • Catalogue of Life Checklist - 4766428 names
    • International Barcode of Life project (iBOL) Barcode Index Numbers (BINs) - 635951 names
    • UNITE - Unified system for the DNA based fungal species linked to the classification - 611208 names
    • The Paleobiology Database - 212054 names
    • World Register of Marine Species - 188857 names
    • The Interim Register of Marine and Nonmarine Genera - 183894 names
    • The World Checklist of Vascular Plants (WCVP) - 131891 names
    • GBIF Backbone Taxonomy - 114350 names
    • TAXREF - 109374 names
    • The Leipzig catalogue of vascular plants - 75380 names
    • ZooBank - 73549 names
    • Integrated Taxonomic Information System (ITIS) - 68377 names
    • Plazi.org taxonomic treatments database - 61346 names
    • Genome Taxonomy Database r207 - 60545 names
    • International Plant Names Index - 52329 names
    • Fauna Europaea - 45077 names
    • The National Checklist of Taiwan (Catalogue of Life in Taiwan, TaiCoL) - 36193 names
    • Dyntaxa. Svensk taxonomisk databas - 35892 names
    • The Plant List with literature - 32692 names
    • United Kingdom Species Inventory (UKSI) - 29643 names
    • Artsnavnebasen - 29208 names
    • The IUCN Red List of Threatened Species - 21221 names
    • Afromoths, online database of Afrotropical moth species (Lepidoptera) - 13961 names
    • Brazilian Flora 2020 project - Projeto Flora do Brasil 2020 - 13829 names
    • Prokaryotic Nomenclature Up-to-Date (PNU) - 10079 names
    • Checklist Dutch Species Register - Nederlands Soortenregister - 8814 names
    • ICTV Master Species List (MSL) - 7852 names
    • Cockroach Species File - 6020 names
    • GRIN Taxonomy - 5882 names
    • Taxon list of fungi and fungal-like organisms from Germany compiled by the DGfM - 4570 names
    • Catalogue of Afrotropical Bees - 3623 names
    • Catalogue of Tenebrionidae (Coleoptera) of North America - 3327 names
    • Checklist of Beetles (Coleoptera) of Canada and Alaska. Second Edition. - 3312 names
    • Systema Dipterorum - 2850 names
    • Catalogue of the Pterophoroidea of the World - 2807 names
    • The Clements Checklist - 2675 names
    • Taxon list of Hymenoptera from Germany compiled in the context of the GBOL project - 2496 names
    • IOC World Bird List, v13.2 - 2366 names
    • Official Lists and Indexes of Names in Zoology - 2310 names
    • National checklist of all species occurring in Denmark - 1922 names
    • Myriatrix - 1876 names
    • Database of Vascular Plants of Canada (VASCAN) - 1822 names
    • Taxon list of vascular plants from Bavaria, Germany compiled in the context of the BFL project - 1771 names
    • Orthoptera Species File - 1742 names
    • A list of the terrestrial fungi, flora and fauna of Madeira and Selvagens archipelagos - 1602 names
    • Aphid Species File - 1565 names
    • World Spider Catalog - 1561 names
    • Taxon list of Jurassic Pisces of the Tethys Palaeo-Environment compiled at the SNSB-JME - 1270 names
    • Backbone Family Classification Patch - 1143 names
    • GBIF Algae Classification - 1100 names
    • International Cichorieae Network (ICN): Cichorieae Portal - 975 names
    • Psocodea Species File - 803 names
    • New Zealand Marine Macroalgae Species Checklist - 787 names
    • Annotated checklist of endemic species from the Western Balkans - 754 names
    • Taxon list of animals with German names (worldwide) compiled at the SMNS - 503 names
    • Catalogue of the Alucitoidea of the World - 472 names
    • Lygaeoidea Species File - 462 names
    • Catálogo de Plantas y Líquenes de Colombia - 422 names
    • GBIF Backbone Patch - 317 names
    • Phasmida Species File - 259 names
    • Cortinariaceae fetched from the Index Fungorum API - 234 names
    • Coreoidea Species File - 233 names
    • GTDB supplement - 139 names
    • Mantodea Species File - 119 names
    • Endemic species in Taiwan - 93 names
    • Taxon list of Araneae from Germany compiled in the context of the GBOL project - 88 names
    • Species of Hominidae - 78 names
    • Taxon list of Sternorrhyncha from Germany compiled in the context of the GBOL project - 77 names
    • Taxon list of mosses from Germany compiled in the context of the GBOL project - 75 names
    • Mammal Species of the World - 73 names
    • Plecoptera Species File - 71 names
    • Species Fungorum Plus - 64 names
    • Catalogue of the type specimens of Cosmopterigidae (Lepidoptera: Gelechioidea) from research collections of the Zoological Institute, Russian Academy of Sciences - 47 names
    • Species named after famous people - 41 names
    • Dermaptera Species File - 36 names
    • Taxon list of Trichoptera from Germany compiled in the context of the GBOL project - 34 names
    • True Fruit Flies (Diptera, Tephritidae) of the Afrotropical Region - 33 names
    • Range and Regularities in the Distribution of Earthworms of the Earthworms of the USSR Fauna. Perel, 1979 - 32 names
    • Taxon list of Diplura from Germany compiled in the context of the GBOL project - 30 names
    • Lista de referencia de especies de aves de Colombia - 2022 - 24 names
    • Taxon list of Auchenorrhyncha from Germany compiled in the context of the GBOL project - 20 names
    • Catalogue of the type specimens of Polycestinae (Coleoptera: Buprestidae) from research collections of the Zoological Institute, Russian Academy of Sciences - 19 names
    • Taxon list of Thysanoptera from Germany compiled in the context of the GBOL project - 19 names
    • Lista de especies de vertebrados registrados en jurisdicción del Departamento del Huila - 18 names
    • Taxon list of Microcoryphia (Archaeognatha) from Germany compiled in the context of the GBOL project - 15 names
    • Catalogue of the type specimens of Bufonidae and Megophryidae (Amphibia: Anura) from research collections of the Zoological Institute, Russian Academy of Sciences - 12 names
    • Grylloblattodea Species File - 11 names
    • Coleorrhyncha Species File - 9 names
    • Taxon list of liverworts from Germany compiled in the context of the GBOL project - 9 names
    • Embioptera Species File - 7 names
    • Taxon list of Pisces and Cyclostoma from Germany compiled in the context of the GBOL project - 6 names
    • Taxon list of Pteridophyta from Germany compiled in the context of the GBOL project - 6 names
    • Taxon list of Siphonaptera from Germany compiled in the context of the GBOL project - 5 names
    • The Earthworms of the Fauna of Russia. Perel, 1997 - 5 names
    • Taxon list of Zygentoma from Germany compiled in the context of the GBOL project - 4 names
    • Asiloid Flies: new taxa of Diptera: Apioceridae, Asilidae, and Mydidae - 3 names
    • Taxon list of Protura from Germany compiled in the context of the GBOL project - 3 names
    • Taxon list of hornworts from Germany compiled in the context of the GBOL project - 2 names
    • Chrysididae Species File - 1 names
    • Taxon list of Dermaptera from Germany compiled in the context of the GBOL project - 1 names
    • Taxon list of Diplopoda from Germany in the context of the GBOL project - 1 names
    • Taxon list of Orthoptera (Grashoppers) from Germany compiled at the SNSB - 1 names
    • Taxon list of Pscoptera from Germany compiled in the context of the GBOL project - 1 names
    • Taxon list of Pseudoscorpiones from Germany compiled in the context of the GBOL project - 1 names
    • Taxon list of Raphidioptera from Germany compiled in the context of the GBOL project - 1 names

  16. Global Sanctions Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). Global Sanctions Dataset [Dataset]. https://brightdata.com/products/datasets/global-sanctions
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    With in-depth information on individuals who have been included in the international sanctions list and are currently facing economic sanctions from various countries and international organizations, you can benefit greatly. Our list includes key data attributes such as - first name, last name, citizenship, passport details, address, date of proscription & reason for listing. The comprehensive information on individuals listed on the international sanctions list helps organizations ensure compliance with sanctions regulations and avoid any potential risks associated with doing business with sanctioned entities.

    Popular attributes:

    ✔ Financial Intelligence

    ✔ Credit Risk Analysis

    ✔ Compliance

    ✔ Bank Data Enrichment

    ✔ Account Profiling

  17. o

    Global B2B people Data | 720M+ LinkedIn Profiles | Verified & Bi-Weekly...

    • opendatabay.com
    .undefined
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Forager (2025). Global B2B people Data | 720M+ LinkedIn Profiles | Verified & Bi-Weekly Updates [Dataset]. https://www.opendatabay.com/data/premium/5ff38f72-201c-469b-aa7c-5cba9ddb2ac3
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset authored and provided by
    Forager
    Area covered
    Synthetic Data Generation
    Description

    🌍 Global B2B Person Dataset | 755M+ LinkedIn Profiles | Verified & Bi-Weekly Updated Access the world’s most comprehensive professional dataset, enriched with over 755 million LinkedIn profiles. The Forager.ai Global B2B Person Dataset delivers work-verified professional contacts with 95%+ accuracy, refreshed every two weeks. Ideal for recruitment, sales, research, and talent mapping, it provides direct access to decision-makers, specialists, and executives across industries and geographies.

    Dataset Features Full Name & Job Title: Up-to-date first/last name with current professional role.

    Emails & Phone Numbers: AI-validated work and personal email addresses, plus mobile numbers.

    Company Info: Current employer name, industry, and company size (employee count).

    Career History: Detailed work history with job titles, durations, and role progressions.

    Skills & Endorsements: Extracted from public LinkedIn profiles.

    Education & Certifications: Universities, degrees, and professional certifications.

    Location & LinkedIn URL: City, country, and direct link to public LinkedIn profile.

    Distribution Data Volume: 755M+ total profiles, with 270M+ containing full contact information.

    Formats Available: CSV, JSON via S3 or Snowflake; API for real-time access.

    Access Methods: REST API, Enrichment API (lookup), full dataset delivery, or custom solutions.

    Usage This dataset is ideal for a variety of applications:

    Executive Recruitment: Source passive talent, build role-based maps, and assess mobility.

    Sales Intelligence: Find decision-makers, personalize outreach, and trigger campaigns on job changes.

    Market Research: Understand talent concentration by company, geography, and skill set.

    Partnership Development: Identify key stakeholders in target firms for business development.

    Talent Mapping & Strategic Hiring: Build full organizational charts and skill distribution heatmaps.

    Coverage Geographic Coverage: Global – including North America, EMEA, LATAM, and APAC.

    Time Range: Continuously updated; profiles refreshed bi-weekly.

    Demographics: Cross-industry coverage of seniority levels from entry-level to C-suite, across all sectors.

    License CUSTOM

    Who Can Use It Recruiters & Staffing Firms: For building target lists and sourcing niche talent.

    Sales & RevOps Teams: For targeting by department, title, or decision-making authority.

    VCs & PE Firms: To assess leadership teams and monitor executive movement.

    Data Scientists & Analysts: To train models for job mobility, hiring trends, or org structure prediction.

    B2B Platforms: For enriching internal databases and powering account-based marketing (ABM).

  18. w

    Dataset of books called The four freedoms histories, or, The people we are :...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called The four freedoms histories, or, The people we are : a history for boys and girls. Vol.4, Great Britain and the world, 1870-1963: the age of competition [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+four+freedoms+histories%2C+or%2C+The+people+we+are+%3A+a+history+for+boys+and+girls.+Vol.4%2C+Great+Britain+and+the+world%2C+1870-1963%3A+the+age+of+competition
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Great Britain, United Kingdom
    Description

    This dataset is about books. It has 1 row and is filtered where the book is The four freedoms histories, or, The people we are : a history for boys and girls. Vol.4, Great Britain and the world, 1870-1963: the age of competition. It features 7 columns including author, publication date, language, and book publisher.

  19. Popular White Last Names in the US

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Popular White Last Names in the US [Dataset]. https://www.johnsnowlabs.com/marketplace/popular-white-last-names-in-the-us/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    United States
    Description

    This dataset represents the popular last names in the United States for White.

  20. Worldwide Soundscapes project metadata and analysis scripts

    • zenodo.org
    • data.niaid.nih.gov
    csv, zip
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin F.A. Darras; Kevin F.A. Darras; Rodney Rountree; Rodney Rountree; Steven Van Wilgenburg; Steven Van Wilgenburg; Amandine Gasc; Amandine Gasc; Songhai Li; Songhai Li; Lijun Dong; Lijun Dong; Youfang Chen; Youfang Chen; Thomas Cherico Wanger; Thomas Cherico Wanger (2025). Worldwide Soundscapes project metadata and analysis scripts [Dataset]. http://doi.org/10.5281/zenodo.14216871
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    May 7, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kevin F.A. Darras; Kevin F.A. Darras; Rodney Rountree; Rodney Rountree; Steven Van Wilgenburg; Steven Van Wilgenburg; Amandine Gasc; Amandine Gasc; Songhai Li; Songhai Li; Lijun Dong; Lijun Dong; Youfang Chen; Youfang Chen; Thomas Cherico Wanger; Thomas Cherico Wanger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated passive acoustic monitoring meta-datasets (i.e. meta-data collections). This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description. Additionally, R scripts are provided to replicate the analysis published in [placeholder].

    The overview of all sampling sites and timelines can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings. The recordings of this collection were annotated and analysed to explore macro-ecological trends.

    The audio recording criteria justifying inclusion into the meta-database are:

    • Stationary (no transects, towed sensors or microphones mounted on cars)
    • Passive (unattended, no human disturbance by the recordist)
    • Ambient (no directional microphone or triggered recordings, non-experimental conditions)
    • Spatially and/or temporally replicated (i.e. multiple sites sampled at the same time and/or multiple days - covering the same daytime - sampled at the same site)

    The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database. The data shared here only includes validated collections.

    Changes from version 4.0.0

    Added link to the published synthesis.

    Meta-database CSV files

    collections

    • collection_id: unique integer, primary key
    • name: name of the dataset. if it is repeated, incremental integers should be used in the "subset" column to differentiate them.
    • ecoSound-web_link: link of validated meta-collection on ecoSound-web
    • primary_contributors: full names of people deemed corresponding contributors who are responsible for the dataset
    • secondary_contributors: full names of people who are not primary contributors but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses
    • date_added: when the datased was added (YYYY-MM-DD)
    • URL_open_recordings: internet link for openly-available recordings from this collection
    • URL_project: internet link for further information about the corresponding project
    • DOI_publication: Digital Object Identifiers of corresponding publications
    • core_realm_IUCN: The main, core realm of the dataset according to IUCN Global Ecosystem Typology (v2.0): https://global-ecosystems.org/
    • medium: the physical medium the microphone is situated in
    • locality: optional free text about the locality
    • contributor_comments: free-text field for comments by the primary contributors

    collections-sites

    • dataset_ID: primary key of collections table
    • site_ID: primary key of sites table

    sites

    • site_ID: unique integer, primary key
    • site_name: internal name or code of sampling site as used in respective projects
    • latitude_numeric: site's numeric degrees of latitude
    • longitude_numeric: site's numeric degrees of longitude
    • blurred_coordinates: whether latitude and longitude coordinates are inaccurate, boolean. Coordinates may be blurred with random offsets, rounding, snapping, etc. Indicate the blurring method inside the comments field
    • topography_m: vertical position of the microphone relative to the sea level. for sites on land: elevation. For marine sites: depth (negative). in meters. Only indicate if the values were measured by the collaborator.
    • freshwater_depth_m: microphone depth, only used for sites inside freshwater bodies that also have an elevation value above the sea level
    • realm: Ecosystem type: main realm according to IUCN GET https://global-ecosystems.org/
    • biome: Ecosystem type: main biome according to IUCN GET https://global-ecosystems.org/
    • functional_group: Ecosystem type: main functional group according to IUCN GET https://global-ecosystems.org/
    • contributor_comments: free text field for contributor comments
    • GADM_0: Global ADMinistrative Database level 0 classification of terrestrial site or marine site that is within territorial waters. Source: https://gadm.org/download_world.html
    • IHO: International Hydrographic Organization classification of marine site. Source: https://marineregions.org/downloads.php
    • WDPA: World Database on Protected Areas classification of the site. Source: https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA

    deployments

    • dataset_ID: primary key of datasets table
    • deployment: identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
    • subset_site_ID: If the deployment was not done in all the sites of the corresponding collection, site IDs where the deployment was conducted
    • start_date: date of deployment start
    • start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • permanent: whether the deployment is permanent, boolean
    • end_date: date of deployment end (date when last scheduled operation starts)
    • end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
    • operation_mode: continuous: recording takes place from the deployment start date-time to deployment end date-time.
      periodical: recording takes place periodically (i.e., with duty cycle) from the deployment start date-time to deployment end date-time.
      scheduled: recording takes place during scheduled daily time intervals (optionally with duty cycle)
    • duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". empty if no duty cycle is used. For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes
    • operation_start_time_mixed: only for scheduled recordings: start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • operation_duration_minutes: only for scheduled recordings: duration of operation in minutes, if constant
    • operation_end_time_mixed: only for scheduled recordings: end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Only required if durations are variable. Do not use when end times are ambiguous (for instance, if a recording could be 1 hour or 25 hours long because the end is on the next day). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
    • high_pass_filter_Hz: frequency of the high-pass filter of the recorder if applied, in Hz. Otherwise, write "none". This may be called a "low-cut" filter too.
    • bit_depth: sampling bit depth of the recordings. Often constant for a particular recorder
    • channels: number of recorded audio channels
    • sampling_frequency_kHz: frequency at which the microphone signal was sampled by the recorder (sounds of half that frequency will be recorded)
    • recorder: recorder used for deployment
    • microphone: microphone used for deployment
    • target_taxa: main IUCN animal taxa that were studied with this deployment, using the exact IUCN Red list names (http://www.iucnredlist.org/), separated by commas. Only genera, families, orders, and classes are accepted. Empty if there was no taxonomic focus (i.e., general soundscapes were the study focus).
    • contributor_comments: free text field for contributor comments
    • exact_recordings: whether the deployment data here have been superseded by inserting more exact recording date-time ranges into the meta-collection on ecoSound-web

    recordings (partial download from ecoSound-web)

    • recording_id: primary key of the recordings table
    • collection_id: ID of the collection the recording belongs to
    • name: name of the recording
    • site_id: site ID the recording belongs to:
    • recorder_id: ID of the recorder used for the recording (internal ecoSound-web code)
    • microphone_id: ID of the microphone used for the recording (internal ecoSound-web code)
    • recording_gain:recording gain applied for amplifying the audio signal, in decibels
    • duty_cycle_recording: fraction of the recording periode when the recorder is actively recording audio
    • duty_cycle_period: period of the duty cycle, i.e., time between the starts of two subsequent recordings
    • note: comments (contains the target taxon)
    • file_date: date of the recording

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/

Geonames - All Cities with a population > 1000

Explore at:
14 scholarly articles cite this dataset (View in Google Scholar)
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

Search
Clear search
Close search
Google apps
Main menu