5 datasets found
  1. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  2. Z

    Data for manuscript "The Prevalence of Terms Denoting Far-right and Far-left...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rozado, David (2022). Data for manuscript "The Prevalence of Terms Denoting Far-right and Far-left Political Extremism in U.S. and U.K. News Media" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5437015
    Explore at:
    Dataset updated
    Mar 22, 2022
    Dataset authored and provided by
    Rozado, David
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Kingdom, United States
    Description

    This data set belongs to an academic manuscript examining longitudinally (2000-2019) the prevalence of terms denoting far-right and far-left political extremism in a large corpus of more than 32 million written news and opinion articles from 54 news media outlets popular in the United States and the United Kingdom.

    The textual content of news and opinion articles from the 54 outlets listed in the main manuscript is available in the outlet's online domains and/or public cache repositories such as Google cache (https://webcache.googleusercontent.com), The Internet Wayback Machine (https://archive.org/web/web.php), and Common Crawl (https://commoncrawl.org). We used derived word frequency counts from these sources. Textual content included in our analysis is circumscribed to articles headlines and main body of text of the articles and does not include other article elements such as figure captions.

    Targeted textual content was located in HTML raw data using outlet specific xpath expressions. Tokens were lowercased prior to estimating frequency counts. To prevent outlets with sparse text content for a year from distorting aggregate frequency counts, we only include outlet frequency counts from years for which there is at least 1 million words of article content from an outlet. This threshold was chosen to maximize inclusion in our analysis of outlets with sparse amounts of articles text per year.

    Yearly frequency usage of a target word in an outlet in any given year was estimated by dividing the total number of occurrences of the target word in all articles of a given year by the number of all words in all articles of that year. This method of estimating frequency accounts for variable volume of total article output over time.

    The list of compressed files in this data set is listed next:

    -analysisScripts.rar contains the analysis scripts used in the main manuscript

    -articlesContainingTargetWords.rar contains counts of target words in outlets articles as well as total counts of words in articles

    Usage Notes

    In a small percentage of articles, outlet specific XPath expressions failed to properly capture the content of the article due to the heterogeneity of HTML elements and CSS styling combinations with which articles text content is arranged in outlets online domains. As a result, the total and target word counts metrics for a small subset of articles are not precise. In a random sample of articles and outlets, manual estimation of target words counts overlapped with the automatically derived counts for over 90% of the articles.

    Most of the incorrect frequency counts were minor deviations from the actual counts such as for instance counting the word "Facebook" in an article footnote encouraging article readers to follow the journalist’s Facebook profile and that the XPath expression mistakenly included as the content of the article main text. Some additional outlet-specific inaccuracies that we could identify occurred in "The Hill" and "Newsmax" news outlets where XPath expressions had some shortfalls at precisely capturing articles’ content. For "The Hill", in years 2007-2009, XPath expressions failed to capture the complete text of the article in about 40% of the articles. This does not necessarily result in incorrect frequency counts for that outlet but in a sample of articles’ words that is about 40% smaller than the total population of articles words for those three years. In the case of "NewsMax", the issue was that for some articles, XPath expressions captured the entire text of the article twice. Notice that this does not result in incorrect frequency counts. If a word appears x times in an article with a total of y words, the same frequency count will still be derived when our scripts count the word 2x times in the version of the article with a total of 2y words.

    To conclude, in a data analysis of 32 million articles, we cannot manually check the correctness of frequency counts for every single article and hundred percent accuracy at capturing articles’ content is elusive due to the small number of difficult to detect boundary cases such as incorrect HTML markup syntax in online domains. Overall however, we are confident that our frequency metrics are representative of word prevalence in print news media content (see Figure 1 in the main manuscript for illustration of the accuracy of the frequency counts).

  3. Instagram: most popular posts as of 2024

    • statista.com
    • de.statista.com
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  4. g

    CARMA, United Kingdom Power Plant Emissions, United Kingdom, 2000/ 2007/...

    • geocommons.com
    Updated May 2, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data (2008). CARMA, United Kingdom Power Plant Emissions, United Kingdom, 2000/ 2007/ Future [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 2, 2008
    Dataset provided by
    data
    CARMA
    Description

    All the data for this dataset is provided from CARMA: Data from CARMA (www.carma.org) This dataset provides information about Power Plant emissions in the United Kingdom. Power Plant emissions from all power plants in the UK were obtained by CARMA for the past (2000 Annual Report), the present (2007 data), and the future. CARMA determine data presented for the future to reflect planned plant construction, expansion, and retirement. The dataset provides the name, company, parent company, city, state, zip, county, metro area, lat/lon, and plant id for each individual power plant. Only Power Plants that had a listed longitude and latitude in CARMA's database were mapped. The dataset reports for the three time periods: Intensity: Pounds of CO2 emitted per megawatt-hour of electricity produced. Energy: Annual megawatt-hours of electricity produced. Carbon: Annual carbon dioxide (CO2) emissions. The units are short or U.S. tons. Multiply by 0.907 to get metric tons. Carbon Monitoring for Action (CARMA) is a massive database containing information on the carbon emissions of over 50,000 power plants and 4,000 power companies worldwide. Power generation accounts for 40% of all carbon emissions in the United States and about one-quarter of global emissions. CARMA is the first global inventory of a major, sector of the economy. The objective of CARMA.org is to equip individuals with the information they need to forge a cleaner, low-carbon future. By providing complete information for both clean and dirty power producers, CARMA hopes to influence the opinions and decisions of consumers, investors, shareholders, managers, workers, activists, and policymakers. CARMA builds on experience with public information disclosure techniques that have proven successful in reducing traditional pollutants. Please see carma.org for more information http://carma.org/region/detail/201

  5. g

    Energy Information Administration, Energy Consumption by Sector and State,...

    • geocommons.com
    Updated May 29, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data (2008). Energy Information Administration, Energy Consumption by Sector and State, USA, 2005 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 29, 2008
    Dataset provided by
    EIA - Energy Information Administration
    data
    Description

    This dataset displays the energy consumption figures for each US state during the year 2005. This data is divided by sector, which includes residential, commercial, industrial, transportation, and total consumption figures. These statistics are displayed on a trillion British Thermal Unit (Btu) scale. Texas and California are consistantly atop the list as the highest consumers of energy across all sectors. With Texas being th highest total consumer. On the other end of the spectrum Vermont is the lowest total consumer of energy in 2005.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/

Geonames - All Cities with a population > 1000

Explore at:
16 scholarly articles cite this dataset (View in Google Scholar)
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

Search
Clear search
Close search
Google apps
Main menu