29 datasets found
  1. Twitter users in the United States 2019-2028

    • statista.com
    • ai-chatbox.pro
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.

  2. An aggregated dataset of day 3 post-inoculation viral titer measurements...

    • catalog.data.gov
    • data.cdc.gov
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). An aggregated dataset of day 3 post-inoculation viral titer measurements from influenza A virus-infected ferret tissues [Dataset]. https://catalog.data.gov/dataset/an-aggregated-dataset-of-day-3-post-inoculation-viral-titer-measurements-from-influenza-a-
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    Data from influenza A virus (IAV) infected ferrets (Mustela putorius furo) provides invaluable information towards the study of novel and emerging viruses that pose a threat to human health. This gold standard animal model can recapitulate many clinical signs of infection present in IAV-infected humans, supports virus replication of human and zoonotic strains without prior adaptation, and permits evaluation of virus transmissibility by multiple modes. While ferrets have been employed in risk assessment settings for >20 years, results from this work are typically reported in discrete stand-alone publications, making aggregation of raw data from this work over time nearly impossible. Here, we describe a dataset of 333 ferrets inoculated with 107 unique IAV, conducted by a single research group (NCIRD/ID/IPB/Pathogenesis Laboratory Team) under a uniform experimental protocol. This collection of ferret tissue viral titer data on a per-individual ferret level represents a companion dataset to ‘An aggregated dataset of serially collected influenza A virus morbidity and titer measurements from virus-infected ferrets’. However, care must be taken when combining datasets at the level of individual animals (see PMID 40245007 for guidance in best practices for comparing datasets comprised of serially-collected and fixed-timepoint in vivo-generated data). See publications using and describing data for more information: Kieran TJ, Sun X, Tumpey TM, Maines TR, Belser JA. 202X. Spatial variation of infectious virus load in aggregated day 3 post-inoculation respiratory tract tissues from influenza A virus-infected ferrets. Under peer review. Kieran TJ, Sun X, Maines TR, Belser JA. 2025. Predictive models of influenza A virus lethal disease: insights from ferret respiratory tract and brain tissues. Scientific Reports, in press. Bullock TA, Pappas C, Uyeki TM, Brock N, Kieran TJ, Olsen SJ, Davis CD, Tumpey TM, Maines TR, Belser JA. 2025. The (digestive) path less traveled: influenza A virus and the gastrointestinal tract. mBio, in press. Kieran TJ, Sun X, Maines TR, Beauchemin CAA, Belser JA. 2024. Exploring associations between viral titer measurements and disease outcomes in ferrets inoculated with 125 contemporary influenza A viruses. J Virol98: e01661-23. https://doi.org/10.1038/s41597-024-03256-6 Related dataset: Kieran TJ, Sun X, Creager HM, Tumpey TM, Maine TR, Belser JA. 2025. An aggregated dataset of serial morbidity and titer measurements from influenza A virus-infected ferrets. Sci Data, 11(1):510. https://doi.org/10.1038/s41597-024-03256-6 https://data.cdc.gov/National-Center-for-Immunization-and-Respiratory-D/An-aggregated-dataset-of-serially-collected-influe/cr56-k9wj/about_data Other relevant publications for best practices on data handling and interpretation: Kieran TJ, Maines TR, Belser JA. 2025. Eleven quick tips to unlock the power of in vivo data science. PLoS Comput Biol, 21(4):e1012947. https://doi.org/10.1371/journal.pcbi.1012947 Kieran TJ, Maines TR, Belser JA. 2025. Data alchemy, from lab to insight: Transforming in vivo experiments into data science gold. PLoS Pathog, 20(8):e1012460. https://doi.org/10.1371/journal.ppat.1012460

  3. Twitter Tweets Sentiment Dataset

    • kaggle.com
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

    Description:

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns:

    1. textID - unique ID for each piece of text
    2. text - the text of the tweet
    3. sentiment - the general sentiment of the tweet

    Acknowledgement:

    The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification models to predict the twitter sentiments.
    • Compare the evaluation metrics of vaious classification algorithms.
  4. Tweet Sentiment's Impact on Stock Returns

    • kaggle.com
    Updated Jan 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Tweet Sentiment's Impact on Stock Returns [Dataset]. https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Tweet Sentiment's Impact on Stock Returns

    862,231 Labeled Instances

    By [source]

    About this dataset

    This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements.

    The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns*7DAY RETURNS*.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”

    Research Ideas

    • Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions.
    • Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events.
    • Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: reduced_dataset-release.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------------------------------------------------| | TWEET | Text of the tweet. (String) | | STOCK | Company's stock mentioned in the tweet. (String) | | DATE | Date the tweet was posted. (Date) | | LAST_PRICE | Company's last price at the time of tweeting. (Float) ...

  5. ERA5 post-processed daily statistics on single levels from 1940 to present

    • cds.climate.copernicus.eu
    grib
    Updated Jul 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 post-processed daily statistics on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.4991cf48
    Explore at:
    gribAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdf

    Time period covered
    Jan 1, 1940 - Jul 17, 2025
    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. This catalogue entry provides post-processed ERA5 hourly single-level data aggregated to daily time steps. In addition to the data selection options found on the hourly page, the following options can be selected for the daily statistic calculation:

    The daily aggregation statistic (daily mean, daily max, daily min, daily sum*) The sub-daily frequency sampling of the original data (1 hour, 3 hours, 6 hours) The option to shift to any local time zone in UTC (no shift means the statistic is computed from UTC+00:00)

    *The daily sum is only available for the accumulated variables (see ERA5 documentation for more details). Users should be aware that the daily aggregation is calculated during the retrieval process and is not part of a permanently archived dataset. For more details on how the daily statistics are calculated, including demonstrative code, please see the documentation. For more details on the hourly data used to calculate the daily statistics, please refer to the ERA5 hourly single-level data catalogue entry and the documentation found therein.

  6. ECE657AW20-ASG4-Coronavirus

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MarkCrowley (2025). ECE657AW20-ASG4-Coronavirus [Dataset]. https://www.kaggle.com/markcrowley/ece657aw20asg4coronavirus/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    MarkCrowley
    Description

    COVID-19 Data for Analysis and Machine Learning

    There are lots of datasets online, more growing every day, to help us all get a handle on this pandemic. Here are just a few links to data we've found that students in ECE 657A, and anyone else who finds their way here, can play with and practice their machine learning skills. The main dataset is the COVID-19 dataset from John Hopkins university. This data is perfect for time series analysis and Recurrent Neural Networks, the final topic in the course. This dataset will be left public so anyone can see it but to join you must request the link from Prof. Crowley or be in the ECE 657A W20 course at the University of Waterloo.

    For ECE 657A W20 Students

    Your bonus grade for assignment 4 comes from creating a kernel from this dataset and writing up some useful analysis and publishing that notebook. You can do any kind of analysis you like but some good places to start are - Analysis: feature extraction and analysis of the data to look for patterns that aren't evident from the original features (this is hard for the simple spread/infection/death data since there aren't that many features) - Other Data: utilize any other datasets in your kernels by loading data about the countries themselves (population, density, wealthy etc.) or their responses to the situation. Tip: If you open a New Notebook related to this dataset you can easily add new data available on Kaggle and link that to you analysis. - HOW'S MY FLATTENING COVID19 DATASET - This dataset has a lot more files and includes a lot of what I was talking about, so if you produce good kernels there you can also count them for your asg4 grade. https://www.kaggle.com/howsmyflattening/covid19-challenges - Predict: make predictions about confirmed cases, deaths, recoveries or other metrics for the future. You can test you models by training on the past and predicting on the following days, then post a prediction for tomorrow or the next few days given ALL the data up to this point. Hopefully the datasets we've linked here will updated automatically so your kernels would update as well. - Create Tasks: you can make your own "Tasks" as part of this kaggle and propose your own solution to it. Then others can try solving it as well. - Groups: students can do this assignment either in the same groups they had for assignment 3 or individually.

    Suggest other datasets

    We're happy to add other relevant data to this Kaggle, in particular it would be great to integrate live data on the following: - Progression of each country/region/city in "days since X Level" such as Days since 100 confirmed cases, see the link for a great example such a dataset being plotted. I haven't see a live link to a csv of that data, but we could generate. - Mitigation Policies enacted by local governments in each city/region/country. These are dates when that region first enacted Level 1, 2, 3, 4 containment, or started encouraging social distancing or the date when they closed different levels of schools, pubs, restaurants etc. - The hidden positives: this would be a dataset, or method for estimating, as described by Emtiyaz Khan in this twitter thread. The idea is, how many unreported or unconfirmed cases are there in any region, and can we build an estimate of that number using other regions with widespread testing as a baseline and the death rates which are like an observation of a process with a hidden variable or true infection rate. - Paper discussing one way to compute this : https://cmmid.github.io/topics/covid19/severity/global_cfr_estimates.html

  7. Instagram: most popular posts as of 2024

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2025). Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  8. FLDAS Noah Land Surface Model L4 Global Monthly 0.1 x 0.1 degree (MERRA-2...

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Jul 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/GSFC/SED/ESD/TISL/GESDISC (2025). FLDAS Noah Land Surface Model L4 Global Monthly 0.1 x 0.1 degree (MERRA-2 and CHIRPS) V001 (FLDAS_NOAH01_C_GL_M) at GES DISC [Dataset]. https://catalog.data.gov/dataset/fldas-noah-land-surface-model-l4-global-monthly-0-1-x-0-1-degree-merra-2-and-chirps-v001-f-a740b
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This dataset contains a series of land surface parameters simulated from the Noah 3.6.1 model in the Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System (FLDAS). The data are in 0.10 degree resolution and range from January 1982 to present. The temporal resolution is monthly and the spatial coverage is global (60S, 180W, 90N, 180E). The FLDAS regional monthly datasets will no longer be available and have been superseded by the global monthly dataset. The simulation was forced by a combination of the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) data and Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) 6-hourly rainfall data that has been downscaled using the NASA Land Data Toolkit.The simulation was initialized on January 1, 1982 using soil moisture and other state fields from a FLDAS/Noah model climatology for that day of the year.In November 2020, all FLDAS data were post-processed with the MOD44W MODIS land mask. Previously, some grid boxes over inland water were considered as over land and, thus, had non-missing values. The post-processing corrected this issue and masked out all model output data over inland water; the post-processing did not affect the meteorological forcing variables. More information on this can be found in the FLDAS README document, and the MOD44W MODIS land mask is available on the FLDAS Project site. If you had downloaded any FLDAS data prior to November 2020, please download the data again to receive the post-processed data.

  9. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2025). Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  10. Z

    MAISON-LLF: Multimodal Sensor Dataset for Monitoring Older Adults Post...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abedi, Ali (2025). MAISON-LLF: Multimodal Sensor Dataset for Monitoring Older Adults Post Lower-Limb Fractures in Community Settings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14597612
    Explore at:
    Dataset updated
    Mar 25, 2025
    Dataset provided by
    Khan, Shehroz S
    Abedi, Ali
    Chu, Charlene
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The MAISON-LLF dataset was collected from 10 older adult participants living alone in the community following lower limb fractures. Each participant contributed data for over 8 weeks, beginning from their first-week post-discharge. This resulted in a total of 574 days of continuous multimodal sensor data, complemented by biweekly clinical questionnaire data.

    The MAISON-LLF dataset is organized into a directory tree, as shown below.

    maison-llf/├── sensor-data/│ ├── p01/...│ ├── p10/│ │ ├── acceleration-data.csv│ │ ├── heartrate-data.csv│ │ ├── motion-data.csv│ │ ├── position-data.csv│ │ ├── sleep-data.csv│ │ ├── step-data.csv├── features/│ ├── p01/...│ ├── p10/│ │ ├── acceleration-features.csv│ │ ├── heartrate-features.csv│ │ ├── motion-features.csv│ │ ├── position-features.csv│ │ ├── sleep-features.csv│ │ ├── step-features.csv│ │ ├── clinical.csv├── dataset/│ ├── all-features.csv│ ├── all-features-imputed.csv│ ├── dataset-daily.pt│ ├── dataset-weekly.pt│ ├── dataset-biweekly.pt│ ├──

    In ‘sensor-data’ folder, the dataset includes 60 CSV files containing data from six sensor types for 10 participants. Each file includes a ‘timestamp’ column indicating the date and time of the recorded sensor data, accurate to milliseconds (‘yyyy-MM-dd HH:mm:ss.SSS’), along with the corresponding sensor measurements. For instance, the ‘acceleration-data.csv’ files include four columns: timestamp, and x, y, and z coordinates, while the ‘heartrate-data.csv’ files contain two columns: timestamp and heart rate value.

    The dataset also includes 70 CSV files containing daily features extracted from the sensor data, along with clinical questionnaire data and physical test results. Each feature CSV file includes a timestamp column representing the date (‘yyyy-MM-dd’) of the sensor data from which the daily features were extracted, alongside the corresponding sensor features. For example, the ‘acceleration-features.csv’ files contain eight columns: timestamp and the seven acceleration features and the ‘heartrate-features.csv’ files include five columns: timestamp and the four heart rate features. Additionally, the ‘clinical.csv’ files provide values for individual items of the SIS (‘sis-01’ to ‘sis-06’), OHS (‘ohs-01’ to ‘ohs-12’), and OKS (‘oks-01’ to ‘oks-12’) questionnaires, along with their final scores (‘sis’, ‘ohs’, and ‘oks’). These files also include results for the TUG and 30-second chair stand tests. Each participant has four sets of clinical data, with each set sharing the same ‘timestamp’ corresponding to the date (‘yyyy-MM-dd’) on which the clinical data were collected.

    To provide a comprehensive overview of the dataset, the ‘all-features.csv’ and ‘all-features-imputed.csv’ files in ‘dataset’ folder combine all daily features, clinical data, and demographic information into single CSV files, representing the data before and after missing value imputation (as explained in subsection 2.2.4). Additionally, the Python PyTorch files are structured datasets designed to facilitate supervised and unsupervised machine learning model development for estimating clinical outcomes.

    ‘dataset-daily.pt’ in ‘dataset’ folder contains a NumPy array with dimensions num_days × num_features, representing the daily features for all 10 participants. Alongside this array, it includes a num_days IDs array that maps each day to a participant (IDs 1 to 10). Additionally, the file contains three separate num_days arrays for SIS, OHS, and OKS scores, each assigned to the corresponding days in the daily features array.

    ‘dataset-weekly.pt’ in ‘dataset’ folder provides an array with dimensions num_weeks × 7 × num_features, which includes the weekly sequential features for all participants. This file also includes a num_weeks IDs array to identify the participant (1 to 10) associated with each week in the samples array. Similar to the daily dataset, it contains three separate num_weeks arrays for the SIS, OHS, and OKS scores, each assigned to the respective weeks in the weekly features array.

    ‘dataset-biweekly.pt’ in ‘dataset’ folder provides an array with dimensions num_biweeks × 14 × num_features, which includes the biweekly sequential features for all participants. This file also includes a num_biweeks IDs array to identify the participant (1 to 10) associated with each biweekly period in the samples array. Similar to the daily dataset, it contains three separate num_biweeks arrays for the SIS, OHS, and OKS scores, each assigned to the respective biweekly periods in the biweekly features array.

    CitationCite the related pre-print:

    A. Abedi, C. H. Chu, and S. S. Khan, "Multimodal Sensor Dataset for Remote Monitoring of Older Adults Post Lower-Limb Fractures in the Community,"

  11. GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Jul 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/GSFC/SED/ESD/TISL/GESDISC (2025). GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC [Dataset]. https://catalog.data.gov/dataset/gpm-imerg-final-precipitation-l3-1-day-0-1-degree-x-0-1-degree-v07-gpm-3imergdf-at-ges-dis-f1669
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Version 07 is the current version of the data set. Older versions will no longer be available and have been superseded by Version 07.The Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global surface precipitation rates at a high resolution of 0.1° every half-hour beginning 2000. It is part of the joint NASA-JAXA Global Precipitation Measurement (GPM) mission, using the GPM Core Observatory satellite as the standard to combine precipitation observations from an international constellation of satellites using advanced techniques. IMERG can be used for global-scale applications as well as over regions with sparse or no reliable surface observations. The fine spatial and temporal resolution of IMERG data allows them to be accumulated to the scale of the application for increased skill. IMERG has three Runs with varying latencies in response to a range of application needs: rapid-response applications (Early Run, 4-h latency), same/next-day applications (Late Run, 14-h latency), and post-real-time research (Final Run, 3.5-month latency). While IMERG strives for consistency and accuracy, satellite estimates of precipitation are expected to have lower skill over frozen surfaces, complex terrain, and coastal zones. As well, the changing GPM satellite constellation over time may introduce artifacts that affect studies focusing on multi-year changes.This dataset is the GPM Level 3 IMERG Final Daily 10 x 10 km (GPM_3IMERGDF) derived from the half-hourly GPM_3IMERGHH. The derived result represents the Final estimate of the daily mean precipitation rate in mm/day. The dataset is produced by first computing the mean precipitation rate in (mm/hour) in every grid cell, and then multiplying the result by 24. This minimizes the possible dry bias in versions before "07", in the simple daily totals for cells where less than 48 half-hourly observations are valid for the day. The latter under-sampling is very rare in the combined microwave-infrared and rain gauge dataset, variable "precipitation", and appears in higher latitudes. Thus, in most cases users of global "precipitation" data will not notice any difference. This correction, however, is noticeable in the high-quality microwave retrieval, variable "MWprecipitation", where the occurrence of less than 48 valid half-hourly samples per day is very common. The counts of the valid half-hourly samples per day have always been provided as a separate variable, and users of daily data were advised to pay close attention to that variable and use it to calculate the correct precipitation daily rates. Starting with version "07", this is done in production to minimize possible misinterpretations of the data. The counts are still provided in the data, but they are only given to gauge the significance of the daily rates, and reconstruct the simple totals if someone wishes to do so. The latency of the derived Final Daily product depends on the delivery of the IMERG Final Half-Hourly product GPM_IMERGHH. Since the latter are delivered in a batch, once per month for the entire month, with up to 4 months latency, so will be the latency for the Final Daily, plus about 24 hours. Thus, e.g. the Dailies for January can be expected to appear no earlier than April 2. The daily mean rate (mm/day) is derived by first computing the mean precipitation rate (mm/hour) in a grid cell for the data day, and then multiplying the result by 24. Thus, for every grid cell we have Pdaily_mean = SUM{Pi * 1[Pi valid]} / Pdaily_cnt * 24, i=[1,Nf]Where:Pdaily_cnt = SUM{1[Pi valid]}Pi - half-hourly input, in (mm/hr)Nf - Number of half-hourly files per day, Nf=481[.] - Indicator function; 1 when Pi is valid, 0 otherwisePdaily_cnt - Number of valid retrievals in a grid cell per day.Grid cells for which Pdaily_cnt=0, are set to fill value in the Daily files.Note that Pi=0 is a valid value.Pdaily_cnt are provided in the data files as variables "precipitation_cnt" and "MWprecipitation_cnt", for correspondingly the microwave-IR-gauge and microwave-only retrievals. They are only given to gauge the significance of the daily rates, and reconstruct the simple totals if someone wishes to do so. There are various ways the daily error could be estimated from the source half-hourly random error (variable "randomError"). The daily error provided in the data files is calculated in a fashion similar to the daily mean precipitation rate. First, the mean of the squared half-hourly "randomError" for the day is computed, and the resulting (mm^2/hr) is converted to (mm^2/day). Finally, square root is taken to get the result in (mm/day):Perr_daily = { SUM{ (Perr_i)^2 * 1[Perr_i valid] ) } / Ncnt_err * 24}^0.5, i=[1,Nf]Ncnt_err = SUM( 1[Perr_i valid] )where:Perr_i - half-hourly input, "randomError", (mm/hr)Perr_daily - Magnitude of the daily error, (mm/day)Ncnt_err - Number of valid half-hour error estimatesAgain, the sum of squared "randomError" can be reconstructed, and other estimates can be derived using the available counts in the Daily files.

  12. Atmospheric Model high resolution 15-day forecast

    • ecmwf.int
    application/x-grib
    Updated Sep 20, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Medium-Range Weather Forecasts (2016). Atmospheric Model high resolution 15-day forecast [Dataset]. https://www.ecmwf.int/en/forecasts/datasets/set-i
    Explore at:
    application/x-grib(1 datasets)Available download formats
    Dataset updated
    Sep 20, 2016
    Dataset authored and provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    License

    https://www.ecmwf.int/sites/default/files/ECMWF_Standard_Licence.pdfhttps://www.ecmwf.int/sites/default/files/ECMWF_Standard_Licence.pdf

    Description

    Single prediction that uses

    observations
    prior information about the Earth-system
    ECMWF's highest-resolution model
    

    HRES Direct model output Products offers "High Frequency products"

    4 forecast runs per day (00/06/12/18) (see dissemination schedule for details)
    Hourly steps to step 90 for all four runs.
    

    Not all post-processed Products are available at 06/18 runs or in hourly steps.

  13. GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree V07

    • oidc.rda.ucar.edu
    • data.ucar.edu
    • +1more
    Updated May 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    G. Huffman; E. Stocker; D. Bolvin; E. Nelkin; Jackson Tan (2024). GPM IMERG Final Precipitation L3 1 month 0.1 degree x 0.1 degree V07 [Dataset]. http://doi.org/10.5065/KRNV-Y644
    Explore at:
    Dataset updated
    May 9, 2024
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    G. Huffman; E. Stocker; D. Bolvin; E. Nelkin; Jackson Tan
    Time period covered
    Jun 1, 2000 - Mar 1, 2025
    Area covered
    Earth
    Description

    This dataset contains Version 07 of the Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG Level 3 "Final Run" precipitation analysis at 0.1 degree, monthly resolution.

    From the official GPM IMERG site at NASA GES DISC [https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGM_07/summary]: The Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG is a NASA product estimating global surface precipitation rates at a high resolution of 0.1 degree every half-hour beginning June 2000. It is part of the joint NASA/JAXA Global Precipitation Measurement (GPM) mission, using the GPM Core Observatory satellite (for June 2014 to present) and the Tropical Rainfall Measuring Mission (TRMM) satellite (for June 2000 to May 2014) as the standard to combine precipitation observations from an international constellation of satellites using advanced techniques. IMERG can be used for global-scale applications, including over regions with sparse or no reliable surface observations. The fine spatial and temporal resolution of IMERG data allows them to be accumulated to the scale of the application for increased skill. IMERG has three Runs with varying latencies in response to a range of application needs: rapid-response applications (Early Run, 4-hour latency), same/next-day applications (Late Run, 14-hour latency), and post-real-time research (Final Run, 4-month latency). While IMERG strives for consistency and accuracy, satellite estimates of precipitation are expected to have lower skill over frozen surfaces, complex terrain, and coastal zones. As well, the changing GPM satellite constellation over time may introduce artifacts that affect studies focusing on multi-year changes.

    This dataset contains Version 07 of the Integrated Multi-satellitE Retrievals for GPM (IMERG) IMERG Level 3 "Final Run" precipitation analysis at 0.1 degree x 0.1 degree, monthly resolution. The dataset represents the Final Run estimate of the monthly mean precipitation rate in mm/day. The dataset is produced by first computing the mean precipitation rate for the month in each grid cell from half-hourly satellite-only IMERG estimates, adjusting the satellite data field to the large-area bias of the Global Precipitation Climatology Project analysis of monthly surface precipitation gauges, then combining the adjusted satellite and the GPCC gauge analysis weighted by estimated inverse random error variance. The IMERG Monthly is considered the most reliable dataset.

    See the official GPM IMERG site at NASA GES DISC [https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGM_07/summary] for the complete dataset abstract and more information.

  14. n

    Data from: Variations in the Intensity and Spatial Extent of Tropical...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 4, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danielle Touma; Samantha Stevenson; Suzana J. Camargo; Daniel E. Horton; Noah S. Diffenbaugh (2019). Variations in the Intensity and Spatial Extent of Tropical Cyclone Precipitation [Dataset]. http://doi.org/10.25349/D9JP4X
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 4, 2019
    Dataset provided by
    Lamont-Doherty Earth Observatory
    University of California, Santa Barbara
    Stanford University
    Département de la Formation, de la Jeunesse et de la Culture
    Authors
    Danielle Touma; Samantha Stevenson; Suzana J. Camargo; Daniel E. Horton; Noah S. Diffenbaugh
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The intensity and spatial extent of tropical cyclone precipitation (TCP) often shapes the risk posed by landfalling storms. Here we provide a comprehensive climatology of landfalling TCP characteristics as a function of tropical cyclone strength, using daily precipitation station data and Atlantic US landfalling tropical cyclone tracks from 1900-2017. We analyze the intensity and spatial extent of ≥ 1 mm/day TCP (Z1) and ≥ 50 mm/day TCP (Z50) over land. We show that the highest median intensity and largest median spatial extent of Z1 and Z50 occur for major hurricanes that have weakened to tropical storms, indicating greater flood risk despite weaker wind speeds. We also find some signs of TCP change in recent decades. In particular, for major hurricanes that have weakened to tropical storms, Z50 intensity has significantly increased, indicating possible increases in flood risk to coastal communities in more recent years.

    Methods 1. Station precipitation and tropical cyclone tracks

    We use daily precipitation data from the Global Historical Climatology Network (GHCN)-Daily station dataset (Menne et al., 2012) and TC tracks archived in the revised HURricane DATabase (HURDAT2) database (Landsea & Franklin, 2013). HURDAT2 is a post-storm reanalysis that uses several datasets, including land observations, aircraft reconnaissance, ship logs, radiosondes, and satellite observations to determine tropical cyclone track locations, wind speeds and central pressures (Jarvinen et al., 1984; Landsea & Franklin, 2013). We select 1256 US stations from the GHCN-Daily dataset that have observations beginning no later than 1900 and ending no earlier than 2017 (though most station records are not continuous throughout that period). These 1256 land-based stations are well distributed over the southeastern US and Atlantic seaboard (see Supporting Figure S1).

    We use the HURDAT2 Atlantic database to select locations and windspeeds of TC tracks that originated in the North Atlantic Ocean, Gulf of Mexico and Caribbean Sea, and made landfall over the continental US. Though tracks are determined at 6-hourly time steps for each storm (with additional timesteps that indicate times of landfall, and times and values of maximum intensity), we limit our analysis to track points recorded at 1200 UTC, in order to match the daily temporal resolution and times of observation of the GHCN-Daily precipitation dataset (Menne et al., 2012), as well as the diurnal cycle of TCP (Gaona & Villarini, 2018). Although this temporal matching technique may omit high values of precipitation from the analysis, it reduces the possibility of capturing precipitation that is not associated with a TC.

    1. Tropical cyclone and Lifetime Maximum Intensity (LMI) categories

    For each daily point in the tropical cyclone track, we use the maximum sustained windspeed to place the storm into one of three Extended Saffir-Simpson categories: tropical storms (“TS”; 34-63 knots), minor hurricanes (“Min”; categories 1 and 2; 64-95 knots), and major hurricanes (“Maj”; categories 3 to 5; > 96 knots) (Schott et al., 2012). Additionally, for each track, we record the category of the lifetime maximum intensity (LMI), based on the maximum windspeed found along the whole lifetime of the track (i.e., using all available track points). LMI is a standard tropical cyclone metric, and is considered a robust measure of track intensity through time and across different types of data integrated into the HURDAT2 reanalysis (Elsner et al., 2008; Kossin et al., 2013, 2014). Therefore, for each track point, a dual category is assigned: the first portion of the classification denotes the category of the storm for a given point (hereafter “point category”), while the second denotes the LMI category. The combination of the two can thus be considered a “point-LMI category”. For example, the point on August 27, 2017 at 1200 UTC along Hurricane Harvey’s track is classified as TS-Maj because it is a tropical storm (TS) at this point but falls along a major hurricane LMI track (see starred location in Supporting Figure S2a). Given that the LMI category for a given point cannot be weaker than the point category itself, the set of possible point-LMI category combinations for each track point is TS-TS, TS-Min, TS-Maj, Min-Min, Min-Maj, and Maj-Maj. This dual classification allows us to explore climatological TCP spatial extents and intensities during the tropical cyclone lifetime. Our dual classification does not account for the timing of the point category relative to the LMI category for a given point along a track (i.e., the time-lag between the LMI and point in consideration). However, the majority of points selected in our analysis occur after the TC has reached its LMI and are in the weakening stage (see Supporting Table S1 for more details). This could be expected, as our analysis is focused on land-based precipitation stations, and TCs weaken over land. However, a small fraction of TC points analyzed occur over the ocean before making landfall, but are close enough to land for precipitation gauges to be impacted.

    1. Moving neighborhood method for TCP spatial extent and intensity

      We first find the distribution of tropical cyclone precipitation (TCP) intensity using all daily land precipitation values from all available stations in a 700 km-radius neighborhood around each point over land on each tropical cyclone track (Figure 1a and Supporting Figure S2). We then create two new binary station datasets, Z1(x) and Z50(x), which indicate whether or not a station meets or exceeds the 1 mm/day or 50 mm/day precipitation threshold, respectively, on a given day. The 50 mm/day threshold is greater than the 75th percentile of TCP across all tropical cyclone categories (Figure 1a), allowing us to capture the characteristics of heavy TCP while retaining a robust sample size. The 1 mm/day threshold captures the extent of the overall TCP around the TC track point.

    We use the relaxed moving neighborhood and semivariogram framework developed by Touma et al. (2018) to quantify the spatial extent of Z1 and Z50 TCP for each track point. Using a neighborhood with a 700 km radius around each track point, we select all station pairs that meet two criteria: at least one station has to exhibit the threshold precipitation on that given day (Z(x) = 1; blue and pink stations in Supporting Figure S2b), and at least one station has to be inside the neighborhood (black and pink stations in Supporting Figure S2b). We then calculate the indicator semivariogram, g(h), for each station pair selected for that track point (Eq. 1):

    γh=0.5*[Z(x+h)-Z(x)]2, Eq. 1

    where h is the separation distance between the stations in the station pair. The indicator semivariogram is a function of the separation distance, and has two possible outcomes: all pairs with two threshold stations (Z(x) = Z(x+h) = 1) have a semivariogram value of 0, and all pairs with one threshold station and one non-threshold station (Z(x) = 1 and Z(x+h) = 0) have a semivariogram value of 0.5.

    We then average the semivariogram values for all station pairs for equal intervals of separation distances (up to 1000 km) to obtain the experimental semivariogram (Supporting Figure S2c). To quantify the shape of the experimental semivariogram, we fit three parameters of the theoretical spherical variogram (nugget, partial sill, and practical range) to the experimental semivariogram (Eq. 2):

    γ(h) = 0, for h=0

    γ(h) = c+b*((3/2)(h/α)-(1/2)(h/α)3), for 0<h≤α

    γ(h) = c+b, for h≥α, Eq. 2

    where c is the nugget, b is the partial sill, and a is the practical range (Goovaerts, 2015). The nugget quantifies measurement errors or microscale variability, and the partial sill is the maximum value reached by the spherical semivariogram (Goovaerts, 2015). The practical range is the separation distance at which the semivariogram asymptotes (Supporting Figure S2c). At this separation distance, station pairs are no longer likely to exhibit the threshold precipitation (1 mm/day or 50 mm/day) simultaneously (Goovaerts, 2015; Touma et al., 2018). Therefore, as in Touma et al. (2018), we define the length scale – or spatial extent – of TCP for that given track point as the practical range.

    There are some subjective choices of the moving neighborhood and semivariogram framework, including the 700 km radius of neighborhood (Touma et al. 2018). Previous studies found that 700 km is sufficient to capture the extent to which tropical cyclones influence precipitation (e.g., Barlow, (2011), Daloz et al. (2010), Hernández Ayala & Matyas (2016), Kim et al. (2014), Knaff et al. (2014), Knutson et al. (2010) and Matyas (2010)). Additionally, Touma et al. (2018) showed that although the neighborhood size can slightly impact the magnitude of length scales, it has little impact on their relative spatial and temporal variations.

    1. Analysis of variations and trends

    We use Mood’s median test (Desu & Raghavarao, 2003) to test for differences in the median TCP intensity and spatial extent among point-LMI categories, adjusting p-values to account for multiple simultaneous comparisons (Benjamini & Hochberg, 1995; Holm, 1979; Sheskin, 2003). To test for changes in TCP characteristics over time, we divide our century-scale dataset into two halves, 1900-1957 and 1958-2017. First, the quartile boundaries are established using the distributions of the earlier period (1900-1957), with one-quarter of the distribution falling in each quartile. Then, we find the fraction of points in each quartile in the later period (1958-2017) to determine changes in the distribution. We also report the p-values of the Kolmogorov-Smirnov

  15. Number of LinkedIn users in the United Kingdom 2019-2028

    • statista.com
    Updated Nov 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Number of LinkedIn users in the United Kingdom 2019-2028 [Dataset]. https://www.statista.com/topics/3236/social-media-usage-in-the-uk/
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United Kingdom
    Description

    The number of LinkedIn users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 1.5 million users (+4.51 percent). After the eighth consecutive increasing year, the LinkedIn user base is estimated to reach 34.7 million users and therefore a new peak in 2028. User figures, shown here with regards to the platform LinkedIn, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  16. f

    Post-processed neural and behavioral data class with reward-relative cells...

    • plus.figshare.com
    bin
    Updated May 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marielena Sosa; Mark Plitt; Lisa Giocomo (2025). Post-processed neural and behavioral data class with reward-relative cells identified [Dataset]. http://doi.org/10.25452/figshare.plus.27138633.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 17, 2025
    Dataset provided by
    Figshare+
    Authors
    Marielena Sosa; Mark Plitt; Lisa Giocomo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 2 pickled Python files containing the post-processed data used for: Sosa, Plitt, Giocomo, 2025. A flexible hippocampal population code for experience relative to reward. Nature Neuroscience. https://doi.org/10.1038/s41593-025-01985-4Hippocampal neurons were recorded in dorsal CA1 using 2-photon calcium imaging and synchronized with virtual reality behavior in head-fixed mice. This data is referred to as "post-processed" because shuffles have already been run to identify cells as "reward-relative" or not; therefore, the cell ID labels in this dataset will allow exact replication of any figures in the paper that do not require an additional shuffle of the data. The pickle or dill package in Python is required to load this dataset. See accompanying dataset: Pre-processed neural and behavioral data, including raw fluorescence. See the code base on Github for usage and additional documentation. File name format: m[mouse-number-range]_expdays[list-of-day-numbers]_multiDayData_dff_[date saved, yyyymm]Each pickle file is a Python dictionary, either for (1) only the experimental days where a reward zone location was switched on a virtual linear track (3-5-7-8-10-12-14) or (2) for all days, including days where the reward zone remained in the same location (1...14). In pickle (2), the data on the switch days are identical to pickle (1) -- we have provided both options to allow users to download a smaller file size if they are only interested in the "switch" days. Each entry of the dictionary corresponds to a class object for a given experimental day indexed as [3, 5, 7, 8, 10, 12, 14], for example, corresponding to the day number.Below are the most relevant attributes of the class for analyses in the paper. Additional attributes are explained in the dayData.py docstring on the Github. Values before the '--' are defaults.self.anim_list: list of mouse IDs included in this dayself.place_cell_logical: 'or' -- cells were classified as place cells by having significant spatial information in the trials before OR after the reward switchself.force_two_sets: True -- trials were split into "set 0" before the reward switch, and "set 1" after the reward switch. In animals without a reward switch, "set 0" and "set 1" correspond to the 1st and 2nd half of trials, respectivelyself.ts_key: 'dff' -- timeseries data type (dF/F) used to find place cell peaksself.use_speed_thr: True -- whether a running speed threshold was used to quantify neural activityself.speed_thr: 2 -- the speed threshold used, in cm/sself.exclude_int: True -- whether putative interneurons were excluded from analysesself.int_thresh: 0.5 -- speed correlation threshold to identify putative interneuronsself.int_method: 'speed' -- method of finding putative interneuronsself.reward_dist_exclusive: 50 -- distance in cm to exclude cells "near" rewardself.reward_dist_inclusive: 50 -- distance in cm to include cells as "near" rewardself.bin_size: 10 -- linear bin size (cm) for quantifying spatial activityself.sigma: 1 -- Gaussian s.d. in bins for smoothingself.smooth: False -- whether to smooth binned data for finding place cell peaksself.impute_NaNs: True -- whether to impute NaN bins in spatial activity matricesself.sim_method: 'correlation' -- trial-by-trial similarity matrix method: 'cosine_sim' or 'correlation'self.lick_correction_thr: 0.35 -- threshold to detect capacitive sensor errors and set trial licking to NaN self.is_switch: whether each animal had a reward switchself.anim_tag: string of animal ID numbersself.trial_dict: dictionary of booleans identifying each trial as in "set 0" or "set 1"self.rzone_pos: [start, stop] position of each reward zone (cm)self.rzone_by_trial: same as above but for each trialself.rzone_label: label of each reward zone (e.g. 'A', 'B')self.activity_matrix: spatially-binned neural activity of type self.ts_key (trials x position bins x neurons)self.events: original spatially-binned deconvolved events (trials x position bins x neurons) (no speed threshold applied)self.place_cell_masks: booleans identifying which cells are place cells in each trial setself.SI: spatial information for each cell in each trial setself.overall_place_cell_masks: single boolean identifying which cells are place cells according to self.place_cell_logicalself.peaks: spatial bin center of peak activity for each cell in each trial setself.field_dict: dictionary of place field properties for each cellself.plane_per_cell: imaging plane of each cell (all zeros if only a single plane was imaged, otherwise 0 or 1 if two planes were imaged)self.is_int: boolean, whether each cell is a putative interneuronself.is_reward_cell: boolean, whether each cell has a peak within 50 cm of both reward zone startsself.is_end_cell: boolean, whether each cell has a peak in the first or last spatial bin of the trackself.is_track_cell: boolean, whether each cell's peak stays within 50 cm of itself from trial set 0 to trial set 1self.sim_mat: trial-by-trial similarity matrix for place cells, licking, and speedself.in_vs_out_lickratio: ratio of lick rate in the anticipatory zone vs. everywhere outside the anticipatory and reward zonesself.lickpos_std: standard deviation of licking positionself.lick_mat: matrix of lick rate in each spatial bin (trials x position bins)self.cell_class: dictionary containing booleans of which cells have remapping types classified as "track", "disappear", "appear", "reward", or "nonreward_remap", where:'track' = track-relative'disappear' = disappearing'appear' = appearing'reward' = remap near reward (firing peak ≤50 cm from both reward zone starts), including reward-relative 'nonreward_remap' = remap far from reward (>50 cm from reward zone start), including reward-relativeSee Fig. 2 notebook and code docstrings for more details.self.pos_bin_centers: position bin centersself.dist_btwn_rel_null: distance between spatial firing peaks relative to reward before the switch and the "random remapping" shuffle after the switch (radians)self.dist_btwn_rel_peaks: distance between spatial firing peaks relative to reward before vs. after the switch (radians)self.reward_rel_cell_ids: integer cell indices that were identified as reward-relative after application of all criteriaself.xcorr_above_shuf: lag, in spatial bins, of the above-shuffle maximum of the cross-correlation used to confirm cells as reward-relative (computed for all cells; NaNs indicate that the xcorr did not exceed shuffle)self.reward_rel_dist_along_unity: circular mean of pre-switch and post-switch spatial firing peak position relative to reward (radians)self.rel_peaks: spatial firing peak position relative to reward in each trial set (radians)self.rel_null: spatial firing peak position relative to reward, for the random-remapping shuffle post-switch (radians)self.circ_licks: spatially-binned licking, in circular coordinates relative to reward (trials x position bins)self.circ_speed: spatially-binned speed, in circular coordinates relative to reward (trials x position bins)self.circ_map: mean spatially-binned neural activity within each trial set, of type self.ts_key, in circular coordinates relative to rewardself.circ_trial_matrix: spatially-binned neural activity of type self.ts_key, in circular coordinates relative to reward (trials x position bins x neurons)self.circ_rel_stats_across_an: metadata across the "switch" animals:'include_ans': list of "switch" animal names'rdist_to_rad_inc': self.reward_dist_inclusive converted to radians'rdist_to_rad_exc': self.reward_dist_exclusive converted to radians'min_pos': minimum position bin used'max_pos': maximum position bin used'hist_bin_centers': bin centers used for spatial binning

  17. n

    MODIS Thermal (Last 7 days) - Dataset - CKAN

    • nationaldataplatform.org
    Updated Feb 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). MODIS Thermal (Last 7 days) - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/modis-thermal-last-7-days
    Explore at:
    Dataset updated
    Feb 28, 2024
    Description

    This layer presents detectable thermal activity from MODIS satellites for the last 7 days. MODIS Global Fires is a product of NASA’s Earth Observing System Data and Information System (EOSDIS), part of NASA's Earth Science Data. EOSDIS integrates remote sensing and GIS technologies to deliver global MODIS hotspot/fire locations to natural resource managers and other stakeholders around the World.Consumption Best Practices: As a service that is subject to Viral loads (very high usage), avoid adding Filters that use a Date/Time type field. These queries are not cacheable and WILL be subject to Rate Limiting by ArcGIS Online. To accommodate filtering events by Date/Time, we encourage using the included "Age" fields that maintain the number of Days or Hours since a record was created or last modified compared to the last service update. These queries fully support the ability to cache a response, allowing common query results to be supplied to many users without adding load on the service.When ingesting this service in your applications, avoid using POST requests, these requests are not cacheable and will also be subject to Rate Limiting measures.Source: NASA FIRMS - Active Fire Data - for WorldScale/Resolution: 1kmUpdate Frequency: 1/2 Hour (every 30 minutes) using the Aggregated Live Feed MethodologyArea Covered: WorldWhat can I do with this layer?The MODIS thermal activity layer can be used to visualize and assess wildfires worldwide. However, it should be noted that this dataset contains many “false positives” (e.g., oil/natural gas wells or volcanoes) since the satellite will detect any large thermal signal.Additional InformationMODIS stands for MODerate resolution Imaging Spectroradiometer. The MODIS instrument is on board NASA’s Earth Observing System (EOS) Terra (EOS AM) and Aqua (EOS PM) satellites. The orbit of the Terra satellite goes from north to south across the equator in the morning and Aqua passes south to north over the equator in the afternoon resulting in global coverage every 1 to 2 days. The EOS satellites have a ±55 degree scanning pattern and orbit at 705 km with a 2,330 km swath width.It takes approximately 2 – 4 hours after satellite overpass for MODIS Rapid Response to process the data, and for the Fire Information for Resource Management System (FIRMS) to update the website. Occasionally, hardware errors can result in processing delays beyond the 2-4 hour range. Additional information on the MODIS system status can be found at MODIS Rapid Response.Attribute InformationLatitude and Longitude: The center point location of the 1km (approx.) pixel flagged as containing one or more fires/hotspots (fire size is not 1km, but variable). Stored by Point Geometry. See What does a hotspot/fire detection mean on the ground?Brightness: The brightness temperature measured (in Kelvin) using the MODIS channels 21/22 and channel 31.Scan and Track: The actual spatial resolution of the scanned pixel. Although the algorithm works at 1km resolution, the MODIS pixels get bigger toward the edge of the scan. See What does scan and track mean?Date and Time: Acquisition date of the hotspot/active fire pixel and time of satellite overpass in UTC (client presentation in local time). Stored by Acquisition Date.Acquisition Date: Derived Date/Time field combining Date and Time attributes.Satellite: Whether the detection was picked up by the Terra or Aqua satellite.Confidence: The detection confidence is a quality flag of the individual hotspot/active fire pixel.Version: Version refers to the processing collection and source of data. The number before the decimal refers to the collection (e.g. MODIS Collection 6). The number after the decimal indicates the source of Level 1B data; data processed in near-real time by MODIS Rapid Response will have the source code “CollectionNumber.0”. Data sourced from MODAPS (with a 2-month lag) and processed by FIRMS using the standard MOD14/MYD14 Thermal Anomalies algorithm will have a source code “CollectionNumber.x”. For example, data with the version listed as 5.0 is collection 5, processed by MRR, data with the version listed as 5.1 is collection 5 data processed by FIRMS using Level 1B data from MODAPS.Bright.T31: Channel 31 brightness temperature (in Kelvins) of the hotspot/active fire pixel.FRP: Fire Radiative Power. Depicts the pixel-integrated fire radiative power in MW (MegaWatts). FRP provides information on the measured radiant heat output of detected fires. The amount of radiant heat energy liberated per unit time (the Fire Radiative Power) is thought to be related to the rate at which fuel is being consumed (Wooster et. al. (2005)).DayNight: The standard processing algorithm uses the solar zenith angle (SZA) to threshold the day/night value; if the SZA exceeds 85 degrees it is assigned a night value. SZA values less than 85 degrees are assigned a day time value. For the NRT algorithm the day/night flag is assigned by ascending (day) vs descending (night) observation. It is expected that the NRT assignment of the day/night flag will be amended to be consistent with the standard processing.Hours Old: Derived field that provides age of record in hours between Acquisition date/time and latest update date/time. 0 = less than 1 hour ago, 1 = less than 2 hours ago, 2 = less than 3 hours ago, and so on.RevisionsJune 22, 2022: Added 'HOURS_OLD' field to enhance Filtering data. Added 'Last 7 days' Layer to extend data to match time range of VIIRS offering. Added Field level descriptions.This map is provided for informational purposes and is not monitored 24/7 for accuracy and currency.If you would like to be alerted to potential issues or simply see when this Service will update next, please visit our Live Feed Status Page!

  18. Reddit users in the United States 2019-2028

    • statista.com
    • ai-chatbox.pro
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Reddit users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.

  19. Instagram: distribution of global audiences 2024, by age group

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2025). Instagram: distribution of global audiences 2024, by age group [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, almost 32 percent of global Instagram audiences were aged between 18 and 24 years, and 30.6 percent of users were aged between 25 and 34 years. Overall, 16 percent of users belonged to the 35 to 44 year age group.

                  Instagram users
    
                  With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 362.9 million and 169.7 million Instagram users each.
    
                  Instagram features
    
                  One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature.
                  As of the second quarter of 2021, Snapchat had 293 million daily active users.
    
  20. Instagram: most used hashtags 2024

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista Research Department (2024). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
Organization logo

Twitter users in the United States 2019-2028

Explore at:
73 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 13, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United States
Description

The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.

Search
Clear search
Close search
Google apps
Main menu