7 datasets found
  1. Covid Twitter Sentiment Analysis Datasets

    • kaggle.com
    zip
    Updated Jan 7, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MEJBAH AHAMMAD (2021). Covid Twitter Sentiment Analysis Datasets [Dataset]. https://www.kaggle.com/mejbahahammad/covid-twitter-sentiment-analysis-datasets
    Explore at:
    zip(111387463 bytes)Available download formats
    Dataset updated
    Jan 7, 2021
    Authors
    MEJBAH AHAMMAD
    Description

    This dataset gives a cursory glimpse at the overall sentiment trend of the public discourse regarding the COVID-19 pandemic on Twitter. The live scatter plot of this dataset is available as The Overall Trend block at https://live.rlamsal.com.np. The trend graph reveals multiple peaks and drops that need further analysis. The n-grams during those peaks and drops can prove beneficial for better understanding the discourse. The dataset will be updated weekly and will continue until the development of the Coronavirus (COVID-19) Tweets Dataset is ongoing.

  2. Tableau Dummy Dataset for Practice

    • kaggle.com
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piush Dave (2025). Tableau Dummy Dataset for Practice [Dataset]. https://www.kaggle.com/datasets/piyushdave/tableau-dummy-dataset-for-practice
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 21, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Piush Dave
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Domain-Specific Dataset and Visualization Guide

    This package contains 20 realistic datasets in CSV format across different industries, along with 20 text files suggesting visualization ideas. Each dataset includes about 300 rows of synthetic but domain-appropriate data. They are designed for data analysis, visualization practice, machine learning projects, and dashboard building.

    What’s inside

    • 20 CSV files, one for each domain:

      1. Education
      2. E-Commerce
      3. Healthcare
      4. Finance
      5. Retail
      6. Social Media
      7. Manufacturing
      8. Sports
      9. Transport
      10. Hospitality
      11. Telecom
      12. Banking
      13. Real Estate
      14. Gaming
      15. Agriculture
      16. Automobile
      17. Energy
      18. Insurance
      19. Government
      20. Entertainment

    20 TXT files, each listing 10 relevant graphing options for the dataset.

    MASTER_INDEX.csv, which summarizes all domains with their column names.

    Use cases

    • Practice data cleaning, exploration, and visualization in Excel, Tableau, Power BI, or Python.
    • Build dashboards for specific industries.
    • Train beginner-level machine learning models such as classification and regression.
    • Use in classroom teaching or workshops as ready-made datasets.

    Example

    • Education dataset has columns like StudentName, Class, Subject, Marks, AttendancePercent. Suggested graphs: bar chart of average marks by subject, scatter plot of marks vs attendance percent, line chart of attendance over time.

    • E-Commerce dataset has columns like OrderDate, Product, Category, Price, Quantity, Total. Suggested graphs: line chart of revenue trend, bar chart of revenue by category, pie chart of payment mode share.

  3. e

    Annual Time Series of Air Temperature, Precipitation, and Urban Area Extent...

    • b2find.eudat.eu
    Updated Mar 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Annual Time Series of Air Temperature, Precipitation, and Urban Area Extent in Modena, Italy - Files - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/46e3fcf8-0259-5400-ad52-45a02ed2d903
    Explore at:
    Dataset updated
    Mar 20, 2024
    Area covered
    Italy, Modena
    Description

    An uninterrupted data set of 139 annual values of local mean air temperature T, cumulative precipitation depth P, urban area extent A, global mean surface air temperature G, and global CO2 concentration C for the 1881-2019 period of time is shared with the scientific community. The Matlab 2021a code na.m performing a nonlinear analysis of the data contained in the file ts.dat is also shared with the scientific community. The code loads file ts.dat and generates the PDF files of this dataset. File README.txt contains the description of this dataset and its files.The shared data can be found in the ASCII text file ts.dat (as well as in dataset doi:10.1594/PANGAEA.938739, which has been created from that file). The first column, having header year, contains the year. The second column, having header T (°C), contains the local mean air temperature T in Celsius degrees observed in Modena. The third column, having header P (mm), contains the cumulative precipitation depth P in millimeters in Modena. The fourth column, having header A (km2), contains the urban area extent A in square kilometers of Modena. The fifth column, having header G (°C), contains the global mean surface air temperature G in Celsius degrees obtained by adding the GISTEMP temperature change to the average temperature observed in Modena in the 1951–1980 base period (https://data.giss.nasa.gov/gistemp/). The sixth column, having heather C (ppm), contains the global CO2 concentration C in parts per million estimated from ice cores, from 1881 to 1958 (https://cdiac.ess-dive.lbl.gov/trends/co2/lawdome-data.html), and observed in the Mauna Loa Observatory (latitude 19.5362°N, longitude 155.5763°W, elevation 3397.00 m asl), Hawaii, from 1959 to 2019 (https://gml.noaa.gov/ccgg/trends/data.html).The Matlab 2021a code na.m performing a nonlinear analysis of the data contained in the file ts.dat is also shared with the scientific community. The Matlab 2021a code na.m loads the file ts.dat and generates the the following PDF files:- PDF file lg.pdf. Comparison between local temperature in Modena and global temperatures obtained from the NASA GISTEMP temperature change projected to Modena.- PDF file dm.pdf. Scatter plot matrix of T, P, A, G, and C.- PDF file vm.pdf. Scatter plot matrix for the first differences of T, P, A, G, and C.- PDF file pm.pdf. Generalized additive model predictions of T, P, A, G, and C, denoted as T', P', A', G', and C', obtained from single predictors T, P, A, G, and C.- PDF file gam.pdf. Generalized additive model predictions of T and G, denoted as T' and G', respectively, obtained from multiple predictors based on T, P, A, G, and C.The nonlinear analysis performed by using the data set contained in the ASCII text file ts.dat and the Matlab 2021a code na.m are described in Orlandini et al 2021 and available from the authors sharing the present data set upon request.

  4. Scatter plot of the samples in the prostate cancer dataset contributed by...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Regina Berretta; Pablo Moscato (2023). Scatter plot of the samples in the prostate cancer dataset contributed by True et al., presenting the MPR-Statistical Complexity of each sample as a function of its Normalized Shannon Entropy. [Dataset]. http://doi.org/10.1371/journal.pone.0012262.g012
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Regina Berretta; Pablo Moscato
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains the expression of 13,188 probes and 31 samples. The samples include 11 samples labelled ‘Gleason 3’ (in green), 12 ‘Gleason 4’ samples, and 8 ‘Gleason 5’ (in red). Two samples seem to be outliers to a generic trend, which is somewhat expected. We do expect samples with a ‘Gleason 3’ label to have higher values of Normalized Shannon Entropy. This is indeed the case, no sample with a ‘Gleason 3’ label has a value of Normalized Shannon Entropy lower than 0.985, while 14 samples corresponding to samples which are either ‘Gleason 4’ or ‘Gleason 5’ have values smaller than that threshold. In agreement with some of the caveats discussed by True et al., there exist a group of samples that, irrespective of their label, have similar values of Normalized Shannon Entropy (near 0.992). Samples 02_003E and 03_063 seem to be outliers to this trend, and in the case of 03_063 the sample is not even close to a hypothetical linear fit which seems to be the norm for all the samples. Figure 13 will provide further evidence that may indicate that these two samples are outliers or not to the overall trend.

  5. National Universities Rankings

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). National Universities Rankings [Dataset]. https://www.kaggle.com/datasets/thedevastator/national-universities-rankings-explore-quality-t/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    National Universities Rankings

    Analyze 1,800 U.S. Universities and their Academic Performance

    By Education [source]

    About this dataset

    Welcome to the U.S. News & World Report's 2017 National Universities Rankings, a comprehensive dataset of over 1,800 schools across the United States providing quality data on admissions criteria, cost of tuition and fees, enrollment numbers, and overall rankings! Here you'll find up-to-date information on institutes of higher learning from Princeton University at the top spot in Best National Universities to Williams College at No. 1 on the Best National Liberal Arts Colleges list.

    This collection of data is all that's needed for potential students - parents, counselors and more - to evaluate their choices in selecting a college or university that perfectly meets their needs. For instance: what is the total tuition & fees cost? What are student enrollment numbers? How have students rated this school? Which universities have been recognized as top institutions in academics by U.S. News & World Report? What admissions criteria do these schools evaluate when considering an applicant's profile? The answers lie within this dataset!

    Explore each category separately as well as with other considerations through visuals like our scatter plot to get an inside look into collegiate education from enrollment patterns charted against yearly expenses including room & board charges without forgetting several crucial factors such as six-year graduation rates and freshman retention rates measured among nations' universities included here -allowing for comparison and assessment beforehand for a well-rounded experience such that you can find your own path ahead!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains information on the quality, tuition, and enrollment data of 1,800 U.S.-based universities ranked by U.S. News & World Report from 2017. It includes rankings from the National University and Liberal Arts College lists in addition to relevant data points like tuition fees and undergraduate enrollments for each school.

    Users can take advantage of this dataset to build models that predict ranking or predicting cost-benefit results for students by using cost-related (tuition) metrics along with quality metrics (rankings). Alternatively users can use it to analyze trends between investments in higher education versus outcomes (ranking), or explore the relationship between enrollments for schools of varying rank tiers, etc...

    For more information on how rankings are calculated please refer to this methodology explainer on U.S news website

    Here is an overview of all columns included in this dataset:

    Columns:Name - institution name,Location - City and state where located,Rank - Ranking according to U.S News & World Report ,Description - Snippet of text overview from U.S News ,Tuition and fees – Combined tuition and fees for out–of–state students ,In–state – Tuition and fees for in–state students ,Undergraduate Enrollment – Number of enrolled undergraduate students .

    Using this column detail as a guide we can answer questions like ‘which colleges give highest ROI ?’ or ‘Which college has highest number undergraduates?’ . For statistical analysis such as correlation we may use a visual representation such as a scatter plots or bar graphs accordingly making it easier analyses trends found within our dataset ans well as exploring any relationships between different factors such us tuitions vs ranks

    Research Ideas

    • Developing a searchable database to help high school students identify colleges that match their criteria in terms of tuition, graduation rate, location, and rank.
    • Identifying correlations between enrollment numbers and university rank in order to better understand how the number of enrolled students effects the overall ranking of a university.
    • Comparing universities with similar rankings in order to highlight differences between programs’ tuition and fees as well as retention rates

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, ...

  6. f

    Seasonal Fire Weather Index Analysis – Italy (2007–2024)

    • figshare.com
    csv
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Angel Espinosa; Giorgio Vacchiano (2025). Seasonal Fire Weather Index Analysis – Italy (2007–2024) [Dataset]. http://doi.org/10.6084/m9.figshare.30218152.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    figshare
    Authors
    Luis Angel Espinosa; Giorgio Vacchiano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    📌 OverviewThis repository contains all the necessary scripts and input data to reproduce the analysis of the seasonal Fire Weather Index (FWI) indicators for Italy between 2007–2024 (18 years).The workflow consists of three R scripts that calculate mean/median FWI values, burned area statistics, and relationships between fire danger indicators and observed burned areas.⚠️ Important Notes🔹 Dataset provenanceThe FWI dataset used here originates from the Copernicus Climate Data Store.This dataset is no longer supported by the data providers. Data and documentation are provided as is. Users are encouraged to consult the CDS Forum for any discussions or clarifications.The original NetCDF (nc) files contained:Seasonal data: June–September (fire season)Indicator: Fire Weather Index (FWI)Scenario: RCP2.6Aggregation: Multi-model mean caseThese NetCDF files were reprojected using Climate Data Operators (CDO) to a regular lon/lat grid before being included here.Reprojection grid specification:gridtype = lonlatxsize = 1000ysize = 460xfirst = -45xinc = 0.11yfirst = 22yinc = 0.11📂 Folder StructureAfter downloading, place the folder in the following path:C:/NERO2025/The folder should contain:C:/NERO2025/│── Script 01 C3S Analysis.R # Step 1 – FWI annual mean & median│── Script 02 Burned Areas Per Region.R # Step 2 – Burned areas per ecoregion│── Script 03 Scatter Plots FWI vs Burned Area.R # Step 3 – FWI–burned area link│── IB_ITA_2007_2024_v_1.shp # Shapefile of burned areas│── subsez_ecoregioni_IT_proj.shp # Shapefile of Italian ecoregions│── other shapefile files (.dbf, .shx, etc.)▶️ WorkflowStep 1: Annual FWI statisticsRun Script 01 C3S Analysis.RCalculates mean and median seasonal FWI values per year for Italy and per ecoregion.Outputs:FWI_Mean_Table.csvFWI_Median_Table.csvStep 2: Burned areas per ecoregionRun Script 02 Burned Areas Per Region.RIntersects burned areas with Italian ecoregions.Calculates:Absolute burned area per year & ecoregionRelative burned area (% of ecoregion burned)Performs trend analysis using linear regression & Mann-Kendall.Outputs:BurnedArea_Ecoregions.csvRelativeBurnedArea_Ecoregions.csvBurnedArea_Ecoregions_Trends.csvPlots saved in BurnedArea_Trends_Plots/Step 3: Linking FWI and burned areasRun Script 03 Scatter Plots FWI vs Burned Area.RMerges FWI tables (Step 1) with burned area statistics (Step 2).Generates scatterplots with regression fits, showing the relationship between FWI and burned areas.Outputs:Plots saved in FWI_Burned_ScatterPlots/CSVs of data used for each plot⚠️ Dependencies:Script 03 requires the results of Scripts 01 & 02. Therefore, always run them in sequence.

  7. Scatter plot showing the expression of the probe corresponding to ADA...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Regina Berretta; Pablo Moscato (2023). Scatter plot showing the expression of the probe corresponding to ADA (Adenosine deaminase), AA683578 (y-axis) and TP63 (Tumor protein p63), AA455929 (x-axis). [Dataset]. http://doi.org/10.1371/journal.pone.0012262.g007
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Regina Berretta; Pablo Moscato
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All the samples that have TP63 expression are normal or nevi, with two primary melanomas still preserving TP63 expression but with higher ADA. The trend reverses for the rest of the primary melanoma samples and the metastatic ones, which all express ADA but not TP63.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
MEJBAH AHAMMAD (2021). Covid Twitter Sentiment Analysis Datasets [Dataset]. https://www.kaggle.com/mejbahahammad/covid-twitter-sentiment-analysis-datasets
Organization logo

Covid Twitter Sentiment Analysis Datasets

IEEE CORONAVIRUS (COVID-19) TWEETS DATASET

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
zip(111387463 bytes)Available download formats
Dataset updated
Jan 7, 2021
Authors
MEJBAH AHAMMAD
Description

This dataset gives a cursory glimpse at the overall sentiment trend of the public discourse regarding the COVID-19 pandemic on Twitter. The live scatter plot of this dataset is available as The Overall Trend block at https://live.rlamsal.com.np. The trend graph reveals multiple peaks and drops that need further analysis. The n-grams during those peaks and drops can prove beneficial for better understanding the discourse. The dataset will be updated weekly and will continue until the development of the Coronavirus (COVID-19) Tweets Dataset is ongoing.

Search
Clear search
Close search
Google apps
Main menu