100+ datasets found
  1. bleeding-edge-gameplay-sample

    • huggingface.co
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). bleeding-edge-gameplay-sample [Dataset]. https://huggingface.co/datasets/microsoft/bleeding-edge-gameplay-sample
    Explore at:
    Dataset updated
    Feb 21, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    Description

    This dataset contains 1024 60 second video clips of Bleeding Edge gameplay (75GB). The data has already been processed into the following format: 300x180 videos sampled at 10 fps.

      Dataset Structure
    
    
    
    
    
      Data Files
    

    testing_dataset_part1.zip & testing_dataset_part2.zip – Contains all 1024 60 second trajectories used for our evaluation.

    4 examples from the dataset:

    FB[…].npz – .npz file (described below) FB[…].mp4 – 60 seconds .mp4 video of the images from the .npz file.… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/bleeding-edge-gameplay-sample.

  2. Adventure Works 2022 CSVs

    • kaggle.com
    zip
    Updated Nov 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Algorismus (2022). Adventure Works 2022 CSVs [Dataset]. https://www.kaggle.com/datasets/algorismus/adventure-works-in-excel-tables
    Explore at:
    zip(567646 bytes)Available download formats
    Dataset updated
    Nov 2, 2022
    Authors
    Algorismus
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    Adventure Works 2022 dataset

    How this Dataset is created?

    On the official website the dataset is available over SQL server (localhost) and CSVs to be used via Power BI Desktop running on Virtual Lab (Virtaul Machine). As per first two steps of Importing data are executed in the virtual lab and then resultant Power BI tables are copied in CSVs. Added records till year 2022 as required.

    How this Dataset may help you?

    this dataset will be helpful in case you want to work offline with Adventure Works data in Power BI desktop in order to carry lab instructions as per training material on official website. The dataset is useful in case you want to work on Power BI desktop Sales Analysis example from Microsoft website PL 300 learning.

    How to use this Dataset?

    Download the CSV file(s) and import in Power BI desktop as tables. The CSVs are named as tables created after first two steps of importing data as mentioned in the PL-300 Microsoft Power BI Data Analyst exam lab.

  3. c

    Data from: Delta Neighborhood Physical Activity Study

    • s.cnmilf.com
    • agdatacommons.nal.usda.gov
    • +1more
    Updated Jun 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Neighborhood Physical Activity Study [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/delta-neighborhood-physical-activity-study-f82d7
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    The Delta Neighborhood Physical Activity Study was an observational study designed to assess characteristics of neighborhood built environments associated with physical activity. It was an ancillary study to the Delta Healthy Sprouts Project and therefore included towns and neighborhoods in which Delta Healthy Sprouts participants resided. The 12 towns were located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys between August 2016 and September 2017 using the Rural Active Living Assessment (RALA) tools and the Community Park Audit Tool (CPAT). Scale scores for the RALA Programs and Policies Assessment and the Town-Wide Assessment were computed using the scoring algorithms provided for these tools via SAS software programming. The Street Segment Assessment and CPAT do not have associated scoring algorithms and therefore no scores are provided for them. Because the towns were not randomly selected and the sample size is small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Dataset one contains data collected with the RALA Programs and Policies Assessment (PPA) tool. Dataset two contains data collected with the RALA Town-Wide Assessment (TWA) tool. Dataset three contains data collected with the RALA Street Segment Assessment (SSA) tool. Dataset four contains data collected with the Community Park Audit Tool (CPAT). [Note : title changed 9/4/2020 to reflect study name] Resources in this dataset:Resource Title: Dataset One RALA PPA Data Dictionary. File Name: RALA PPA Data Dictionary.csvResource Description: Data dictionary for dataset one collected using the RALA PPA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Two RALA TWA Data Dictionary. File Name: RALA TWA Data Dictionary.csvResource Description: Data dictionary for dataset two collected using the RALA TWA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Three RALA SSA Data Dictionary. File Name: RALA SSA Data Dictionary.csvResource Description: Data dictionary for dataset three collected using the RALA SSA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Four CPAT Data Dictionary. File Name: CPAT Data Dictionary.csvResource Description: Data dictionary for dataset four collected using the CPAT.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset One RALA PPA. File Name: RALA PPA Data.csvResource Description: Data collected using the RALA PPA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Two RALA TWA. File Name: RALA TWA Data.csvResource Description: Data collected using the RALA TWA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Three RALA SSA. File Name: RALA SSA Data.csvResource Description: Data collected using the RALA SSA tool.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Four CPAT. File Name: CPAT Data.csvResource Description: Data collected using the CPAT.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Data Dictionary. File Name: DataDictionary_RALA_PPA_SSA_TWA_CPAT.csvResource Description: This is a combined data dictionary from each of the 4 dataset files in this set.

  4. f

    Example of a filtered Microsoft Excel spreadsheet for TaAMY2 single null...

    • datasetcatalog.nlm.nih.gov
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mieog, Jos C.; Ral, Jean-Philippe F. (2016). Example of a filtered Microsoft Excel spreadsheet for TaAMY2 single null mutant detection (selected data). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001527938
    Explore at:
    Dataset updated
    Sep 28, 2016
    Authors
    Mieog, Jos C.; Ral, Jean-Philippe F.
    Description

    Example of a filtered Microsoft Excel spreadsheet for TaAMY2 single null mutant detection (selected data).

  5. Microsoft Malware Sample

    • kaggle.com
    Updated Dec 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dheemanth Bhat (2022). Microsoft Malware Sample [Dataset]. https://www.kaggle.com/datasets/dheemanthbhat/microsoft-malware-sample
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dheemanth Bhat
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Source

    This dataset contains vectorized byte-files taken from the original dataset of Microsoft Malware Classification Challenge (BIG 2015) competition. Original dataset belongs to http://arxiv.org/abs/1802.10135.
    Original Train and Test dataset are ~18GB each. This random sample extracted and vectorized is just ~15MB is size.

    How the dataset is sampled?

    1. Randomly equal number of malware byte-files from each class (except Simda) are selcted.
    2. Byte data in hexadecimal characters are then subjected to preprocessing.
    3. Finally preprocessed hex strings are then vectorized using scikit-learn CountVectorizer.

    Note: Original dataset contains only 42 byte-files for malware class 5 (Simda).

    https://i.imgur.com/CFWhzYr.png" alt="balanced-dataset-pie-chart">

  6. Sorting/selecting data in Excel with VLOOKUP()

    • figshare.com
    xlsx
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anneke Batenburg (2016). Sorting/selecting data in Excel with VLOOKUP() [Dataset]. http://doi.org/10.6084/m9.figshare.964802.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Anneke Batenburg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example of how I use MS Excel's VLOOKUP() function to filter my data.

  7. e

    Cloud to Street - Microsoft Flood Dataset - Sentinel-1

    • collections.eurodatacube.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sentinel Hub, Cloud to Street - Microsoft Flood Dataset - Sentinel-1 [Dataset]. https://collections.eurodatacube.com/microsoft-floods-s1/
    Explore at:
    Dataset provided by
    <a href="https://www.sentinel-hub.com/">Sentinel Hub</a>
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Cloud to Street - Microsoft Flood Dataset (C2S-MS Floods) is a dataset of near-coincident Sentinel-1 and Sentinel-2 data paired with water labels from 18 global flood events. These labels are derived products of MODIS sensor on board NASA's Aqua and Terra satellites produced as a part of the study, "Satellite imaging reveals increased proportion of population exposed to floods," Nature (2021), doi: 10.1038/s41586-021-03695-w. In this collection, we keep the water label which represents the maximum observed flood extent during the time period of the event. For a detailed description of the methods used to generate these labels, please refer to the original paper.

  8. bing_coronavirus_query_set

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). bing_coronavirus_query_set [Dataset]. https://huggingface.co/datasets/microsoft/bing_coronavirus_query_set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for BingCoronavirusQuerySet

      Dataset Summary
    

    Please note that you can specify the start and end date of the data. You can get start and end dates from here: https://github.com/microsoft/BingCoronavirusQuerySet/tree/master/data/2020 example: load_dataset("bing_coronavirus_query_set", queries_by="state", start_date="2020-09-01", end_date="2020-09-30")

    You can also load the data by country by using queries_by="country".

      Supported Tasks and… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/bing_coronavirus_query_set.
    
  9. AdventureWorks Sample Mfg Database Tables

    • kaggle.com
    zip
    Updated Feb 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Brown (2023). AdventureWorks Sample Mfg Database Tables [Dataset]. https://www.kaggle.com/datasets/universalanalyst/adventureworks-sample-mfg-database-tables
    Explore at:
    zip(3689556 bytes)Available download formats
    Dataset updated
    Feb 24, 2023
    Authors
    Michael Brown
    Description

    In order to practice writing SQL queries in a semi-realistic database, I discovered and imported Microsoft's AdventureWorks sample database into Microsoft SQL Server Express. The Adventure Works [fictious] company represents a bicycle manufacturer that sells bicycles and accessories to global markets. Queries were written for developing and testing a Tableau dashboard.

    The dataset presented here represents a fraction of the entire manufacturing relational database. Tables within the dataset include product, purchasing, work order, and transaction data.

    The full database sample can be found on Microsoft SQL Docs website: https://learn.microsoft.com/en-us/sql/samples/ and additionally on Github: https://github.com/microsoft/sql-server-samples

  10. FStarDataSet-V2

    • huggingface.co
    Updated Sep 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2024). FStarDataSet-V2 [Dataset]. https://huggingface.co/datasets/microsoft/FStarDataSet-V2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2024
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    License

    https://choosealicense.com/licenses/cdla-permissive-2.0/https://choosealicense.com/licenses/cdla-permissive-2.0/

    Description

    This dataset is the Version 2.0 of microsoft/FStarDataSet.

      Primary-Objective
    

    This dataset's primary objective is to train and evaluate Proof-oriented Programming with AI (PoPAI, in short). Given a specification of a program and proof in F*, the objective of a AI model is to synthesize the implemantation (see below for details about the usage of this dataset, including the input and output).

      Data Format
    

    Each of the examples in this dataset are organized as dictionaries… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/FStarDataSet-V2.

  11. Z

    Sample Dataset - HR Subject Areas

    • data.niaid.nih.gov
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weber, Marc (2023). Sample Dataset - HR Subject Areas [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7447111
    Explore at:
    Dataset updated
    Jan 18, 2023
    Authors
    Weber, Marc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset created as part of the Master Thesis "Business Intelligence – Automation of Data Marts modeling and its data processing".

    Lucerne University of Applied Sciences and Arts

    Master of Science in Applied Information and Data Science (MScIDS)

    Autumn Semester 2022

    Change log Version 1.1:

    The following SQL scripts were added:

        Index
        Type
        Name
    
    
        1
        View
        pg.dictionary_table
    
    
        2
        View
        pg.dictionary_column
    
    
        3
        View
        pg.dictionary_relation
    
    
        4
        View
        pg.accesslayer_table
    
    
        5
        View
        pg.accesslayer_column
    
    
        6
        View
        pg.accesslayer_relation
    
    
        7
        View
        pg.accesslayer_fact_candidate
    
    
        8
        Stored Procedure
        pg.get_fact_candidate
    
    
        9
        Stored Procedure
        pg.get_dimension_candidate
    
    
        10
        Stored Procedure
        pg.get_columns
    

    Scripts are based on Microsoft SQL Server Version 2017 and compatible with a data warehouse built with Datavault Builder. Data warehouse objects scripts of the sample data warehouse are restricted and cannot be shared.

  12. Microsoft Azure Predictive Maintenance

    • kaggle.com
    zip
    Updated Oct 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    arnab (2020). Microsoft Azure Predictive Maintenance [Dataset]. https://www.kaggle.com/arnabbiswas1/microsoft-azure-predictive-maintenance
    Explore at:
    zip(32497141 bytes)Available download formats
    Dataset updated
    Oct 15, 2020
    Authors
    arnab
    Description

    Context

    This an example data source which can be used for Predictive Maintenance Model Building. It consists of the following data:

    • Machine conditions and usage: The operating conditions of a machine e.g. data collected from sensors.
    • Failure history: The failure history of a machine or component within the machine.
    • Maintenance history: The repair history of a machine, e.g. error codes, previous maintenance activities or component replacements.
    • Machine features: The features of a machine, e.g. engine size, make and model, location.

    Details

    • Telemetry Time Series Data (PdM_telemetry.csv): It consists of hourly average of voltage, rotation, pressure, vibration collected from 100 machines for the year 2015.

    • Error (PdM_errors.csv): These are errors encountered by the machines while in operating condition. Since, these errors don't shut down the machines, these are not considered as failures. The error date and times are rounded to the closest hour since the telemetry data is collected at an hourly rate.

    • Maintenance (PdM_maint.csv): If a component of a machine is replaced, that is captured as a record in this table. Components are replaced under two situations: 1. During the regular scheduled visit, the technician replaced it (Proactive Maintenance) 2. A component breaks down and then the technician does an unscheduled maintenance to replace the component (Reactive Maintenance). This is considered as a failure and corresponding data is captured under Failures. Maintenance data has both 2014 and 2015 records. This data is rounded to the closest hour since the telemetry data is collected at an hourly rate.

    • Failures (PdM_failures.csv): Each record represents replacement of a component due to failure. This data is a subset of Maintenance data. This data is rounded to the closest hour since the telemetry data is collected at an hourly rate.

    • Metadata of Machines (PdM_Machines.csv): Model type & age of the Machines.

    Acknowledgements

    This dataset was available as a part of Azure AI Notebooks for Predictive Maintenance. But as of 15th Oct, 2020 the notebook (link) is no longer available. However, the data can still be downloaded using the following URLs:

    https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_telemetry.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_errors.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_maint.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_failures.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_machines.csv

    Inspiration

    Try to use this data to build Machine Learning models related to Predictive Maintenance.

  13. Data from: Delta Produce Sources Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  14. t

    T-Drive Trajectory Data Sample - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). T-Drive Trajectory Data Sample - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/t-drive-trajectory-data-sample
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description
  15. d

    GP Practice Prescribing Presentation-level Data - July 2014

    • digital.nhs.uk
    csv, zip
    Updated Oct 31, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). GP Practice Prescribing Presentation-level Data - July 2014 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
    Explore at:
    csv(1.4 GB), zip(257.7 MB), csv(1.7 MB), csv(275.8 kB)Available download formats
    Dataset updated
    Oct 31, 2014
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Jul 1, 2014 - Jul 31, 2014
    Area covered
    United Kingdom
    Description

    Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively, add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from Microsoft http://office.microsoft.com/en-gb/excel/download-power-pivot-HA101959985.aspx Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): - the total number of items prescribed and dispensed - the total net ingredient cost - the total actual cost - the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.

  16. m

    Dataset of development of business during the COVID-19 crisis

    • data.mendeley.com
    • narcis.nl
    Updated Nov 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tatiana N. Litvinova (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
    Explore at:
    Dataset updated
    Nov 9, 2020
    Authors
    Tatiana N. Litvinova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.

  17. S

    Data from: Microsoft Concept Graph: Mining Semantic Concepts for Short Text...

    • scidb.cn
    Updated Oct 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lei Ji; Yujing Wang; Botian Shi; Dawei Zhang; Zhongyuan Wang; Jun Yan (2020). Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding [Dataset]. http://doi.org/10.11922/sciencedb.j00104.00047
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 16, 2020
    Dataset provided by
    Science Data Bank
    Authors
    Lei Ji; Yujing Wang; Botian Shi; Dawei Zhang; Zhongyuan Wang; Jun Yan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Four tables and 23 figures of this paper. Table 1 shows the concept space comparison of existing taxonomies. Table 2 presents Hearst pattern examples. Table 3 shows labeling guideline for conceptualization. Table 4 presents precision of short text understanding. Figure 1 shows the framework overviews. Figure 2 is local taxonomy construction. Figure 3 shows horizontal merging. Figure 4 shows vertical merging: single sense alignment. Figure 5 shows vertical merging: multiple sense alignment. Figure 6 is a subgraph of heterogeneous semantic network around watch. Figure 7 is the compression procedure of typed-term co-occurrence network. Figure 8 presents an example of short text understanding. Figure 9 present examples of Chain model and Pairwise model. Figure 10 is a snapshot of the Probase browser. Figure 11 is a snapshot of single instance conceptualization.Figure 12 is a snapshot of context-aware single instance conceptualization. Figure 13 shows an example of short text conceptualization. Figure 14 is the framework of topic search. Figure 15 is a snapshot of the Web tables. Figure 16 shows query recommendation snapshot. Figure 17 shows the correlation of CTR with ads relevance score. Figure 18 presents the distribution of concepts in Microsoft Concept Graph. Figure 19 shows concept coverage of different taxonomies. Figure 20 shows precision of extracted isA pairs on 40 concepts.Figure 21 is precision of isA pairs after each iteration. Figure 22 shows the number of discovered concepts and isA pairs after each iteration. Figure 23 shows precision and nDCG comparison.

  18. Escape Excel: A tool for preventing gene symbol and accession conversion...

    • plos.figshare.com
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich (2023). Escape Excel: A tool for preventing gene symbol and accession conversion errors [Dataset]. http://doi.org/10.1371/journal.pone.0185207
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundMicrosoft Excel automatically converts certain gene symbols, database accessions, and other alphanumeric text into dates, scientific notation, and other numerical representations. These conversions lead to subsequent, irreversible, corruption of the imported text. A recent survey of popular genomic literature estimates that one-fifth of all papers with supplementary gene lists suffer from this issue.ResultsHere, we present an open-source tool, Escape Excel, which prevents these erroneous conversions by generating an escaped text file that can be safely imported into Excel. Escape Excel is implemented in a variety of formats (http://www.github.com/pstew/escape_excel), including a command line based Perl script, a Windows-only Excel Add-In, an OS X drag-and-drop application, a simple web-server, and as a Galaxy web environment interface. Test server implementations are accessible as a Galaxy interface (http://apostl.moffitt.org) and simple non-Galaxy web server (http://apostl.moffitt.org:8000/).ConclusionsEscape Excel detects and escapes a wide variety of problematic text strings so that they are not erroneously converted into other representations upon importation into Excel. Examples of problematic strings include date-like strings, time-like strings, leading zeroes in front of numbers, and long numeric and alphanumeric identifiers that should not be automatically converted into scientific notation. It is hoped that greater awareness of these potential data corruption issues, together with diligent escaping of text files prior to importation into Excel, will help to reduce the amount of Excel-corrupted data in scientific analyses and publications.

  19. Data from: Delta Food Outlets Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Food Outlets Study [Dataset]. https://catalog.data.gov/dataset/delta-food-outlets-study-2786d
    Explore at:
    Dataset updated
    May 8, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The Delta Food Outlets Study was an observational study designed to assess the nutritional environments of 5 towns located in the Lower Mississippi Delta region of Mississippi. It was an ancillary study to the Delta Healthy Sprouts Project and therefore included towns in which Delta Healthy Sprouts participants resided and that contained at least one convenience (corner) store, grocery store, or gas station. Data were collected via electronic surveys between March 2016 and September 2018 using the Nutrition Environment Measures Survey (NEMS) tools. Survey scores for the NEMS Corner Store, NEMS Grocery Store, and NEMS Restaurant were computed using modified scoring algorithms provided for these tools via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Dataset one (NEMS-C) contains data collected with the NEMS Corner (convenience) Store tool. Dataset two (NEMS-G) contains data collected with the NEMS Grocery Store tool. Dataset three (NEMS-R) contains data collected with the NEMS Restaurant tool. Resources in this dataset:Resource Title: Delta Food Outlets Data Dictionary. File Name: DFO_DataDictionary_Public.csvResource Description: This file contains the data dictionary for all 3 datasets that are part of the Delta Food Outlets Study.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset One NEMS-C. File Name: NEMS-C Data.csvResource Description: This file contains data collected with the Nutrition Environment Measures Survey (NEMS) tool for convenience stores.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Two NEMS-G. File Name: NEMS-G Data.csvResource Description: This file contains data collected with the Nutrition Environment Measures Survey (NEMS) tool for grocery stores.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel Resource Title: Dataset Three NEMS-R. File Name: NEMS-R Data.csvResource Description: This file contains data collected with the Nutrition Environment Measures Survey (NEMS) tool for restaurants.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

  20. B

    Data Cleaning Sample

    • borealisdata.ca
    • dataone.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Microsoft (2025). bleeding-edge-gameplay-sample [Dataset]. https://huggingface.co/datasets/microsoft/bleeding-edge-gameplay-sample
Organization logo

bleeding-edge-gameplay-sample

microsoft/bleeding-edge-gameplay-sample

Explore at:
Dataset updated
Feb 21, 2025
Dataset authored and provided by
Microsofthttp://microsoft.com/
Description

This dataset contains 1024 60 second video clips of Bleeding Edge gameplay (75GB). The data has already been processed into the following format: 300x180 videos sampled at 10 fps.

  Dataset Structure





  Data Files

testing_dataset_part1.zip & testing_dataset_part2.zip – Contains all 1024 60 second trajectories used for our evaluation.

4 examples from the dataset:

FB[…].npz – .npz file (described below) FB[…].mp4 – 60 seconds .mp4 video of the images from the .npz file.… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/bleeding-edge-gameplay-sample.

Search
Clear search
Close search
Google apps
Main menu