11 datasets found

Data from: Current and projected research data storage needs of Agricultural...
catalog.data.gov
agdatacommons.nal.usda.gov
+2more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
SPORTS_DATA_ANALYSIS_ON_EXCEL
kaggle.com
zip
Updated Dec 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nil kamal Saha (2024). SPORTS_DATA_ANALYSIS_ON_EXCEL [Dataset]. https://www.kaggle.com/datasets/nilkamalsaha/sports-data-analysis-on-excel
Explore at:
zip(1203633 bytes)Available download formats
Dataset updated
Dec 12, 2024
Authors
Nil kamal Saha
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
PROJECT OBJECTIVE

We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.

Questions (KPIs)

TASK 1: STANDARDIZING THE DATASET

Populate the FULLNAME consisting of the following fields ONLY, in the prescribed format: PREFIX FIRSTNAME LASTNAME.{Note: All UPPERCASE)

Get the COUNTRY NAME to which these sportsmen belong to. Make use of LOCATION sheet to get the required data

Populate the LANGUAGE_!poken by the sportsmen. Make use of LOCTION sheet to get the required data

Generate the EMAIL ADDRESS for those members, who speak English, in the prescribed format :lastname.firstnamel@xyz .org {Note: All lowercase) and for all other members, format should be lastname.firstname@xyz.com (Note: All lowercase)

Populate the SPORT LOCATION of the sport played by each player. Make use of SPORT sheet to get the required data

TASK 2: DATA FORMATING

Display MEMBER IDas always 3 digit number {Note: 001,002 ...,D2D,..etc)

Format the BIRTHDATE as dd mmm'yyyy (Prescribed format example: 09 May' 1986)

Display the units for the WEIGHT column (Prescribed format example: 80 kg)

Format the SALARY to show the data In thousands. If SALARY is less than 100,000 then display data with 2 decimal places else display data with one decimal place. In both cases units should be thousands (k) e.g. 87670 -> 87.67 k and 12 250 -> 123.2 k

TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:

In COLUMNS; Group : GENDER.

In ROWS; Group : COUNTRY (Note: use COUNTRY NAMES).

In VALUES; calculate the count of candidates from each COUNTRY and GENDER type, Remove GRAND TOTALs.

TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)

• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:

Starting from range RANGE H4; get the distinct GENDER. Use remove duplicates option and transpose the data.

Starting from range RANGE GS; get the distinct COUNTRY (Note: use COUNTRY NAMES).

In the cross table,get the count of candidates from each COUNTRY and GENDER type.

TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)

• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:

Change the report layout to TABULAR form.

Remove expand and collapse buttons.

Remove GRAND TOTALs.

Allow user to filter the data by SPORT LOCATION.

Process

Verify data for any missing values and anomalies, and sort out the same.

Made sure data is consistent and clean with respect to data type, data format and values used.

Created pivot tables according to the questions asked.
Z
ANN development + final testing datasets
data.niaid.nih.gov
resodate.org
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Authors (2020). ANN development + final testing datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1445865
Explore at:
Dataset updated
Jan 24, 2020
Authors
Authors
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
File name definitions:

'...v_50_175_250_300...' - dataset for velocity ranges [50, 175] + [250, 300] m/s

'...v_175_250...' - dataset for velocity range [175, 250] m/s

'ANNdevelop...' - used to perform 9 parametric sub-analyses where, in each one, many ANNs are developed (trained, validated and tested) and the one yielding the best results is selected

'ANNtest...' - used to test the best ANN from each aforementioned parametric sub-analysis, aiming to find the best ANN model; this dataset includes the 'ANNdevelop...' counterpart

Where to find the input (independent) and target (dependent) variable values for each dataset/excel ?

input values in 'IN' sheet

target values in 'TARGET' sheet

Where to find the results from the best ANN model (for each target/output variable and each velocity range)?

open the corresponding excel file and the expected (target) vs ANN (output) results are written in 'TARGET vs OUTPUT' sheet

Check reference below (to be added when the paper is published)

https://www.researchgate.net/publication/328849817_11_Neural_Networks_-_Max_Disp_-_Railway_Beams
S
Annual Retail Store Data, 2000 [Canada] [Excel]
dataverse.scholarsportal.info
borealisdata.ca
pdf, xls
Updated Nov 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scholars Portal Dataverse (2021). Annual Retail Store Data, 2000 [Canada] [Excel] [Dataset]. https://dataverse.scholarsportal.info/dataset.xhtml;jsessionid=1283d69ee2dd528c9011fe4a2fe3?persistentId=hdl%3A10864%2F11351&version=&q=&fileTypeGroupFacet=&fileAccess=&fileTag=%22Tables%22&fileSortField=&fileSortOrder=
Explore at:
xls(2165760), xls(29696), xls(2920448), pdf(76787), pdf(158404), xls(34816), xls(2754048), pdf(81084), pdf(71183), xls(34304), xls(625664), xls(2707968), xls(695808), pdf(70673), pdf(72585), xls(576512), xls(609792), xls(28672), pdf(60236), pdf(30338), pdf(87181), pdf(84140), pdf(92012), xls(610304), pdf(74439), xls(2471424), pdf(73788), xls(30208), pdf(74478), pdf(53645)Available download formats
Dataset updated
Nov 17, 2021
Dataset provided by
Scholars Portal Dataverse
Area covered
Canada, Canada
Description
The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.
d
Data from: Impact assessment of coastal marine range shifts to support...
datadryad.org
search.dataone.org
zip
Updated May 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amy Henry; Cascade Sorte (2021). Impact assessment of coastal marine range shifts to support proactive management [Dataset]. http://doi.org/10.7280/D1770W
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.7280/D1770W
Dataset updated
May 13, 2021
Dataset provided by
Dryad
Authors
Amy Henry; Cascade Sorte
Time period covered
May 4, 2021
Description
Identification of study species

We identified 40 marine species with documented shifts in range limits along the coastline (<15 km from shore) of North America, including plants, invertebrates, fish, a protist, and a bird. Of these, 26 species were compiled by Sorte et al. (2010), and we added 14 species from an updated literature review. We searched Google Scholar (on 08/20/2019) using this search string: marine "range expansion" species "range shift". We reviewed titles and, when appropriate, abstracts and text of the first 600 results, identifying 12 additional species from eight papers. We added two species (Brachidontes adamsianus and Mexacanthina lugubris) from our literature files and personal observations. We excluded migratory or pelagic species with large biogeographic ranges, for which it was difficult to confirm historical native ranges.

Review of published impacts

Evidence of species’ impacts was compiled from online database searches an...
E
Systematic Review of the Literature on Definitions and Characterisation of...
find.data.gov.scot
dtechtive.com
csv, docx, pdf, txt +2
Updated Nov 25, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh. Centre for Clinical Brain Sciences (2019). Systematic Review of the Literature on Definitions and Characterisation of MS Lesions [Dataset]. http://doi.org/10.7488/ds/2715
Explore at:
csv(0.0003 MB), csv(0.0167 MB), txt(0.0166 MB), csv(0.0915 MB), csv(1.312 MB), csv(0.8488 MB), xls(3.412 MB), csv(1.747 MB), docx(0.1372 MB), csv(0.0188 MB), pdf(0.2025 MB), csv(0.0155 MB), xlsx(0.9941 MB), csv(0.0035 MB), csv(0.0165 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/2715
Dataset updated
Nov 25, 2019
Dataset provided by
University of Edinburgh. Centre for Clinical Brain Sciences
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
While inspecting the brain magnetic resonance imaging (MRI) scans from a sample of Multiple Sclerosis (MS) patients, blind to any clinical, cognitive and demographic information, it caught our attention the presence of ovoidal or circular, partially stellate, regions of signal intensities similar to that of the normal brain parenchyma in Fluid Attenuated Inversion Recovery (FLAIR) surrounded by hyperintensities in the periventricular region in a reasonable number of scans, seemingly corresponding in all cases to hypointense regions (i.e. with the same signal level of the cerebrospinal fluid) in T1-weighted. The ovoidal shape of these features, clearly distinctive due to their homogeneously lower signal with respect to their surroundings in the FLAIR sequence prompted us to refer them as FLAIR 'pseudocavities'. The idea that they could be differentially distinctive and indicative of an underlying process of different aetiology from their surroundings is not implausible. Inversion recovery imaging can potentially discriminate among tissues based on subtle differences in T1 characteristics. Specifically, the FLAIR sequence exploits the fact that many types of pathology have elevated T1 and T2 values resulting from increased free water content compared to background tissue. Higher specific absorption rate due to additional 180 degrees, together with the increased dynamic range, and the additive T1 and T2 contrast, make FLAIR highly susceptible to differentially reflect subtle pathological processes (Bydder & Young, 1985). We, hence, systematically reviewed the literature in the last 10 years (i.e. from March 1999 up to March 2019) to investigate the definitions of MS lesions used up to date and their characterisation, to establish if what we called 'FLAIR 'pseudocavities'' have been described previously. This dataset is conformed by an excel file (Microsoft excel 97-2003 (.xls)) with multiple worksheets which contain all the references found in the two databases explored (i.e. Medline and EMBASE), as well as the data extracted and the results of the analyses. Briefly, from just over a hundred studies that defined MRI lesions in MS, more than half characterised lesions based on the criteria that they were hyperintense on T2-weighted, FLAIR and PD-weighted series, and more than a quarter of the studies characterised lesions based on the criteria that they were hyperintense on T2-weighted, FLAIR and PD-weighted and that they were hypointense on T1-weighted series. The literature review confirmed that what we refer to as FLAIR 'pseudocavities' have not yet been acknowledged in the MS literature. Note: The dataset contains a master excel spreadsheet with multiple worksheets. The data from each worksheet in the excel file is also provided as a .csv file
p
Location of Armed Conflict Onset Dataset (LACOD)
prio.org
Updated Jul 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peace Research Institute Oslo (PRIO) (2023). Location of Armed Conflict Onset Dataset (LACOD) [Dataset]. https://www.prio.org/data/30
Explore at:
Dataset updated
Jul 11, 2023
Dataset provided by
Peace Research Institute Oslo (PRIO)
Time period covered
1946 - 2008
Area covered
Global
Description
Dataset on the locations within a country where internal armed conflicts initially break out, 1946-2008
w
Fire statistics data tables
gov.uk
s3.amazonaws.com
Updated Oct 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
Explore at:
Dataset updated
Oct 23, 2025
Dataset provided by
GOV.UK
Authors
Ministry of Housing, Communities and Local Government
Description

On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/">Northern Ireland: Fire and Rescue Statistics.

If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

Related content

Fire statistics guidance
Fire statistics incident level datasets

Incidents attended

https://assets.publishing.service.gov.uk/media/68f0f810e8e4040c38a3cf96/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 143 KB) Previous FIRE0101 tables

https://assets.publishing.service.gov.uk/media/68f0ffd528f6872f1663ef77/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.12 MB) Previous FIRE0102 tables

https://assets.publishing.service.gov.uk/media/68f20a3e06e6515f7914c71c/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 197 KB) Previous FIRE0103 tables

https://assets.publishing.service.gov.uk/media/68f20a552f0fc56403a3cfef/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 443 KB) Previous FIRE0104 tables

Dwelling fires attended

https://assets.publishing.service.gov.uk/media/68f100492f0fc56403a3cf94/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 192 KB) Previous FIRE0201 tables

<span class="gem
EPIMIC: A Simple Homemade Computer Program for Real-Time EPIdemiological...
plos.figshare.com
datasetcatalog.nlm.nih.gov
application/cdfv2
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philippe Colson; Jean-Marc Rolain; Cédric Abat; Rémi Charrel; Pierre-Edouard Fournier; Didier Raoult (2023). EPIMIC: A Simple Homemade Computer Program for Real-Time EPIdemiological Surveillance and Alert Based on MICrobiological Data [Dataset]. http://doi.org/10.1371/journal.pone.0144178
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0144178
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Philippe Colson; Jean-Marc Rolain; Cédric Abat; Rémi Charrel; Pierre-Edouard Fournier; Didier Raoult
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background and AimsInfectious diseases (IDs) are major causes of morbidity and mortality and their surveillance is critical. In 2002, we implemented a simple and versatile homemade tool, named EPIMIC, for the real-time systematic automated surveillance of IDs at Marseille university hospitals, based on the data from our clinical microbiology laboratory, including clinical samples, tests and diagnoses.MethodsThis tool was specifically designed to detect abnormal events as IDs are rarely predicted and modeled. EPIMIC operates using Microsoft Excel software and requires no particular computer skills or resources. An abnormal event corresponds to an increase above, or a decrease below threshold values calculated based on the mean of historical data plus or minus 2 standard deviations, respectively.ResultsBetween November 2002 and October 2013 (11 years), 293 items were surveyed weekly, including 38 clinical samples, 86 pathogens, 79 diagnosis tests, and 39 antibacterial resistance patterns. The mean duration of surveillance was 7.6 years (range, 1 month-10.9 years). A total of 108,427 Microsoft Excel file cells were filled with counts of clinical samples, and 110,017 cells were filled with counts of diagnoses. A total of 1,390,689 samples were analyzed. Among them, 172,180 were found to be positive for a pathogen. EPIMIC generated a mean number of 0.5 alert/week on abnormal events.ConclusionsEPIMIC proved to be efficient for real-time automated laboratory-based surveillance and alerting at our university hospital clinical microbiology laboratory-scale. It is freely downloadable from the following URL: http://www.mediterranee-infection.com/article.php?larub=157&titre=bulletin-epidemiologique (last accessed: 20/11/2015).
Attendance sheet Data set for University
kaggle.com
zip
Updated May 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Ali (2023). Attendance sheet Data set for University [Dataset]. https://www.kaggle.com/datasets/ahmedaliraja/attendance-sheet-data-set-for-university
Explore at:
zip(608 bytes)Available download formats
Dataset updated
May 18, 2023
Authors
Ahmed Ali
Description
Context: The University Attendance Sheet Dataset is a comprehensive collection of attendance records from various university courses. This dataset is valuable for analyzing student attendance patterns, studying the impact of attendance on academic performance, and exploring factors influencing student engagement. It provides a rich resource for researchers, educators, and students interested in understanding attendance dynamics within a university setting.

Content: The dataset includes the following information:

Student ID: A unique identifier for each student. Course ID: A unique identifier for each course. Date: The date of the attendance record. Attendance Status: Indicates whether the student was present, absent, or had an excused absence on a particular date. The dataset contains records from multiple academic semesters, covering a wide range of courses across different disciplines. By examining this dataset, researchers can investigate attendance trends across different courses, identify patterns related to student performance, and explore correlations between attendance and other academic variables.

Acknowledgements: We would like to express our gratitude to the university administration, faculty members, and students who contributed to the collection and organization of this dataset. Their cooperation and support have made this dataset possible, enabling valuable insights into student attendance dynamics.

Inspiration: The inspiration behind creating this dataset stems from the recognition of the significant role attendance plays in a student's academic journey. By making this dataset available on Kaggle, we hope to facilitate research and analysis on attendance patterns, identify interventions to improve student engagement, and provide educators with valuable insights to enhance their teaching strategies. We also encourage collaboration and exploration of the dataset to uncover new findings and generate knowledge that can benefit the education community as a whole.

By leveraging the University Attendance Sheet Dataset, we aspire to contribute to the ongoing efforts to improve student success and foster an environment that promotes active participation and learning within higher education institutions.
Blinkit dataset
kaggle.com
zip
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mukesh gadri (2024). Blinkit dataset [Dataset]. https://www.kaggle.com/datasets/mukeshgadri/blinkit-dataset
Explore at:
zip(695160 bytes)Available download formats
Dataset updated
Jul 18, 2024
Authors
mukesh gadri
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
In the case study titled "Blinkit: Grocery Product Analysis," a dataset called 'Grocery Sales' contains 12 columns with information on sales of grocery items across different outlets. Using Tableau, you as a data analyst can uncover customer behavior insights, track sales trends, and gather feedback. These insights will drive operational improvements, enhance customer satisfaction, and optimize product offerings and store layout. Tableau enables data-driven decision-making for positive outcomes at Blinkit.

The table Grocery Sales is a .CSV file and has the following columns, details of which are as follows:

• Item_Identifier: A unique ID for each product in the dataset. • Item_Weight: The weight of the product. • Item_Fat_Content: Indicates whether the product is low fat or not. • Item_Visibility: The percentage of the total display area in the store that is allocated to the specific product. • Item_Type: The category or type of product. • Item_MRP: The maximum retail price (list price) of the product. • Outlet_Identifier: A unique ID for each store in the dataset. • Outlet_Establishment_Year: The year in which the store was established. • Outlet_Size: The size of the store in terms of ground area covered. • Outlet_Location_Type: The type of city or region in which the store is located. • Outlet_Type: Indicates whether the store is a grocery store or a supermarket. • Item_Outlet_Sales: The sales of the product in the particular store. This is the outcome variable that we want to predict.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016

Explore at:

Dataset updated

Apr 21, 2025

Dataset provided by

Agricultural Research Servicehttps://www.ars.usda.gov/

Description

The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

Clear search

Close search

Google apps

Main menu

Data from: Current and projected research data storage needs of Agricultural...

SPORTS_DATA_ANALYSIS_ON_EXCEL

ANN development + final testing datasets

Annual Retail Store Data, 2000 [Canada] [Excel]

Data from: Impact assessment of coastal marine range shifts to support...

Systematic Review of the Literature on Definitions and Characterisation of...

Location of Armed Conflict Onset Dataset (LACOD)

Fire statistics data tables

Related content

Incidents attended

Dwelling fires attended

EPIMIC: A Simple Homemade Computer Program for Real-Time EPIdemiological...

Attendance sheet Data set for University

Blinkit dataset

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016See More Versions

Data from: Current and projected research data storage needs of Agricultural Research Service researchers in 2016