The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.
Data fields requiring description are detailed below.
APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.
LICENSE STATUS: 'AAI' means the license was issued.
Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.
Data Owner: Business Affairs and Consumer Protection
Time Period: Current
Frequency: Data is updated daily
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Spreadsheet used to calculated hydrograph recession parameters (Minimum, Most Probable Value, and Maximum) for the Stochastic Empirical Loading Dilution Model (SELDM) . The spreadsheet was used in conjunction with the SELDM simulations used in the publication: Stonewall, A.J., and Granato, G.E., 2018, Assessing potential effects of highway and urban runoff on receiving streams in total maximum daily load watersheds in Oregon using the Stochastic Empirical Loading and Dilution Model: U.S. Geological Survey Scientific Investigations Report 2019-5053, 116 p., https://doi.org/10.3133/sir20195053
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data to reproduce figures.Data required to reproduce figures.
https://assets.publishing.service.gov.uk/media/67077dab3b919067bb482f30/fire-statistics-data-tables-fire1102-191023.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (19 October 2023) (MS Excel Spreadsheet, 472 KB)
https://assets.publishing.service.gov.uk/media/652d1f486972600014ccf86e/fire-statistics-data-tables-fire1102-201022.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (20 October 2022) (MS Excel Spreadsheet, 461 KB)
https://assets.publishing.service.gov.uk/media/634e78c78fa8f5346f4fea45/fire-statistics-data-tables-fire1102-211021.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (21 October 2021) (MS Excel Spreadsheet, 404 KB)
https://assets.publishing.service.gov.uk/media/61699a16d3bf7f5601cf3038/fire-statistics-data-tables-fire1102-221020.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (22 October 2020) (MS Excel Spreadsheet, 348 KB)
https://assets.publishing.service.gov.uk/media/5f86a5a08fa8f51707a7c1ec/fire-statistics-data-tables-fire1102-311019.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (31 October 2019) (MS Excel Spreadsheet, 300 KB)
https://assets.publishing.service.gov.uk/media/5db6ff89ed915d1d02a59fe3/fire-statistics-data-tables-fire1102-181018.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (18 October 2018) (MS Excel Spreadsheet, 251 KB)
https://assets.publishing.service.gov.uk/media/5bb4dcc5ed915d076cc2ac66/fire-statistics-data-tables-fire1102.xlsx">FIRE1102: Total staff numbers (full time equivalent) by role and fire and rescue authority (26 October 2017) (MS Excel Spreadsheet, 276 KB)
Fire statistics data tables
Fire statistics guidance
Fire statistics
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset for the article "The current utilization status of wearable devices in clinical research".Analyses were performed by utilizing the JMP Pro 16.10, Microsoft Excel for Mac version 16 (Microsoft).The file extension "jrp" is a file of the statistical analysis software JMP, which contains both the analysis code and the data set.In case JMP is not available, a "csv" file as a data set and JMP script, the analysis code, are prepared in "rtf" format.The "xlsx" file is a Microsoft Excel file that contains the data set and the data plotted or tabulated using Microsoft Excel functions.Supplementary Figure 1. NCT number duplication frequencyIncludes Excel file used to create the figure (Supplemental Figure 1).・Sfig1_NCT number duplication frequency.xlsxSupplementary Figure 2-5 Simple and annual time series aggregationIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 2-5).・Sfig2-5 Annual time series aggregation.xlsx・Sfig2 Study Type.jrp・Sfig4device type.jrp・Sfig3 Interventions Type.jrp・Sfig5Conditions type.jrp・Sfig2, 3 ,5_database.csv・Sfig2_JMP script_Study type.rtf・Sfig3_JMP script Interventions type.rtf・Sfig5_JMP script Conditions type.rtf・Sfig4_dataset.csv・Sfig4_JMP script_device type.rtfSupplementary Figures 6-11 Mosaic diagram of intervention by conditionSupplementary tables 4-9 Analysis of contingency table for intervention by condition JMP repot files used to create the figures(Supplementary Figures 6-11 ) and tables(Supplementary Tablea 4-9) , including the csv dataset of JMP repot files and JMP scripts.・Sfig6-11 Stable4-9 Intervention devicetype_conditions.jrp・Sfig6-11_Stable4-9_dataset.csv・Sfig6-11_Stable4-9_JMP script.rtfSupplementary Figure 12. Distribution of enrollmentIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 12).・Sfig12_Distribution of enrollment.jrp・Sfig12_Distribution of enrollment.csv・Sfig12_JMP script.rtf
The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
This excel contains results from the 2017 State of Narragansett Bay and Its Watershed Technical Report (nbep.org), Chapter 4: "Population." The methods for analyzing population were developed by the US Environmental Protection Agency ORD Atlantic Coastal Environmental Sciences Division in collaboration with the Narragansett Bay Estuary Program and other partners. Population rasters were generated using the USGS dasymetric mapping tool (see http://geography.wr.usgs.gov/science/dasymetric/index.htm) which uses land use data to distribute population data more accurately than simply within a census mapping unit. The 1990, 2000, and 2010 10m cell population density rasters were produced using Rhode Island state land use data, Massachusetts state land use, Connecticut NLCD land use data, and U.S. Census data. To generate a population estimate (number of persons) for any given area within the boundaries of this raster, NBEP used the the Zonal Statistics as Table tool to sum the 10m cell density values within a given zone dataset (e.g., watershed polygon layer). Results presented include population estimates (1990, 2000, 2010) as well as calculation of percent change (1990-2000;2000-2010;1990-2010).
The Freedom of Information Act 1982 (Cth) provides a right of access to documents held by Australian Government agencies and ministers. Sometimes a charge may apply to an FOI request, and review mechanisms allow people to appeal decisions on an FOI request if they disagree. The data below shows the number of requests received since 1982-83 and the amount of charges notified and collected. This data is sourced from data provided to the Office of the Australian Information Commissioner (OAIC) and previous FOI annual reports prepared each year from 1983 onwards. The 1982-83 data covers only seven months (December 1982 to June 1983), and the data about requests for personal information compared to other requests has only been collected from 2000-01 onwards. Agencies and ministers are also required to provide to the OAIC quarterly and annual statistical returns on FOI activity, review numbers and estimated costs. The returns data since 2011-12 is available here in CSV format and in a summary Excel spreadsheet. In these files a column identified by either 'P' or 'personal' refers to FOI requests for predominantly personal information in relation to the criteria in row 1 above that heading. Similarly the adjacent column to the right identified by 'O' or 'other’ refers to all other FOI requests for that particular criteria. The column immediately adjacent, to the right of 'O' headed by 'T', contains the total of the 'P' and 'O' columns. The Excel is optimised for readability and contains additional processing (such as applying a salary multiplier to agency estimated staff time data). Finally, the original survey questions for each year are included: each of the columns in the CSVs can be linked to a survey question asked that year.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the data for the Excel Township, Minnesota population pyramid, which represents the Excel township population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel township Population by Age. You can refer the same here
This enables further analysis and comparison of Regional Trade in goods data and contains information that includes:
Quarterly information on the number of goods exporters and importers, by UK region and destination country.
Data on number of businesses exporting or importing
Average value of exports and imports by business per region.
Export and Import value by region.
The spreadsheet provides data on businesses using both the whole number and proportion number methodology.
The spreadsheet covers:
Importers by whole number business count
Importers by proportional business count
Exporters by whole number business count
Exporters by proportional business count
The Exporters by proportional business count spreadsheet was previously produced by the Department for International Trade.
MS Excel Spreadsheet, 1.24 MB
A Microsoft Excel spreadsheet with all gravity data used in forward modelling. The data was downloaded from GADDS (Geophysical Archive Data Delivery System): see link.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data collected during a study ("Towards High-Value Datasets determination for data-driven development: a systematic literature review") conducted by Anastasija Nikiforova (University of Tartu), Nina Rizun, Magdalena Ciesielska (Gdańsk University of Technology), Charalampos Alexopoulos (University of the Aegean) and Andrea Miletič (University of Zagreb) It being made public both to act as supplementary data for "Towards High-Value Datasets determination for data-driven development: a systematic literature review" paper (pre-print is available in Open Access here -> https://arxiv.org/abs/2305.10234) and in order for other researchers to use these data in their own work.
The protocol is intended for the Systematic Literature review on the topic of High-value Datasets with the aim to gather information on how the topic of High-value datasets (HVD) and their determination has been reflected in the literature over the years and what has been found by these studies to date, incl. the indicators used in them, involved stakeholders, data-related aspects, and frameworks. The data in this dataset were collected in the result of the SLR over Scopus, Web of Science, and Digital Government Research library (DGRL) in 2023.
Methodology
To understand how HVD determination has been reflected in the literature over the years and what has been found by these studies to date, all relevant literature covering this topic has been studied. To this end, the SLR was carried out to by searching digital libraries covered by Scopus, Web of Science (WoS), Digital Government Research library (DGRL).
These databases were queried for keywords ("open data" OR "open government data") AND ("high-value data*" OR "high value data*"), which were applied to the article title, keywords, and abstract to limit the number of papers to those, where these objects were primary research objects rather than mentioned in the body, e.g., as a future work. After deduplication, 11 articles were found unique and were further checked for relevance. As a result, a total of 9 articles were further examined. Each study was independently examined by at least two authors.
To attain the objective of our study, we developed the protocol, where the information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information.
Test procedure Each study was independently examined by at least two authors, where after the in-depth examination of the full-text of the article, the structured protocol has been filled for each study. The structure of the survey is available in the supplementary file available (see Protocol_HVD_SLR.odt, Protocol_HVD_SLR.docx) The data collected for each study by two researchers were then synthesized in one final version by the third researcher.
Description of the data in this data set
Protocol_HVD_SLR provides the structure of the protocol Spreadsheets #1 provides the filled protocol for relevant studies. Spreadsheet#2 provides the list of results after the search over three indexing databases, i.e. before filtering out irrelevant studies
The information on each selected study was collected in four categories: (1) descriptive information, (2) approach- and research design- related information, (3) quality-related information, (4) HVD determination-related information
Descriptive information
1) Article number - a study number, corresponding to the study number assigned in an Excel worksheet
2) Complete reference - the complete source information to refer to the study
3) Year of publication - the year in which the study was published
4) Journal article / conference paper / book chapter - the type of the paper -{journal article, conference paper, book chapter}
5) DOI / Website- a link to the website where the study can be found
6) Number of citations - the number of citations of the article in Google Scholar, Scopus, Web of Science
7) Availability in OA - availability of an article in the Open Access
8) Keywords - keywords of the paper as indicated by the authors
9) Relevance for this study - what is the relevance level of the article for this study? {high / medium / low}
Approach- and research design-related information 10) Objective / RQ - the research objective / aim, established research questions 11) Research method (including unit of analysis) - the methods used to collect data, including the unit of analy-sis (country, organisation, specific unit that has been ana-lysed, e.g., the number of use-cases, scope of the SLR etc.) 12) Contributions - the contributions of the study 13) Method - whether the study uses a qualitative, quantitative, or mixed methods approach? 14) Availability of the underlying research data- whether there is a reference to the publicly available underly-ing research data e.g., transcriptions of interviews, collected data, or explanation why these data are not shared? 15) Period under investigation - period (or moment) in which the study was conducted 16) Use of theory / theoretical concepts / approaches - does the study mention any theory / theoretical concepts / approaches? If any theory is mentioned, how is theory used in the study?
Quality- and relevance- related information
17) Quality concerns - whether there are any quality concerns (e.g., limited infor-mation about the research methods used)?
18) Primary research object - is the HVD a primary research object in the study? (primary - the paper is focused around the HVD determination, sec-ondary - mentioned but not studied (e.g., as part of discus-sion, future work etc.))
HVD determination-related information
19) HVD definition and type of value - how is the HVD defined in the article and / or any other equivalent term?
20) HVD indicators - what are the indicators to identify HVD? How were they identified? (components & relationships, “input -> output")
21) A framework for HVD determination - is there a framework presented for HVD identification? What components does it consist of and what are the rela-tionships between these components? (detailed description)
22) Stakeholders and their roles - what stakeholders or actors does HVD determination in-volve? What are their roles?
23) Data - what data do HVD cover?
24) Level (if relevant) - what is the level of the HVD determination covered in the article? (e.g., city, regional, national, international)
Format of the file .xls, .csv (for the first spreadsheet only), .odt, .docx
Licenses or restrictions CC-BY
For more info, see README.txt
This dataset was created to document the scoring of a camera tow from SS2007/02 SE MPA's survey Specifically, the camera tow on Hill Patience, to derive data from the tow relating to the number of basket work eels observed. To collect this data, video footage from SS2007/02 SE MPA's camera tow on Patience Hill was observed on a monitor screen with lines marked on it - the video was stopped every five seconds, and marine fauna within the marked lines were counted, with the marked lines used in …Show full descriptionThis dataset was created to document the scoring of a camera tow from SS2007/02 SE MPA's survey Specifically, the camera tow on Hill Patience, to derive data from the tow relating to the number of basket work eels observed. To collect this data, video footage from SS2007/02 SE MPA's camera tow on Patience Hill was observed on a monitor screen with lines marked on it - the video was stopped every five seconds, and marine fauna within the marked lines were counted, with the marked lines used in order to prevent double counting of the animals. This data was recorded in an excel spreadsheet, with a list of the different species recorded and the timecodes that they were recorded at. This data was then used in conjuction with other data from station 54 relating to the depth and structure of the seafloor, in order to determine any relation between marine life numbers and the surrounding habitat.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18738/T8/UVOFTHhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18738/T8/UVOFTH
Excel spreadsheet of smartrock and corresponding hydraulic stream data used to make figures in Pretzlav et al. paper. The spreadsheet contains two tables. First, data averaged over each of the 22 hydrographs, including thresholds of motion (derived from smartrocks). Second, data averaged over 15 minute time intervals (corresponding to USGS gaging station data intervals), including the number of smartrocks still sampling, stage (relative to the thalweg) in our study reach, shear stress, shear velocity, Shields stress, the number of smartrock motions recorded in the time interval, and the transport probability in that time interval. See publication for additional details. Calculations to make these data tables were done primarily by Kealie Pretzlav, with input from Joel Johnson. When using these data, please cite the manuscript Pretzlav et al. (2020) manuscript they are from, or cite this archive and associated DOI directly.
This page lists ad-hoc statistics released during the period July - September 2020. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.
If you would like any further information please contact evidence@dcms.gov.uk.
This analysis considers businesses in the DCMS Sectors split by whether they had reported annual turnover above or below £500 million, at one time the threshold for the Coronavirus Business Interruption Loan Scheme (CBILS). Please note the DCMS Sectors totals here exclude the Tourism and Civil Society sectors, for which data is not available or has been excluded for ease of comparability.
The analysis looked at number of businesses; and total GVA generated for both turnover bands. In 2018, an estimated 112 DCMS Sector businesses had an annual turnover of £500m or more (0.03% of the total DCMS Sector businesses). These businesses generated 35.3% (£73.9bn) of all GVA by the DCMS Sectors.
These are trends are broadly similar for the wider non-financial UK business economy, where an estimated 823 businesses had an annual turnover of £500m or more (0.03% of the total) and generated 24.3% (£409.9bn) of all GVA.
The Digital Sector had an estimated 89 businesses (0.04% of all Digital Sector businesses) – the largest number – with turnover of £500m or more; and these businesses generated 41.5% (£61.9bn) of all GVA for the Digital Sector. By comparison, the Creative Industries had an estimated 44 businesses with turnover of £500m or more (0.01% of all Creative Industries businesses), and these businesses generated 23.9% (£26.7bn) of GVA for the Creative Industries sector.
MS Excel Spreadsheet, 42.5KB
This analysis shows estimates from the ONS Opinion and Lifestyle Omnibus Survey Data Module, commissioned by DCMS in February 2020. The Opinions and Lifestyles Survey (OPN) is run by the Office for National Statistics. For more information on the survey, please see the https://www.ons.gov.uk/aboutus/whatwedo/paidservices/opinions" class="govuk-link">ONS website.
DCMS commissioned 19 questions to be included in the February 2020 survey relating to the public’s views on a range of data related issues, such as trust in different types of organisations when handling personal data, confidence using data skills at work, understanding of how data is managed by companies and the use of data skills at work.
The high level results are included in the accompanying tables. The survey samples adults (16+) across the whole of Great Britain (excluding the Isles of Scilly).
MS Excel Spreadsheet, 12
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
Modelled road freight vehicle movements for a base year of 2006, produced by the Base Year Freight Matrices (BYFM) study. Data consists of numbers of vehicles per average day between a set of origin-destination zone pairs. Vehicles are split into 3 categories: artics, rigids and vans. The zip file contains the data in csv format, metadata in an Excel spreadsheet and the BYFM study technical report.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please note that 2024 data are incomplete and will be updated as additional records become available. Data are complete through 12/31/2023.Fatal and serious injury crashes are not “accidents” and are preventable. The City of Tempe is committed to reducing the number of fatal and serious injury crashes to zero. This data page provides details about the performance measure related to High Severity Traffic Crashes as well as access to the data sets and any supplemental data. The Engineering and Transportation Department uses this data to improve safety in Tempe.This data includes vehicle/vehicle, vehicle/bicycle and vehicle/pedestrian crashes in Tempe. The data also includes the type of crash and location. This layer is used in the related Vision Zero story map, web maps and operations dashboard. Time ZonesPlease note that data is stored in Arizona time which is UTC-07:00 (7 hours behind UTC) and does not adjust for daylight savings (as Arizona does not partake in daylight savings). The data is intended to be viewed in Arizona time. Data downloaded as a CSV may appear in UTC time and in some rare circumstances and locations, may display online in UTC or local time zones. As a reference to check data, the record with incident number 2579417 should appear as Jan. 10, 2012 9:04 AM.Please note that 2024 data are incomplete and will be updated as additional records become available. Data are complete through 12/31/2023.This page provides data for the High Severity Traffic Crashes performance measure. The performance measure page is available at 1.08 High Severity Traffic CrashesAdditional InformationSource: Arizona Department of Transportation (ADOT)Contact (author): Shelly SeylerContact (author) E-Mail: Shelly_Seyler@tempe.govContact (maintainer): Julian DresangContact (maintainer) E-Mail: Julian_Dresang@tempe.govData Source Type: CSV files and Excel spreadsheets can be downloaded from ADOT websitePreparation Method: Data is sorted to remove license plate numbers and other sensitive informationPublish Frequency: semi-annuallyPublish Method: ManualData Dictionary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository holds the testing and demonstration data for DataRig, an opensource software program for downloading datasets from data repositories utilizing RESTful APIs. This repository contains 5 sample datasets.
annotations_001.txt
This data set is a tab-separated text file containing 6 columns that start on line number 7. The column headers are;
'Number' 'Start Time' 'End Time' 'Time From Start' 'Channel' 'Annotation'
There are 13 rows of data under each of these column headers representing the start and end times of annotated events from an eeg recording file in this repository called recording_001.edf. The events describe the behavior of a mouse in 5 sec increments with each behavior being one of 'exploring', 'grooming' or 'rest'.
recording_001.edf
A European Data Format file consisting of 4 channels of EEG data lasting approximately 1 hour. The times in the annotations_001.txt file are referenced against this file.
sample_arr.npy
A numpy array of shape (4, 250) with values sequentially running from 0 to 1000.
sample_excel.xls
An excel file with a single column of 10 numbers from 0-9 sequentially.
sample_text.txt
A text file with 4 rows containing 250 values per row. The values in the file run from 0 to 1000 sequentially.
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel