Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
Download Employee Travel Excel SheetThis dataset contains information about the employee travel expenses for the year 2021. Details are provided on the employee (name, title, department), the travel (dates, location, purpose) and the cost (expenses, recoveries). Expenses are broken down in separate tabs by Quarter (Q1, Q2, Q3 and Q4). Updated quarterly when expenses are prepared. Expenses for other years are available in separate datasets.
The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In "Sample Student Data", there are 6 sheets. There are three sheets with sample datasets, one for each of the three different exercise protocols described (CrP Sample Dataset, Glycolytic Dataset, Oxidative Dataset). Additionally, there are three sheets with sample graphs created using one of the three datasets (CrP Sample Graph, Glycolytic Graph, Oxidative Graph). Each dataset and graph pairs are from different subjects. · CrP Sample Dataset and CrP Sample Graph: This is an example of a dataset and graph created from an exercise protocol designed to stress the creatine phosphate system. Here, the subject was a track and field athlete who threw the shot put for the DeSales University track team. The NIRS monitor was placed on the right triceps muscle, and the student threw the shot put six times with a minute rest in between throws. Data was collected telemetrically by the NIRS device and then downloaded after the student had completed the protocol. · Glycolytic Dataset and Glycolytic Graph: This is an example of a dataset and graph created from an exercise protocol designed to stress the glycolytic energy system. In this example, the subject performed continuous squat jumps for 30 seconds, followed by a 90 second rest period, for a total of three exercise bouts. The NIRS monitor was place on the left gastrocnemius muscle. Here again, data was collected telemetrically by the NIRS device and then downloaded after he had completed the protocol. · Oxidative Dataset and Oxidative Graph: In this example, the dataset and graph are from an exercise protocol designed to stress the oxidative system. Here, the student held a sustained, light-intensity, isometric biceps contraction (pushing against a table). The NIRS monitor was attached to the left biceps muscle belly. Here, data was collected by a student observing the SmO2 values displayed on a secondary device; specifically, a smartphone with the IPSensorMan APP displaying data. The recorder student observed and recorded the data on an Excel Spreadsheet, and marked the times that exercise began and ended on the Spreadsheet.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.
The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
National
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).
Sample survey data [ssd]
The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.
Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.
For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.
For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).
For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.
For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.
For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.
Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).
Computer Assisted Personal Interview [capi]
Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.
For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.
For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.
Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
The Nuclear Medicine National HQ System database is a series of MS Excel spreadsheets and Access Database Tables by fiscal year. They consist of information from all Veterans Affairs Medical Centers (VAMCs) performing or contracting nuclear medicine services in Veterans Affairs medical facilities. The medical centers are required to complete questionnaires annually (RCS 10-0010-Nuclear Medicine Service Annual Report). The information is then manually entered into the Access Tables, which includes: * Distribution and cost of in-house VA - Contract Physician Services, whether contracted services are made via sharing agreement (with another VA medical facility or other government medical providers) or with private providers. * Workload data for the performance and/or purchase of PET/CT studies. * Organizational structure of services. * Updated changes in key imaging service personnel (chiefs, chief technicians, radiation safety officers). * Workload data on the number and type of studies (scans) performed, including Medicare Relative Value Units (RVUs), also referred to as Weighted Work Units (WWUs). WWUs are a workload measure calculated as the product of a study's Current Procedural Terminology (CPT) code, which consists of total work costs (the cost of physician medical expertise and time), and total practice costs (the costs of running a practice, such as equipment, supplies, salaries, utilities etc). Medicare combines WWUs together with one other parameter to derive RVUs, a workload measure widely used in the health care industry. WWUs allow Nuclear Medicine to account for the complexity of each study in assessing workload, that some studies are more time consuming and require higher levels of expertise. This gives a more accurate picture of workload; productivity etc than using just 'total studies' would yield. * A detailed Full-Time Equivalent Employee (FTEE) grid, and staffing distributions of FTEEs across nuclear medicine services. * Information on Radiation Safety Committees and Radiation Safety Officers (RSOs). Beginning in 2011 this will include data collection on part-time and non VA (contract) RSOs; other affiliations they may have and if so to whom they report (supervision) at their VA medical center.Collection of data on nuclear medicine services' progress in meeting the special needs of our female veterans. Revolving documentation of all major VA-owned gamma cameras (by type) and computer systems, their specifications and ages. * Revolving data collection for PET/CT cameras owned or leased by VA; and the numbers and types of PET/CT studies performed on VA patients whether produced on-site, via mobile PET/CT contract or from non-VA providers in the community.* Types of educational training/certification programs available at VA sites * Ongoing funded research projects by Nuclear Medicine (NM) staff, identified by source of funding and research purpose. * Data on physician-specific quality indicators at each nuclear medicine service.* Academic achievements by NM staff, including published books/chapters, journals and abstracts. * Information from polling field sites re: relevant issues and programs Headquarters needs to address. * Results of a Congressionally mandated contracted quality assessment exercise, also known as a Proficiency study. Study results are analyzed for comparison within VA facilities (for example by mission or size), and against participating private sector health care groups. * Information collected on current issues in nuclear medicine as they arise. Radiation Safety Committee structures and membership, Radiation Safety Officer information and information on how nuclear medicine services provided for female Veterans are examples of current issues.The database is now stored completely within MS Access Database Tables with output still presented in the form of Excel graphs and tables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of subgroup meta-analysis of good knowledge, favorable attitudes, and good practices.
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Published: 25th February 2020 - The NHSBSA ‘One Drug Database’ project is underway to provide a consistent, single source of prescriptions information. The first release of new data was due in February 2020 (December 2019 data) on the NHSBSA website, however this has been delayed a month. As a result this is an additional and FINAL release of PLP data by NHS Digital. This means that from March 2020 the PLP data will ONLY be available from the NHSBSA website. There will also be differences in the way data is presented. You can read more about the project, and how this affects the NHSBSA systems and reports in the related link at the bottom of this web page. Practice level prescribing data is a list of all medicines, dressings and appliances that are prescribed by all practices in England and dispensed in the community each month. A zip file is available which users are able to download and extract all 3 files locally. What does the data cover? Practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, including GP practices, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): the total number of items prescribed and dispensed the total net ingredient cost the total actual cost the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to practices. Practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation. Warning: Large file size (over 1GB). Each monthly data set is large (over 10 million rows), but can be viewed using add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets. The Microsoft PowerPivot add-on for Excel is available using the link in the 'Related Links' section below. Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet.
Market basket analysis with Apriori algorithm
The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.
Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.
Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Number of Attributes: 7
https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">
First, we need to load required libraries. Shortly I describe all libraries.
https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">
Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.
https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png">
https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">
After we will clear our data frame, will remove missing values.
https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">
To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
The Survey on Interest Rate Controls 2020 was conducted as a World Bank Group study on interest rate controls (IRCs) in lending and deposit markets around the world. The study aims to identify the different types of formal (or de jure) controls, the countries that apply then, how they implement them, and the reasons for doing so. The objective of the study is to advance knowledge on this topic by providing an evidence base for investigating the impact of IRCs on economic outcomes.
The survey investigates present IRCs in each surveyed country, the reasons why they have been applied, the framework and resources associated with their application and the details as to their level and functioning. The focus is on legal forms of control (i.e. codified into law) as opposed to de facto controls. The new database on interest rate controls, a popular form of financial repression is based on a survey of 108 countries, representing 88 percent of global gross domestic product. The interest rate controls presented in this dataset were in effect in 2019.
Global Survey, covering 108 countries, representing 88 percent of global GDP.
Regulation at the national level.
Banking supervisors and Local Banking Associations.
Sample survey data [ssd]
Mail Questionnaire [mail]
Bank supervisors and banking associations were provided with a standard excel file with five parts. The survey was structured in five parts, each placed in a different excel sheet. Part A: Introduction. Countries with no IRCs in place were asked to only answer this sheet and leave the rest blank. Part B: Presented the definitions of controls, institutions, products and additional aspects that will be covered in the survey. Part C: Introduced a set of qualitative questions to describe the IRCs in place. Part D: Displayed a set of tables to quantitatively describe the IRCs in place. Part E: Laid out the final set of questions, covering sanctions and control mechanisms that support the IRCs' enforcement. The questionnaire is provided in the Documentation section in pdf and excel.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sensitivity analysis of pooled prevalence with each study removed one by one.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Level of knowledge, attitude, and practice towards COVID-19 infection prevention among pregnant women in Ethiopia.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Characteristics of included studies in this systematic review and meta-analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Excel database is part of a study that addresses the treatment of rural generalist journalism on the abandonment of the countryside in a Spanish province (Valladolid). Based on the global data presented here, a content analysis of 226 journalistic pieces from two digital newspapers was carried out: El Norte de Castilla digital (the only regional newspaper in the autonomous community to which Valladolid belongs) and El Día de Valladolid digital (the strictly local newspaper) in order to detect causes, consequences and solutions provided in journalism on the rural exodus from economic, political, socio-cultural and environmental parameters. The main objectives of the study were twofold. On the one hand, to find out how rural journalism deals with the abandonment of the countryside for Valladolid's readers. Two secondary objectives were derived from this objective: (1) To find out what are the causes, consequences and solutions provided by rural journalism on the exodus from the countryside to the citizens of the province of Valladolid; and (2) To find out what is the professional practice of rural journalism on the abandonment of the countryside. On the other hand, the second main objective was to establish proposals for improvement in rural journalism in order to facilitate its professional practice and contribute in some way to solving the problem of rural depopulation. The data were collected between 1 February and 30 November 2023. In addition to the two aforementioned digital newspapers, the data include those of four specialised digital agricultural magazines: ‘Empresa agraria’, ‘Campo.es’, ‘Tierras.Agricultura’ and ‘Agronews Castilla y León’. It should be noted that, once all the data had been incorporated, several transformations were carried out with the PowerQuery Editor tool, which allowed the data to be positioned in an appropriate way in order to subsequently generate dynamic tables in Excel.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.
This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.
https://i.imgur.com/cUFuMeU.png" alt="">
The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.
Cover Photo by: Kevin Woblick on Unsplash
Thumbnail by: Airplane icons created by Freepik - Flaticon
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT
The Albero study analyzes the personal transitions of a cohort of high school students at the end of their studies. The data consist of (a) the longitudinal social network of the students, before (n = 69) and after (n = 57) finishing their studies; and (b) the longitudinal study of the personal networks of each of the participants in the research. The two observations of the complete social network are presented in two matrices in Excel format. For each respondent, two square matrices of 45 alters of their personal networks are provided, also in Excel format. For each respondent, both psychological sense of community and frequency of commuting is provided in a SAV file (SPSS). The database allows the combined analysis of social networks and personal networks of the same set of individuals.
INTRODUCTION
Ecological transitions are key moments in the life of an individual that occur as a result of a change of role or context. This is the case, for example, of the completion of high school studies, when young people start their university studies or try to enter the labor market. These transitions are turning points that carry a risk or an opportunity (Seidman & French, 2004). That is why they have received special attention in research and psychological practice, both from a developmental point of view and in the situational analysis of stress or in the implementation of preventive strategies.
The data we present in this article describe the ecological transition of a group of young people from Alcala de Guadaira, a town located about 16 kilometers from Seville. Specifically, in the “Albero” study we monitored the transition of a cohort of secondary school students at the end of the last pre-university academic year. It is a turning point in which most of them began a metropolitan lifestyle, with more displacements to the capital and a slight decrease in identification with the place of residence (Maya-Jariego, Holgado & Lubbers, 2018).
Normative transitions, such as the completion of studies, affect a group of individuals simultaneously, so they can be analyzed both individually and collectively. From an individual point of view, each student stops attending the institute, which is replaced by new interaction contexts. Consequently, the structure and composition of their personal networks are transformed. From a collective point of view, the network of friendships of the cohort of high school students enters into a gradual process of disintegration and fragmentation into subgroups (Maya-Jariego, Lubbers & Molina, 2019).
These two levels, individual and collective, were evaluated in the “Albero” study. One of the peculiarities of this database is that we combine the analysis of a complete social network with a survey of personal networks in the same set of individuals, with a longitudinal design before and after finishing high school. This allows combining the study of the multiple contexts in which each individual participates, assessed through the analysis of a sample of personal networks (Maya-Jariego, 2018), with the in-depth analysis of a specific context (the relationships between a promotion of students in the institute), through the analysis of the complete network of interactions. This potentially allows us to examine the covariation of the social network with the individual differences in the structure of personal networks.
PARTICIPANTS
The social network and personal networks of the students of the last two years of high school of an institute of Alcala de Guadaira (Seville) were analyzed. The longitudinal follow-up covered approximately a year and a half. The first wave was composed of 31 men (44.9%) and 38 women (55.1%) who live in Alcala de Guadaira, and who mostly expect to live in Alcala (36.2%) or in Seville (37.7%) in the future. In the second wave, information was obtained from 27 men (47.4%) and 30 women (52.6%).
DATE STRUCTURE AND ARCHIVES FORMAT
The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.
Social network
The file “Red_Social_t1.xlsx” is a valued matrix of 69 actors that gathers the relations of knowledge and friendship between the cohort of students of the last year of high school in the first observation. The file “Red_Social_t2.xlsx” is a valued matrix of 57 actors obtained 17 months after the first observation.
The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.
In order to generate each complete social network, the list of 77 students enrolled in the last year of high school was passed to the respondents, asking that in each case they indicate the type of relationship, according to the following values: 1, “his/her name sounds familiar"; 2, "I know him/her"; 3, "we talk from time to time"; 4, "we have good relationship"; and 5, "we are friends." The two resulting complete networks are represented in Figure 2. In the second observation, it is a comparatively less dense network, reflecting the gradual disintegration process that the student group has initiated.
Personal networks
Also in this case the information is organized in two observations. The compressed file “Redes_Personales_t1.csv” includes 69 folders, corresponding to personal networks. Each folder includes a valued matrix of 45 alters in CSV format. Likewise, in each case a graphic representation of the network obtained with Visone (Brandes and Wagner, 2004) is included. Relationship values range from 0 (do not know each other) to 2 (know each other very well).
Second, the compressed file “Redes_Personales_t2.csv” includes 57 folders, with the information equivalent to each respondent referred to the second observation, that is, 17 months after the first interview. The structure of the data is the same as in the first observation.
Sense of community and metropolitan displacements
The SPSS file “Albero.sav” collects the survey data, together with some information-summary of the network data related to each respondent. The 69 rows correspond to the 69 individuals interviewed, and the 118 columns to the variables related to each of them in T1 and T2, according to the following list:
• Socio-economic data.
• Data on habitual residence.
• Information on intercity journeys.
• Identity and sense of community.
• Personal network indicators.
• Social network indicators.
DATA ACCESS
Social networks and personal networks are available in CSV format. This allows its use directly with UCINET, Visone, Pajek or Gephi, among others, and they can be exported as Excel or text format files, to be used with other programs.
The visual representation of the personal networks of the respondents in both waves is available in the following album of the Graphic Gallery of Personal Networks on Flickr: <https://www.flickr.com/photos/25906481@N07/albums/72157667029974755>.
In previous work we analyzed the effects of personal networks on the longitudinal evolution of the socio-centric network. It also includes additional details about the instruments applied. In case of using the data, please quote the following reference:
The English version of this article can be downloaded from: https://tinyurl.com/yy9s2byl
CONCLUSION
The database of the “Albero” study allows us to explore the co-evolution of social networks and personal networks. In this way, we can examine the mutual dependence of individual trajectories and the structure of the relationships of the cohort of students as a whole. The complete social network corresponds to the same context of interaction: the secondary school. However, personal networks collect information from the different contexts in which the individual participates. The structural properties of personal networks may partly explain individual differences in the position of each student in the entire social network. In turn, the properties of the entire social network partly determine the structure of opportunities in which individual trajectories are displayed.
The longitudinal character and the combination of the personal networks of individuals with a common complete social network, make this database have unique characteristics. It may be of interest both for multi-level analysis and for the study of individual differences.
ACKNOWLEDGEMENTS
The fieldwork for this study was supported by the Complementary Actions of the Ministry of Education and Science (SEJ2005-25683), and was part of the project “Dynamics of actors and networks across levels: individuals,
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.
Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.
This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.
The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.
For more information:
NNDSS Supports the COVID-19 Response | CDC.
The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.
All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.
To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.
CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.
For questions, please contact Ask SRRG (eocevent394@cdc.gov).
COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is obtained from a web-scraper with results of the query TMPRSS2:ERG in pubmed bibliography database. The dataset consist on an excel file with 915 references including: authors, pubmed id, title, short citation, long citation, link to the url, abstract, DOI and journal of publication.
This data is obtained with a web scraper coded in python for the first practice of the asignature M2.851 - Tipología y ciclo de vida de los datos of the Master Degree in Data Science of the UOC.
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).