Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The study is mixed methods research.Quantitative Data: Datasets are of sociodemographic data of women accessing cervical cancer screening at a woman's clinic. The datasets and do files can be opened in analytic software, STATA . Qualitative data: Qualitative data consists of preliminary analysis tables and reflective notes from in-depth interviews with female patients and healthcare providers. .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BACKGROUND: The Health Insurance Institute of Slovenia (ZZZS) began publishing service-related data in May 2023, following a directive from the Ministry of Health (MoH). The ZZZS website provides easily accessible information about the services provided by individual doctors, including their names. The user is provided relevant information about the doctor's employer, including whether it is a public or private institution. The data provided is useful for studying the public system's operations and identifying any errors or anomalies.
METHODS: The data for services provided in May 2023 was downloaded and analysed. The published data were cross-referenced using the provider's RIZDDZ number with the daily updated data on ambulatory workload from June 9, 2023, published by ZZZS. The data mentioned earlier were found to be inaccurate and were improved using alerts from the zdravniki.sledilnik.org portal. Therefore, they currently provide an accurate representation of the current situation. The total number of services provided by each provider in a given month was determined by adding up the individual services and then assigning them to the corresponding provider.
RESULTS: A pivot table was created to identify 307 unique operators, with 15 operators not appearing in both lists. There are 66 public providers, which make up about 72% of the contractual programme in the public system. There are 241 private providers, accounting for about 28% of the contractual programme. In May 2023, public providers accounted for 69% (n=646,236) of services in the family medicine system, while private providers contributed 31% (n=291,660). The total number of services provided by public and private providers was 937,896. Three linear correlations were analysed. The initial analysis of the entire sample yielded a high R-squared value of .998 (adjusted R-squared value of .996) and a significant level below 0.001. The second analysis of the data from private providers showed a high R Squared value of .904 (Adjusted R Squared = .886), indicating a strong correlation between the variables. Furthermore, the significance level was < 0.001, providing additional support for the statistical significance of the results. The third analysis used data from public providers and showed a strong level of explanatory power, with a R Squared value of 1.000 (Adjusted R Squared = 1.000). Furthermore, the statistical significance of the findings was established with a p-value < 0.001.
CONCLUSION: Our analysis shows a strong linear correlation between contract size of the program signed and number services rendered by family medicine providers. A stronger linear correlation is observed among providers in the public system compared to those in the private system. Our study found that private providers generally offer more services than public providers. However, it is important to acknowledge that the evaluation framework for assessing services may have inherent flaws when examining the data. Prescribing a prescription and resuscitating a patient are both assigned a rating of one service. It is crucial to closely monitor trends and identify comparable databases for pairing at the secondary and tertiary levels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains article metadata and information about Open Science Indicators for approximately 139,000 research articles published in PLOS journals from 1 January 2018 to 30 March 2025 and a set of approximately 28,000 comparator articles published in non-PLOS journals. This is the tenth release of this dataset, which will be updated with new versions on an annual basis.This version of the Open Science Indicators dataset shares the indicators seen in the previous versions as well as fully operationalised protocols and study registration indicators, which were previously only shared in preliminary forms. The v10 dataset focuses on detection of five Open Science practices by analysing the XML of published research articles:Sharing of research data, in particular data shared in data repositoriesSharing of codePosting of preprintsSharing of protocolsSharing of study registrationsThe dataset provides data and code generation and sharing rates, the location of shared data and code (whether in Supporting Information or in an online repository). It also provides preprint, protocol and study registration sharing rates as well as details of the shared output, such as publication date, URL/DOI/Registration Identifier and platform used. Additional data fields are also provided for each article analysed. This release has been run using an updated preprint detection method (see OSI-Methods-Statement_v10_Jul25.pdf for details). Further information on the methods used to collect and analyse the data can be found in Documentation.Further information on the principles and requirements for developing Open Science Indicators is available in https://doi.org/10.6084/m9.figshare.21640889.Data folders/filesData Files folderThis folder contains the main OSI dataset files PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv, which containdescriptive metadata, e.g. article title, publication data, author countries, is taken from the article .xml filesadditional information around the Open Science Indicators derived algorithmicallyand the OSI-Summary-statistics_v10_Jul25.xlsx file contains the summary data for both PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv.Documentation folderThis file contains documentation related to the main data files. The file OSI-Methods-Statement_v10_Jul25.pdf describes the methods underlying the data collection and analysis. OSI-Column-Descriptions_v10_Jul25.pdf describes the fields used in PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv. OSI-Repository-List_v1_Dec22.xlsx lists the repositories and their characteristics used to identify specific repositories in the PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv repository fields.The folder also contains documentation originally shared alongside the preliminary versions of the protocols and study registration indicators in order to give fuller details of their detection methods.Contact details for further information:Iain Hrynaszkiewicz, Director, Open Research Solutions, PLOS, ihrynaszkiewicz@plos.org / plos@plos.orgLauren Cadwallader, Open Research Manager, PLOS, lcadwallader@plos.org / plos@plos.orgAcknowledgements:Thanks to Allegra Pearce, Tim Vines, Asura Enkhbayar, Scott Kerr and parth sarin of DataSeer for contributing to data acquisition and supporting information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the data repository for the LO2 dataset.
Here is an overview of the contents.
lo2-data.zip
This is the main dataset. This is the completely unedited output of our data collection process. Note that the uncompressed size is around 540 GB. For more information, see the paper and the data-appendix in this repository.
lo2-sample.zip
This is a sample that contains the data used for preliminary analysis. It contains only service logs and the most relevant metrics for the first 100 runs. Furthermore, the metrics are combined on a run level to a single csv to make them easier to utilize.
data-appendix.pdf
This document contains further details and stats about the full dataset. These include file size distributions, empty file analysis, log type analysis and the appearance of an unknown file.
lo2-scripts.zip
Various scripts for processing the data to create the sample, to conduct the preliminary analysis and to create the statistics seen in the data-appendix.
Version v3: Updated data appendix introduction, added another stage in the log analysis process in loglead_lo2.py
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of data of different types of television shows based on video games from years mentioned in the title. The data has been used in articles and conference presentations before (e.g. Kerttula 2019; Kerttula 2020). The data is free to use in any future publications with proper references to the author and the original data. Should the data be used in further research, it is to be noted that the dataset is not 100% complete. The reasons to this are difficulties with language and cultural barriers. It also needs to be mentioned, that some of the television shows and production companies have probably being forgotten over time, which means that a complete list would quite likely prove to be very difficult to gather. Some of the data included is missing classification information. This is because in these cases, the data needed was not available or hard to figure out. For example, the time slot data was missing for these shows, or there was not enough information available to make conclusions about the structure of the show. This applies only for a handful of shows, however. This data does not compromise or endanger any copyrights or personal information. All the data gathered here is publically available from different internet sources. No personal information, such as addresses, phone numbers or contact persons was recorded in the data. Some shows feature episodes from video depositories around internet, but if the production company wants to take the episodes offline, it does not harm the dataset.
This dataset includes all the data files that were used for the studies in my Master Thesis: "The Choice of Aspect in the Russian Modal Construction with prixodit'sja/prijtis'". The data files are numbered so that they are shown in the same order as they are presented in the thesis. They include the database and the code used for the statistical analysis. Their contents are described in the ReadMe files. The core of the work is a quantitative and empirical study on the choice of aspect by Russian native speakers in the modal construction prixodit’sja/prijtis’ + inf. The hypothesis is that in the modal construction prixodit’sja/prijtis’ + inf the aspect of the infinitive is not fully determined by grammatical context but, to some extent, open to construal. A preliminary analysis was carried out on data gathered from the Russian National Corpus (www.ruscorpora.ru). Four hundred and forty-seven examples with the verb prijtis' were annotated manually for several factors and a statistical test (CART) was run. Results demonstrated that no grammatical factor plays a big role in the use of one aspect rather than the other. Data for this study can be consulted in the files from 01 to 03 and include a ReadMe file, the database in .csv format and the code used for the statistical test. An experiment with native speakers was then carried out. A hundred and ten native speakers of Russian were surveyed and asked to evaluate the acceptability of the infinitive in examples with prixodit’sja/prijtis’ delat’/sdelat’ šag/vid/vybor. The survey presented seventeen examples from the Russian National Corpus that were submitted two times: the first time with the same aspect as in the original version, the second time with the other aspect. Participants had to evaluate each case by choosing among “Impossible”, “Acceptable” and “Excellent” ratings. They were also allowed to give their opinion about the difference between aspects in each example. A Logistic Regression with Mixed Effects was run on the answers. Data for this study can be consulted in the files from 04 to 010 and include a ReadMe file, the text and the answers of the questionnaire, the database in .csv, .txt and pdf formats and the code used for the statistical test. Results showed that prijtis’ often admits both aspects in the infinitive, while prixodit’sja is more restrictive and prefers imperfective. Overall, “Acceptable” and “Excellent” responses were higher than “Impossible” responses for both aspects, even when the aspect evaluated didn’t match with the original. Personal opinions showed that the choice of aspect often depends on the meaning the speaker wants to convey. Only in very few cases the grammatical context was considered to be a constraint on the choice.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The datasets contain the raw data and preprocessed data (following the steps in the Jupyter Notebook) of 9 DHT22 sensors in a cold storage room. Details on how the data was gathered can be found in the publication "Self-Adaptive Integration of Distributed Sensor Systems for Monitoring Cold Storage Environments" by Elia Henrichs, Florian Stoll, and Christian Krupitzer.
This dataset consists of the following files:
Offshore wind represents a potentially significant source of low-carbon energy for Canada, and ensuring that relevant, high-quality data and scientifically sound analyses are brought forward into decision-making processes will increase the chances of success for any future deployment of offshore wind in Canada. To support this objective, CanmetENERGY-Ottawa (CE-O), a federal laboratory within Natural Resources Canada (NRCan), completed a preliminary analysis of relevant considerations for offshore wind, with an initial focus on Atlantic Canada. To conduct the analysis, CE-O used geographic information system (GIS) software and methods and engaged with multiple federal government departments to acquire relevant data and obtain insights from subject matter experts on the appropriate use of these data in the context of the analysis. The purpose of this work is to support the identification of candidate regions within Atlantic Canada that could become designated offshore wind energy areas in the future.
The study area for the analysis included the Gulf of St. Lawrence, the western and southern coasts of the island of Newfoundland, and the coastal waters south of Nova Scotia. Twelve input data layers representing various geophysical, ecological, and ocean use considerations were incorporated as part of a multi-criteria analysis (MCA) approach to evaluate the effects of multiple inputs within a consistent framework. Six scenarios were developed which allow for visualization of a range of outcomes according to the influence weighting applied to the different input layers and the suitability scoring applied within each layer.
This preliminary assessment resulted in the identification of several areas which could be candidates for future designated offshore wind areas, including the areas of the Gulf of St. Lawrence north of Prince Edward Island and west of the island of Newfoundland, and areas surrounding Sable Island. This study is subject to several limitations, namely missing and incomplete data, lack of emphasis on temporal and cumulative effects, and the inherent subjectivity of the scoring scheme applied. Further work is necessary to address data gaps and take ecosystem wide impacts into account before deployment of offshore wind projects in Canada’s coastal waters. Despite these limitations, this study and the data compiled in its preparation can aid in identifying promising locations for further review.
A description of the methodology used to undertake this study is contained in the accompanying report, available at the following link: https://doi.org/10.4095/331855. This report provides in depth detail into how these data layers were compiled and details any analysis that was done on the data to produce the final data layers in this package.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To inform efforts to improve the discoverability of and access to biomedical datasets by providing a preliminary estimate of the number and type of datasets generated annually by National Institutes of Health (NIH)-funded researchers. Of particular interest is characterizing those datasets that are not deposited in a known data repository or registry, e.g., those for which a related journal article does not indicate that underlying data have been deposited in a known repository. Such “invisible” datasets comprise the “long tail” of biomedical data and pose significant practical challenges to ongoing efforts to improve discoverability of and access to biomedical research data. This study identified datasets used to support the NIH-funded research reported in articles published in 2011 and cited in PubMed® and deposited in PubMed Central® (PMC). After searching for all articles that acknowledged NIH support, we first identified articles that contained explicit mention of datasets being deposited in recognized repositories. Thirty members of the NIH staff then analyzed a random sample of the remaining articles to estimate how many and what types of datasets were used per article. Two reviewers independently examined each paper. Each dataset is titled Bigdata_randomsample_xxxx_xx. The xxxx refers to the set of articles the annotator looked at, while the xxidentifies the annotator that did the analysis. Within each dataset, the author has listed the number of datasets they identified within the articles that they looked at. For every dataset that was found, the annotators were asked to insert a new row into the spreadsheet, and then describe the dataset they found (e.g., type of data, subject of study, etc.). Each row in the spreadsheet was always prepended by the PubMed Identifier (PMID) where the dataset was found. Finally, the files 2013-08-07_Bigdatastudy_dataanalysis, Dataanalysis_ack_si_datasets, and Datasets additional random sample mention vs deposit 20150313 refer to the analysis that was performed based on each annotator's analysis of the publications they were assigned, and the data deposits identified from the analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Outcome statistics (temper outbursts) for participants for preliminary data included in the analysis.
https://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do
This data provides data that can be used to analyze the number of applicants and successful applicants for the medical preliminary examination (written position) by university. It provides the year, occupation, round, exam name, foreign university, number of applicants, number of successful applicants, and passing rate from 2013 to 2025 (as of July 22, 2025) in a form that does not identify individuals. This data has the following purposes and usability: 1. It provides basic statistical information on the status of qualification examinations. 2. It can be used to analyze the supply and demand of human resources and education status in the medical (preliminary examination) field. 3. It can be used as a basis for public data analysis and private utilization.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This zip file contains the data file (viral titers from bee extracts) and R Code (as a .rmd file) used to conduct analysis for the manuscript "Preliminary analysis shows that feral and managed honey bees in southern California have similar levels of viral pathogens", but A Geffre, D Travis, J Kohn and J Nieh.
Data on marine benthonic foraminifera were obtained from samples collected in Tracadie Bay during September 1964. The bay is shallow and undergoes wide diurnal and seasonal temperature and pH changes. Samples were collected from the sediment in water depths from 1.82m to 3.65m. Temperature and pH of sediments were recorded as soon as they reached the surface. Sediments were mixed with alcohol and Rose Bengal for faunal analysis. The samples collected from 1.82m were located in tidal marshes at low tide. The foraminifera results were tabulated as percent living as well as percent of total population. Only species presence is included in this OBIS view. Complete information on sediments in the bay and statistical analysis of samples can be found in the original data report (Bartlett,1965).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The 2018 TRI preliminary dataset consists of TRI data for 2018. Users should note that while these preliminary data have undergone the basic data quality checks included in the online TRI reporting software, they have not undergone the complete TRI data quality process. In addition, EPA does not aggregate or summarize these data, or offer any analysis or interpretation of them.You can use the TRI preliminary dataset to: Identify how many TRI facilities operate in a certain geographic area (for example, a ZIP code);Identify which chemicals are being managed by TRI facilities and in what quantities; andFind out if a particular facility initiated any pollution prevention activities in the most recent calendar year.The agency will update the dataset several times in August and September based on information from facilities. EPA plans to publish the complete, quality-checked 2018 dataset in October 2019, followed by the 2018 TRI National Analysis in January 2020.
The FDOT GIS Preliminary Context Classification feature class provides spatial information regarding preliminary context classification on selected Florida roadways. Context classification denotes the criteria for roadway design elements for safer streets that promote safety, economic development, and quality of life. All non-limited access state highways will be evaluated and assigned a current context classification. Limited access facilities are assigned only one code - LA - Limited Access. For growth development and design purposes, a future context classification will also be assigned. The District Complete Streets Coordinator will determine the current and future context classification designation, along with the dates, and coordinate with the District RCI staff to get this information into the RCI database. This information is required for All functionally classified roadways on the State Highway System (SHS). This dataset is maintained by the Transportation Data & Analytics office (TDA). The source spatial data for this hosted feature layer was created on: 07/12/2025.For more details please review the FDOT RCI Handbook Download Data: Enter Guest as Username to download the source shapefile from here:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.
The PAE is derived from the intersection of surface water hydrology features, groundwater management units, mining development leases and/or coal seam gas (CSG) tenements, and directional flows of surface water and groundwater (Barrett et al., 2013). The process of defining the PAE is undertaken via consideration of the vertical and horizontal proximity and scale of surface water and groundwater connectivity pathways and potential for depressurisation or dewatering.
The Preliminary Assessment Extent (PAE) has been derived from the maximum extents of the respective input surface water and groundwater PAEs for the Namoi subregion.
The purpose of the PAE is to provide a first step in the process of determining whether a water-related link is possible between coal resource development and the assets. It is intended to be a realistic yet inclusive estimate of the land surface area where potential impacts might occur. As the model-data analysis, impact analysis and risk analysis components of a BA are completed it will be possible to more closely characterise and quantify impacts in terms of their extent and their likelihood.
The PAE boundary was defined from by performing a Union in ESRI ArcGIS of the respective versions of the groundwater and surface water PAEs.
Bioregional Assessment Programme (2014) Preliminary Assessment Extent (PAE) for the Namoi subregion - v04. Bioregional Assessment Derived Dataset. Viewed 11 December 2018, http://data.bioregionalassessments.gov.au/dataset/6b8e415f-03f8-4b2d-a022-f6ef1e9e73a8.
Derived From Groundwater Zone of Impact for the Namoi subregion
Derived From Groundwater Preliminary Assessment Extent for the Namoi subregion
Derived From Natural Resource Management (NRM) Regions 2010
Derived From Surface water Preliminary Assessment Extent (PAE) for the Namoi (NAM) subregion - v03
Derived From GEODATA TOPO 250K Series 3
Derived From Bioregional Assessment areas v03
Derived From Bioregional Assessment areas v01
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From NSW Catchment Management Authority Boundaries 20130917
Derived From Bioregional Assessment areas v02
Derived From Geological Provinces - Full Extent
In spring 1998, studies were initiated to assess the ecological impacts of the invasive non-native species leafy spurge (Euphorbia esula) on rangeland and natural areas in the northern Great Plains, and to evaluate the effectiveness of biocontrol insects released to reduce spurge populations. The primary objectives of the study were: (1) to conduct a vegetation survey to describe the composition and relative abundance of plant species on study sites, determine species dominance within the community and construct initial diversity indices such as species richness, (2) to determine the density and stage distribution of leafy spurge on permanent plots, and (3) to determine the abundance and distribution of biocontrol insects. Overall diversity tended to be relatively low, with all sites averaging less than nine species per 1 square-meter plot. All sites were dominated by introduced species, with native forbs and grasses occupying less important roles in the community structure. In 1999, we will determine the extent to which the seed banks at the sites reflect the current abundance of native and exotic species.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Data Collection Dance practitioners are a group in urgent need of attention. Reviewing the literature, there are problems in the professional status of dance practitioners. These include low and unstable financial income, limited professional development, multiple pressures and role conflicts faced by practitioners, social class and social acceptance, and the wider impact on the discipline of dance. As the core group that promotes the development of the professional field of dance, the contribution and return of dance practitioners do not match, which will largely frustrate the career expectations and sense of achievement of dance practitioners, leading to job burnout. This will not only lead to the loss of talent within the industry, but also affect the current and future development of the professional field of dance. Through induction, this study conducted semi-structured interviews with 10 qualified dance practitioners in Chinese universities, compared the cultural capital accumulated by dance practitioners in the early stage with the actual economic and social capital in the later stage, and investigated the status quo of cultural capital transformation of dance practitioners. Second, what challenges or obstacles do dance practitioners face in the transformation of cultural capital? And why they were able to hold on to their dance careers in the face of difficulties. Through the exploration of these stories, it can be seen that the transformation of cultural capital accumulated by dance practitioners into economic capital is difficult, and the transformation into social capital is more significant. Dance practitioners' love for dance and spiritual values are the main motivators that help them overcome challenges and obstacles. They now face the dual challenges of physical and mental health and job burnout. This study considers implications for future research and practical applications and hopes to provide a call for concern for the health and safety of dance practitioners, as well as relevant supporting materials. Methods Data Collection The teacher interviews were focused on June to July 2024. A semi-structured interview was conducted with each participant in a library discussion room at the University of Edinburgh, UK. Due to geographic and temporal differences, all interviews were conducted online through Teams, and the timing of the interviews was random, referencing the time of the participants. Each interview lasts 45 to 60 minutes. In order to make the interview more smooth, each participant has access to interview questions in advance. Before each participant was interviewed, during the interview, there were videotaped video recordings and audio interview scripts, and the key points emphasized by the interviewer to the participant were marked and recorded. The semi-structured interviews in this study include four themes: (1) cultural capital accumulation; (2) The transformation of cultural capital into economic and social capital; (3) Challenges faced and coping strategies; (4) Reasons and prospects for persistence. The purpose of the interview is to understand whether the cultural capital of dance practitioners (college dance teachers) such as dance professional knowledge, academic background, accumulated professional certificates and honors can be transformed into actual economic and social capital; As well as the actual performance and some challenges and obstacles in the transformation process; In the face of difficulties and pressure, I chose to stick to my dance career. During the interview, I asked the participants 23 questions, including open questions, closed questions and leading questions. After the participant described specific events and feelings during the interview, the interviewer would again summarize what the participant had said in a general way, such as: "This is......... ? What are you trying to say?" "So you think......" Ensure data accuracy. In addition, interviewers focus primarily on open-ended questions. When the intervieee is unable to continue to answer in-depth questions, the interviewer will guide to a certain extent, reducing the deliberate guidance and intervention of the interviewer. For example, "What did you just share...... Can you tell me more?" Based on your personal experience, what do you think is the cause of...... While it's helpful to have a basic interview guide, it's also important for interviewers to "actively listen and move the interview forward as much as possible by building on what the participants have already begun to share" (Seidman, 2013). All participants' data is kept in onedrive's university account and can only be shared between the author and the tutor. All data will be destroyed within 30 days of the completion of the paper. Data Analysis In my data analysis, I used the thematic analysis approach. Thematic analysis is a method of identifying and recording relevant patterns in methodological data that, despite multiple approaches, often follows the process from coding data to reporting and discussing analytical topics. By extracting statements from large amounts of qualitative data, thematic analysis can enable data analysis to become coherent and transparent to the reader, and thus can strongly support data analysis. Theme analysis consists of six steps: familiarization with the data, preliminary coding, finding topics, reviewing topics, defining and naming topics, and finally writing a report (Braun & Clarke, 2006; Miles & Huberman). Each participant interviews were audio-recorded and transcribed. First, I read the data of each participant and paid special attention to the exclusivity of the data during the second level of coding. To facilitate preliminary coding, interviews irrelevant to the study question were excluded, and single sentences were collated into complete paragraphs to match each interview question. Merriam (2009) believes that coding is the process of the researcher reading the data, noting interesting, potentially relevant or important parts, and conducting conversations, questions and comments with the data. In this study, I adopted open coding, which implies maintaining an open mind during coding. In the first level coding, I directly coded the participants' statements, marking the original data paragraphs or sentences that fully fit the research question; in the second level coding, I read the complete data and summarize words and phrases next to the text (Merriam, 2009). Merriam (2009) mentioned that "data analysis is a complex process involving repeated switching between concrete data and abstract concepts, between inductive and deductive reasoning, and between description and interpretation" (p.176). Therefore, I moved back and forth between data fragments, descriptions, and interpretations, looking for common clues to these themes (Fraser, 2004). Due to the open encoding of the data content, 53 secondary codes were generated, which posed challenges for my subsequent topic definition and naming. By analyzing the common clues of these contents, I continue to summarize the secondary coding into seven tertiary coding themes that can directly answer, define and name the research questions. Four-level coding corresponds to the four questions in this study, each corresponding to the three-level coding topic and answering the four questions of the study .
Its been two years since the news that Canada has legalized weed hit us, so I was like why don't we get a dataset from Kaggle to practice a bit of data analysis and to my surprise I cannot find a weed dataset which reflects the economics behind legalized weed and how it has changed over time ,so I just went to the Canadian govt data site , and ola they have CSV files on exactly what I wanted floating around on their website and all I did was to download it straight up, and here I am to share it with the community.
We have a series of CSV files each having data about things like supply, use case, production, etc but before we go into the individual files there are a few data columns which are common to all csv files
Understanding metadata files:
Cube Title: The title of the table. The output files are unilingual and thus will contain either the English or French title.
Product Id (PID): The unique 8 digit product identifier for the table.
CANSIM Id: The ID number which formally identified the table in CANSIM. (where applicable)
URL: The URL for the representative (default) view of a given data table.
Cube Notes: Each note is assigned a unique number. This field indicates which notes, if any, are applied to the entire table.
Archive Status: Describes the status of a table as either 'Current' or 'Archived'. Archived tables are those that are no longer updated.
Frequency: Frequency of the table. (i.e. annual)
Start Reference Period: The starting reference period for the table.
End Reference Period: The end reference period for the table.
Total Number of Dimensions: The total number of dimensions contained in the table.
Dimension Name: The name of a dimension in a table. There can be up to 10 dimensions in a table. (i.e. – Geography)
Dimension ID: The reference code assigned to a dimension in a table. A unique reference Dimension ID code is assigned to each dimension in a table.
Dimension Notes: Each note is assigned a unique number. This field indicates which notes are applied to a particular dimension.
Dimension Definitions: Reserved for future development.
Member Name: The textual description of the members in a dimension. (i.e. – Nova Scotia, Ontario (members of the Geography dimension))
Member ID: The code assigned to a member of a dimension. There is a unique ID for each member within a dimension. These IDs are used to create the coordinate field in the data file. (see the 'coordinate' field in the data record layout).
Classification (where applicable): Classification code for a member. Definitions, data sources and methods
Parent Member ID: The code used to display the hierarchical relationship between members in a dimension. (i.e. – The member Ontario (5) is a child of the member Canada (1) in the dimension 'Geography')
Terminated: Indicates whether a member has been terminated or not. Terminated members are those that are no longer updated.
Member Notes: Each note is assigned a unique number. This field indicates which notes are applied to each member.
Member definitions: Reserved for future development.
Symbol Legend: The symbol legend provides descriptions of the various symbols which can appear in a table. This field describes a comprehensive list of all possible symbols, regardless of whether a selected symbol appears in a particular table.
Survey Code: The unique code associated with a survey or program from which the data in the table is derived. Data displayed in one table may be derived ...
https://borealisdata.ca/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.5683/SP3/DX005Lhttps://borealisdata.ca/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.5683/SP3/DX005L
One of the lines of research of Sustaining the Knowledge Commons (SKC) is a longitudinal study of the minority (about a third) of the fully open access journals that use this business model. The original idea was to gather data during an annual two-week census period. The volume of data and growth in this area makes this an impractical goal. For this reason, we are posting this preliminary dataset in case it might be helpful to others working in this area. Future data gathering and analysis will be conducted on an ongoing basis. Major sources of data for this dataset include: • the Directory of Open Access Journals (DOAJ) downloadable metadata; the base set is from May 2014, with some additional data from the 2015 dataset • data on publisher article processing charges and related information gathered from publisher websites by the SKC team in 2015, 2014 (Morris on, Salhab, Calvé-Genest & Horava, 2015) and a 2013 pilot • DOAJ article content data screen scraped from DOAJ (caution; this data can be quite misleading due to limitations with article-level metadata) • Subject analysis based on DOAJ subject metadata in 2014 for selected journals • Data on APCs gathered in 2010 by Solomon and Björk (supplied by the authors). Note that Solomon and Björk use a different method of calculating APC so the numbers are not directly comparable. • Note that this full d ataset includes some working columns which are meaningful only by means of explaining very specific calculations which are not necessarily evident in the dataset per se. Details below. Significant limitation: • This dataset does not include new journals added to DOAJ in 2015. A recent publisher size analysis indicates some significant changes. For example, DeGruyter, not listed in the 2014 survey, is now the third largest DOAJ publisher with over 200 titles. Elsevier is now the 7th largest DOAJ publisher. In both cases, gathering data from the publisher websites will be time-consuming as it is necessary to conduct individual title look-up. • Some OA APC data for newly added journals was gathered in May 2015 but has not yet been added to this dataset. One of the reasons for gathering this data is a comparison of the DOAJ "one price listed" approach with potentially richer data on the publisher's own website. For full details see the documentation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The study is mixed methods research.Quantitative Data: Datasets are of sociodemographic data of women accessing cervical cancer screening at a woman's clinic. The datasets and do files can be opened in analytic software, STATA . Qualitative data: Qualitative data consists of preliminary analysis tables and reflective notes from in-depth interviews with female patients and healthcare providers. .