Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GENERAL INFORMATION
Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation
Date of data collection: January to March 2022
Collection instrument: SurveyMonkey
Funding: Alfred P. Sloan Foundation
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license
Links to publications that cite or use the data:
Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437
Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266
DATA & FILE OVERVIEW
File List
Additional related data collected that was not included in the current data package: Open ended questions asked to respondents
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:
The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.
Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).
Methods for processing the data:
Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.
Instrument- or software-specific information needed to interpret the data:
The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.
DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata
Number of variables: 94
Number of cases/rows: 2,492
Missing data codes: 999 Not asked
Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.
Facebook
TwitterThis dataset originates from a series of experimental studies titled “Tough on People, Tolerant to AI? Differential Effects of Human vs. AI Unfairness on Trust” The project investigates how individuals respond to unfair behavior (distributive, procedural, and interactional unfairness) enacted by artificial intelligence versus human agents, and how such behavior affects cognitive and affective trust.1 Experiment 1a: The Impact of AI vs. Human Distributive Unfairness on TrustOverview: This dataset comes from an experimental study aimed at examining how individuals respond in terms of cognitive and affective trust when distributive unfairness is enacted by either an artificial intelligence (AI) agent or a human decision-maker. Experiment 1a specifically focuses on the main effect of the “type of decision-maker” on trust.Data Generation and Processing: The data were collected through Credamo, an online survey platform. Initially, 98 responses were gathered from students at a university in China. Additional student participants were recruited via Credamo to supplement the sample. Attention check items were embedded in the questionnaire, and participants who failed were automatically excluded in real-time. Data collection continued until 202 valid responses were obtained. SPSS software was used for data cleaning and analysis.Data Structure and Format: The data file is named “Experiment1a.sav” and is in SPSS format. It contains 28 columns and 202 rows, where each row corresponds to one participant. Columns represent measured variables, including: grouping and randomization variables, one manipulation check item, four items measuring distributive fairness perception, six items on cognitive trust, five items on affective trust, three items for honesty checks, and four demographic variables (gender, age, education, and grade level). The final three columns contain computed means for distributive fairness, cognitive trust, and affective trust.Additional Information: No missing data are present. All variable names are labeled in English abbreviations to facilitate further analysis. The dataset can be directly opened in SPSS or exported to other formats.2 Experiment 1b: The Mediating Role of Perceived Ability and Benevolence (Distributive Unfairness)Overview: This dataset originates from an experimental study designed to replicate the findings of Experiment 1a and further examine the potential mediating role of perceived ability and perceived benevolence.Data Generation and Processing: Participants were recruited via the Credamo online platform. Attention check items were embedded in the survey to ensure data quality. Data were collected using a rolling recruitment method, with invalid responses removed in real time. A total of 228 valid responses were obtained.Data Structure and Format: The dataset is stored in a file named Experiment1b.sav in SPSS format and can be directly opened in SPSS software. It consists of 228 rows and 40 columns. Each row represents one participant’s data record, and each column corresponds to a different measured variable. Specifically, the dataset includes: random assignment and grouping variables; one manipulation check item; four items measuring perceived distributive fairness; six items on perceived ability; five items on perceived benevolence; six items on cognitive trust; five items on affective trust; three items for attention check; and three demographic variables (gender, age, and education). The last five columns contain the computed mean scores for perceived distributive fairness, ability, benevolence, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variables are labeled using standardized English abbreviations to facilitate reuse and secondary analysis. The file can be analyzed directly in SPSS or exported to other formats as needed.3 Experiment 2a: Differential Effects of AI vs. Human Procedural Unfairness on TrustOverview: This dataset originates from an experimental study aimed at examining whether individuals respond differently in terms of cognitive and affective trust when procedural unfairness is enacted by artificial intelligence versus human decision-makers. Experiment 2a focuses on the main effect of the decision agent on trust outcomes.Data Generation and Processing: Participants were recruited via the Credamo online survey platform from two universities located in different regions of China. A total of 227 responses were collected. After excluding those who failed the attention check items, 204 valid responses were retained for analysis. Data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in a file named Experiment2a.sav in SPSS format and can be directly opened in SPSS software. It contains 204 rows and 30 columns. Each row represents one participant’s response record, while each column corresponds to a specific variable. Variables include: random assignment and grouping; one manipulation check item; seven items measuring perceived procedural fairness; six items on cognitive trust; five items on affective trust; three attention check items; and three demographic variables (gender, age, and education). The final three columns contain computed average scores for procedural fairness, cognitive trust, and affective trust.Additional Notes: The dataset contains no missing values. All variables are labeled using standardized English abbreviations to facilitate reuse and secondary analysis. The file can be directly analyzed in SPSS or exported to other formats as needed.4 Experiment 2b: Mediating Role of Perceived Ability and Benevolence (Procedural Unfairness)Overview: This dataset comes from an experimental study designed to replicate the findings of Experiment 2a and to further examine the potential mediating roles of perceived ability and perceived benevolence in shaping trust responses under procedural unfairness.Data Generation and Processing: Participants were working adults recruited through the Credamo online platform. A rolling data collection strategy was used, where responses failing attention checks were excluded in real time. The final dataset includes 235 valid responses. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in a file named Experiment2b.sav, which is in SPSS format and can be directly opened using SPSS software. It contains 235 rows and 43 columns. Each row corresponds to a single participant, and each column represents a specific measured variable. These include: random assignment and group labels; one manipulation check item; seven items measuring procedural fairness; six items for perceived ability; five items for perceived benevolence; six items for cognitive trust; five items for affective trust; three attention check items; and three demographic variables (gender, age, education). The final five columns contain the computed average scores for procedural fairness, perceived ability, perceived benevolence, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variables are labeled using standardized English abbreviations to support future reuse and secondary analysis. The dataset can be directly analyzed in SPSS and easily converted into other formats if needed.5 Experiment 3a: Effects of AI vs. Human Interactional Unfairness on TrustOverview: This dataset comes from an experimental study that investigates how interactional unfairness, when enacted by either artificial intelligence or human decision-makers, influences individuals’ cognitive and affective trust. Experiment 3a focuses on the main effect of the “decision-maker type” under interactional unfairness conditions.Data Generation and Processing: Participants were college students recruited from two universities in different regions of China through the Credamo survey platform. After excluding responses that failed attention checks, a total of 203 valid cases were retained from an initial pool of 223 responses. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in the file named Experiment3a.sav, in SPSS format and compatible with SPSS software. It contains 203 rows and 27 columns. Each row represents a single participant, while each column corresponds to a specific measured variable. These include: random assignment and condition labels; one manipulation check item; four items measuring interactional fairness perception; six items for cognitive trust; five items for affective trust; three attention check items; and three demographic variables (gender, age, education). The final three columns contain computed average scores for interactional fairness, cognitive trust, and affective trust.Additional Notes: There are no missing values in the dataset. All variable names are provided using standardized English abbreviations to facilitate secondary analysis. The data can be directly analyzed using SPSS and exported to other formats as needed.6 Experiment 3b: The Mediating Role of Perceived Ability and Benevolence (Interactional Unfairness)Overview: This dataset comes from an experimental study designed to replicate the findings of Experiment 3a and further examine the potential mediating roles of perceived ability and perceived benevolence under conditions of interactional unfairness.Data Generation and Processing: Participants were working adults recruited via the Credamo platform. Attention check questions were embedded in the survey, and responses that failed these checks were excluded in real time. Data collection proceeded in a rolling manner until a total of 227 valid responses were obtained. All data were processed and analyzed using SPSS software.Data Structure and Format: The dataset is stored in the file named Experiment3b.sav, in SPSS format and compatible with SPSS software. It includes 227 rows and
Facebook
TwitterData from: Doctoral dissertation; Preprint article entitled: Managers' and physicians’ perception of palm vein technology adoption in the healthcare industry. Formats of the files associated with dataset: CSV; SAV. SPSS setup files can be used to generate native SPSS file formats such as SPSS system files and SPSS portable files. SPSS setup files generally include the following SPSS sections: DATA LIST: Assigns the name, type, decimal specification (if any), and specifies the beginning and ending column locations for each variable in the data file. Users must replace the "physical-filename" with host computer-specific input file specifications. For example, users on Windows platforms should replace "physical-filename" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory "C:\". VARIABLE LABELS: Assigns descriptive labels to all variables. Variable labels and variable names may be identical for some variables. VALUE LABELS: Assigns descriptive labels to codes in the data file. Not all variables necessarily have assigned value labels. MISSING VALUES: Declares user-defined missing values. Not all variables in the data file necessarily have user-defined missing values. These values can be treated specially in data transformations, statistical calculations, and case selection. MISSING VALUE RECODE: Sets user-defined numeric missing values to missing as interpreted by the SPSS system. Only variables with user-defined missing values are included in the statements. ABSTRACT: The purpose of the article is to examine the factors that influence the adoption of palm vein technology by considering the healthcare managers’ and physicians’ perception, using the Unified Theory of Acceptance and Use of Technology theoretical foundation. A quantitative approach was used for this study through which an exploratory research design was utilized. A cross-sectional questionnaire was distributed to responders who were managers and physicians in the healthcare industry and who had previous experience with palm vein technology. The perceived factors tested for correlation with adoption were perceived usefulness, complexity, security, peer influence, and relative advantage. A Pearson product-moment correlation coefficient was used to test the correlation between the perceived factors and palm vein technology. The results showed that perceived usefulness, security, and peer influence are important factors for adoption. Study limitations included purposive sampling from a single industry (healthcare) and limited literature was available with regard to managers’ and physicians’ perception of palm vein technology adoption in the healthcare industry. Researchers could focus on an examination of the impact of mediating variables on palm vein technology adoption in future studies. The study offers managers insight into the important factors that need to be considered in adopting palm vein technology. With biometric technology becoming pervasive, the study seeks to provide managers with the insight in managing the adoption of palm vein technology. KEYWORDS: biometrics, human identification, image recognition, palm vein authentication, technology adoption, user acceptance, palm vein technology
Facebook
TwitterAs the UK went into the first lockdown of the COVID-19 pandemic, the team behind the biggest social survey in the UK, Understanding Society (UKHLS), developed a way to capture these experiences. From April 2020, participants from this Study were asked to take part in the Understanding Society COVID-19 survey, henceforth referred to as the COVID-19 survey or the COVID-19 study.
The COVID-19 survey regularly asked people about their situation and experiences. The resulting data gives a unique insight into the impact of the pandemic on individuals, families, and communities. The COVID-19 Teaching Dataset contains data from the main COVID-19 survey in a simplified form. It covers topics such as
The resource contains two data files:
Key features of the dataset
A full list of variables in both files can be found in the User Guide appendix.
Who is in the sample?
All adults (16 years old and over as of April 2020), in households who had participated in at least one of the last two waves of the main study Understanding Society, were invited to participate in this survey. From the September 2020 (Wave 5) survey onwards, only sample members who had completed at least one partial interview in any of the first four web surveys were invited to participate. From the November 2020 (Wave 6) survey onwards, those who had only completed the initial survey in April 2020 and none since, were no longer invited to participate
The User guide accompanying the data adds to the information here and includes a full variable list with details of measurement levels and links to the relevant questionnaire.
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Structural Equation Modeling (SEM) software market is experiencing robust growth, driven by increasing adoption across diverse sectors like education, healthcare, and the social sciences. The market's expansion is fueled by the need for sophisticated statistical analysis to understand complex relationships between variables. Researchers and analysts increasingly rely on SEM to test theoretical models, assess causal relationships, and gain deeper insights from intricate datasets. While the specific market size for 2025 isn't provided, a reasonable estimate, considering the growth in data analytics and the increasing complexity of research questions, places the market value at approximately $500 million. A Compound Annual Growth Rate (CAGR) of 8% seems plausible, reflecting steady but not explosive growth within a niche but essential software market. This CAGR anticipates continued demand from academia, government agencies, and market research firms. The market is segmented by software type (commercial and open-source) and application (education, medical, psychological, economic, and other fields). Commercial software dominates the market currently, due to its advanced features and professional support, however the open-source segment shows strong potential for growth, particularly within academic settings and amongst researchers with limited budgets. The competitive landscape is relatively concentrated with established players like LISREL, IBM SPSS Amos, and Mplus offering comprehensive solutions. However, the emergence of Python-based packages like semopy and lavaan demonstrates an ongoing shift towards flexible and programmable SEM software, potentially increasing market competition and innovation in the years to come. Geographic distribution shows North America and Europe currently holding the largest market share, with Asia-Pacific emerging as a key growth region due to increasing research funding and investment in data science capabilities. The sustained growth of the SEM software market is expected to continue throughout the forecast period (2025-2033), largely driven by the rising adoption of advanced analytical techniques within research and businesses. Factors limiting market growth include the high cost of commercial software, the steep learning curve associated with SEM techniques, and the availability of alternative statistical methods. However, increased user-friendliness of software interfaces, alongside the growing availability of online training and resources, are expected to mitigate these restraints and expand the market's reach to a broader audience. Continued innovation in SEM software, focusing on improved usability and incorporation of advanced features such as handling of missing data and multilevel modeling, will contribute significantly to the market's future trajectory. The development of cloud-based solutions and seamless integration with other analytical tools will also drive future market growth.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456864https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456864
Abstract (en): The purpose of this data collection is to provide an official public record of the business of the federal courts. The data originate from 94 district and 12 appellate court offices throughout the United States. Information was obtained at two points in the life of a case: filing and termination. The termination data contain information on both filing and terminations, while the pending data contain only filing information. For the appellate and civil data, the unit of analysis is a single case. The unit of analysis for the criminal data is a single defendant. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed consistency checks.; Standardized missing values.; Checked for undocumented or out-of-range codes.. All federal court cases, 1970-2000. 2012-05-22 All parts are being moved to restricted access and will be available only using the restricted access procedures.2005-04-29 The codebook files in Parts 57, 94, and 95 have undergone minor edits and been incorporated with their respective datasets. The SAS files in Parts 90, 91, 227, and 229-231 have undergone minor edits and been incorporated with their respective datasets. The SPSS files in Parts 92, 93, 226, and 228 have undergone minor edits and been incorporated with their respective datasets. Parts 15-28, 34-56, 61-66, 70-75, 82-89, 96-105, 107, 108, and 115-121 have had identifying information removed from the public use file and restricted data files that still include that information have been created. These parts have had their SPSS, SAS, and PDF codebook files updated to reflect the change. The data, SPSS, and SAS files for Parts 34-37 have been updated from OSIRIS to LRECL format. The codebook files for Parts 109-113 have been updated. The case counts for Parts 61-66 and 71-75 have been corrected in the study description. The LRECL for Parts 82, 100-102, and 105 have been corrected in the study description.2003-04-03 A codebook was created for Part 105, Civil Pending, 1997. Parts 232-233, SAS and SPSS setup files for Civil Data, 1996-1997, were removed from the collection since the civil data files for those years have corresponding SAS and SPSS setup files.2002-04-25 Criminal data files for Parts 109-113 have all been replaced with updated files. The updated files contain Criminal Terminations and Criminal Pending data in one file for the years 1996-2000. Part 114, originally Criminal Pending 2000, has been removed from the study and the 2000 pending data are now included in Part 113.2001-08-13 The following data files were revised to include plaintiff and defendant information: Appellate Terminations, 2000 (Part 107), Appellate Pending, 2000 (Part 108), Civil Terminations, 1996-2000 (Parts 103, 104, 115-117), and Civil Pending, 2000 (Part 118). The corresponding SAS and SPSS setup files and PDF codebooks have also been edited.2001-04-12 Criminal Terminations (Parts 109-113) data for 1996-2000 and Criminal Pending (Part 114) data for 2000 have been added to the data collection, along with corresponding SAS and SPSS setup files and PDF codebooks.2001-03-26 Appellate Terminations (Part 107) and Appellate Pending (Part 108) data for 2000 have been added to the data collection, along with corresponding SAS and SPSS setup files and PDF codebooks.1997-07-16 The data for 18 of the Criminal Data files were matched to the wrong part numbers and names, and now have been corrected. Funding insitution(s): United States Department of Justice. Office of Justice Programs. Bureau of Justice Statistics. (1) Several, but not all, of these record counts include a final blank record. Researchers may want to detect this occurrence and eliminate this record before analysis. (2) In July 1984, a major change in the recording and disposition of an appeal occurred, and several data fields dealing with disposition were restructured or replaced. The new structure more clearly delineates mutually exclusive dispositions. Researchers must exercise care in using these fields for comparisons. (3) In 1992, the Administrative Office of the United States Courts changed the reporting period for statistical data. Up to 1992, the reporting period...
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456254https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456254
Abstract (en): This round of Euro-Barometer surveys queried respondents on standard Euro-Barometer measures such as public awareness of and attitudes toward the Common Market and the European Union (EU), and focused on perceptions about and factors affecting blood and plasma donation. Questions solicited opinions about the way blood and plasma are collected and handled, reasons for donating, understanding of the differences between blood and plasma, the necessity of rewards for donating, and sources of information about blood or plasma donation. Respondents were also surveyed about their perceptions of product quality based on country of manufacture, cross-border purchases and customs experiences, a single European currency, women's opinions on EU matters, tobacco smoking habits, AIDS risks, and perceived cancer risks of food products. On EU matters, respondents were asked how well-informed they felt about the EU, what sources of information about the EU they used, whether their country had benefited from being an EU member, and the extent of their personal interest in EU matters. This survey also includes respondent opinions and party preferences for the June 1994 European elections. Demographic and other background information was gathered on number of people residing in the home, size of locality, home ownership, trade union membership, region of residence, and occupation of the head of household, as well as the respondent's age, sex, marital status, education, occupation, work sector, religion, religiosity, subjective social class, left-right political self-placement, and opinion leadership. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed consistency checks.; Standardized missing values.; Performed recodes and/or calculated derived variables.; Checked for undocumented or out-of-range codes.. Persons aged 15 and over residing in the 12 member nations of the European Union: Belgium, Denmark, France, Germany, Greece, Ireland, Italy, Luxembourg, the Netherlands, Portugal, Spain, and the United Kingdom, as well as in Norway and Finland. Multistage national probability samples. 2005-11-04 On 2005-03-14 new files were added to one or more datasets. These files included additional setup files as well as one or more of the following: SAS program, SAS transport, SPSS portable, and Stata system files. The metadata record was revised 2005-11-04 to reflect these additions.1998-02-26 The data have been revised and additional standardization of missing data codes was performed. The SPSS data definition statements were also revised. In addition, SAS data definition statements were added to the collection, as well as a full machine-readable codebook with bivariate frequencies, and the data collection instrument is now available as a PDF file. The data collection instrument is provided as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background and purposeHydrocephalus is a frequent complication following subarachnoid hemorrhage. Few studies investigated the association between laboratory parameters and shunt-dependent hydrocephalus. This study aimed to investigate the variations of laboratory parameters after subarachnoid hemorrhage. We also attempted to identify predictive laboratory parameters for shunt-dependent hydrocephalus.MethodsMultiple imputation was performed to fill the missing laboratory data using Bayesian methods in SPSS. We used univariate and multivariate Cox regression analyses to calculate hazard ratios for shunt-dependent hydrocephalus based on clinical and laboratory factors. The area under the receiver operating characteristic curve was used to determine the laboratory risk values predicting shunt-dependent hydrocephalus.ResultsWe included 181 participants with a mean age of 54.4 years. Higher sodium (hazard ratio, 1.53; 95% confidence interval, 1.13–2.07; p = 0.005), lower potassium, and higher glucose levels were associated with higher shunt-dependent hydrocephalus. The receiver operating characteristic curve analysis showed that the areas under the curve of sodium, potassium, and glucose were 0.649 (cutoff value, 142.75 mEq/L), 0.609 (cutoff value, 3.04 mmol/L), and 0.664 (cutoff value, 140.51 mg/dL), respectively.ConclusionsDespite the exploratory nature of this study, we found that higher sodium, lower potassium, and higher glucose levels were predictive values for shunt-dependent hydrocephalus from postoperative day (POD) 1 to POD 12–16 after subarachnoid hemorrhage. Strict correction of electrolyte imbalance seems necessary to reduce shunt-dependent hydrocephalus. Further large studies are warranted to confirm our findings.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de439683https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de439683
Abstract (en): This survey was undertaken to assemble a broad range of family, household, employment, schooling, and welfare data on families living in urban poverty areas of Chicago. The researchers were seeking to test a variety of theories about urban poverty. Questions concerned respondents' current lives as well as their recall of life events from birth to age 21. Major areas of investigation included household composition, family background, education, time spent in detention or jail, childbirth, fertility, relationship history, current employment, employment history, military service, participation in informal economy, child care, child support, child-rearing, neighborhood and housing characteristics, social networks, current health, current and past public aid use, current income, and major life events. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed consistency checks.. Non-Hispanic whites, non-Hispanic Blacks, and persons of Mexican or Puerto Rican ethnicity, aged 18-44, residing in 1986 in Chicago census tracts with 20 percent or more persons living under the poverty line. Multistage stratified probability sample design yielding 2,490 observations (1,183 Blacks, 364 whites, 489 Mexican-origin persons, and 454 Puerto Rican-origin persons). Though Black respondents include parents (N = 1,020) and non-parents (N = 163), only parents were selected within non-Black groups. Response rates ranged from 73.8 percent for non-Hispanic whites to 82.5 percent for Black parents. 1997-11-04 The documentation and frequencies are being released as PDF files, and an SPSS export file is now available. Also, the SAS data definition statements and SPSS data definition statements have been reissued with minor changes, and SPSS value labels are being released in Part 7 due to SPSS for Windows limitations. Funding insitution(s): Carnegie Corporation. Chicago Community Trust. Ford Foundation. Institute for Research on Poverty. Joyce Foundation. Lloyd A. Fry Foundation. John D. and Catherine T. MacArthur Foundation. Rockefeller Foundation. Spencer Foundation. United States Department of Health and Human Services. William T. Grant Foundation. Woods Charitable Fund. Value labels for this study are being released in a separate file, Part 7, to assist users of SPSS Release 6.1 for Windows. The syntax window in this version of SPSS will read a maximum of 32,767 lines. If all value labels were included in the SPSS data definition file, the number of lines in the file would exceed 32,767 lines.All references to card-image data in the codebook are no longer applicable.During generation of the logical record length data file, ICPSR optimized variable widths to the width of the widest value appearing in the data collection for each variable. However, the principal investigator's user-missing data code definitions were retained even when a variable contained no missing data. As a result, when user-missing data values are defined (e.g., by uncommenting the MISSING VALUES section in the SPSS data definition statements) and exceed the optimized variable width, SPSS's display dictionary output will contain asterisks for the missing data codes.Producer: University of Chicago, Center for the Study of urban Inequality, and the National Opinion Research Center (NORC).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This bundle contains supplementary materials for an upcoming academic publication Do Agile Scaling Approaches Make A Difference? An Empirical Comparison of Team Effectiveness Across Popular Scaling Approaches?, by Christiaan Verwijs and Daniel Russo. Included in the bundle are the dataset and SPSS syntaxes. This replication package is made available by C. Verwijs under a "Creative Commons Attribution Non-Commercial Share-Alike 4.0 International"-license (CC-BY-NC-SA 4.0).
About the dataset
The dataset (SPSS) contains anonymized response data from 15,078 team members aggregated into 4,013 Agile teams that participated from scrumteamsurvey.org. Stakeholder evaluations of 1,841 stakeholders were also collected for 529 of those teams. Data was gathered between September 2021, and September 2023. We cleaned the individual response data from careless responses and removed all data that could potentially identify teams, individuals, or their parent organizations. Because we wanted to analyze our measures at the team level, we calculated a team-level mean for each item in the survey. Such aggregation is only justified when at least 10% of the variance exists at the team level (Hair, 2019), which was the case (ICC = 35-50%). No data was missing at the team level.
Question labels and option labels are provided separately in Questions.csv. To conform to the privacy statement of scrumteamsurvey.org, the bundle does not include response data from before the team-level aggregation.
About the SPSS syntaxes
The bundle includes the syntaxes we used to prepare the dataset from the raw import, as well as the syntax we used to generate descriptives. This is mostly there for other researchers to verify our procedure.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de444806https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de444806
Abstract (en): This is a longitudinal study of three birth cohorts of youngsters who were considered at risk because of anti-social behavior or because of officially recorded delinquency at early ages. The study followed a sample of 245 boys in the fourth, seventh, and tenth grades in 1980 (Part 1) and again in 1985 (Part 2). Two screening devices, or "gatings," were used to predict future delinquency. The first procedure, triple gating, was based on teachers' ratings of school competence, mothers' reports of anti-social behavior in the home, and parental monitoring. The second procedure, double gating, used only the teachers' ratings and mothers' reports. Data were collected on the boys' family, school, and criminal backgrounds. Variables include measures of independence and achievement, family criminality, home conduct problems, school disruptiveness, school competence, parental authoritarianism, parental conflict, self-reported delinquency, peer delinquency, and drug and alcohol use. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Standardized missing values.; Checked for undocumented or out-of-range codes.. Males in the fourth, seventh, and tenth grades from 21 elementary and high schools in Oregon. Subjects were selected from a sample of 300 families who volunteered to participate in all phases of the study. 2006-03-30 File QU9312.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-03-30 File CB9312.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2005-11-04 On 2005-03-14 new files were added to one or more datasets. These files included additional setup files as well as one or more of the following: SAS program, SAS transport, SPSS portable, and Stata system files. The metadata record was revised 2005-11-04 to reflect these additions.1998-10-15 Missing data codes have been standardized and SAS and SPSS data definition statements were created for this collection. ICPSR also created a PDF codebook and scanned the data collection instruments into a PDF file. Funding insitution(s): United States Department of Justice. Office of Justice Programs. National Institute of Justice (84-IJ-CX-0048).
Facebook
TwitterBackground and Objectives: Pharmacogenomics (PGx) leverages genomic information to tailor drug therapies, enhancing precision medicine. Despite global advancements, its implementation in Lebanon, Qatar, and Saudi Arabia faces unique challenges in clinical integration. This study aimed to investigate PGx attitudes, knowledge implementation, associated challenges, forecast future educational needs, and compare findings across the three countries. Methods: This cross-sectional study utilized an anonymous, self-administered online survey distributed to healthcare professionals, academics, and clinicians in Lebanon, Qatar, and Saudi Arabia. The survey comprised 18 questions to assess participants' familiarity with PGx, current implementation practices, perceived obstacles, potential integration strategies, and future educational needs. Results: The survey yielded 337 responses from healthcare professionals across the three countries. Data revealed significant variations in PGx familiarity an..., Ethical statement and informed consent Ethical approval for this study was obtained from the institutional review boards of the participating universities: Beirut Arab University (2023-H-0153-HS-R-0545), Qatar University (QU-IRB 1995-E/23), and Alfaisal University (IRB-20270). Informed consent was obtained from all participants online, ensuring their confidentiality and the right to withdraw from the study without any consequences. Participants were informed that all collected data would be anonymous and confidential, with only the principal investigator having access to the data. Completing and submitting the survey was considered an agreement to participate. Study design This study utilized a quantitative cross-sectional research design, involving healthcare professionals (pharmacists, nurses, medical laboratory technologists), university academics, and clinicians from Lebanon, Qatar, and Saudi Arabia. Data was collected through a voluntary, anonymous, private survey to gather PGx per..., , # Integrating pharmacogenomics in three Middle Eastern countries’ healthcare (Lebanon, Qatar, and Saudi Arabia)
Description of the data set: o 1 dataset is included; PGx_database : it includes the raw data of our paper. o In the data set, each row represent one participant. o All the variables can contain empty cells. When participants didn't answer, empty cells were added to show the missing data. o The number in each cell has a specific value depending on the variable.
Listed variables:
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441995https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441995
Abstract (en): This dataset includes selected variables and cases from the Federal Bureau of Investigation's Uniform Crime Reports, 1958-1969, and the County and City Data Books for 1962, 1967, and 1972. Data are reported for all United States cities with a population of 75,000 or more in 1960. Data from the Uniform Crime Reports include for each year the number of homicides, forcible rapes, robberies, aggravated assaults, burglaries, larcenies over 50 dollars, and auto thefts. Also included is the Total Crime Index, which is the simple sum of all the crimes listed above. Selected variables describing population characteristics and city finances were taken from the 1962, 1967, and 1972 County and City Data Books. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Standardized missing values.; Checked for undocumented or out-of-range codes.. All cities in the United States with a population of 75,000 or more in 1960. 2005-11-04 On 2005-03-14 new files were added to one or more datasets. These files included additional setup files as well as one or more of the following: SAS program, SAS transport, SPSS portable, and Stata system files. The metadata record was revised 2005-11-04 to reflect these additions.1997-02-13 SAS and SPSS data definition statements are now available for this collection. Funding insitution(s): United States Department of Justice. Office of Justice Programs. Bureau of Justice Statistics. These data were taken from a dataset originally created by Alvin L. Jacobson and were prepared for use in ICPSR's Workshop on Data Processing and Data Management in the Criminal Justice Field in the summer of 1978, with further processing by Colin Loftin.
Facebook
Twitterhttps://qdr.syr.edu/policies/qdr-standard-access-conditionshttps://qdr.syr.edu/policies/qdr-standard-access-conditions
This is an Annotation for Transparent Inquiry (ATI) data project. The annotated article can be viewed on the Publisher's Website. Data Generation The research project engages a story about perceptions of fairness in criminal justice decisions. The specific focus involves a debate between ProPublica, a news organization, and Northpointe, the owner of a popular risk tool called COMPAS. ProPublica wrote that COMPAS was racist against blacks, while Northpointe posted online a reply rejecting such a finding. These two documents were the obvious foci of the qualitative analysis because of the further media attention they attracted, the confusion their competing conclusions caused readers, and the power both companies wield in public circles. There were no barriers to retrieval as both documents have been publicly available on their corporate websites. This public access was one of the motivators for choosing them as it meant that they were also easily attainable by the general public, thus extending the documents’ reach and impact. Additional materials from ProPublica relating to the main debate were also freely downloadable from its website and a third party, open source platform. Access to secondary source materials comprising additional writings from Northpointe representatives that could assist in understanding Northpointe’s main document, though, was more limited. Because of a claim of trade secrets on its tool and the underlying algorithm, it was more difficult to reach Northpointe’s other reports. Nonetheless, largely because its clients are governmental bodies with transparency and accountability obligations, some of Northpointe-associated reports were retrievable from third parties who had obtained them, largely through Freedom of Information Act queries. Together, the primary and (retrievable) secondary sources allowed for a triangulation of themes, arguments, and conclusions. The quantitative component uses a dataset of over 7,000 individuals with information that was collected and compiled by ProPublica and made available to the public on github. ProPublica’s gathering the data directly from criminal justice officials via Freedom of Information Act requests rendered the dataset in the public domain, and thus no confidentiality issues are present. The dataset was loaded into SPSS v. 25 for data analysis. Data Analysis The qualitative enquiry used critical discourse analysis, which investigates ways in which parties in their communications attempt to create, legitimate, rationalize, and control mutual understandings of important issues. Each of the two main discourse documents was parsed on its own merit. Yet the project was also intertextual in studying how the discourses correspond with each other and to other relevant writings by the same authors. Several more specific types of discursive strategies were of interest in attracting further critical examination: Testing claims and rationalizations that appear to serve the speaker’s self-interest Examining conclusions and determining whether sufficient evidence supported them Revealing contradictions and/or inconsistencies within the same text and intertextually Assessing strategies underlying justifications and rationalizations used to promote a party’s assertions and arguments Noticing strategic deployment of lexical phrasings, syntax, and rhetoric Judging sincerity of voice and the objective consideration of alternative perspectives Of equal importance in a critical discourse analysis is consideration of what is not addressed, that is to uncover facts and/or topics missing from the communication. For this project, this included parsing issues that were either briefly mentioned and then neglected, asserted yet the significance left unstated, or not suggested at all. This task required understanding common practices in the algorithmic data science literature. The paper could have been completed with just the critical discourse analysis. However, because one of the salient findings from it highlighted that the discourses overlooked numerous definitions of algorithmic fairness, the call to fill this gap seemed obvious. Then, the availability of the same dataset used by the parties in conflict, made this opportunity more appealing. Calculating additional algorithmic equity equations would not thereby be troubled by irregularities because of diverse sample sets. New variables were created as relevant to calculate algorithmic fairness equations. In addition to using various SPSS Analyze functions (e.g., regression, crosstabs, means), online statistical calculators were useful to compute z-test comparisons of proportions and t-test comparisons of means. Logic of Annotation Annotations were employed to fulfil a variety of functions, including supplementing the main text with context, observations, counter-points, analysis, and source attributions. These fall under a few categories. Space considerations. Critical discourse analysis offers a rich method...
Facebook
TwitterSystems thinking is a skill that is essential to understanding and taking effective action on complex challenges such as climate change. This research evaluated whether systems thinking could be increased with a brief intervention. Participants (N = 678) recruited from Amazon Mechanical Turk all completed the Systems Thinking Scale (Randel & Stroink, 2018), which was used as a covariate. Participants were then randomly assigned to one of four conditions. Some participants (n = 165) watched an entertaining 5-minute video describing systems thinking with a real-life example (Cats in Borneo, https://www.youtube.com/watch?v=17BP9n6g1F0). Others (n = 174) watched this video, read a definition of systems thinking, and were asked to engage in systems thinking while completing a survey. This was designed to be a "sledgehammer" condition, in which we made our manipulation as heavy-handed as possible. A third (control) condition (n = 167) watched a video about how to fold a fitted sheet...., Participants were recruited via Amazon Mechanical Turk. All participants were adults living in the United States. We gave 10 different measures that capture some aspect of systems thinking: The Systems Thinking Scale (Randle & Stroink, 2018): This 15-item self-report scale measures someone's dispositional tendency to engage in systems thinking. Negatively worded items were recoded, and all items were averaged together. Higher scores = more systems thinking. This trait measure was given at the start of the study and was used as a covariate. The Murder Scenario (Choi et al., 2007): Participants read a brief description of a murder case and indicated which of 96 possible facts were irrelevant to the case. We recoded the items such that 1 = relevant, 0 = irrelevant. The recoded items were summed together. Choosing more items indicates more holistic thinking about causality. Ripple Effect Question: Driver Scenario: Based on measures developed by Maddux & Yuki (2006). Participan..., , # Using a "sledgehammer" approach to increase systems thinking with a brief manipulation
https://doi.org/10.5061/dryad.v9s4mw75b
The data appear in the file DataForIncreasing STWithBriefManipulation.csv. Variable descriptions and values appear in the file MetaDataForIncreasing STWithBriefManipulation.csv.Â
Description: Raw data that has not been published. The data file was generated in SPSS but exported to csv format for accessibility. Each row corresponds to a single participant. Missing data occurred when online participants failed to complete a question. Missing data is indicated with an empty field.
| Variable | Position | Label ...,
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For a comprehensive guide to this data and other UCR data, please see my book at ucrbook.comVersion 14 release notes:Adds .parquet file formatVersion 13 release notes:Adds 2023-2024 dataVersion 12 release notes:Adds 2022 dataVersion 11 release notes:Adds 2021 data.Version 10 release notes:Adds 2020 data. Please note that the FBI has retired UCR data ending in 2020 data so this will be the last arson data they release. Changes .rda file to .rds.Version 9 release notes:Changes release notes description, does not change data.Version 8 release notes:Adds 2019 data.Note that the number of months missing variable sharply changes starting in 2018. This is probably due to changes in UCR reporting of the column_2_type variable which is used to generate the months missing county (the code I used does not change). So pre-2018 and 2018+ years may not be comparable for this variable. Version 7 release notes:Adds a last_month_reported column which says which month was reported last. This is actually how the FBI defines number_of_months_reported so is a more accurate representation of that. Removes the number_of_months_reported variable as the name is misleading. You should use the last_month_reported or the number_of_months_missing (see below) variable instead.Adds a number_of_months_missing in the annual data which is the sum of the number of times that the agency reports "missing" data (i.e. did not report that month) that month in the card_2_type variable or reports NA in that variable. Please note that this variable is not perfect and sometimes an agency does not report data but this variable does not say it is missing. Therefore, this variable will not be perfectly accurate.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Version 4 release notes: Adds 1979-2000, 2006, and 2017 dataAdds agencies that reported 0 months.Adds monthly data.All data now from FBI, not NACJD. Changes some column names so all columns are <=32 characters to be usable in Stata.Version 3 release notes: Add data for 2016.Order rows by year (descending) and ORI.Removed data from Chattahoochee Hills (ORI = "GA06059") from 2016 data. In 2016, that agency reported about 28 times as many vehicle arsons as their population (Total mobile arsons = 77762, population = 2754.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. This Arson data set is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about arsons reported in the United States. The information is the number of arsons reported, to have actually occurred, to not have occurred ("unfounded"), cleared by arrest of at least one arsoning, cleared by arrest where all offenders are under the age of 18, and the cost of the arson. This is done for a number of different arson location categories such as community building, residence, vehicle, and industrial/manufacturing structure. The yearly data sets here combine data from the years 1979-2018 into a single file for each group of crimes. Each monthly file is only a single year as my laptop can't handle combining all the years together. These files are quite large and may take some time to load. I also added state, county, and place FIPS code from the LEAIC (crosswalk).A small number of agencies had some months with clearly incorrect data. I changed the incorrect columns to NA and left the other columns unchanged for that agency. The following are data problems that I fixed - there are still likely issues remaining in the data so make sure to check yourself before running analyses. Oneida, New York (ORI = NY03200) had multiple years that reported single arsons costing over $700 million. I deleted this agency from all years of data.In January 1989 Union, North Carolina (ORI = NC09000) reported 30,000 arsons in uninhabited single occupancy buildings and none any other months. In December 1991 Gadsden, Florida (ORI = FL02000) reported that a single arson at a community/public building caused $99,999,999 in damages (the maximum possible).In April 2017 St. Paul, Minnesota (ORI = MN06209) reported 73,400 arsons in uninhabited storage buildings and 10,000 arsons in uninhabited community
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This bundle contains supplementary materials for an upcoming academic publication A Theory of Scrum Team Effectiveness, by Christiaan Verwijs and Daniel Russo. Included in the bundle are the dataset, SPSS syntaxes, and model definitions (AMOS). This replication package is made available by C. Verwijs under a "Creative Commons Attribution Non-Commercial Share-Alike 4.0 International"-license (CC-BY-NC-SA 4.0).
About the dataset
The dataset (SPSS) contains anonymized response data from 4.940 respondents from 1.978 Scrum Teams that participated from the https://scrumteamsurvey.org. Data was gathered between June 3, 2020, and October 13, 2021. We cleaned the individual response data from careless responses and removed all data that could potentially identify teams, individuals, or their parent organizations. Because we wanted to analyze our measures at the team level, we calculated a team level mean for each item in the survey. Such aggregation is only justified when at least 10% of the variance exists at the team level (Hair, 2019), which was the case (ICC = 51%). Because the percentage of missing data was modest, and to prevent list-wise deletion of cases and lose information, we performed EM maximum likelihood imputation in SPSS.
The dataset contains question labels and answer option definitions. To conform to the privacy statement of scrumteamsurvey.org, the bundle does not include individual response data from before the team-level aggregation.
About the model definitions
The bundle includes definitions for Structural Equation Models (SEM) for AMOS. We added the four iterations of the measurement model, four models used to perform a test for common method bias, the final path model, and the model used for mediation testing. Mediation testing was performed with the procedure outlined by Podsakoff (2003). Mediation testing was performed with the "Indirect Effects" plugin for AMOS by James Gaskin.
About the SPSS syntaxes
The bundle includes the syntaxes we used to prepare the dataset from the raw import, as well as the syntax we used to generate descriptives. This is mostly there for other researchers to verify our procedure.
Facebook
Twitterhttp://rdm.uva.nl/en/support/confidential-data.htmlhttp://rdm.uva.nl/en/support/confidential-data.html
This database represents data from 480 respondents with the purpose to measure their intervention choice in community care with the instrument AICN (Assessment of Intervention choice in Community Nursing). Data collection took place at the Faculty of Health of the Amsterdam University of Applied Sciences, the Netherlands. The respondents are all baccalaureate nursing students in the fourth year of study close to graduation. Data were collected at three timepoints: around May 2016 (group 1215), May 2017 (group 1316) and May 2018 (group 1417). The student cohorts 1215 and 1316 form a historical control group, 1417 is the intervention group. The intervention group underwent a new four-year more ‘community-oriented’ curriculum, with five new curriculum themes related to caregiving in peoples own homes: (1) fostering patient self-management, (2) shared decision-making, (3) collaboration with the patients’ social system, (4) using healthcare technology, and (5) allocation of care.The aim of this study is to investigate the effect of this redesigned baccalaureate nursing curriculum on students’ intervention choice in community care. AICN is a measuring instrument containing three vignettes in which a situation in caregiving in the patients’ home is described. Each vignette incorporates all five new curriculum themes. The interventions with regard to each theme are a realistic option, while more ‘traditional’ intervention choices are also possible. To avoid students responding in a way they think to be correct, they are not aware of the instrument’s underlying purpose (i.e., determining the five themes). After reading each vignette, the respondents briefly formulate five, in their opinion, most suitable interventions for nursing caregiving. The fifteen interventions yield qualitative information. To allow for quantitative data analysis, the AICN includes a codebook describing the criteria used to recode each of the qualitative intervention descriptions into a quantitative value. As the manuscript describing AICN and codebook is yet under review, a link to the instrument will be added after publication. Filesets:1: SPSS file – 3 cohorts AICN without student numbers2: SPSS syntax file Variables in SPSS file (used in analysis):1: Cohort type2: Curriculum type (old vs. new)3-20: Dummy variables of demographics21-35: CSINV refers to case/intervention; CS1INV2 means case 1, intervention 236-50: Dummy variables of 21-35, representing the main outcome old vs. new intervention type51: Sum of dummy variables (range 1-15) representing the primary outcome AICN52: Sum of dummys like 51, but with respondents with missing variables included, used in the regression analysis53-58: Count the number of chosen interventions per curriculum theme59-60: Count missings (old curriculum = 59, new = 60)61-62: Count no intervention theme (old curriculum = 61, new = 62)ContactBecause of the sensitive nature of the data, the fileset is confidential and will be shared only under strict conditions. For more information contact opensciencesupport@hva.nl
Facebook
TwitterThe Sicily and Calabria Extortion Database was extracted from police and court documents by the Palermo team of the GLODERS — Global Dynamics of Extortion Racket Systems — project which has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement no. 315874 (http://www.gloders.eu, “Global dynamics of extortion racket systems”). The data are provided as an SPSS file with variable names, variable labels, value labels where appropriate, missing value definitions where appropriate. Variable and value labels are given in English translation, string texts are quoted from the Italian originals as we thought that a translation could bias the information and that users of the data for secondary analysis will usually be able to read Italian. The rows of the SPSS file describe one extortion case each. The columns start with some technical information (unique case number, reference to the original source, region, case number within the regions (Sicily and Calabria). These are followed by information about when the cases happened, the pseudonym of the extorter, his role in the organisation and the name and territory of the mafia family or mandamento he belongs to. Information about the victims, their affiliations and the type of enterprise they represent follows; the type of enterprise is coded according to the official Italian coding scheme (AtEco, which can be downloaded from http://www.istat.it/it/archivio/17888). The next group of variables describes the place where the extortion happened. The value labels for the numerical pseudonyms of extorters and victims (both persons and firms) are not contained in this file, hence the pseudonyms can only be used to analyse how often the same person or firm was involved in extortion. After this more or less technical information about the extortion cases the cases are described materially. Most variables come in two forms, both the original textual description of what happened and how it happened and a recoded variable which lends itself better for quantitative analyses. The features described in these variables encompass • whether the extortion was only attempted (and unsuccessful from the point of view of the extorter) or completed, i.e. the victim actually paid, • whether the request was for a periodic or a one-off payment or both and what the amount was (the amounts of periodic and one-off amounts are not always comparable as some were only defined in terms of percentages of victim income or in terms of obligations the victim accepted to employ a relative of the extorter etc.), • whether there was an intimidation and whether it was directed to a person or to property, • whether the extortion request was brought forward by direct personal contact or by some indirect communication, • whether there was some negotiation between extorter and victim, and if so, what it was like, and whether a mediator interfered, • how the victim reacted: acquiescent, conniving or refusing, • how the law enforcement agencies got to know about the case (own observation, denunciation, etc.), • whether the extorter was caught, brought to investigation custody or finally sentenced (these variables contain a high percentage of missing data, partly due to the fact that some cases are still under prosecution or before court or as a consequence of incomplete documents. Kompilation Transkription Compilation Transcription Extortion cases in Sicily and Calabria Reasoned sampling, trying to represent the proportional distribution of the cases between East and West Sicily. For Calabria the focus was on the province capital.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Version 5 release notes: Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.Version 4 release notes: Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics. Version 3 release notes: Add data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Arrests by Age, Sex, and Race data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1980-2015 into a single file. These files are quite large and may take some time to load. All the data was downloaded from NACJD as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data. If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.I did not make any changes to the data other than the following. When an arrest column has a value of "None/not reported", I change that value to zero. This makes the (possible incorrect) assumption that these values represent zero crimes reported. The original data does not have a value when the agency reports zero arrests other than "None/not reported." In other words, this data does not differentiate between real zeros and missing values. Some agencies also incorrectly report the following numbers of arrests which I change to NA: 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99999, 99998. To reduce file size and make the data more manageable, all of the data is aggregated yearly. All of the data is in agency-year units such that every row indicates an agency in a given year. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each agency-year and columns with the number of people arrests for murder. The ASR data breaks down arrests by age and gender (e.g. Male aged 15, Male aged 18). They also provide the number of adults or juveniles arrested by race. Because most agencies and years do not report the arrestee's ethnicity (Hispanic or not Hispanic) or juvenile outcomes (e.g. referred to adult court, referred to welfare agency), I do not include these columns. To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. Please note that some of the FIPS codes have leading zeros and if you open it in Excel it will automatically delete those leading zeros. I created 9 arrest categories myself. The categories are: Total Male JuvenileTotal Female JuvenileTotal Male AdultTotal Female AdultTotal MaleTotal FemaleTotal JuvenileTotal AdultTotal ArrestsAll of these categories are based on the sums of the sex-age categories (e.g. Male under 10, Female aged 22) rather than using the provided age-race categories (e.g. adult Black, juvenile Asian). As not all agencies report the race data, my method is more accurate. These categories also make up the data in the "simple" version of the data. The "simple" file only includes the above 9 columns as the arrest data (all other columns in the
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GENERAL INFORMATION
Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation
Date of data collection: January to March 2022
Collection instrument: SurveyMonkey
Funding: Alfred P. Sloan Foundation
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license
Links to publications that cite or use the data:
Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437
Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266
DATA & FILE OVERVIEW
File List
Additional related data collected that was not included in the current data package: Open ended questions asked to respondents
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:
The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.
Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).
Methods for processing the data:
Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.
Instrument- or software-specific information needed to interpret the data:
The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.
DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata
Number of variables: 94
Number of cases/rows: 2,492
Missing data codes: 999 Not asked
Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.