6 datasets found

g
Statistical Computing: SPSS
search.gesis.org
dataverse.unc.edu
+1more
Updated Oct 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zimmer, Catherine (2021). Statistical Computing: SPSS [Dataset]. https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631
Explore at:
Dataset updated
Oct 29, 2021
Dataset provided by
GESIS search
UNC Dataverse
Authors
Zimmer, Catherine
License
https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631
Description
Part 1 of the course will offer an introduction to SPSS and teach how to work with data saved in SPSS format. Part 2 will demonstrate how to work with SPSS syntax, how to create your own SPSS data files, and how to convert data in other formats to SPSS. Part 3 will teach how to append and merge SPSS files, demonstrate basic analytical procedures, and show how to work with SPSS graphics.
d
Current Population Survey (CPS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
H
Area Resource File (ARF)
dataverse.harvard.edu
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). Area Resource File (ARF) [Dataset]. http://doi.org/10.7910/DVN/8NMSFV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/8NMSFV
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the area resource file (arf) with r the arf is fun to say out loud. it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa). the file contains health information and statistics for over 3,000 us counties. like many government agencies, hrsa provides only a sas importation script and an as cii file. this new github repository contains two scripts: 2011-2012 arf - download.R download the zipped area resource file directly onto your local computer load the entire table into a temporary sql database save the condensed file as an R data file (.rda), comma-separated value file (.csv), and/or stata-readable file (.dta). 2011-2012 arf - analysis examples.R limit the arf to the variables necessary for your analysis sum up a few county-level statistics merge the arf onto other data sets, using both fips and ssa county codes create a sweet county-level map click here to view these two scripts for mo re detail about the area resource file (arf), visit: the arf home page the hrsa data warehouse notes: the arf may not be a survey data set itself, but it's particularly useful to merge onto other survey data. confidential to sas, spss, stata, and sudaan users: time to put down the abacus. time to transition to r. :D
H
Health and Retirement Study (HRS)
dataverse.harvard.edu
search.dataone.org
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ELEKOY
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...
search.datacite.org
doi.org
+1more
Updated 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1980-2016 [Dataset]. http://doi.org/10.3886/e102263v5-10021
Explore at:
Unique identifier
https://doi.org/10.3886/e102263v5-10021
Dataset updated
2018
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
DataCitehttps://www.datacite.org/
Authors
Jacob Kaplan
Description
Version 5 release notes:
Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.
Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.
Version 4 release notes:
Changes column names from "poss_coke" and "sale_coke" to "poss_heroin_coke" and "sale_heroin_coke" to clearly indicate that these column includes the sale of heroin as well as similar opiates such as morphine, codeine, and opium. Also changes column names for the narcotic columns to indicate that they are only for synthetic narcotics.
Version 3 release notes:
Add data for 2016.Order rows by year (descending) and ORI.Version 2 release notes:
Fix bug where Philadelphia Police Department had incorrect FIPS county code.
The Arrests by Age, Sex, and Race data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains highly granular data on the number of people arrested for a variety of crimes (see below for a full list of included crimes). The data sets here combine data from the years 1980-2015 into a single file. These files are quite large and may take some time to load.
All the data was downloaded from NACJD as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data. If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.

I did not make any changes to the data other than the following. When an arrest column has a value of "None/not reported", I change that value to zero. This makes the (possible incorrect) assumption that these values represent zero crimes reported. The original data does not have a value when the agency reports zero arrests other than "None/not reported." In other words, this data does not differentiate between real zeros and missing values. Some agencies also incorrectly report the following numbers of arrests which I change to NA: 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 99999, 99998.

To reduce file size and make the data more manageable, all of the data is aggregated yearly. All of the data is in agency-year units such that every row indicates an agency in a given year. Columns are crime-arrest category units. For example, If you choose the data set that includes murder, you would have rows for each agency-year and columns with the number of people arrests for murder. The ASR data breaks down arrests by age and gender (e.g. Male aged 15, Male aged 18). They also provide the number of adults or juveniles arrested by race. Because most agencies and years do not report the arrestee's ethnicity (Hispanic or not Hispanic) or juvenile outcomes (e.g. referred to adult court, referred to welfare agency), I do not include these columns.

To make it easier to merge with other data, I merged this data with the Law Enforcement Agency Identifiers Crosswalk (LEAIC) data. The data from the LEAIC add FIPS (state, county, and place) and agency type/subtype. Please note that some of the FIPS codes have leading zeros and if you open it in Excel it will automatically delete those leading zeros.

I created 9 arrest categories myself. The categories are:
Total Male JuvenileTotal Female JuvenileTotal Male AdultTotal Female AdultTotal MaleTotal FemaleTotal JuvenileTotal AdultTotal ArrestsAll of these categories are based on the sums of the sex-age categories (e.g. Male under 10, Female aged 22) rather than using the provided age-race categories (e.g. adult Black, juvenile Asian). As not all agencies report the race data, my method is more accurate. These categories also make up the data in the "simple" version of the data. The "simple" file only includes the above 9 columns as the arrest data (all other columns in the data are just agency identifier columns). Because this "simple" data set need fewer columns, I include all offenses.

As the arrest data is very granular, and each category of arrest is its own column, there are dozens of columns per crime. To keep the data somewhat manageable, there are nine different files, eight which contain different crimes and the "simple" file. Each file contains the data for all years. The eight categories each have crimes belonging to a major crime category and do not overlap in crimes other than with the index offenses. Please note that the crime names provided below are not the same as the column names in the data. Due to Stata limiting column names to 32 characters maximum, I have abbreviated the crime names in the data. The files and their included crimes are:

Index Crimes
MurderRapeRobberyAggravated AssaultBurglaryTheftMotor Vehicle TheftArsonAlcohol CrimesDUIDrunkenness
LiquorDrug CrimesTotal DrugTotal Drug SalesTotal Drug PossessionCannabis PossessionCannabis SalesHeroin or Cocaine PossessionHeroin or Cocaine SalesOther Drug PossessionOther Drug SalesSynthetic Narcotic PossessionSynthetic Narcotic SalesGrey Collar and Property CrimesForgeryFraudStolen PropertyFinancial CrimesEmbezzlementTotal GamblingOther GamblingBookmakingNumbers LotterySex or Family CrimesOffenses Against the Family and Children
Other Sex Offenses
ProstitutionRapeViolent CrimesAggravated AssaultMurderNegligent ManslaughterRobberyWeapon Offenses
Other CrimesCurfewDisorderly ConductOther Non-trafficSuspicion
VandalismVagrancy
Simple
This data set has every crime and only the arrest categories that I created (see above).
If you have any questions, comments, or suggestions please contact me at jkkaplan6@gmail.com.
2
BCS
datacatalogue.ukdataservice.ac.uk
Updated Mar 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Data Service (2022). BCS [Dataset]. http://doi.org/10.5255/UKDA-SN-6367-2
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-6367-2
Dataset updated
Mar 17, 2022
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
Area covered
England and Wales
Description
The Crime Survey for England and Wales (CSEW) asks a sole adult in a random sample of households about their, or their household's, experience of crime victimisation in the previous 12 months. These are recorded in the victim form data file (VF). A wide range of questions are then asked, covering demographics and crime-related subjects such as attitudes to the police and the criminal justice system (CJS). These variables are contained within the non-victim form (NVF) data file. In 2009, the survey was extended to children aged 10-15 years old; one resident of that age range was also selected from the household and asked about their experience of crime and other related topics. The first set of children's data covered January-December 2009 and is held separately under SN 6601. From 2009-2010, the children's data cover the same period as the adult data and are included with the main study.
The Telephone-operated Crime Survey for England and Wales (TCSEW) became operational on 20 May 2020. It was a replacement for the face-to-face CSEW, which was suspended on 17 March 2020 because of the coronavirus (COVID-19) pandemic. It was set up with the intention of measuring the level of crime during the pandemic. As the pandemic continued throughout the 2020/21 survey year, questions have been raised as to whether the year ending March 2021 TCSEW is comparable with estimates produced in earlier years by the face-to-face CSEW. The ONS Comparability between the Telephone-operated Crime Survey for England and Wales and the face-to-face Crime Survey for England and Wales report explores those factors that may have a bearing on the comparability of estimates between the TCSEW and the former CSEW. These include survey design, sample design, questionnaire changes and modal changes.
More general information about the CSEW may be found on the ONS Crime Survey for England and Wales web page and for the previous BCS, from the GOV.UK BCS Methodology web page.
History - the British Crime Survey
The CSEW was formerly known as the British Crime Survey (BCS), and has been in existence since 1981. The 1982 and 1988 BCS waves were also conducted in Scotland (data held separately under SNs 4368 and 4599). Since 1993, separate Scottish Crime and Justice Surveys have been conducted. Up to 2001, the BCS was conducted biennially. From April 2001, the Office for National Statistics took over the survey and it became the CSEW. Interviewing was then carried out continually and reported on in financial year cycles. The crime reference period was altered to accommodate this.

Secure Access CSEW data
In addition to the main survey, a series of questions covering drinking behaviour, drug use, self-offending, gangs and personal security, and intimate personal violence (IPV) (including stalking and sexual victimisation) are asked of adults via a laptop-based self-completion module (questions may vary over the years). Children aged 10-15 years also complete a separate self-completion questionnaire. The questionnaires are included in the main documentation, but the data are only available under Secure Access conditions (see SN 7280), not with the main study. In addition, from 2011 onwards, lower-level geographic variables are also available under Secure Access conditions (see SN 7311).

New methodology for capping the number of incidents from 2017-18
The CSEW datasets available from 2017-18 onwards are based on a new methodology of capping the number of incidents at the 98th percentile. Incidence variables names have remained consistent with previously supplied data but due to the fact they are based on the new 98th percentile cap, and old datasets are not, comparability has been lost with years prior to 2012-2013. More information can be found in the 2017-18 User Guide (see SN 8464) and the article ‘Improving victimisation estimates derived from the Crime Survey for England and Wales’.
Variable 'PFA' (Police Force Area):
From 2008-2009 onwards, the BCS variable 'PFA' (Police Force Area) is now only available within the associated dataset SN 6368, British Crime Survey, 2008-2009: Special Licence Access, Low-Level Geographic Data, which is subject to restrictive access conditions; see 'Access' section below.

2008-2009 self-completion modules:
From October 2016, the self-completion questionnaire modules covering drug use, drinking behaviour, and domestic violence, sexual victimisation and stalking are subject to Controlled data access conditions - see SN 7280.

CSEW Historic back series – dataset update (March 2022)
From January 2019, all releases of crime statistics using CSEW data adopted a new methodology for measuring repeat victimisation (moving from a cap of 5 in the number of repeat incidents to tracking the 98th percentile value for major crime types).
To maintain a consistent approach across historic data, all datasets back to 2001 have been revised to the new methodology. The change affects all incident data and related fields. A “bolt-on” version of the data has been created for the 2001/02 to 2011/12 datasets. This “bolt-on” dataset contains only variables previously supplied impacted by the change in methodology. These datasets can be merged onto the existing BCS NVF and VF datasets. A template ‘merge’ SPSS syntax file is provided, which will need to be adapted for other software formats.
For the second edition (March 2022), “bolt-on” datasets for the NVF and VF files, example merge syntax and additional documentation have been added to the study to accommodate the latest CSEW repeat victimisation measurement methodology. See the documentation for further details.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Zimmer, Catherine (2021). Statistical Computing: SPSS [Dataset]. https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631

Statistical Computing: SPSS

Explore at:

65 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Oct 29, 2021

Dataset provided by

GESIS search
UNC Dataverse

Authors

Zimmer, Catherine

License

https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631https://search.gesis.org/research_data/datasearch-httpsdataverse-unc-eduoai--hdl1902-2911631

Description

Part 1 of the course will offer an introduction to SPSS and teach how to work with data saved in SPSS format. Part 2 will demonstrate how to work with SPSS syntax, how to create your own SPSS data files, and how to convert data in other formats to SPSS. Part 3 will teach how to append and merge SPSS files, demonstrate basic analytical procedures, and show how to work with SPSS graphics.

Clear search

Close search

Google apps

Main menu

Statistical Computing: SPSS

Current Population Survey (CPS)

Area Resource File (ARF)

Health and Retirement Study (HRS)

Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race,...

BCS

Statistical Computing: SPSS