CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides data at the national level from federal fiscal year 2006 onwards for the accuracy of the assignment of Social Security numbers (SSN) based on an end-of line sample review of transactions that result in the release of SSN cards.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Social Security Administration Data for Enumeration Accuracy’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/e568c9ad-f5c4-4627-81fd-37229a259af2 on 11 February 2022.
--- Dataset description provided by original source is as follows ---
This dataset provides data at the national level from federal fiscal year 2006 onwards for the accuracy of the assignment of Social Security numbers (SSN) based on an end-of line sample review of transactions that result in the release of SSN cards.
--- Original source retains full ownership of the source dataset ---
The digitalised social safety net programmes of Bangladesh are evaluated with a sample of 531 beneficiaries and non-beneficiaries.
This data set includes airborne measurements obtained from the QueenAir, B-80 aircraft (Tail Number N306D) during the SSN project. This dataset contains low rate navigation, state parameter, and microphysics flight-level data in GENPRO-I format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Prepared by Vetle Torvik 2018-04-15 The dataset comes as a single tab-delimited ASCII encoded file, and should be about 717MB uncompressed. • How was the dataset created? First and last names of authors in the Author-ity 2009 dataset was processed through several tools to predict ethnicities and gender, including Ethnea+Genni as described in: Torvik VI, Agarwal S. Ethnea -- an instance-based ethnicity classifier based on geocoded author names in a large-scale bibliographic database. International Symposium on Science of Science March 22-23, 2016 - Library of Congress, Washington, DC, USA. http://hdl.handle.net/2142/88927 Smith, B., Singh, M., & Torvik, V. (2013). A search engine approach to estimating temporal changes in gender orientation of first names. Proceedings Of The ACM/IEEE Joint Conference On Digital Libraries, (JCDL 2013 - Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries), 199-208. doi:10.1145/2467696.2467720 EthnicSeer: http://singularity.ist.psu.edu/ethnicity Treeratpituk P, Giles CL (2012). Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching. Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (pp. 1141-1147). AAAI-12. Toronto, ON, Canada SexMachine 0.1.1: https://pypi.org/project/SexMachine First names, for some Author-ity records lacking them, were harvested from outside bibliographic databases. • The code and back-end data is periodically updated and made available for query at Torvik Research Group • What is the format of the dataset? The dataset contains 9,300,182 rows and 10 columns 1. auid: unique ID for Authors in Author-ity 2009 (PMID_authorposition) 2. name: full name used as input to EthnicSeer) 3. EthnicSeer: predicted ethnicity; ARA, CHI, ENG, FRN, GER, IND, ITA, JAP, KOR, RUS, SPA, VIE, XXX 4. prop: decimal between 0 and 1 reflecting the confidence of the EthnicSeer prediction 5. lastname: used as input for Ethnea+Genni 6. firstname: used as input for Ethnea+Genni 7. Ethnea: predicted ethnicity; either one of 26 (AFRICAN, ARAB, BALTIC, CARIBBEAN, CHINESE, DUTCH, ENGLISH, FRENCH, GERMAN, GREEK, HISPANIC, HUNGARIAN, INDIAN, INDONESIAN, ISRAELI, ITALIAN, JAPANESE, KOREAN, MONGOLIAN, NORDIC, POLYNESIAN, ROMANIAN, SLAV, THAI, TURKISH, VIETNAMESE) or two ethnicities (e.g., SLAV-ENGLISH), or UNKNOWN (if no one or two dominant predictons), or TOOSHORT (if both first and last name are too short) 8. Genni: predicted gender; 'F', 'M', or '-' 9. SexMac: predicted gender based on third-party Python program (default settings except case_sensitive=False); female, mostly_female, andy, mostly_male, male) 10. SSNgender: predicted gender based on US SSN data; 'F', 'M', or '-'
Not seeing a result you expected?
Learn how you can add new datasets to our index.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides data at the national level from federal fiscal year 2006 onwards for the accuracy of the assignment of Social Security numbers (SSN) based on an end-of line sample review of transactions that result in the release of SSN cards.