https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXXhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXX
This dataset contains replication files for "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples" by Raj Chetty and John Friedman. For more information, see https://opportunityinsights.org/paper/differential-privacy/. A summary of the related publication follows. Releasing statistics based on small samples – such as estimates of social mobility by Census tract, as in the Opportunity Atlas – is very valuable for policy but can potentially create privacy risks by unintentionally disclosing information about specific individuals. To mitigate such risks, we worked with researchers at the Harvard Privacy Tools Project and Census Bureau staff to develop practical methods of reducing the risks of privacy loss when releasing such data. This paper describes the methods that we developed, which can be applied to disclose any statistic of interest that is estimated using a sample with a small number of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic’s maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.
Historical Employment Statistics 1990 - current. The Current Employment Statistics (CES) more information program provides the most current estimates of nonfarm employment, hours, and earnings data by industry (place of work) for the nation as a whole, all states, and most major metropolitan areas. The CES survey is a federal-state cooperative endeavor in which states develop state and sub-state data using concepts, definitions, and technical procedures prescribed by the Bureau of Labor Statistics (BLS). Estimates produced by the CES program include both full- and part-time jobs. Excluded are self-employment, as well as agricultural and domestic positions. In Connecticut, more than 4,000 employers are surveyed each month to determine the number of the jobs in the State. For more information please visit us at http://www1.ctdol.state.ct.us/lmi/ces/default.asp.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Sheep statistics, supply and disposition of sheep and lambs, Canada and provinces (head x 1,000). Data are available on an annual basis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Red Oak by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Red Oak. The dataset can be utilized to understand the population distribution of Red Oak by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Red Oak. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Red Oak.
Key observations
Largest age group (population): Male # 70-74 years (273) | Female # 65-69 years (229). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Red Oak Population by Gender. You can refer the same here
Metropolitan Statistical Areas are CBSAs associated with at least one urbanized area that has a population of at least 50,000. The metropolitan statistical area comprises the central county or counties or equivalent entities containing the core, plus adjacent outlying counties having a high degree of social and economic integration with the central county or counties as measured through commuting.Download: https://www2.census.gov/geo/tiger/TGRGDB24/tlgdb_2024_a_us_nationgeo.gdb.zip Layer: Core_Based_Statistical_Area where [MEMI] = "1"Metadata: https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/Current_19115/series_tl_2023_cbsa.shp.iso.xml
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Sanford by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Sanford. The dataset can be utilized to understand the population distribution of Sanford by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Sanford. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Sanford.
Key observations
Largest age group (population): Male # 10-14 years (63) | Female # 65-69 years (89). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Sanford Population by Gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Monthly statistics for pages viewed by visitors to the Queensland Government website—People with disability franchise. Source: Google Analytics
This statistic shows the consumption of frozen cakes / muffins / pastries / pies in the United States from 2012 to 2020 and a forecast thereof until 2024. The data has been calculated by Statista based on the U.S. Census data and Simmons National Consumer Survey (NHCS). According to this statistic, 112.58 million Americans consumed frozen cakes / muffins / pastries / pies in 2020. This figure is projected to increase to 113.43 million in 2024.
Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).
This dataset conatins information about Licenses Statistics for ( 2022-2021-2020) by Activity
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Wellfleet by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Wellfleet. The dataset can be utilized to understand the population distribution of Wellfleet by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Wellfleet. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Wellfleet.
Key observations
Largest age group (population): Male # 25-29 years (8) | Female # 30-34 years (5). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Wellfleet Population by Gender. You can refer the same here
As of January 2025, around 13.7 percent of paid iOS apps admitted collecting data from users engaging with their mobile products. In comparison, approximately 53 percent of free-to-download iOS apps reported they collect private data from users worldwide, while approximately 86 percent of paid apps have not declared whether they collect users' privacy data.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Monthly railway industry carloading statistics for intermodal and non-intermodal traffic in metric tonnes, for the period from January to the most current month of the current year, Canada, Eastern Division and Western Division.
https://datacatalog1.worldbank.org/public-licenses?fragment=cchttps://datacatalog1.worldbank.org/public-licenses?fragment=cc
National statistical systems are facing significant challenges. These challenges arise from increasing demands for high quality and trustworthy data to guide decision making, coupled with the rapidly changing landscape of the data revolution. To help create a mechanism for learning amongst national statistical systems, the World Bank has developed improved Statistical Performance Indicators (SPI) to monitor the statistical performance of countries. The SPI focuses on five key dimensions of a country’s statistical performance: (i) data use, (ii) data services, (iii) data products, (iv) data sources, and (v) data infrastructure. This will replace the Statistical Capacity Index (SCI) that the World Bank has regularly published since 2004.
The SPI focus on five key pillars of a country’s statistical performance: (i) data use, (ii) data services, (iii) data products, (iv) data sources, and (v) data infrastructure. The SPI are composed of more than 50 indicators and contain data for 186 countries. This set of countries covers 99 percent of the world population. The data extend from 2016-2023, with some indicators going back to 2004.
For more information, consult the academic article published in the journal Scientific Data. https://www.nature.com/articles/s41597-023-01971-0.
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
A direct internet link to Solomon Island's agriculture statistics at a glance and other related information.
The median income indicates the income bracket separating the income earners into two halves of equal size.
This 2001 Population Census dataset contains statistics relevant to demographic, household, educational, economic, housing and internal migration characteristics of the Hong Kong population residing in the 139 Large Tertiary Planning Unit Groups in 2001. The dataset also contains the boundaries of individual Large Tertiary Planning Unit Groups. Since 1961, a population census has been conducted in Hong Kong every 10 years and a by-census in the middle of the intercensal period. The 2001 Population Census, which was conducted in March 2001, provides benchmark statistics on the socio-economic characteristics of the Hong Kong population vital to the planning and policy formulation of the government. This dataset will be incorporated into Population Distribution Framework Spatial Data Theme.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Avg Weekly Earnings: OS: Dry Cleaning & Laundry ex Coin Operated data was reported at 528.540 USD in May 2018. This records a decrease from the previous number of 535.460 USD for Apr 2018. United States Avg Weekly Earnings: OS: Dry Cleaning & Laundry ex Coin Operated data is updated monthly, averaging 439.230 USD from Mar 2006 (Median) to May 2018, with 147 observations. The data reached an all-time high of 537.420 USD in Dec 2017 and a record low of 378.000 USD in Aug 2006. United States Avg Weekly Earnings: OS: Dry Cleaning & Laundry ex Coin Operated data remains active status in CEIC and is reported by Bureau of Labor Statistics. The data is categorized under Global Database’s USA – Table US.G032: Current Employment Statistics Survey: Average Weekly and Hourly Earnings.
The National Prisoner Statistics (NPS) data collection began in 1926 in response to a congressional mandate to gather information on persons incarcerated in state and federal prisons. Originally under the auspices of the United States Census Bureau, the collection moved to the Bureau of Prisons in 1950, and then in 1971 to the National Criminal Justice Information and Statistics Service, the precursor to the Bureau of Justice Statistics (BJS) which was established in 1979. Since 1979, the Census Bureau has been the NPS data collection agent. The NPS is administered to 51 respondents. Before 2001, the District of Columbia was also a respondent, but responsibility for housing the District of Columbia's sentenced prisoners was transferred to the federal Bureau of Prisons, and by yearend 2001 the District of Columbia no longer operated a prison system. The NPS provides an enumeration of persons in state and federal prisons and collects data on key characteristics of the nation's prison population. NPS has been adapted over time to keep pace with the changing information needs of the public, researchers, and federal, state, and local governments.
This dataset includes New York State historical shoreline positions represented as digital vector polylines from 1880 to 2015. Shorelines were compiled from topographic survey sheets from the National Oceanic and Atmospheric Administration (NOAA). Historical shoreline positions can be used to assess the movement of shorelines through time. Rates of shoreline change were calculated in ArcMap 10.5.1 using the Digital Shoreline Analysis System (DSAS) version 5.0. DSAS uses a measurement baseline method to calculate rate of change statistics. Transects are cast from the reference baseline to intersect each shoreline, establishing measurement points used to calculate shoreline change rates. For wetland shorelines these rates can be interpreted as accretion or erosion.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXXhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXX
This dataset contains replication files for "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples" by Raj Chetty and John Friedman. For more information, see https://opportunityinsights.org/paper/differential-privacy/. A summary of the related publication follows. Releasing statistics based on small samples – such as estimates of social mobility by Census tract, as in the Opportunity Atlas – is very valuable for policy but can potentially create privacy risks by unintentionally disclosing information about specific individuals. To mitigate such risks, we worked with researchers at the Harvard Privacy Tools Project and Census Bureau staff to develop practical methods of reducing the risks of privacy loss when releasing such data. This paper describes the methods that we developed, which can be applied to disclose any statistic of interest that is estimated using a sample with a small number of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic’s maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.