https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXXhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXX
This dataset contains replication files for "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples" by Raj Chetty and John Friedman. For more information, see https://opportunityinsights.org/paper/differential-privacy/. A summary of the related publication follows. Releasing statistics based on small samples – such as estimates of social mobility by Census tract, as in the Opportunity Atlas – is very valuable for policy but can potentially create privacy risks by unintentionally disclosing information about specific individuals. To mitigate such risks, we worked with researchers at the Harvard Privacy Tools Project and Census Bureau staff to develop practical methods of reducing the risks of privacy loss when releasing such data. This paper describes the methods that we developed, which can be applied to disclose any statistic of interest that is estimated using a sample with a small number of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic’s maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.
This dataset is a polygon coverage of counties limited to the extent of the Pond Creek coal bed resource areas and attributed with statistics on the thickness of the Pond Creek coal zone, its elevation, and overburden thickness, in feet. The file has been generalized from detailed geologic coverages found elsewhere in Professional Paper 1625-C.
Historical Employment Statistics 1990 - current. The Current Employment Statistics (CES) more information program provides the most current estimates of nonfarm employment, hours, and earnings data by industry (place of work) for the nation as a whole, all states, and most major metropolitan areas. The CES survey is a federal-state cooperative endeavor in which states develop state and sub-state data using concepts, definitions, and technical procedures prescribed by the Bureau of Labor Statistics (BLS). Estimates produced by the CES program include both full- and part-time jobs. Excluded are self-employment, as well as agricultural and domestic positions. In Connecticut, more than 4,000 employers are surveyed each month to determine the number of the jobs in the State. For more information please visit us at http://www1.ctdol.state.ct.us/lmi/ces/default.asp.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Households in sample (Number) by Region and Year
View data using web pages
Download .px file (Software required)
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Dataset contains counts and measures for individuals from the 2013, 2018, and 2023 Censuses. Data is available by statistical area 2.
The variables included in this dataset are for the census usually resident population count (unless otherwise stated). All data is for level 1 of the classification (unless otherwise stated).
The variables for part 1 of the dataset are:
Download lookup file for part 1 from Stats NZ ArcGIS Online or embedded attachment in Stats NZ geographic data service. Download data table (excluding the geometry column for CSV files) using the instructions in the Koordinates help guide.
Footnotes
Te Whata
Under the Mana Ōrite Relationship Agreement, Te Kāhui Raraunga (TKR) will be publishing Māori descent and iwi affiliation data from the 2023 Census in partnership with Stats NZ. This will be available on Te Whata, a TKR platform.
Geographical boundaries
Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.
Subnational census usually resident population
The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city.
Population counts
Stats NZ publishes a number of different population counts, each using a different definition and methodology. Population statistics – user guide has more information about different counts.
Caution using time series
Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).
Study participation time series
In the 2013 Census study participation was only collected for the census usually resident population count aged 15 years and over.
About the 2023 Census dataset
For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.
Data quality
The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.
Concept descriptions and quality ratings
Data quality ratings for 2023 Census variables has additional details about variables found within totals by topic, for example, definitions and data quality.
Disability indicator
This data should not be used as an official measure of disability prevalence. Disability prevalence estimates are only available from the 2023 Household Disability Survey. Household Disability Survey 2023: Final content has more information about the survey.
Activity limitations are measured using the Washington Group Short Set (WGSS). The WGSS asks about six basic activities that a person might have difficulty with: seeing, hearing, walking or climbing stairs, remembering or concentrating, washing all over or dressing, and communicating. A person was classified as disabled in the 2023 Census if there was at least one of these activities that they had a lot of difficulty with or could not do at all.
Using data for good
Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.
Confidentiality
The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.
Measures
Measures like averages, medians, and other quantiles are calculated from unrounded counts, with input noise added to or subtracted from each contributing value during measures calculations. Averages and medians based on less than six units (e.g. individuals, dwellings, households, families, or extended families) are suppressed. This suppression threshold changes for other quantiles. Where the cells have been suppressed, a placeholder value has been used.
Percentages
To calculate percentages, divide the figure for the category of interest by the figure for 'Total stated' where this applies.
Symbol
-997 Not available
-999 Confidential
Inconsistencies in definitions
Please note that there may be differences in definitions between census classifications and those used for other data collections.
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
Contains statistics on the UK's economy, industry, society and demography presented in easy to read tables and backed up with explanatory notes and definitions. It covers, among others, the following areas: area; parliamentary elections; defence; population and vital statistics; education; labour market; expenditure and wealth; health; crime and justice; lifestyles; environment, housing; transport and communications; government finance; agriculture, fisheries and food; production; banking and insurance and service industry.
Source agency: Office for National Statistics
Designation: National Statistics
Language: English
Alternative title: AA
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean and Median Hourly Earnings by Year, Nationality, Statistic, Sex and Employment Status
View data using web pages
Download .px file (Software required)
Dataset of all the data supplied by each local authority and imputed figures used for national estimates.
This file is no longer being updated to include any late revisions local authorities may have reported to the department. Please use instead the Local authority housing statistics open data file for the latest data.
MS Excel Spreadsheet, 1.26 MB
This file may not be suitable for users of assistive technology.
Request an accessible format.This folder contains 3 .csv files which contain all the observations for the suite of major ion and nutrient constituents for the Heart River Basin. These files contain the water-quality observations for the statistical summary tables in the report cited in this data release (Tatge and others, 2021).The allsiteinfo.table.csv file can be used to cross reference the sites with the main report (Tatge and others, 2021). Tatge, W.S., Nustad, R.A., and Galloway, J.M., 2021, Evaluation of Salinity and Nutrient Conditions in the Heart River Basin, North Dakota, 1970-2020: U.S. Geological Survey Scientific Investigations Report 2021-XXXX, XX p.
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
This is the latest statistical publication of linked HES (Hospital Episode Statistics) and DID (Diagnostic Imaging Dataset) data held by the Health and Social Care Information Centre. The HES-DID linkage provides the ability to undertake national (within England) analysis along acute patient pathways to understand typical imaging requirements for given procedures, and/or the outcomes after particular imaging has been undertaken, thereby enabling a much deeper understanding of outcomes of imaging and to allow assessment of variation in practice. This publication aims to highlight to users the availability of this updated linkage and provide users of the data with some standard information to assess their analysis approach against. The two data sets have been linked using specific patient identifiers collected in HES and DID. The linkage allows the data sets to be linked from April 2012 when the DID data was first collected; however this report focuses on patients who were present in either data set for the period April 2015-February 2016 only. For DID this is provisional 2015/16 data. For HES this is provisional 2015/16 data. The linkage used for this publication was created on 06 June 2016 and released together with this publication on 07 July 2016.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Sheep statistics, supply and disposition of sheep and lambs, Canada and provinces (head x 1,000). Data are available on an annual basis.
This publication gives previously published copies of the quarterly National Statistics publication on egg production, usage and prices that showed figures for 2023. Each publication gives the figures available at that time. The figures are subject to revision each quarter as new information becomes available.
The latest publication and accompanying data sets can be found here.
For further information please contact:
julie.rumsey@defra.gov.uk
https://twitter.com/@defrastats" title="@DefraStats" class="govuk-link">Twitter: @DefraStats
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
List of footnotes, notes, and source information for NHIS Adult Summary Statistics. Each row of this dataset contains the accompanying text for a footnote found in the NHIS Adults Summary Statistics Dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The "Utah 64 Small Health Statistics Areas" feature layer was developed by the Office of Public Health Assessment, Utah Department of Health using small area analysis methodology in 1997. Each feature was generated by combining a sufficient number of adjacent ZIP code area features to form a geographic area of approximately 33,500 persons (range 15,000 to 62,500). Criteria used for determining which ZIP code areas to combine together to form a Small Health Statistics Area included population size, local health district and county boundaries, similarity of ZIP code population's income level and community political boundaries. Input from local community representatives was used to refine area designations. The Utah 64 Small Health Statistics Areas provide a means of geographically analyzing and presenting health statistics at the community level. Producing information at the small area in Utah provides community planners and other with information that is specific to the populations living in their communities of concern. Small area analysis also allows an investigator to explore ecologic relationships between health status, lifestyle, the environment and the health system. In areas where a ZIP code crosses a county boundary, the 2008 and 2009 versions of Small Statistical Areas honor the ZIP code boundary leading to cases where a Small Statistical Areas can be in multiple counties. The 2012 and 2014 versions correct this issue by splitting ZIP code areas by county boundaries resulting in Small Statistical Areas that can only be found in one county. In the 2017 version, area 57 Grand/San Juan Counties was split into 2 areas, area 57.1 Grand county and 57.2 San Juan County.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The most important skills for the development of the enterprise in the coming years by Firm Employment Size, Year and Statistic
View data using web pages
Download .px file (Software required)
The Home Office has changed the format of the published data tables for a number of areas (asylum and resettlement, entry clearance visas, extensions, citizenship, returns, detention, and sponsorship). These now include summary tables, and more detailed datasets (available on a separate page, link below). A list of all available datasets on a given topic can be found in the ‘Contents’ sheet in the ‘summary’ tables. Information on where to find historic data in the ‘old’ format is in the ‘Notes’ page of the ‘summary’ tables. The Home Office intends to make these changes in other areas in the coming publications. If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.
Immigration statistics, year ending March 2020
Immigration Statistics Quarterly Release
Immigration Statistics User Guide
Publishing detailed data tables in migration statistics
Policy and legislative changes affecting migration to the UK: timeline
Immigration statistics data archives
https://assets.publishing.service.gov.uk/media/5f1e9c14e90e0745691135e9/asylum-summary-mar-2020-tables.xlsx">Asylum and resettlement summary tables, year ending March 2020 second edition (MS Excel Spreadsheet, 123 KB)
Detailed asylum and resettlement datasets
https://assets.publishing.service.gov.uk/media/5ebe9d9786650c2791ec7166/sponsorship-summary-mar-2020-tables.xlsx">Sponsorship summary tables, year ending March 2020 (MS Excel Spreadsheet, 72.7 KB)
https://assets.publishing.service.gov.uk/media/5ebe9d77d3bf7f5d37fa0d9f/visas-summary-mar-2020-tables.xlsx">Entry clearance visas summary tables, year ending March 2020 (MS Excel Spreadsheet, 66.1 KB)
Detailed entry clearance visas datasets
https://assets.publishing.service.gov.uk/media/5ebe9e4b86650c279626e5f2/passenger-arrivals-admissions-summary-mar-2020-tables.xlsx">Passenger arrivals (admissions) summary tables, year ending March 2020 (MS Excel Spreadsheet, 76.1 KB)
Detailed Passengers initially refused entry at port datasets
https://assets.publishing.service.gov.uk/media/5ebe9edb86650c2791ec7167/extentions-summary-mar-2020-tables.xlsx">Extensions summary tables, year ending March 2020 (MS Excel Spreadsheet, 41.8 KB)
Table of INEBase Coverage of the statistic by sectors and years. National. Statistics on R&D Activities in the Business Sector
The Bath and North East Somerset Council has one of the largest databases in the world on the production and trade of minerals. The dataset contains annual production statistics by mass for more than 70 mineral commodities covering the majority of economically important and internationally-traded minerals, metals and mineral-based materials. For each commodity the annual production statistics are recorded for individual countries, grouped by continent. Import and export statistics are also available for years up to 2002. Maintenance of the database is funded by the Science Budget and output is used by government, private industry and others in support of policy, economic analysis and commercial strategy. As far as possible the production data are compiled from primary, official sources. Quality assurance is maintained by participation in such groups as the International Consultative Group on Non-ferrous Metal Statistics. Individual commodity and country tables are available for sale on request.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistics on Capital Markets Services Licence holders by Core Activity
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Persons with a Disability as a Percentage of All Population by Age Group, Sex, CensusYear and Statistic
View data using web pages
Download .px file (Software required)
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXXhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXX
This dataset contains replication files for "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples" by Raj Chetty and John Friedman. For more information, see https://opportunityinsights.org/paper/differential-privacy/. A summary of the related publication follows. Releasing statistics based on small samples – such as estimates of social mobility by Census tract, as in the Opportunity Atlas – is very valuable for policy but can potentially create privacy risks by unintentionally disclosing information about specific individuals. To mitigate such risks, we worked with researchers at the Harvard Privacy Tools Project and Census Bureau staff to develop practical methods of reducing the risks of privacy loss when releasing such data. This paper describes the methods that we developed, which can be applied to disclose any statistic of interest that is estimated using a sample with a small number of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic’s maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.