100+ datasets found
  1. U

    Statistical Abstract of the United States, 2007

    • dataverse-staging.rdmc.unc.edu
    Updated Oct 27, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2011). Statistical Abstract of the United States, 2007 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-0227
    Explore at:
    Dataset updated
    Oct 27, 2011
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227

    Description

    "The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2007.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for su pporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

  2. m

    COVID-19 Combined Data-set with Improved Measurement Errors

    • data.mendeley.com
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afshin Ashofteh (2020). COVID-19 Combined Data-set with Improved Measurement Errors [Dataset]. http://doi.org/10.17632/nw5m4hs3jr.3
    Explore at:
    Dataset updated
    May 13, 2020
    Authors
    Afshin Ashofteh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public health-related decision-making on policies aimed at controlling the COVID-19 pandemic outbreak depends on complex epidemiological models that are compelled to be robust and use all relevant available data. This data article provides a new combined worldwide COVID-19 dataset obtained from official data sources with improved systematic measurement errors and a dedicated dashboard for online data visualization and summary. The dataset adds new measures and attributes to the normal attributes of official data sources, such as daily mortality, and fatality rates. We used comparative statistical analysis to evaluate the measurement errors of COVID-19 official data collections from the Chinese Center for Disease Control and Prevention (Chinese CDC), World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). The data is collected by using text mining techniques and reviewing pdf reports, metadata, and reference data. The combined dataset includes complete spatial data such as countries area, international number of countries, Alpha-2 code, Alpha-3 code, latitude, longitude, and some additional attributes such as population. The improved dataset benefits from major corrections on the referenced data sets and official reports such as adjustments in the reporting dates, which suffered from a one to two days lag, removing negative values, detecting unreasonable changes in historical data in new reports and corrections on systematic measurement errors, which have been increasing as the pandemic outbreak spreads and more countries contribute data for the official repositories. Additionally, the root mean square error of attributes in the paired comparison of datasets was used to identify the main data problems. The data for China is presented separately and in more detail, and it has been extracted from the attached reports available on the main page of the CCDC website. This dataset is a comprehensive and reliable source of worldwide COVID-19 data that can be used in epidemiological models assessing the magnitude and timeline for confirmed cases, long-term predictions of deaths or hospital utilization, the effects of quarantine, stay-at-home orders and other social distancing measures, the pandemic’s turning point or in economic and social impact analysis, helping to inform national and local authorities on how to implement an adaptive response approach to re-opening the economy, re-open schools, alleviate business and social distancing restrictions, design economic programs or allow sports events to resume.

  3. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • catalog.ihsn.org
    • datacatalog.ihsn.org
    Updated Jun 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://catalog.ihsn.org/index.php/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Economic Research Forum
    Kurdistan Regional Statistics Office (KRSO)
    Central Statistical Organization (CSO)
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  4. U

    Statistical Abstract of the United States 1999

    • dataverse-staging.rdmc.unc.edu
    Updated Nov 30, 2007
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2007). Statistical Abstract of the United States 1999 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-0014
    Explore at:
    Dataset updated
    Nov 30, 2007
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0014https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0014

    Description

    The Statistical Abstract is the Nation's best known and most popular single source of statistics on the social, political, and economic organization of the country. The print version of this reference source has been published since 1878 while the compact disc version first appeared in 1993. This disc is designed to serve as a convenient, easy-to-use statistical reference source and guide to statistical publications and sources. The disc contains over 1,400 tables from over 250 different gove rnmental, private, and international organizations. The 1999 CD reflects improved and enhanced data on the disc and the software used for accessing the information. The enrichments to the data and their access include: a link for table of contents page to a PDF of The Census web site. This enable the user to have direct links to the Statistical Abstract and its supplements and other features, such as Statistics in Brief and Frequently Requested Tables. A link to the table of contents from the first text page of each section facilitates quick movement between sections of the book. New PDFs provide more explanation of several major economic series including the Federal Budget, the National Income and Product Accounts (NIPA), the Consumer Price Index (CPI)and Producer Price Index (PPI), and the new North American Industry Classification System (NAICS). Another PDF provides information on the Federal court system. Links to these supplemental materials are provided from each appropriate table. A separate PDF presents a compilation of tables showing major economic indices, as selected by the Council of Economic Advisors. Maps of each state and their metro areas and component counties, maps outlining National Park sites throughout the country, a map of the United States with major transportation facilities and routes, a U.S.map locating coal mines and facilities, and one depicting the distribution of forest land have been added. As usual, updates have been made to most of the more than 1,500 tables and charts that were on the previous disc with new or more recent data. The spreadsheet files, which are available in both Excel and Lotus formats, will usually have more information than the tables displayed in the book or Adobe Acrobat files. The 1999 year introduced over 100 new tables covering a wide range of subject areas. Several sections have preliminary data from the 1997 Economic Census, which presents industry statistics for the first time based on the North American Industry Classification System (NAICS). Comparative data for 1992 and 1997, based on the Standard Industrial Classification (SIC), are also presented. Tables 872 and 873 in Section 17, Business, present summary data for industries. Other new tables cover such topics as the foreign-born population, health care expenditures, the medicare trust fund, violence in schools, presale handgun checks, recycling programs, defense- related employment and spending, workplace violence, ownership of mutual funds, computer use, results of the 1997 Census of Agriculture, and mail order catalogue sales. In addition to the above new tables, a new section has been developed, the 20th Century Statistics. This section introduces data beginning in 1900 on a broad range of subjects, including population, vital statistics, health, education, income, labor force, communications, agriculture, defense, and other areas. The Industrial Outlook tables, previously in Section 31, have been deleted for lack of updates. For a complete list of new tables, see Appendix VI,p.947. The Adobe Acrobat Reader and Search engine, Version 4.0, is on the disc. The Acrobat Reader allows users to view, navigate, search, and print on demand any of the pages from the book. Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

  5. C

    Statistical Data Catalog Cologne

    • ckan.mobidatalab.eu
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Köln (2023). Statistical Data Catalog Cologne [Dataset]. https://ckan.mobidatalab.eu/dataset/statisticaldatacatalogue-coln
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/csv(307022), http://publications.europa.eu/resource/authority/file-type/csv(272780), http://publications.europa.eu/resource/authority/file-type/json, http://publications.europa.eu/resource/authority/file-type/csv(3746), http://publications.europa.eu/resource/authority/file-type/csv(3752), http://publications.europa.eu/resource/authority/file-type/csv(274184), http://publications.europa.eu/resource/authority/file-type/csv(3735), http://publications.europa.eu/resource/authority/file-type/csv(275264), http://publications.europa.eu/resource/authority/file-type/csv(5356), http://publications.europa.eu/resource/authority/file-type/csv(273265), http://publications.europa.eu/resource/authority/file-type/csv(3730), http://publications.europa.eu/resource/authority/file-type/csv(19787), http://publications.europa.eu/resource/authority/file-type/csv(273515), http://publications.europa.eu/resource/authority/file-type/csv(272571), http://publications.europa.eu/resource/authority/file-type/csv(3748), http://publications.europa.eu/resource/authority/file-type/csv(3753), http://publications.europa.eu/resource/authority/file-type/csv(271286), http://publications.europa.eu/resource/authority/file-type/csv(3754), http://publications.europa.eu/resource/authority/file-type/csv(273516), http://publications.europa.eu/resource/authority/file-type/csv(273403), http://publications.europa.eu/resource/authority/file-type/csv(3764), http://publications.europa.eu/resource/authority/file-type/csv(1215), http://publications.europa.eu/resource/authority/file-type/csv(3758)Available download formats
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    Köln
    License

    Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
    License information was derived automatically

    Description

    Data from various sources are updated in the Statistical Information System of the City of Cologne. The annual statistical yearbook publishes these in tabular, graphic and cartographic form at the level of the city districts and districts. Furthermore, definitions and calculation bases are explained. Small-scale statistics at the level of the 86 districts can be obtained from the Cologne district information become. All levels of the local area structure are presented in this publication explained.

    This statistical data catalogue supplements the range of small-scale data. Selected structural data can be called up here in compact tabular form at the level of the 570 statistical districts or the 86 districts. The two overviews provide information about which data is available and from which source it originates. The data itself is provided annually.

    Notes:

    • Data sources are indicated in the summary tables. When using the data, the data license Germany - attribution - version 2.0 must be observed.
    • Some values ​​cannot be given to protect statistical confidentiality. For the data sets of the Federal Employment Agency, these are values ​​from 1 to < 10, for all further data records values ​​from 1 to < 5. This is marked in the data by a * .
    • The differentiation of population figures by gender is currently made according to female and male residents. The case numbers of those who define themselves as non-binary/diverse are so low at a small-scale level that they cannot be reported for reasons of statistical confidentiality.
    • The determination of residents with a migration background is carried out by combination various characteristics from the resident registration procedure. The data are to be interpreted as estimates. The statistical yearbook of the city of Cologne provides further details.
    • The information on households comes from the household generation process. This is a statistical procedure in which residents within an address are assigned to a household as far as possible by querying certain criteria. If the procedure does not identify any connections, the allocation to single-person households takes place. The statistical yearbook of the city of Cologne provides further details.
    • The data set pupils* at general schools (spatial location by place of residence) is available from 2013.
    • The number of the statistical quarter or district is a spatial location and can be linked to the geodata (see related resource below).

  6. Turnover by type of client and employment size class (2000)

    • ec.europa.eu
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Turnover by type of client and employment size class (2000) [Dataset]. http://doi.org/10.2908/BS_BS6_00
    Explore at:
    application/vnd.sdmx.genericdata+xml;version=2.1, json, application/vnd.sdmx.data+xml;version=3.0.0, application/vnd.sdmx.data+csv;version=1.0.0, tsv, application/vnd.sdmx.data+csv;version=2.0.0Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1999 - 2000
    Area covered
    France, Sweden, Finland, Portugal, Spain, Denmark, Luxembourg, United Kingdom
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  7. Annual enterprise statistics by geographical breakdown (1995-2003)

    • ec.europa.eu
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Annual enterprise statistics by geographical breakdown (1995-2003) [Dataset]. http://doi.org/10.2908/SBS_INS_5FCO96
    Explore at:
    application/vnd.sdmx.genericdata+xml;version=2.1, tsv, application/vnd.sdmx.data+xml;version=3.0.0, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.data+csv;version=2.0.0, jsonAvailable download formats
    Dataset updated
    Mar 28, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1995 - 2003
    Area covered
    Czechia, Sweden, Italy, Denmark, Finland, Spain, Luxembourg, Norway, Netherlands, France
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  8. Creative Industries Economic Estimates – December 2011

    • gov.uk
    Updated Dec 8, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2011). Creative Industries Economic Estimates – December 2011 [Dataset]. https://www.gov.uk/government/statistics/creative-industries-economic-estimates-december-2011
    Explore at:
    Dataset updated
    Dec 8, 2011
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    Released:

    8 December 2011

    Period covered:

    2009 (Export of services)
    2008 - 2009 (GVA)
    2009 - 2010 (Employment)
    2009 - 2011 (Businesses)

    Geographic coverage:

    UK

    Next release date:

    Autumn 2012

    Summary

    This bulletin provides estimates of the contribution of Creative Industries to the economy, using the latest data available. The majority of this data is taken from National Statistics sources produced by the Office for National Statistics (ONS). Data sources include thhttp://www.ons.gov.uk/ons/search/index.html?content-type=publicationContentTypes&nscl=Business+and+Energy&pubdateRangeType=last5yrs&pubdateRangeType=allDates&coverage=UK&newquery=annual+business+survey&pageSize=50&applyFilters=tr">Annual Business Survey (ABS), the http://www.ons.gov.uk/ons/about-ons/who-we-are/services/unpublished-data/business-data/idbr/index.html">Inter-Departmental Business Register (IDBR) and the http://www.ons.gov.uk/ons/search/index.html?content-type=Publication&nscl=Labour+Market&pubdateRangeType=last12months&pubdateRangeType=allDates&newquery=labour+force+survey&pageSize=25&applyFilters=true">Labour Force Survey (LFS). Our definition of the Creative Industries is taken from the http://webarchive.nationalarchives.gov.uk/+/http:/www.culture.gov.uk/reference_library/publications/4632.aspx">2001 Creative Industries Mapping Document. Further information on this can be found in the technical note.

    Experimental Statistics

    This is the second year that the Creative Industries have been estimated via the Standard Industrial Classifications (SIC07). Previously this statistical release was given the title of an ‘experimental statistic’ as the methodology was in its inaugural year and was still under development. This methodology is now in its second year and the core methodology has not changed (see page 9 for other changes) so the title ‘experimental statistics’ has been removed.

    However, the methodology for estimation used here is regularly reviewed and if you would like to contribute to this, please contact us at CIEEBulletin@culture.gsi.gov.uk.

    Time Series and Future Developments

    This set of Creative Industries Estimates represents a snapshot of the latest figures. Because of the modifications made to this releases estimates, the figures should not be directly compared to previous estimates. Re-calculation of previous years’ estimates have been included in the release for time series analysis.

    Full Statistical Release

    This contains the headline findings, data tables and figures and a full technical note with definitions, methodology and a full list of the SIC codes used to produce these statistics.

    Tables and Headline Findings

    A summary of the key findings from these statistics, along with data tables.

    Revision Note:

    Updated 22/12/11 to correct the presentation and formatting - all estimates are unchanged from the earlier version.

    Previous Release

    The UK Statistics Authority

    This release is published in accordance with the Code of Practice for Official Statistics (2009), as produced by the UK Statistics Authority (UKSA). The UKSA has the overall objective of promoting and safeguarding the production and publication of official statistics that serve the public good. It monitors and reports on all official statistics, and promotes good practice in this area.

    Pre-releas

  9. National Energy Efficiency Data-Framework (NEED) report: summary of analysis...

    • s3.amazonaws.com
    • gov.uk
    Updated Aug 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Business, Energy & Industrial Strategy (2021). National Energy Efficiency Data-Framework (NEED) report: summary of analysis 2021 [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/174/1744764.html
    Explore at:
    Dataset updated
    Aug 5, 2021
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Business, Energy & Industrial Strategy
    Description

    The National Energy Efficiency Data-Framework (NEED) was set up to provide a better understanding of energy use and energy efficiency in domestic and non-domestic buildings in Great Britain. The data framework matches data about a property together - including energy consumption and energy efficiency measures installed - at household level.

    4 August 2021 Error notice: revisions to the June 2021 Domestic NEED annual report

    We identified 2 processing errors in this edition of the Domestic NEED Annual report and corrected them. The changes are small and do not affect the overall findings of the report, only the domestic energy consumption estimates. The impact of energy efficiency measures analysis remains unchanged. The revisions are summarised here:

    Error 1: Some properties incorrectly excluded from the 2019 gas consumption estimates

    Error 2: Processing of the EPC data

    August 2021: Survey on the future of Domestic NEED closed

    This survey (published June 2021) sought user feedback to inform BEIS’ development of Domestic NEED to better meet user requirements. It is now closed: thank you to those who responded.

    We are reviewing responses and will provide an update in due course. The responses will also inform BEIS’ decision on whether or not to pause the 2022 NEED publication to enable development work to take place.

  10. Number of enterprises by importance of barriers met in cross border trade...

    • ec.europa.eu
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Number of enterprises by importance of barriers met in cross border trade and economic activity (2004) [Dataset]. http://doi.org/10.2908/BS_BS12_04
    Explore at:
    tsv, application/vnd.sdmx.data+xml;version=3.0.0, json, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+csv;version=2.0.0Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2004
    Area covered
    Poland, Slovakia, Slovenia, Latvia, United Kingdom, Norway, Spain, Greece, Germany, Lithuania
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  11. Manufacturing, subsections DA-DE and total (NACE Rev. 1.1, D) by employment...

    • ec.europa.eu
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Manufacturing, subsections DA-DE and total (NACE Rev. 1.1, D) by employment size class (1995-2001) [Dataset]. http://doi.org/10.2908/SBS_SC_2D_DADE95
    Explore at:
    application/vnd.sdmx.data+xml;version=3.0.0, tsv, json, application/vnd.sdmx.data+csv;version=2.0.0, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.genericdata+xml;version=2.1Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1995 - 2001
    Area covered
    Czechia, Italy, Albania, Cyprus, United Kingdom, Belgium, Poland, Austria, Luxembourg, European Union
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  12. Taking Part 2011/12 Quarter 3: Statistical Release

    • gov.uk
    Updated Mar 29, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2012). Taking Part 2011/12 Quarter 3: Statistical Release [Dataset]. https://www.gov.uk/government/statistics/taking-part-2011-12-quarter-3-statistical-release
    Explore at:
    Dataset updated
    Mar 29, 2012
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    The Taking Part survey has run since 2005 and is the key evidence source for DCMS. It is a continuous face to face household survey of adults aged 16 and over in England and children aged 5-15 years old. This latest releases presents rolling estimates incorporating data from the third quarter of year seven of the survey.

    Released:

    29 March 2012

    Period covered:

    January 2011 - December 2011

    Geographic coverage:

    National and Regional level data for England.

    Next release date:

    A release of rolling annual estimates for adults, including the fourth quarter of the 2011/12 survey year, is scheduled for the end of June 2012.

    Summary

    The latest data from the 2011/12 Taking Part survey provides reliable national estimates of adult and child engagement with sport, libraries, the arts, heritage and museums and galleries. This release builds on the data from 2010/2011 and data from quarter 1 and quarter 2 releases of data from earlier in 2011/12 to look at a number of areas in depth and present measures that begin to consider broader definitions of participation in our sectors. The report also looks at some of the other measures in the survey that provide estimates of volunteering and charitable giving and civic engagement.

    The Taking Part survey is a continuous annual survey of adults and children living in private households in England, and carries the National Statistics badge, meaning that it meets the highest standards of statistical quality.

    Statistical Report

    Dashboard

    Statistical Worksheets

    These spreadsheets contain the data and sample sizes to support the material in this release:

    Previous release

    The previous Taking Part release was published on 21 December 2011 and can be found online. It also provides spreadsheets containing the data and sample sizes for each sector included in the survey.

    Pre-release access

    The document below contains a list of Ministers and Officials who have received privileged early access to this release of Taking Part data. In line with best practice, the list has been kept to a minimum and those given access for briefing purposes had a maximum of 24 hours.

    The UK Statistics Authority

    This release is published in accordance with the Code of Practice for Off

  13. Creative Industries Economic Estimates – December 2010 (Experimental...

    • gov.uk
    Updated Dec 9, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2010). Creative Industries Economic Estimates – December 2010 (Experimental Statistics) [Dataset]. https://www.gov.uk/government/statistics/creative-industries-economic-estimates-december-2010-experimental-statistics
    Explore at:
    Dataset updated
    Dec 9, 2010
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This bulletin contains experimental statistics on gross value added (GVA), employment and numbers of businesses within the creative industries.

    Please note that the methodology for these estimates was updated in 2011 and the figures in this 2010 report have been superseded by those in the 2011 report. Therefore for up-to-date figures on the Creative Industries and a time series covering the figures in this 2010 report please see the latest http://www.culture.gov.uk/publications/8682.aspx">Creative Industries Economic Estimates.

    Released:

    9 December 2010

    Period covered:

    2008 (GVA and Exports)
    2010 (Employment and businesses)

    Geographic coverage:

    UK (GVA, Businesses and Exports)
    Great Britain (Employment)

    Next release date:

    Autumn 2011

    Summary

    This bulletin provides estimates of the contribution of Creative Industries to the economy, using the latest data available. The majority of this data is taken from National Statistics sources produced by the Office for National Statistics (ONS). Data sources include the http://www.ons.gov.uk/ons/search/index.html?content-type=publicationContentTypes&nscl=Business+and+Energy&pubdateRangeType=last5yrs&pubdateRangeType=allDates&coverage=UK&newquery=annual+business+survey&pageSize=50&applyFilters=tr">Annual Business Survey (ABS), the http://www.ons.gov.uk/ons/about-ons/who-we-are/services/unpublished-data/business-data/idbr/index.html">Inter-Departmental Business Register (IDBR) and the http://www.ons.gov.uk/ons/search/index.html?content-type=Publication&nscl=Labour+Market&pubdateRangeType=last12months&pubdateRangeType=allDates&newquery=labour+force+survey&pageSize=25&applyFilters=true">Labour Force Survey (LFS). Our definition of the Creative Industries is taken from the http://webarchive.nationalarchives.gov.uk/+/http:/www.culture.gov.uk/reference_library/publications/4632.aspx">2001 Creative Industries Mapping Document. Further information on this can be found in the technical note.

    Experimental Statistics

    As this is our first attempt to measure the Creative Industries using http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/standard-industrial-classification/index.html">Standard Industrial Classifications (SIC 2007), this series of economic estimates are classed as experimental statistics. The statistics will be developed following further consultation with users. We are grateful for any feedback on the way in which we have used the SIC 2007 codes to measure the Creative Industries. If you would like to contribute to this process, please either use the feedback form below, or contact us at CIEEBulletin@culture.gsi.gov.uk.

    Time Series and Future Developments

    This set of Creative Industries Estimates represents a snapshot of the latest figures. Because of the change of http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/standard-industrial-classification/index.html">Standard Industrial Classifications used to produce these estimates, the figures should not be directly compared to previous estimates, which were produced using the old Standard Industrial Classifications (2003). In 2011 we will work with the Office for National Statistics (ONS) to investigate the possibility of establishing a consistent back time series, so that these estimates are more comparable with previous ones. This will be a complex task, and may not be possible for all sectors.

    We are also investigating the possibility of producing regional estimates for the Creative Industries, given the demand that we know exists for these.

    Full Statistical Release

    This contains the headline findings, data tables, and a full technical note with definitions, methodology and a full list of the SIC codes used to produce these statistics.

  14. d

    Protected Areas Database of the United States (PAD-US) 3.0 Spatial Analysis...

    • catalog.data.gov
    • data.usgs.gov
    Updated Oct 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Protected Areas Database of the United States (PAD-US) 3.0 Spatial Analysis and Statistics [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-spatial-analysis-and-statistics
    Explore at:
    Dataset updated
    Oct 22, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    United States
    Description

    Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and outdoor recreation access across the nation. This data release presents results from statistical summaries of the PAD-US 3.0 protection status (by GAP Status Code) and public access status for various land unit boundaries (Protected Areas Database of the United States 3.0 Vector Analysis and Summary Statistics). Summary statistics are also available to explore and download (Comma-separated Table [CSV], Microsoft Excel Workbook (.xlsx), Portable Document Format [.pdf] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). The vector GIS analysis file, source data used to summarize statistics for areas of interest to stakeholders (National, State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative), and complete Summary Statistics Tabular Data (CSV) are included in this data release. Raster GIS analysis files are also available for combination with other raster data (Protected Areas Database of the United States (PAD-US) 3.0 Raster Analysis). The PAD-US 3.0 Combined Fee, Designation, Easement feature class in the full inventory, with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class (Protected Areas Database of the United States (PAD-US) 3.0, https://doi.org/10.5066/P9Q9LQ4B), was modified to prioritize and remove overlapping management designations, limiting overestimation in protection status or public access statistics and to support user needs for vector and raster analysis data. Analysis files in this data release were clipped to the Census State boundary file to define the extent and fill in areas (largely private land) outside the PAD-US, providing a common denominator for statistical summaries.

  15. d

    Ministry of Public Administration and Security_Statistical Yearbook_Local...

    • data.go.kr
    xml
    Updated Jun 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Ministry of Public Administration and Security_Statistical Yearbook_Local Government Human Resources Development Institute General Education [Dataset]. https://www.data.go.kr/en/data/15107448/openapi.do
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jun 4, 2025
    License

    https://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do

    Description
    • The Ministry of the Interior and Safety publishes the 'Administrative Safety Statistical Yearbook' every year by compiling statistical data from the headquarters of the Ministry of the Interior and Safety and its affiliated organizations in accordance with the 'Ministry of the Interior and Safety Statistics Management Regulations'. - The statistical information by field included in the 'Administrative Safety Statistical Yearbook' is provided as an open API so that it can be used in various fields in both the public and private sectors. - The open API in question is the statistics on 'Local Autonomy Human Resources Development Institute General Education' among 'Others' included in the 'Administrative Safety Statistical Yearbook'. It provides general education statistical information such as the number of courses, sessions, and participants for each long-term education, basic education, specialized policy education, and other education. - In addition, the 'Administrative Safety Statistical Yearbook' can be downloaded in PDF format from the Ministry of the Interior and Safety website at Policy Data > Statistics > Statistical Yearbook/Statistics by Subject.
  16. Factors, likely limiting business growth between 2011 and 2013, by type of...

    • ec.europa.eu
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Factors, likely limiting business growth between 2011 and 2013, by type of enterprise and NACE Rev. 2 (2010) [Dataset]. http://doi.org/10.2908/ACF_P_FA
    Explore at:
    application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+csv;version=2.0.0, json, tsv, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.data+xml;version=3.0.0Available download formats
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2010
    Area covered
    Bulgaria, Cyprus, Sweden, Slovakia, Belgium, France, Finland, United Kingdom, Spain, Lithuania
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  17. w

    Data Use in Academia Dataset

    • datacatalog.worldbank.org
    csv, utf-8
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semantic Scholar Open Research Corpus (S2ORC) (2023). Data Use in Academia Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0065200/data_use_in_academia_dataset
    Explore at:
    utf-8, csvAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Semantic Scholar Open Research Corpus (S2ORC)
    Brian William Stacy
    License

    https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc

    Description

    This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.


    Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.


    We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.


    Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.


    The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.


    To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.


    The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.


    The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:


    Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.

    There are two classification tasks in this exercise:

    1. identifying whether an academic article is using data from any country

    2. Identifying from which country that data came.

    For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.

    After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]

    For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.

    We expect between 10 and 35 percent of all articles to use data.


    The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.


    A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.


    The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.


    The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of

  18. Comparison of Maternal Mortality Estimates: Zambia, Bangladesh, Mozambique.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siân L. Curtis; Robert G. Mswia; Emily H. Weaver (2023). Comparison of Maternal Mortality Estimates: Zambia, Bangladesh, Mozambique. [Dataset]. http://doi.org/10.1371/journal.pone.0135062.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Siân L. Curtis; Robert G. Mswia; Emily H. Weaver
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    Sources:a National Institute for Population Research and Training, MEASURE Evaluation, International Centre for Diarrhoeal Disease Research (2012) Bangladesh Maternal Mortality and Health Care Survey 2010. Available: http://www.cpc.unc.edu/measure/publications/tr-12-87. Accessed October 15, 2012.b World Health Organization (ND) WHO Maternal Mortality Country Profiles. Available: www.who.int/gho/maternal_health/en/#M. Accessed 1 March 2015.c Lozano R, Wang H, Foreman KJ, Rajaratnam JK, Naghavi M, Marcus JR, et al. (2011) Progress towards Millennium Development Goals 4 and 5 on maternal and child mortality: an updated systematic analysis. Lancet 378(9797): 1139–65. 10.1016/S0140-6736(11)61337-8d UNFPA, UNICEF, WHO, World Bank (2012) Trends in maternal mortality: 1990–2010. Available: http://www.unfpa.org/public/home/publications/pid/10728. Accessed 7 October 2012.e Bangladesh Bureau of Statistics, Statistics Informatics Division, Ministry of Planning (December 2012) Population and Housing Census 2011, Socio-economic and Demographic Report, National Series–Volume 4. Available at: http://203.112.218.66/WebTestApplication/userfiles/Image/BBS/Socio_Economic.pdf. Accessed 15 February, 2015.f Mozambique National Institute of Statistics, U.S. Census Bureau, MEASURE Evaluation, U.S. Centers for Disease Control and Prevention (2012) Mortality in Mozambique: Results from a 2007–2008 Post-Census Mortality Survey. Available: http://www.cpc.unc.edu/measure/publications/tr-11-83. Accessed 6 October 2012.g Ministerio da Saude (MISAU), Instituto Nacional de Estatística (INE) e ICF International (ICFI). Moçambique Inquérito Demográfico e de Saúde 2011. Calverton, Maryland, USA: MISAU, INE e ICFI.h Mudenda SS, Kamocha S, Mswia R, Conkling M, Sikanyiti P, et al. (2011) Feasibility of using a World Health Organization-standard methodology for Sample Vital Registration with Verbal Autopsy (SAVVY) to report leading causes of death in Zambia: results of a pilot in four provinces, 2010. Popul Health Metr 9:40. 10.1186/1478-7954-9-40i Central Statistical Office (CSO), Ministry of Health (MOH), Tropical Diseases Research Centre (TDRC), University Teaching Hospital Virology Laboratory, University of Zambia, and ICF International Inc. 2014. Zambia Demographic and Health Survey 2013–14: Preliminary Report. Rockville, Maryland, USA. Available: http://dhsprogram.com/pubs/pdf/PR53/PR53.pdf. Accessed February 26, 2015.j Centers for Disease Control and Prevention (2014) Saving Mothers, Giving Life: Maternal Mortality.Phase 1 Monitoring and Evaluation Report. Atlanta, GA: Centers for Disease Control and Prevention, US Dept of Health and Human Services. Available at: http://www.savingmothersgivinglife.org/doc/Maternal%20Mortality%20(advance%20copy).pdf. Accessed 26 February 2015.k Central Statistical Office (CSO), Ministry of Health (MOH), Tropical Diseases Research Centre (TDRC), University of Zambia, and Macro International Inc. 2009. Zambia Demographic and Health Survey 2007. Calverton, Maryland, USA: CSO and Macro International Inc.Comparison of Maternal Mortality Estimates: Zambia, Bangladesh, Mozambique.

  19. Z

    Data from: Replication package for the paper: "A Study on the Pythonic...

    • data.niaid.nih.gov
    • nde-dev.biothings.io
    • +1more
    Updated Jan 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zid, Cyrine; Zampetti, Fiorella; Antoniol, Giuliano; Di Penta, Massimiliano (2024). Replication package for the paper: "A Study on the Pythonic Functional Constructs' Understandability" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8191782
    Explore at:
    Dataset updated
    Jan 23, 2024
    Dataset provided by
    Polytechnique Montréal
    University of Sannio
    Authors
    Zid, Cyrine; Zampetti, Fiorella; Antoniol, Giuliano; Di Penta, Massimiliano
    License

    https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html

    Description

    Replication Package for "A Study on the Pythonic Functional Constructs' Understandability" to appear at ICSE 2024

    Authors: Cyrine Zid, Fiorella Zampetti, Giuliano Antoniol, Massimiliano Di penta

    Article Preprint: https://mdipenta.github.io/files/ICSE24_funcExperiment.pdf

    Artifacts: https://doi.org/10.5281/zenodo.8191782

    License: GPL V3.0

    This package contains folders and files with code and data used in the study described in the paper. In the following, we first provide all fields required for the submission, and then report a detailed description of all repository folders.

    Artifact Description

    Purpose

    The artifact is about a controlled experiment aimed at investigating the extent to which Pythonic functional constructs have an impact on source code understandability. The artifact archive contains:

    The material to allow replicating the study (see Section Experimental-Material)

    Raw quantitative results, working datasets, and scripts to replicate the statistical analyses reported in the paper. Specifically, the executable part of the replication package reproduces figures and tables of the quantitative analysis (RQ1 and RQ2) of the paper starting from the working datasets.

    Spreadsheets used for the qualitative analysis (RQ3).

    We apply for the following badges:

    Available and reusable: because we provide all the material that can be used to replicate the experiment, but also to perform the statistical analyses and the qualitative analyses (spreadsheets, in this case)

    Provenance

    Paper preprint link: https://mdipenta.github.io/files/ICSE24_funcExperiment.pdf

    Artifacts: https://doi.org/10.5281/zenodo.8191782

    Data

    Results have been obtained by conducting the controlled experiment involving Prolificworkers as participants. Data collection and processing followed a protocol approved by the University ethical board. Note that all data enclosed in the artifact is completely anonymized and does not contain sensible information.

    Further details about the provided dataset can be found in the Section Results' directory and files

    Setup and Usage (for executable artifacts):

    See the Section Scripts to reproduce the results, and instructions for running them

    Experiment-Material/

    Contains the material used for the experiment, and, specifically, the following subdirectories:

    Google-Forms/

    Contains (as PDF documents) the questionnaires submitted to the ten experimental groups.

    Task-Sources/

    Contains, for each experimental group (G-1...G-10), the sources used to produce the Google Forms, and, specifically: - The cover letter (Letter.docx). - A directory for each experimental task (Lambda 1, Lambda 2, Comp 1, Comp 2, MRF 1, MRF 2, Lambda Comparison, Comp Comparison, MRF Comparison). Each directory contains: (i) the exercise text (in both Word and .txt format), the source code snippet, and its .png image to be used in the form. Note: the "Comparison" tasks do not have any exercise as the purpose is always the same, i.e., to compare the (perceived) understandability of the snippets and return the results of the comparison.

    Code-Examples-Table1/

    Contains the source code snippets used as objects of the study (the same you can find under "Task-Sources/"), named as reported in Table 1.

    Results' directory and files

    raw-responses/

    Contains, as spreadsheets, the raw responses provided by the study participants through Google forms.

    raw-results-RQ1/

    Contains the raw results for RQ1. Specifically, the directory contains a subdirectory for each group (G1-G10). Each subdirectory contains: - For each user (named using their Prolific IDs, a directory containing, for each question (Q1-Q6) the produced python code (Qn.py) its output (QnR.txt) and its StdErr output (QnErr.txt). - "expected-outputs/": A directory containing the expected outputs for each task (Qn.txt).

    working-results/RQ1-RQ2-files-for-statistical-analysis/

    Contains three .csv files used as input for conducting the statistical analysis and drawing the graphs for addressing the first two research questions of the study. Specifically:

    ConstructUsage.csv contains the declared frequency usage of the three functional constructs object of the study. This file is used to draw Figure 4. The file contains an entry for each participant, reporting the (text-coded) frequency of construct usage for Comprehension, Lambda, and MRF.

    RQ1.csv contains the collected data used for the mixed-effect logistic regression relating the use of functional constructs with the correctness of the change task, as well as the logistic regression relating the use of map/reduce/filter functions with the correctness of the change task. The csv file contains an entry for each answer provided by each subject, and features the following columns:

    Group: experimental group to which the participant is assigned

    User: user ID

    Time: task time in seconds

    Approvals: number of approvals on previous tasks performed on Prolific

    Student: whether the participant declared themselves as a student

    Section: section of the questionnaire (lambda, comp, or mrf)

    Construct: specific construct being presented (same as "Section" for lambda and comp, for mrf it says whether it is a map, reduce, or filter)

    Question: question id, from Q1 to Q6, indicate the ordering of the question

    MainFactor: main factor treatment for the given question - "f" for functional, "p" for procedural counterpart

    Outcome: TRUE if the task was correctly performed, FALSE otherwise

    Complexity: cyclomatic complexity of the construct (empty for mrf)

    UsageFrequency: usage frequency of the given construct

    RQ1Paired-RQ2.csv contains the collected data used for the ordinal logistic regression of the relationship between the perceived ease of understanding of the functional constructs and (i) participants' usage frequency, and (ii) constructs' complexity (except for map/reduce/filter). The file features a row for each participant, and the columns are the following:

    Group: experimental group to which the participant is assigned

    User: user ID

    Time: task time in seconds

    Approvals: number of approvals on previous tasks performed on Prolific

    Student: whether the participant declared themselves as a student

    LambdaF: result for the change task related to a lambda construct

    LambdaP: result for the change task related to the procedural counterpart of a lambda construct

    CompF: result for the change task related to a comprehension construct

    CompP: result for the change task related to the procedural counterpart of a comprehension construct

    MrfF: result for the change task related to an MRF construct

    MrfP: result for the change task related to the procedural counterpart of a MRF construct

    LambdaComp: perceived understandability level for the comparison task (RQ2) between a lambda and its procedural counterpart

    CompComp: perceived understandability level for the comparison task (RQ2) between a comprehension and its procedural counterpart

    MrfComp: perceived understandability level for the comparison task (RQ2) between a MRF and its procedural counterpart

    LambdaCompCplx: cyclomatic complexity of the lambda construct involved in the comparison task (RQ2)

    CompCompCplx: cyclomatic complexity of the comprehension construct involved in the comparison task (RQ2)

    MrfCompType: type of MRF construct (map, reduce, or filter) used in the comparison task (RQ2)

    LambdaUsageFrequency: self-declared usage frequency on lambda constructs

    CompUsageFrequency: self-declared usage frequency on comprehension constructs

    MrfUsageFrequency: self-declared usage frequency on MRF constructs

    LambdaComparisonAssessment: outcome of the manual assessment of the answer to the "check question" required for the lambda comparison ("yes" means valid, "no" means wrong, "moderatechatgpt" and "extremechatgpt" are the results of GPTZero)

    CompComparisonAssessment: as above, but for comprehension

    MrfComparisonAssessment: as above, but for MRF

    working-results/inter-rater-RQ3-files/

    This directory contains four .csv files used as input for computing the inter-rater agreement for the manual labeling used for addressing RQ3. Specifically, you will find one file for each functional construct, i.e., comprehension.csv, lambda.csv, and mrf.csv, and a different file used for highlighting the reasons why participants prefer to use the procedural paradigm, i.e., procedural.csv.

    working-results/RQ2ManualValidation.csv

    This file contains the results of the manual validation being done to sanitize the answers provided by our participants used for addressing RQ2. Specifically, we coded the behaviour description using four different levels: (i) correct ("yes"), (ii) somewhat correct ("partial"), (iii) wrong ("no"), and (iv) automatically generated. The file features a row for each participant, and the columns are the following:

    ID: ID we used to refer the participant in the paper's qualitative analysis

    Group: experimental group to which the participant is assigned

    ProlificID: user ID

    Comparison for lambda construct description: answer provided by the user for the lambda comparison task

    Final Classification: our assessment of the lambda comparison answer

    Comparison for comprehension description: answer provided by the user for the comprehension comparison task

    Final Classification: our assessment of the comprehension comparison answer

    Comparison for MRF description: answer provided by the user for the MRF comparison task

    Final Classification: our assessment of the MRF comparison answer

    working-results/RQ3ManualValidation.xlsx

    This file contains the results of the open coding applied to address our third research question. Specifically, you will find four sheets, one for each functional construct and one for the procedural paradigm. Each sheet reports the provided answers together with the categories assigned to them. Each sheet contains the following columns:

    ID: ID we used to refer the participant in the paper's qualitative

  20. f

    Datasheet1_Impact of data synthesis strategies for the classification of...

    • frontiersin.figshare.com
    pdf
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthias Schaufelberger; Reinald Peter Kühle; Andreas Wachter; Frederic Weichel; Niclas Hagen; Friedemann Ringwald; Urs Eisenmann; Jürgen Hoffmann; Michael Engel; Christian Freudlsperger; Werner Nahm (2023). Datasheet1_Impact of data synthesis strategies for the classification of craniosynostosis.pdf [Dataset]. http://doi.org/10.3389/fmedt.2023.1254690.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Dec 13, 2023
    Dataset provided by
    Frontiers
    Authors
    Matthias Schaufelberger; Reinald Peter Kühle; Andreas Wachter; Frederic Weichel; Niclas Hagen; Friedemann Ringwald; Urs Eisenmann; Jürgen Hoffmann; Michael Engel; Christian Freudlsperger; Werner Nahm
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionPhotogrammetric surface scans provide a radiation-free option to assess and classify craniosynostosis. Due to the low prevalence of craniosynostosis and high patient restrictions, clinical data are rare. Synthetic data could support or even replace clinical data for the classification of craniosynostosis, but this has never been studied systematically.MethodsWe tested the combinations of three different synthetic data sources: a statistical shape model (SSM), a generative adversarial network (GAN), and image-based principal component analysis for a convolutional neural network (CNN)–based classification of craniosynostosis. The CNN is trained only on synthetic data but is validated and tested on clinical data.ResultsThe combination of an SSM and a GAN achieved an accuracy of 0.960 and an F1 score of 0.928 on the unseen test set. The difference to training on clinical data was smaller than 0.01. Including a second image modality improved classification performance for all data sources.ConclusionsWithout a single clinical training sample, a CNN was able to classify head deformities with similar accuracy as if it was trained on clinical data. Using multiple data sources was key for a good classification based on synthetic data alone. Synthetic data might play an important future role in the assessment of craniosynostosis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
UNC Dataverse (2011). Statistical Abstract of the United States, 2007 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-0227

Statistical Abstract of the United States, 2007

Explore at:
301 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 27, 2011
Dataset provided by
UNC Dataverse
License

https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227

Description

"The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2007.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for su pporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

Search
Clear search
Close search
Google apps
Main menu