Facebook
Twitterhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227
"The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2007.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for su pporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Public health-related decision-making on policies aimed at controlling the COVID-19 pandemic outbreak depends on complex epidemiological models that are compelled to be robust and use all relevant available data. This data article provides a new combined worldwide COVID-19 dataset obtained from official data sources with improved systematic measurement errors and a dedicated dashboard for online data visualization and summary. The dataset adds new measures and attributes to the normal attributes of official data sources, such as daily mortality, and fatality rates. We used comparative statistical analysis to evaluate the measurement errors of COVID-19 official data collections from the Chinese Center for Disease Control and Prevention (Chinese CDC), World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). The data is collected by using text mining techniques and reviewing pdf reports, metadata, and reference data. The combined dataset includes complete spatial data such as countries area, international number of countries, Alpha-2 code, Alpha-3 code, latitude, longitude, and some additional attributes such as population. The improved dataset benefits from major corrections on the referenced data sets and official reports such as adjustments in the reporting dates, which suffered from a one to two days lag, removing negative values, detecting unreasonable changes in historical data in new reports and corrections on systematic measurement errors, which have been increasing as the pandemic outbreak spreads and more countries contribute data for the official repositories. Additionally, the root mean square error of attributes in the paired comparison of datasets was used to identify the main data problems. The data for China is presented separately and in more detail, and it has been extracted from the attached reports available on the main page of the CCDC website. This dataset is a comprehensive and reliable source of worldwide COVID-19 data that can be used in epidemiological models assessing the magnitude and timeline for confirmed cases, long-term predictions of deaths or hospital utilization, the effects of quarantine, stay-at-home orders and other social distancing measures, the pandemic’s turning point or in economic and social impact analysis, helping to inform national and local authorities on how to implement an adaptive response approach to re-opening the economy, re-open schools, alleviate business and social distancing restrictions, design economic programs or allow sports events to resume.
Facebook
TwitterThe harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.
----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:
Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
The survey has six main objectives. These objectives are:
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.
1- Household/family. 2- Individual/person.
The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
Sample survey data [ssd]
----> Design:
Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.
----> Sample frame:
Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.
----> Sampling Stages:
In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.
Face-to-face [f2f]
----> Preparation:
The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.
----> Questionnaire Parts:
The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job
Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.
Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days
Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.
----> Raw Data:
Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.
----> Harmonized Data:
Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
Facebook
Twitterhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0014https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-0014
The Statistical Abstract is the Nation's best known and most popular single source of statistics on the social, political, and economic organization of the country. The print version of this reference source has been published since 1878 while the compact disc version first appeared in 1993. This disc is designed to serve as a convenient, easy-to-use statistical reference source and guide to statistical publications and sources. The disc contains over 1,400 tables from over 250 different gove rnmental, private, and international organizations. The 1999 CD reflects improved and enhanced data on the disc and the software used for accessing the information. The enrichments to the data and their access include: a link for table of contents page to a PDF of The Census web site. This enable the user to have direct links to the Statistical Abstract and its supplements and other features, such as Statistics in Brief and Frequently Requested Tables. A link to the table of contents from the first text page of each section facilitates quick movement between sections of the book. New PDFs provide more explanation of several major economic series including the Federal Budget, the National Income and Product Accounts (NIPA), the Consumer Price Index (CPI)and Producer Price Index (PPI), and the new North American Industry Classification System (NAICS). Another PDF provides information on the Federal court system. Links to these supplemental materials are provided from each appropriate table. A separate PDF presents a compilation of tables showing major economic indices, as selected by the Council of Economic Advisors. Maps of each state and their metro areas and component counties, maps outlining National Park sites throughout the country, a map of the United States with major transportation facilities and routes, a U.S.map locating coal mines and facilities, and one depicting the distribution of forest land have been added. As usual, updates have been made to most of the more than 1,500 tables and charts that were on the previous disc with new or more recent data. The spreadsheet files, which are available in both Excel and Lotus formats, will usually have more information than the tables displayed in the book or Adobe Acrobat files. The 1999 year introduced over 100 new tables covering a wide range of subject areas. Several sections have preliminary data from the 1997 Economic Census, which presents industry statistics for the first time based on the North American Industry Classification System (NAICS). Comparative data for 1992 and 1997, based on the Standard Industrial Classification (SIC), are also presented. Tables 872 and 873 in Section 17, Business, present summary data for industries. Other new tables cover such topics as the foreign-born population, health care expenditures, the medicare trust fund, violence in schools, presale handgun checks, recycling programs, defense- related employment and spending, workplace violence, ownership of mutual funds, computer use, results of the 1997 Census of Agriculture, and mail order catalogue sales. In addition to the above new tables, a new section has been developed, the 20th Century Statistics. This section introduces data beginning in 1900 on a broad range of subjects, including population, vital statistics, health, education, income, labor force, communications, agriculture, defense, and other areas. The Industrial Outlook tables, previously in Section 31, have been deleted for lack of updates. For a complete list of new tables, see Appendix VI,p.947. The Adobe Acrobat Reader and Search engine, Version 4.0, is on the disc. The Acrobat Reader allows users to view, navigate, search, and print on demand any of the pages from the book. Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.
Facebook
TwitterData licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
License information was derived automatically
Data from various sources are updated in the Statistical Information System of the City of Cologne. The annual statistical yearbook publishes these in tabular, graphic and cartographic form at the level of the city districts and districts. Furthermore, definitions and calculation bases are explained. Small-scale statistics at the level of the 86 districts can be obtained from the Cologne district information become. All levels of the local area structure are presented in this publication explained.
This statistical data catalogue supplements the range of small-scale data. Selected structural data can be called up here in compact tabular form at the level of the 570 statistical districts or the 86 districts. The two overviews provide information about which data is available and from which source it originates. The data itself is provided annually.
Notes:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.
SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :
The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).
Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.
Main characteristics (variables) of the SBS data category:
All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:
More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.
Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.
SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :
The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).
Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.
Main characteristics (variables) of the SBS data category:
All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:
More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.
Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.
Facebook
Twitter8 December 2011
2009 (Export of services)
2008 - 2009 (GVA)
2009 - 2010 (Employment)
2009 - 2011 (Businesses)
UK
Autumn 2012
This bulletin provides estimates of the contribution of Creative Industries to the economy, using the latest data available. The majority of this data is taken from National Statistics sources produced by the Office for National Statistics (ONS). Data sources include thhttp://www.ons.gov.uk/ons/search/index.html?content-type=publicationContentTypes&nscl=Business+and+Energy&pubdateRangeType=last5yrs&pubdateRangeType=allDates&coverage=UK&newquery=annual+business+survey&pageSize=50&applyFilters=tr">Annual Business Survey (ABS), the http://www.ons.gov.uk/ons/about-ons/who-we-are/services/unpublished-data/business-data/idbr/index.html">Inter-Departmental Business Register (IDBR) and the http://www.ons.gov.uk/ons/search/index.html?content-type=Publication&nscl=Labour+Market&pubdateRangeType=last12months&pubdateRangeType=allDates&newquery=labour+force+survey&pageSize=25&applyFilters=true">Labour Force Survey (LFS). Our definition of the Creative Industries is taken from the http://webarchive.nationalarchives.gov.uk/+/http:/www.culture.gov.uk/reference_library/publications/4632.aspx">2001 Creative Industries Mapping Document. Further information on this can be found in the technical note.
This is the second year that the Creative Industries have been estimated via the Standard Industrial Classifications (SIC07). Previously this statistical release was given the title of an ‘experimental statistic’ as the methodology was in its inaugural year and was still under development. This methodology is now in its second year and the core methodology has not changed (see page 9 for other changes) so the title ‘experimental statistics’ has been removed.
However, the methodology for estimation used here is regularly reviewed and if you would like to contribute to this, please contact us at CIEEBulletin@culture.gsi.gov.uk.
This set of Creative Industries Estimates represents a snapshot of the latest figures. Because of the modifications made to this releases estimates, the figures should not be directly compared to previous estimates. Re-calculation of previous years’ estimates have been included in the release for time series analysis.
This contains the headline findings, data tables and figures and a full technical note with definitions, methodology and a full list of the SIC codes used to produce these statistics.
A summary of the key findings from these statistics, along with data tables.
Updated 22/12/11 to correct the presentation and formatting - all estimates are unchanged from the earlier version.
This release is published in accordance with the Code of Practice for Official Statistics (2009), as produced by the UK Statistics Authority (UKSA). The UKSA has the overall objective of promoting and safeguarding the production and publication of official statistics that serve the public good. It monitors and reports on all official statistics, and promotes good practice in this area.
Facebook
TwitterThe National Energy Efficiency Data-Framework (NEED) was set up to provide a better understanding of energy use and energy efficiency in domestic and non-domestic buildings in Great Britain. The data framework matches data about a property together - including energy consumption and energy efficiency measures installed - at household level.
We identified 2 processing errors in this edition of the Domestic NEED Annual report and corrected them. The changes are small and do not affect the overall findings of the report, only the domestic energy consumption estimates. The impact of energy efficiency measures analysis remains unchanged. The revisions are summarised here:
This survey (published June 2021) sought user feedback to inform BEIS’ development of Domestic NEED to better meet user requirements. It is now closed: thank you to those who responded.
We are reviewing responses and will provide an update in due course. The responses will also inform BEIS’ decision on whether or not to pause the 2022 NEED publication to enable development work to take place.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.
SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :
The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).
Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.
Main characteristics (variables) of the SBS data category:
All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:
More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.
Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.
SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :
The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).
Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.
Main characteristics (variables) of the SBS data category:
All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:
More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.
Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.
Facebook
TwitterThe Taking Part survey has run since 2005 and is the key evidence source for DCMS. It is a continuous face to face household survey of adults aged 16 and over in England and children aged 5-15 years old. This latest releases presents rolling estimates incorporating data from the third quarter of year seven of the survey.
29 March 2012
January 2011 - December 2011
National and Regional level data for England.
A release of rolling annual estimates for adults, including the fourth quarter of the 2011/12 survey year, is scheduled for the end of June 2012.
The latest data from the 2011/12 Taking Part survey provides reliable national estimates of adult and child engagement with sport, libraries, the arts, heritage and museums and galleries. This release builds on the data from 2010/2011 and data from quarter 1 and quarter 2 releases of data from earlier in 2011/12 to look at a number of areas in depth and present measures that begin to consider broader definitions of participation in our sectors. The report also looks at some of the other measures in the survey that provide estimates of volunteering and charitable giving and civic engagement.
The Taking Part survey is a continuous annual survey of adults and children living in private households in England, and carries the National Statistics badge, meaning that it meets the highest standards of statistical quality.
These spreadsheets contain the data and sample sizes to support the material in this release:
The previous Taking Part release was published on 21 December 2011 and can be found online. It also provides spreadsheets containing the data and sample sizes for each sector included in the survey.
The document below contains a list of Ministers and Officials who have received privileged early access to this release of Taking Part data. In line with best practice, the list has been kept to a minimum and those given access for briefing purposes had a maximum of 24 hours.
This release is published in accordance with the Code of Practice for Off
Facebook
TwitterThis bulletin contains experimental statistics on gross value added (GVA), employment and numbers of businesses within the creative industries.
Please note that the methodology for these estimates was updated in 2011 and the figures in this 2010 report have been superseded by those in the 2011 report. Therefore for up-to-date figures on the Creative Industries and a time series covering the figures in this 2010 report please see the latest http://www.culture.gov.uk/publications/8682.aspx">Creative Industries Economic Estimates.
9 December 2010
2008 (GVA and Exports)
2010 (Employment and businesses)
UK (GVA, Businesses and Exports)
Great Britain (Employment)
Autumn 2011
This bulletin provides estimates of the contribution of Creative Industries to the economy, using the latest data available. The majority of this data is taken from National Statistics sources produced by the Office for National Statistics (ONS). Data sources include the http://www.ons.gov.uk/ons/search/index.html?content-type=publicationContentTypes&nscl=Business+and+Energy&pubdateRangeType=last5yrs&pubdateRangeType=allDates&coverage=UK&newquery=annual+business+survey&pageSize=50&applyFilters=tr">Annual Business Survey (ABS), the http://www.ons.gov.uk/ons/about-ons/who-we-are/services/unpublished-data/business-data/idbr/index.html">Inter-Departmental Business Register (IDBR) and the http://www.ons.gov.uk/ons/search/index.html?content-type=Publication&nscl=Labour+Market&pubdateRangeType=last12months&pubdateRangeType=allDates&newquery=labour+force+survey&pageSize=25&applyFilters=true">Labour Force Survey (LFS). Our definition of the Creative Industries is taken from the http://webarchive.nationalarchives.gov.uk/+/http:/www.culture.gov.uk/reference_library/publications/4632.aspx">2001 Creative Industries Mapping Document. Further information on this can be found in the technical note.
As this is our first attempt to measure the Creative Industries using http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/standard-industrial-classification/index.html">Standard Industrial Classifications (SIC 2007), this series of economic estimates are classed as experimental statistics. The statistics will be developed following further consultation with users. We are grateful for any feedback on the way in which we have used the SIC 2007 codes to measure the Creative Industries. If you would like to contribute to this process, please either use the feedback form below, or contact us at CIEEBulletin@culture.gsi.gov.uk.
This set of Creative Industries Estimates represents a snapshot of the latest figures. Because of the change of http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/standard-industrial-classification/index.html">Standard Industrial Classifications used to produce these estimates, the figures should not be directly compared to previous estimates, which were produced using the old Standard Industrial Classifications (2003). In 2011 we will work with the Office for National Statistics (ONS) to investigate the possibility of establishing a consistent back time series, so that these estimates are more comparable with previous ones. This will be a complex task, and may not be possible for all sectors.
We are also investigating the possibility of producing regional estimates for the Creative Industries, given the demand that we know exists for these.
This contains the headline findings, data tables, and a full technical note with definitions, methodology and a full list of the SIC codes used to produce these statistics.
Facebook
TwitterSpatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and outdoor recreation access across the nation. This data release presents results from statistical summaries of the PAD-US 3.0 protection status (by GAP Status Code) and public access status for various land unit boundaries (Protected Areas Database of the United States 3.0 Vector Analysis and Summary Statistics). Summary statistics are also available to explore and download (Comma-separated Table [CSV], Microsoft Excel Workbook (.xlsx), Portable Document Format [.pdf] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). The vector GIS analysis file, source data used to summarize statistics for areas of interest to stakeholders (National, State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative), and complete Summary Statistics Tabular Data (CSV) are included in this data release. Raster GIS analysis files are also available for combination with other raster data (Protected Areas Database of the United States (PAD-US) 3.0 Raster Analysis). The PAD-US 3.0 Combined Fee, Designation, Easement feature class in the full inventory, with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class (Protected Areas Database of the United States (PAD-US) 3.0, https://doi.org/10.5066/P9Q9LQ4B), was modified to prioritize and remove overlapping management designations, limiting overestimation in protection status or public access statistics and to support user needs for vector and raster analysis data. Analysis files in this data release were clipped to the Census State boundary file to define the extent and fill in areas (largely private land) outside the PAD-US, providing a common denominator for statistical summaries.
Facebook
Twitterhttps://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).
SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.
SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :
The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).
Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.
Main characteristics (variables) of the SBS data category:
All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:
More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.
Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.
Facebook
Twitterhttps://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.
Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.
We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.
Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.
The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.
To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.
The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.
The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:
Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.
There are two classification tasks in this exercise:
1. identifying whether an academic article is using data from any country
2. Identifying from which country that data came.
For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.
After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]
For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.
We expect between 10 and 35 percent of all articles to use data.
The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.
A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.
The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.
The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sources:a National Institute for Population Research and Training, MEASURE Evaluation, International Centre for Diarrhoeal Disease Research (2012) Bangladesh Maternal Mortality and Health Care Survey 2010. Available: http://www.cpc.unc.edu/measure/publications/tr-12-87. Accessed October 15, 2012.b World Health Organization (ND) WHO Maternal Mortality Country Profiles. Available: www.who.int/gho/maternal_health/en/#M. Accessed 1 March 2015.c Lozano R, Wang H, Foreman KJ, Rajaratnam JK, Naghavi M, Marcus JR, et al. (2011) Progress towards Millennium Development Goals 4 and 5 on maternal and child mortality: an updated systematic analysis. Lancet 378(9797): 1139–65. 10.1016/S0140-6736(11)61337-8d UNFPA, UNICEF, WHO, World Bank (2012) Trends in maternal mortality: 1990–2010. Available: http://www.unfpa.org/public/home/publications/pid/10728. Accessed 7 October 2012.e Bangladesh Bureau of Statistics, Statistics Informatics Division, Ministry of Planning (December 2012) Population and Housing Census 2011, Socio-economic and Demographic Report, National Series–Volume 4. Available at: http://203.112.218.66/WebTestApplication/userfiles/Image/BBS/Socio_Economic.pdf. Accessed 15 February, 2015.f Mozambique National Institute of Statistics, U.S. Census Bureau, MEASURE Evaluation, U.S. Centers for Disease Control and Prevention (2012) Mortality in Mozambique: Results from a 2007–2008 Post-Census Mortality Survey. Available: http://www.cpc.unc.edu/measure/publications/tr-11-83. Accessed 6 October 2012.g Ministerio da Saude (MISAU), Instituto Nacional de Estatística (INE) e ICF International (ICFI). Moçambique Inquérito Demográfico e de Saúde 2011. Calverton, Maryland, USA: MISAU, INE e ICFI.h Mudenda SS, Kamocha S, Mswia R, Conkling M, Sikanyiti P, et al. (2011) Feasibility of using a World Health Organization-standard methodology for Sample Vital Registration with Verbal Autopsy (SAVVY) to report leading causes of death in Zambia: results of a pilot in four provinces, 2010. Popul Health Metr 9:40. 10.1186/1478-7954-9-40i Central Statistical Office (CSO), Ministry of Health (MOH), Tropical Diseases Research Centre (TDRC), University Teaching Hospital Virology Laboratory, University of Zambia, and ICF International Inc. 2014. Zambia Demographic and Health Survey 2013–14: Preliminary Report. Rockville, Maryland, USA. Available: http://dhsprogram.com/pubs/pdf/PR53/PR53.pdf. Accessed February 26, 2015.j Centers for Disease Control and Prevention (2014) Saving Mothers, Giving Life: Maternal Mortality.Phase 1 Monitoring and Evaluation Report. Atlanta, GA: Centers for Disease Control and Prevention, US Dept of Health and Human Services. Available at: http://www.savingmothersgivinglife.org/doc/Maternal%20Mortality%20(advance%20copy).pdf. Accessed 26 February 2015.k Central Statistical Office (CSO), Ministry of Health (MOH), Tropical Diseases Research Centre (TDRC), University of Zambia, and Macro International Inc. 2009. Zambia Demographic and Health Survey 2007. Calverton, Maryland, USA: CSO and Macro International Inc.Comparison of Maternal Mortality Estimates: Zambia, Bangladesh, Mozambique.
Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
Replication Package for "A Study on the Pythonic Functional Constructs' Understandability" to appear at ICSE 2024
Authors: Cyrine Zid, Fiorella Zampetti, Giuliano Antoniol, Massimiliano Di penta
Article Preprint: https://mdipenta.github.io/files/ICSE24_funcExperiment.pdf
Artifacts: https://doi.org/10.5281/zenodo.8191782
License: GPL V3.0
This package contains folders and files with code and data used in the study described in the paper. In the following, we first provide all fields required for the submission, and then report a detailed description of all repository folders.
Artifact Description
Purpose
The artifact is about a controlled experiment aimed at investigating the extent to which Pythonic functional constructs have an impact on source code understandability. The artifact archive contains:
The material to allow replicating the study (see Section Experimental-Material)
Raw quantitative results, working datasets, and scripts to replicate the statistical analyses reported in the paper. Specifically, the executable part of the replication package reproduces figures and tables of the quantitative analysis (RQ1 and RQ2) of the paper starting from the working datasets.
Spreadsheets used for the qualitative analysis (RQ3).
We apply for the following badges:
Available and reusable: because we provide all the material that can be used to replicate the experiment, but also to perform the statistical analyses and the qualitative analyses (spreadsheets, in this case)
Provenance
Paper preprint link: https://mdipenta.github.io/files/ICSE24_funcExperiment.pdf
Artifacts: https://doi.org/10.5281/zenodo.8191782
Data
Results have been obtained by conducting the controlled experiment involving Prolificworkers as participants. Data collection and processing followed a protocol approved by the University ethical board. Note that all data enclosed in the artifact is completely anonymized and does not contain sensible information.
Further details about the provided dataset can be found in the Section Results' directory and files
Setup and Usage (for executable artifacts):
See the Section Scripts to reproduce the results, and instructions for running them
Experiment-Material/
Contains the material used for the experiment, and, specifically, the following subdirectories:
Google-Forms/
Contains (as PDF documents) the questionnaires submitted to the ten experimental groups.
Task-Sources/
Contains, for each experimental group (G-1...G-10), the sources used to produce the Google Forms, and, specifically: - The cover letter (Letter.docx). - A directory for each experimental task (Lambda 1, Lambda 2, Comp 1, Comp 2, MRF 1, MRF 2, Lambda Comparison, Comp Comparison, MRF Comparison). Each directory contains: (i) the exercise text (in both Word and .txt format), the source code snippet, and its .png image to be used in the form. Note: the "Comparison" tasks do not have any exercise as the purpose is always the same, i.e., to compare the (perceived) understandability of the snippets and return the results of the comparison.
Code-Examples-Table1/
Contains the source code snippets used as objects of the study (the same you can find under "Task-Sources/"), named as reported in Table 1.
Results' directory and files
raw-responses/
Contains, as spreadsheets, the raw responses provided by the study participants through Google forms.
raw-results-RQ1/
Contains the raw results for RQ1. Specifically, the directory contains a subdirectory for each group (G1-G10). Each subdirectory contains: - For each user (named using their Prolific IDs, a directory containing, for each question (Q1-Q6) the produced python code (Qn.py) its output (QnR.txt) and its StdErr output (QnErr.txt). - "expected-outputs/": A directory containing the expected outputs for each task (Qn.txt).
working-results/RQ1-RQ2-files-for-statistical-analysis/
Contains three .csv files used as input for conducting the statistical analysis and drawing the graphs for addressing the first two research questions of the study. Specifically:
ConstructUsage.csv contains the declared frequency usage of the three functional constructs object of the study. This file is used to draw Figure 4. The file contains an entry for each participant, reporting the (text-coded) frequency of construct usage for Comprehension, Lambda, and MRF.
RQ1.csv contains the collected data used for the mixed-effect logistic regression relating the use of functional constructs with the correctness of the change task, as well as the logistic regression relating the use of map/reduce/filter functions with the correctness of the change task. The csv file contains an entry for each answer provided by each subject, and features the following columns:
Group: experimental group to which the participant is assigned
User: user ID
Time: task time in seconds
Approvals: number of approvals on previous tasks performed on Prolific
Student: whether the participant declared themselves as a student
Section: section of the questionnaire (lambda, comp, or mrf)
Construct: specific construct being presented (same as "Section" for lambda and comp, for mrf it says whether it is a map, reduce, or filter)
Question: question id, from Q1 to Q6, indicate the ordering of the question
MainFactor: main factor treatment for the given question - "f" for functional, "p" for procedural counterpart
Outcome: TRUE if the task was correctly performed, FALSE otherwise
Complexity: cyclomatic complexity of the construct (empty for mrf)
UsageFrequency: usage frequency of the given construct
RQ1Paired-RQ2.csv contains the collected data used for the ordinal logistic regression of the relationship between the perceived ease of understanding of the functional constructs and (i) participants' usage frequency, and (ii) constructs' complexity (except for map/reduce/filter). The file features a row for each participant, and the columns are the following:
Group: experimental group to which the participant is assigned
User: user ID
Time: task time in seconds
Approvals: number of approvals on previous tasks performed on Prolific
Student: whether the participant declared themselves as a student
LambdaF: result for the change task related to a lambda construct
LambdaP: result for the change task related to the procedural counterpart of a lambda construct
CompF: result for the change task related to a comprehension construct
CompP: result for the change task related to the procedural counterpart of a comprehension construct
MrfF: result for the change task related to an MRF construct
MrfP: result for the change task related to the procedural counterpart of a MRF construct
LambdaComp: perceived understandability level for the comparison task (RQ2) between a lambda and its procedural counterpart
CompComp: perceived understandability level for the comparison task (RQ2) between a comprehension and its procedural counterpart
MrfComp: perceived understandability level for the comparison task (RQ2) between a MRF and its procedural counterpart
LambdaCompCplx: cyclomatic complexity of the lambda construct involved in the comparison task (RQ2)
CompCompCplx: cyclomatic complexity of the comprehension construct involved in the comparison task (RQ2)
MrfCompType: type of MRF construct (map, reduce, or filter) used in the comparison task (RQ2)
LambdaUsageFrequency: self-declared usage frequency on lambda constructs
CompUsageFrequency: self-declared usage frequency on comprehension constructs
MrfUsageFrequency: self-declared usage frequency on MRF constructs
LambdaComparisonAssessment: outcome of the manual assessment of the answer to the "check question" required for the lambda comparison ("yes" means valid, "no" means wrong, "moderatechatgpt" and "extremechatgpt" are the results of GPTZero)
CompComparisonAssessment: as above, but for comprehension
MrfComparisonAssessment: as above, but for MRF
working-results/inter-rater-RQ3-files/
This directory contains four .csv files used as input for computing the inter-rater agreement for the manual labeling used for addressing RQ3. Specifically, you will find one file for each functional construct, i.e., comprehension.csv, lambda.csv, and mrf.csv, and a different file used for highlighting the reasons why participants prefer to use the procedural paradigm, i.e., procedural.csv.
working-results/RQ2ManualValidation.csv
This file contains the results of the manual validation being done to sanitize the answers provided by our participants used for addressing RQ2. Specifically, we coded the behaviour description using four different levels: (i) correct ("yes"), (ii) somewhat correct ("partial"), (iii) wrong ("no"), and (iv) automatically generated. The file features a row for each participant, and the columns are the following:
ID: ID we used to refer the participant in the paper's qualitative analysis
Group: experimental group to which the participant is assigned
ProlificID: user ID
Comparison for lambda construct description: answer provided by the user for the lambda comparison task
Final Classification: our assessment of the lambda comparison answer
Comparison for comprehension description: answer provided by the user for the comprehension comparison task
Final Classification: our assessment of the comprehension comparison answer
Comparison for MRF description: answer provided by the user for the MRF comparison task
Final Classification: our assessment of the MRF comparison answer
working-results/RQ3ManualValidation.xlsx
This file contains the results of the open coding applied to address our third research question. Specifically, you will find four sheets, one for each functional construct and one for the procedural paradigm. Each sheet reports the provided answers together with the categories assigned to them. Each sheet contains the following columns:
ID: ID we used to refer the participant in the paper's qualitative
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionPhotogrammetric surface scans provide a radiation-free option to assess and classify craniosynostosis. Due to the low prevalence of craniosynostosis and high patient restrictions, clinical data are rare. Synthetic data could support or even replace clinical data for the classification of craniosynostosis, but this has never been studied systematically.MethodsWe tested the combinations of three different synthetic data sources: a statistical shape model (SSM), a generative adversarial network (GAN), and image-based principal component analysis for a convolutional neural network (CNN)–based classification of craniosynostosis. The CNN is trained only on synthetic data but is validated and tested on clinical data.ResultsThe combination of an SSM and a GAN achieved an accuracy of 0.960 and an F1 score of 0.928 on the unseen test set. The difference to training on clinical data was smaller than 0.01. Including a second image modality improved classification performance for all data sources.ConclusionsWithout a single clinical training sample, a CNN was able to classify head deformities with similar accuracy as if it was trained on clinical data. Using multiple data sources was key for a good classification based on synthetic data alone. Synthetic data might play an important future role in the assessment of craniosynostosis.
Facebook
Twitterhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=hdl:1902.29/CD-0227
"The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2007.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for su pporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.