The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software. Currently datasets and certified values are provided for assessing the accuracy of software for univariate statistics, linear regression, nonlinear regression, and analysis of variance. The collection includes both generated and 'real-world' data of varying levels of difficulty. Generated datasets are designed to challenge specific computations. These include the classic Wampler datasets for testing linear regression algorithms and the Simon & Lesage datasets for testing analysis of variance algorithms. Real-world data include challenging datasets such as the Longley data for linear regression, and more benign datasets such as the Daniel & Wood data for nonlinear regression. Certified values are 'best-available' solutions. The certification procedure is described in the web pages for each statistical method. Datasets are ordered by level of difficulty (lower, average, and higher). Strictly speaking the level of difficulty of a dataset depends on the algorithm. These levels are merely provided as rough guidance for the user. Producing correct results on all datasets of higher difficulty does not imply that your software will pass all datasets of average or even lower difficulty. Similarly, producing correct results for all datasets in this collection does not imply that your software will do the same for your particular dataset. It will, however, provide some degree of assurance, in the sense that your package provides correct results for datasets known to yield incorrect results for some software. The Statistical Reference Datasets is also supported by the Standard Reference Data Program.
Common Crawl Statistics
Number of pages, distribution of top-level domains, crawl overlaps, etc. - basic metrics about Common Crawl Monthly Crawl Archives, for more detailed information and graphs please visit our official statistics page. Here you can find the following statistics files:
Charsets
The character set or encoding of HTML pages only is identified by Tika's AutoDetectReader. The table shows the percentage how character sets have been used to encode HTML pages… See the full description on the dataset page: https://huggingface.co/datasets/commoncrawl/statistics.
The ckanext-datavic-stats extension, a fork of CKAN's built-in statistics plugin, provides enhanced statistical analysis capabilities specifically tailored for data portals like data.gov.au. It modifies the core statistics functionality to provide more relevant and accurate insights. The extension focuses on presenting a refined view of data usage and engagement, optimizing the presentation of key metrics for informed decision-making. Key Features: Exclusion of Private Datasets: Removes private datasets from almost all statistical calculations (except for the top users section), ensuring that public statistics accurately reflect openly accessible data. This prevents skewing of results due to internal or non-public data. Summary Page Enhancement: Introduces or enhances a summary stats page, allowing administrators and users to quickly grasp key metrics related to dataset usage and engagement. This includes important data regarding the number of available public datasets. Activity Summary Page: Adds a dedicated activity summary page that aggregates public data activity metrics, providing a consolidated view of user interactions, dataset updates, and other relevant portal events. This enhances transparency and allows for better tracking of data portal usage. Organization-Level Public/Private Dataset Counts: Provides a dedicated page for organizations outlining the count of both public and private datasets they manage. This enhances organizational transparency and allows for easy auditing of data visibility settings. Use Cases: Data Portals: Enhances the statistical reporting capabilities of open data portals by focusing statistics on public datasets, providing portal administrators and users with more actionable insights into the use of openly accessible data and ensuring key statistics reflect the portals publicly available data offerings. Data Governance: Helps organisations monitor and maintain their data portfolios, with clear indicators of how much data is public. Technical Integration: This extension modifies the existing CKAN statistics plugin, seamlessly integrating with CKAN's user interface and backend. It likely overrides or extends the default statistical calculations and templates to implement its specific features without requiring substantial alterations to the core CKAN system. Benefits & Impact: By focusing statistics on public datasets, ckanext-datavic-stats provides portal administrators a clearer picture of which data is used and how. Added summary pages and organization data counts provide better ways for end users to understand their catalog's data, helping increase its value and utility. The exclusion of private datasets makes metrics more relevant for external users interested in open data usage trends and more targeted insights that can inform promotion and improvement strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains statistics (key metrics) related to the Unleashed Facebook page (https://www.facebook.com/UnleashedADL/). Unleashed is an open data competition, an initiative of the Office for Digital Government, Department of the Premier and Cabinet. This data is used to monitor the level of engagement activity with the audience, and make the communication effective in regards to the event.
The dataset collection in question is a compilation of related data tables sourced from the website of Tilastokeskus (Statistics Finland) in Finland. The data present in the collection is organized in a tabular format comprising of rows and columns, each holding related data. The collection includes several tables, each of which represents different years, providing a temporal view of the data. The description provided by the data source, Tilastokeskuksen palvelurajapinta (Statistics Finland's service interface), suggests that the data is likely to be statistical in nature and could be related to regional statistics, given the nature of the source. This dataset is licensed under CC BY 4.0 (Creative Commons Attribution 4.0, https://creativecommons.org/licenses/by/4.0/deed.fi).
The statistic above shows the ratio of editorial and advertising pages in U.S. magazinesfrom 2000 to 2013. In 2004, 51.9 percent of all magazine pages were filled with editorial content. Here you can find an overview of the editorial content topics.
The dga-stats extension for CKAN enhances the platform's built-in statistics functionality. Adapted from CKAN's original statistics plugin, it provides data.gov.au-specific modifications focused on dataset visibility and summary reporting. By excluding private datasets from the majority of statistics and introducing dedicated summary pages, the extension aims to provide a clearer and more relevant overview of the CKAN instance's data landscape, helping users and administrators to better understand data usage and trends. Key Features: Exclusion of Private Datasets: Removes private datasets from most statistics calculations, providing a public-facing view of dataset popularity and usage. This ensures that only publicly accessible data influences key performance indicators reported through the dashboard. The exception is top users which include interactions with private datasets. Summary Page: Adds a dedicated summary page providing a high-level overview of key metrics on the CKAN instance. This offers a one-stop-shop for quick access to essential information about the portal's data holdings. Activity Summary Page: Introduces a page specifically designed to aggregate and display activity-related statistics. It offers insights into user engagement and data interaction patterns. Organization Public/Private Dataset Count Page: Provides a breakdown of public and private datasets within each organization. This reporting feature provides information about how organizations are managing their data within the platform. Technical Integration: The dga-stats extension modifies the existing CKAN statistics plugin. It achieves this by integrating custom code and configurations within the CKAN framework. Benefits & Impact: The dga-stats extension provides enhanced visibility into data usage on CKAN instances, optimized for platforms like data.gov.au. These enhancements offer improved insights for site administrators and increased transparency for users by emphasizing public dataset statistics and providing comprehensive summary reporting.
https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy
Landing Page Statistics: Landing pages are dedicated web pages designed to convert visitors into leads or customers by focusing on a single, clear call to action. In 2024, the median landing page conversion rate across industries is 6.6%, with top-performing pages exceeding 20%. Email-driven traffic achieves the highest average conversion rate at 19.3%, outperforming paid search (10.9%) and paid social (12%).
Mobile devices account for 82.9% of landing page traffic, yet desktop users exhibit a higher average conversion rate of 12.1% compared to 11.2% for mobile users. Speed is crucial; a one-second delay in page load time can reduce conversions by 7%. Incorporating videos can boost conversions by 86%, and personalized landing pages can convert 202% better than generic ones.
Design elements significantly impact performance. Landing pages with five or fewer form fields convert 120% better than those with more fields. Pages with a single, clear call to action achieve a 13.5% conversion rate, compared to 11.9% for pages with multiple CTAs. Additionally, 38.6% of marketers report that videos enhance landing page conversion rates more than any other element.
Let us check out some of the Landing page statistics concerning landing page performance and the secrets of landing page success.
Population is the sum of births plus in-migration, and it signifies the total market size possible in the area. This is an important metric for economic developers to measure their economic health and investment attraction. Businesses also use this as a metric for market size when evaluating startup, expansion or relocation decisions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Page by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Page across both sexes and to determine which sex constitutes the majority.
Key observations
There is a slight majority of male population, with 50.57% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Page Population by Race & Ethnicity. You can refer the same here
As of August 2024, among the top ten classified websites worldwide, craigslist.org experienced the highest number of pages per visit. During a session, online consumers browsed an average of 28.47 pages on craigslist. The website finn.no followed with around 19.68 pages per visit.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information on the most visited pages of each website always limiting to all those that have more than 100 pages viewed per day. Daily update.
Google.com was the website with the most page views per day in Bolivia in February 2022, according to ranking by Alexa. The website had more than 18.49 daily page views and was followed by Unitel.bo, with 11 page views per day that month. Within Latin America, Mexico was the country where Amazon Alexa contained the largest number of skills.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Page by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Page across both sexes and to determine which sex constitutes the majority.
Key observations
There is a majority of male population, with 57.26% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Page Population by Race & Ethnicity. You can refer the same here
This dataset collection is a compilation of related data tables sourced from 'Tilastokeskus' (Statistics Finland), a website based in Finland. It offers a service interface (WFS) that provides statistical data. The data in this collection is arranged in table format, with each table comprising of columns and rows. The tables included in this collection offer comprehensive and related data, making this dataset an invaluable resource for those seeking statistical information. This dataset is licensed under CC BY 4.0 (Creative Commons Attribution 4.0, https://creativecommons.org/licenses/by/4.0/deed.fi).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Page by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Page. The dataset can be utilized to understand the population distribution of Page by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Page. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Page.
Key observations
Largest age group (population): Male # 5-9 years (14) | Female # 30-34 years (17). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Page Population by Gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Statistics on the Open Data site ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-mon-saint-quentin-hub-arcgis-com-datasets-5426305826594a33a561acfd02d25808_0 on 12 January 2022.
--- Dataset description provided by original source is as follows ---
Statistics on official and obsolete consignments broken down by actor in the portal.
Definition of Obsolète: A batch of data is considered obsolete when obvious defects have been detected as a result of a quality check or where there is no longer an update strategy carried out by the business department responsible for the maintenance of the lot.
Definition of official: The lot is usable and suitable.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wave statistics computed using output from the NOAA WWIII hindcast simulations, spanning thirty years from 1980 to 2009. The statistics are computed based on frequency-directional variance density spectra every three hours for 1951 locations in US waters.
The dataset collection in question comprises of related data tables sourced from the website of 'Tilastokeskus' (Statistical Centre), based in Finland. These tables contain valuable information provided through the Statistical Centre's service interface (WFS). The collection is organized into an accessible table format, making it easy to navigate through the columns and rows of data. This dataset collection is an invaluable resource for those looking for comprehensive statistical data from Finland. This dataset is licensed under CC BY 4.0 (Creative Commons Attribution 4.0, https://creativecommons.org/licenses/by/4.0/deed.fi).
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Vatican Data Series {title at top of page}Data Developers: Burhans, Molly A., Cheney, David M., Emege, Thomas, Gerlt, R.. . “Vatican Data Series {title at top of page}”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Catholic Hierarchy, Environmental Systems Research Institute, Inc., 2019.Web map developer: Molly Burhans, October 2019Web app developer: Molly Burhans, October 2019GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/The Catholic Leadership global maps information is derived from the Annuario Pontificio, which is curated and published by the Vatican Statistics Office annually, and digitized by David Cheney at Catholic-Hierarchy.org -- updated are supplemented with diocesan and news announcements. GoodLands maps this into global ecclesiastical boundaries. Admin 3 Ecclesiastical Territories:Burhans, Molly A., Cheney, David M., Gerlt, R.. . “Admin 3 Ecclesiastical Territories For Web”. Scale not given. Version 1.2. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.Derived from:Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.
The purpose of this project is to improve the accuracy of statistical software by providing reference datasets with certified computational results that enable the objective evaluation of statistical software. Currently datasets and certified values are provided for assessing the accuracy of software for univariate statistics, linear regression, nonlinear regression, and analysis of variance. The collection includes both generated and 'real-world' data of varying levels of difficulty. Generated datasets are designed to challenge specific computations. These include the classic Wampler datasets for testing linear regression algorithms and the Simon & Lesage datasets for testing analysis of variance algorithms. Real-world data include challenging datasets such as the Longley data for linear regression, and more benign datasets such as the Daniel & Wood data for nonlinear regression. Certified values are 'best-available' solutions. The certification procedure is described in the web pages for each statistical method. Datasets are ordered by level of difficulty (lower, average, and higher). Strictly speaking the level of difficulty of a dataset depends on the algorithm. These levels are merely provided as rough guidance for the user. Producing correct results on all datasets of higher difficulty does not imply that your software will pass all datasets of average or even lower difficulty. Similarly, producing correct results for all datasets in this collection does not imply that your software will do the same for your particular dataset. It will, however, provide some degree of assurance, in the sense that your package provides correct results for datasets known to yield incorrect results for some software. The Statistical Reference Datasets is also supported by the Standard Reference Data Program.