Tablets can be used to facilitate systematic testing of academic skills. Yet, when using validated paper tests on tablet, comparability between the mediums must be established. In this dataset, comparability between a tablet and a paper version of a basic math skills test (HRT: Heidelberger Rechen Test 1–4) was investigated.
Four of the five samples included in the current study covered a broad spectrum of schools regarding student achievement in mathematics, proportion of non-native students, parental educational levels, and diversity of ethnic background. The fifth sample, the intervention sample in the Apps-project, presented with similar characterstics except on mathematical achievement where they showed lower results.
To examine the test-retest reliability of the tablet versions of HRT and the Math Battery several samples were tested twice on each measure in various contexts. To test the correlation between the paper and tablet version between HRT, the participants were tested on both paper and tablet versions of HRT using a counterbalanced design to avoid potential order effects. This sample is referred to as the Different formats sample. Finally, norms were collected for HRT, the Math Battery and the mathematical word problem-solving measure. This sample (called the Normative sample) was also use to investigate the correlation, or convergent validity, between HRT and Math Battery (third hypothesis).
See article "Tablets instead of paper-based tests for young children? Comparability between paper and tablet versions of the mathematical Heidelberger Rechen Test 1-4" by Hassler Hallstedt (2018) for further information.
The dataset was originally published in DiVA and moved to SND in 2024.
This dataset contains hourly pedestrian counts since 2009 from pedestrian sensor devices located across the city. The data is updated on a monthly basis and can be used to determine variations in pedestrian activity throughout the day.
The sensor_id column can be used to merge the data with the Pedestrian Counting System - Sensor Locations dataset which details the location, status and directional readings of sensors. Any changes to sensor locations are important to consider when analysing and interpreting pedestrian counts over time.
Importants notes about this dataset:
• Where no pedestrians have passed underneath a sensor during an hour, a count of zero will be shown for the sensor for that hour.
• Directional readings are not included, though we hope to make this available later in the year. Directional readings are provided in the Pedestrian Counting System – Past Hour (counts per minute) dataset.
The Pedestrian Counting System helps to understand how people use different city locations at different times of day to better inform decision-making and plan for the future. A representation of pedestrian volume which compares each location on any given day and time can be found in our Online Visualisation.
Related datasets:
Pedestrian Counting System – Past Hour (counts per minute)
Pedestrian Counting System - Sensor Locations
https://images.unsplash.com/photo-1573057325922-17b61e0dc4d5?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1350&q=80" alt="">
There are many parameters which make a song popular and even more feature which makes guitar players go and learn to play that song, can we use the basic information gathered from https://www.ultimate-guitar.com/ list of 5 star rated guitar songs to understand the underlying patterns which driver guitar players?
Description of a guitar tab instance from the list of the top rated guitar tabs.
All rights considering the data go to the team at https://www.ultimate-guitar.com/ which work really hard on their amazing website.
Lets try and understand what affects the tab popularity. What drives more guitar players to learn a certain song.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Here we present data used in publication "A Deep Dive into the Delta Wave: Forecasting SARS-CoV-2 Epidemic Dynamic in Poland with the pDyn Agent-Based Model". We used data from various sources: Polish Ministry of Health (1), Eurostat (2), GISAID (ECDC used as proxy accessing the data)(3), Polish National Institute of Public Health(4). Additionaly, data set contains results from Agent-Based simulation.sim_zgony_woj.tab - Simulated number of deaths in voivodeships. Figure no. 7.sim_zgony_pl.tab - Number of simulated deaths in Poland. Figure no. 5.sim_wykr_woj.tab - Simulated registered cases in voivodeships. Figure no. 7.sim_wykr_pl.tab - Simulated registered cases in Poland. Figure no. 5.sim_variants.tab - Simulated number of cases of variants. Figure no. 2a.sim_oiom_woj.tab - Simulated number of people in ICU in voivodeships. Figure no. 7.sim_oiom_pl.tab - Simulated number of people in ICU. Figure no. 5.sim_hosp_woj.tab - Simulated number of hospitalized people in voivodeships. Figure no. 7.sim_hosp_pl.tab - Simulated number of hospitalized people. Figure no. 5.real_zgony_woj.tab - Number of deaths in voivodeships. Figure no. 7. Source 1.real_zgony_pl.tab - Number of registered deaths in Poland averaged over 7 day window. Figure no. 5. Source 1.real_zgony_excess_pl.tab - Estimated excess deaths from Eurostat. Figure no. 5. Source 2.real_wykr_woj.tab - Number of registered cases in voivodeships averaged over 7 day window. Figure no. 7. Source 1.real_wykr_pl.tab - Number of registered cases in Poland averaged over 7 day window. Figure no. 5. Source 1.real_oiom_woj.tab - Number of people in ICU in voivodeships. Figure no. 7. Source 1.real_oiom_pl.tab - Number of people in ICU. Figure no. 5. Source 1.real_hosp_woj.tab - Number of hospitalized people in voivodeships. Figure no. 7. Source 1.real_hosp_pl.tab - Number of hospitalized people in Poland. Figure no. 5. Source 1.obserco.tab - OBSER-CO study results. Figure no. 3. Source 4.immun.tab - Simulated number of recovered, vaccinated, 'recov. or vaccin.'. Figure no. 3.gisaid_samples.tab - Number of samples of variants. Figure no. 2c. Source 3.real_nowe_zaszcz.tab - Fraction of vaccinated cases among all registered cases averaged over 31 day window. Figure no. 4. Source 1.sim_nowe_zaszcz.tab - Fraction of vaccinated cases among all simulated cases averaged over 31 day window. Figure no. 4.sim_fraction_variants.tab - Simulated prevalence of variants (in %). Figure no. 2b.gisaid.tab - Dashed lines "ECDC data" in Figure no. 2b. Data on prevalence of variants.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ACAPS' Humanitarian Access Dataset puts together all the pieces of information on Humanitarian Access Constraints of the first half of 2021. The report resulting from this dataset is the Humanitarian Access Overview, which outlines how access to humanitarian assistance continues to be restricted for crisis-affected populations in more than 60 countries.
Data from the dataset is coded according to access dimensions, indicators and sub-indicators, then scored on a 0 to 5 severity scale.
The dataset displays:
According to this data, we identified the following:
Extreme access constraints (Level 5) for: Eritrea, Libya, Syria, Yemen.
Very high access constraints (Level 4) for: Afghanistan, Bangladesh, Cameroon, Democratic Republic of Congo (DRC), Ethiopia, Iraq, Mali, Myanmar, Nigeria, Palestine, Somalia, South Sudan, Venezuela.
High access constraints (Level 3) for: Azerbaijan, Burkina Faso, Central African Republic (CAR), Chad, Colombia, Democratic People’s Republic of Korea (DPRK), Honduras, India, Iran, Lebanon, Mozambique, Nicaragua, Niger, Pakistan, Sudan, Turkey, Ukraine.
For more information, download the dataset. To access ACAPS' Humanitarian Access Overview, please visit: https://www.acaps.org/en/thematics/all-topics/humanitarian-access
This dataset was created as part of the World's Largest Game of Rock, Paper, Scissors talk and challenge introduced by Joseph Nelson and Salo Levy @ SXSW 2023. * https://roboflow.com/rps
* the above image is linked to the entry page
https://i.imgur.com/eVRmfw9.gif" alt="Version 11: Deploy Tab Inference">
* the demo video was prepared from the Deploy Tab, and utilizes
v11
(YOLOv8n-100epochs
)
The dataset includes an aggregation of images cloned from the following datasets: 1. https://universe.roboflow.com/brad-dwyer/egohands-public/ - null images 2. https://universe.roboflow.com/presentations/rock-paper-scissors-presentation/ 3. https://universe.roboflow.com/team-roboflow/rock-paper-scissors-detection 4. universe.roboflow.com/popular-benchmarks/mit-indoor-scene-recognition/
New images were added to the dataset and labeled to supplement the examples from the cloned datasets. Members of Team Roboflow, and more close friends of the team, are included in the dataset to assist with creating a more robust, generalized, model.
https://i.imgur.com/xIudTbe.png" alt="Example Labeled Image from the dataset: Two people playing Rock, Paper, Scissors">
* the above image is linked to the FAQ and contest entry page
The number of tablet users in Malaysia was forecast to continuously increase between 2024 and 2029 by in total 1.6 million users (+20.13 percent). After the fifteenth consecutive increasing year, the number is estimated to reach 9.54 million users and therefore a new peak in 2029. Notably, the number of tablet users of was continuously increasing over the past years.Depicted is the number of tablet users in the country or region at hand.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find further information concerning Thailand and the Philippines.
PyPSA-Eur is an open model dataset of the European power system at the transmission network level that covers the full ENTSO-E area. It can be built using the code provided at https://github.com/PyPSA/PyPSA-eur.
It contains alternating current lines at and above 220 kV voltage level and all high voltage direct current lines, substations, an open database of conventional power plants, time series for electrical demand and variable renewable generator availability, and geographic potentials for the expansion of wind and solar power.
Not all data dependencies are shipped with the code repository, since git is not suited for handling large changing files. Instead we provide separate data bundles to be downloaded and extracted as noted in the documentation.
This is the full data bundle to be used for rigorous research. It includes large bathymetry and natural protection area datasets.
While the code in PyPSA-Eur is released as free software under the MIT, different licenses and terms of use apply to the various input data, which are summarised below:
corine/*
CORINE Land Cover (CLC) database
Source: https://land.copernicus.eu/pan-european/corine-land-cover/clc-2012/
Terms of Use: https://land.copernicus.eu/pan-european/corine-land-cover/clc-2012?tab=metadata
natura/*
Natura 2000 natural protection areas
Source: https://www.eea.europa.eu/data-and-maps/data/natura-10
Terms of Use: https://www.eea.europa.eu/data-and-maps/data/natura-10#tab-metadata
gebco/GEBCO_2014_2D.nc
GEBCO bathymetric dataset
Source: https://www.gebco.net/data_and_products/gridded_bathymetry_data/version_20141103/
Terms of Use: https://www.gebco.net/data_and_products/gridded_bathymetry_data/documents/gebco_2014_historic.pdf
je-e-21.03.02.xls
Population and GDP data for Swiss Cantons
Source: https://www.bfs.admin.ch/bfs/en/home/news/whats-new.assetdetail.7786557.html
Terms of Use:
https://www.bfs.admin.ch/bfs/en/home/fso/swiss-federal-statistical-office/terms-of-use.html
https://www.bfs.admin.ch/bfs/de/home/bfs/oeffentliche-statistik/copyright.html
nama_10r_3popgdp.tsv.gz
Population by NUTS3 region
Source: http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=nama_10r_3popgdp&lang=en
Terms of Use:
https://ec.europa.eu/eurostat/about/policies/copyright
GDP_per_capita_PPP_1990_2015_v2.nc
Gross Domestic Product per capita (PPP) from years 1999 to 2015
Rectangular cutout for European countries in PyPSA-Eur, including a 10 km buffer
Kummu et al. "Data from: Gridded global datasets for Gross Domestic Product and Human Development Index over 1990-2015"
Source: https://doi.org/10.1038/sdata.2018.4 and associated dataset https://doi.org/10.1038/sdata.2018.4
ppp_2019_1km_Aggregated.tif
The spatial distribution of population in 2020: Estimated total number of people per grid-cell. The dataset is available to download in Geotiff format at a resolution of 30 arc (approximately 1km at the equator). The projection is Geographic Coordinate System, WGS84. The units are number of people per pixel. The mapping approach is Random Forest-based dasymetric redistribution.
Rectangular cutout for non-NUTS3 countries in PyPSA-Eur, i.e. MD and UA, including a 10 km buffer
WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University (2018). Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00647
Source: https://data.humdata.org/dataset/worldpop-population-counts-for-world and https://hub.worldpop.org/geodata/summary?id=24777
License: Creative Commons Attribution 4.0 International Licens
data/bundle/era5-HDD-per-country.csv
data/bundle/era5-runoff-per-country.csv
shipdensity_global.zip
Global Shipping Traffic Density
Creative Commons Attribution 4.0
https://datacatalog.worldbank.org/search/dataset/0037580/Global-Shipping-Traffic-Density
seawater_temperature.nc
Global Ocean Physics Reanalysis
Seawater temperature at 5m depth
Link: https://data.marine.copernicus.eu/product/GLOBAL_MULTIYEAR_PHY_001_030/services
License: https://marine.copernicus.eu/user-corner/service-commitments-and-licence
MeteoNet is a meteorological dataset developed and made available by METEO FRANCE, the French national meteorological service.
Our goal is to provide an easy and ready to use dataset for Data Scientists who want to try their hand on weather data. A complete and clean dataset is a rare and precious thing for a data scientist, and this is what we wanted to provide to the community.
As we are always willing to improve our knowledge of meteorology and weather forecasting, we hope that by opening this dataset to the research community, people around the world will find new ways of bringing value to meteorology with data science.
Let’s start playing with the dataset! You can find in the Tasks tab some challenges proposals! Here are some interesting challenges ideas: - Data visualization, exploration - Time series prediction - Observation data correction, detect aberrant values - Rainfall and cloud cover nowcasting - Improve weather forecasting by crossing weather predictions with other data like observations - Create new forecasting methods - … - If you have any challenges ideas, do not hesitate to propose it via the Tasks tab!
Several notebooks are available to quickly visualize, explore the dataset (notebooks with names started with Open...). The notebook Superimpose data gives some ways to superimpose the data from different sources (observations, forecasts, point data, grid data...).
For example, you can also browse a first notebook about time series prediction (called Univariate Time Series Prediction Tmax).
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1546625%2F96d24f05afdffd5a1fa596dc55be674e%2Ftest.png?generation=1585906596875654&alt=media" alt="">
The dataset contains full time series of the following data type: - Ground observations: over 500 ground stations measuring pressure, temperature, humidity, wind direction and speed, dew point and precipitation, recorded every 6 min. - Precipitation radar: radar reflectivity and total rainfall measured every 5 min. - Satellite data: Cloud Type (CT) every 15 min, Channels (visible, infrared) every 1 hour. - Weather models: forecasts from 2 weather models with 2D parameters, generated once a day (3D parameters will be added in a next version) - Land-sea and relief masks
To keep this Kaggle dataset at a reasonable size, the data covers only one geographic areas of 550km x 550km on the Brittany coasts, and spans until over 3 years, 2016 to 2018 (it depends on the data type for limit size reasons on Kaggle).
You can find the full MeteoNet dataset at MeteoNet's Website. A documentation about this dataset is also online. You can post any question or suggestion about MeteoNet in our Slack workspace. The GitHub project contains a toolbox which includes data samples from MeteoNet written in python language and our tutorials/documentation which help you explore and cross-check all data types. You can also find the slides of the first training session on the GitHub project.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1546625%2Ffc09d9a855a063709a90a5de4b13e0a6%2Fzone_nw.png?generation=1585906684362793&alt=media" alt="">
We hope that Data Science will give us a new approach of meteorology and help us anticipate weather dangerous events like floods or heat wave.
The Dataset is licenced by METEO FRANCE under Etalab Open Licence 2.0.
When using this dataset, please cite: Gwennaëlle Larvor, Léa Berthomier, Vincent Chabot, Brice Le Pape, Bruno Pradel, Lior Perez. MeteoNet, an open reference weather dataset by METEO FRANCE, 2020
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdf
ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days (monthly means are available around the 6th of each month). In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 monthly mean data on single levels from 1940 to present".
The dataset holds data on the cinema-concert program of the Belgian movie theatre Cinema Zoologie. The cinema opened in October 1915, in then German-occupied Antwerp, and remained open continuously until May 1936. It was owned and operated by the Royal Zoological Society of Antwerp and located at the premises of the Antwerp Zoological Garden (Statieplein 1, Antwerp). The venue seated 2100 to 2400 people. The mixed program consisted of films and classical musical pieces.AbbreviationsBelgium (B)Switzerland (CH)DenmarkSpain (E)Italy (I)Morocco (MA)United States (USA)India (IND)Japan (J)The Netherlands (NL)Poland (PL)Russia (RUS)Sweden (S)South Africa (ZA)Great Britain (UK)In case of co-productions, multiple countries are separated by a slash (e.g. D / F for Germany / France). The dataset consists of 5469 individual records (films or music pieces), forming 898 cinema-concert programs. Each record (= each line) in the dataset corresponds to one program element from a Cinema Zoologie program. This can be a film showing, a musical performance, or a theatrical number. (Many of the dataset’s fields are specific to film showings, and hence are left empty in the case of musical or theatrical performances.)Each record has the following fields. When a field is left empty, this means that the information was not available to the researchers.Number: Fixed record numberType: The type of program element (F for film, M for musical performance, T for theatrical performance)Original title: The title of the film (or musical composition, or theatre show) in its original languageDutch title: The Dutch title which was used for presenting the film in the Cinema Zoologie programFrench title: The French title which was used for presenting the film in the Cinema Zoologie programIMDb identifier: The URL that identifies this particular film at the Internet Movie Database (IMDb)Director: The director(s) of the film. When more than one director was involved, the names are separated by semicolons (e.g. Max Obal; Rudolf Walther-Fein).Year: The year when the film was originally releasedCountry: The country (or countries) where the film was produced. (A list of countries may be found below under "Abbreviations".) In case of co-productions, multiple countries are separated by a slash (e.g. D / F for Germany / France).Producer: The company which produced the filmProgram identifier:The archival record number of the program booklet where the information for this record was foundStart date: The date of the first screening (or performance), using the format dd-mm-yyyyEnd date: The date of the last screening (or performance), using the format dd-mm-yyyyDescription: Additional information (in French or Dutch) about the film, given in the program bookletThe dataset is stored in the tab-separated values (TSV) format. This was chosen as an alternative to the common comma-separated values (CSV) format, which often causes difficulties because of the need to avoid commas. The text encoding of the TSV file is UTF-8.-----DANS added a .csv version of the data file and has added a copy of the above remarks to the dataset in a file titled 'remarks.txt'.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The DSS Payment Demographic data set is made up of:
Selected DSS payment data by
Geography: state/territory, electorate, postcode, LGA and SA2 (for 2015 onwards)
Demographic: age, sex and Indigenous/non-Indigenous
Duration on Payment (Working Age & Pensions)
Duration on Income Support (Working Age, Carer payment & Disability Support Pension)
Rate (Working Age & Pensions)
Earnings (Working Age & Pensions)
Age Pension assets data
JobSeeker Payment and Youth Allowance (other) Principal Carers
Activity Tested Recipients by Partial Capacity to Work (NSA,PPS & YAO)
Exits within 3, 6 and 12 months (Newstart Allowance/JobSeeker Payment, Parenting Payment, Sickness Allowance & Youth Allowance)
Disability Support Pension by medical condition
Care Receiver by medical conditions
Commonwealth Rent Assistance by Payment type and Income Unit type have been added from March 2017. For further information about Commonwealth Rent Assistance and Income Units see the Data Descriptions and Glossary included in the dataset.
From December 2022, the "DSS Expanded Benefit and Payment Recipient Demographics – quarterly data" publication has introduced expanded reporting populations for income support recipients. As a result, the reporting population for Jobseeker Payment and Special Benefit has changed to include recipients who are current but on zero rate of payment and those who are suspended from payment. The reporting population for ABSTUDY, Austudy, Parenting Payment and Youth Allowance has changed to include those who are suspended from payment. The expanded report will replace the standard report after June 2023.
Additional data for DSS Expanded Benefit and Payment Recipient Demographics – quarterly data includes:
• A new contents page to assist users locate the information within the spreadsheet
• Additional data for the ‘Suspended’ population in the ‘Payment by Rate’ tab to enable users to calculate the old reporting rules.
• Additional information on the Employment Earning by ‘Income Free Area’ tab.
From December 2022, Services Australia have implemented a change in the Centrelink payment system to recognise gender other than the sex assigned at birth or during infancy, or as a gender which is not exclusively male or female. To protect the privacy of individuals and comply with confidentialisation policy, persons identifying as ‘non-binary’ will initially be grouped with ‘females’ in the period immediately following implementation of this change. The Department will monitor the implications of this change and will publish the ‘non-binary’ gender category as soon as privacy and confidentialisation considerations allow.
Local Government Area has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2022 boundaries from June 2023.
Commonwealth Electorate Division has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.
SA2 has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.
From December 2021, the following are included in the report:
selected payments by work capacity, by various demographic breakdowns
rental type and homeownership
Family Tax Benefit recipients and children by payment type
Commonwealth Rent Assistance by proportion eligible for the maximum rate
an age breakdown for Age Pension recipients
For further information, please see the Glossary.
From June 2021, data on the Paid Parental Leave Scheme is included yearly in June releases. This includes both Parental Leave Pay and Dad and Partner Pay, across multiple breakdowns. Please see Glossary for further information.
From March 2017 the DSS demographic dataset will include top 25 countries of birth. For further information see the glossary.
From March 2016 machine readable files containing the three geographic breakdowns have also been published for use in National Map, links to these datasets are below:
Pre June 2014 Quarter Data contains:
Selected DSS payment data by
Geography: state/territory; electorate; postcode and LGA
Demographic: age, sex and Indigenous/non-Indigenous
Note: JobSeeker Payment replaced Newstart Allowance and other working age payments from 20 March 2020, for further details see: https://www.dss.gov.au/benefits-payments/jobseeker-payment
For data on DSS payment demographics as at June 2013 or earlier, the department has published data which was produced annually. Data is provided by payment type containing timeseries’, state, gender, age range, and various other demographics. Links to these publications are below:
Concession card data in the March and June 2020 quarters have been re-stated to address an over-count in reported cardholder numbers.
28/06/2024 – The March 2024 and December 2023 reports were republished with updated data in the ‘Carer Receivers by Med Condition’ section, updates are exclusive to the ‘Care Receivers of Carer Payment recipients’ table, under ‘Intellectual / Learning’ and ‘Circulatory System’ conditions only.
The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. Input point_reporting. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make data for drinking water supply available to the public.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://i.imgur.com/5rtbtpN.png" alt="Imgur">
The original Palmer's Penguins dataset is an invaluable resource in the world of data science, often used for statistical analysis, data visualization, and introductory machine learning tasks. Collected in the Palmer Archipelago near Antarctica, the dataset provides information on three species of penguins, including Adélie, Gentoo, and Chinstrap, and covers essential biological metrics such as bill dimensions and body mass.
Our extended dataset aims to build upon this foundational work by incorporating new, realistic features. We have included additional variables like diet, year of observation, life stage, and health metrics. These extra features allow for a more nuanced understanding of penguin biology and ecology, making it ideal for more complex analyses, including but not limited to educational, ecological, and advanced machine learning applications.
The dataset consists of the following columns:
The inclusion of yearly data from 2021 to 2025 allows for longitudinal studies, providing a temporal dimension that can help track the impact of climate change, dietary shifts, or other ecological factors on penguin populations over time.
We introduce the 'Health Metrics' column, which takes into account the body mass, life stage, and species to categorize each penguin's health status. This provides a multi-faceted view of individual well-being and can be crucial for conservation studies.
Our data structure enables the mapping of the diet to specific life stages, offering a granular understanding of penguin ecology. This added detail can be crucial for studying nutritional needs at different life stages.
Recognizing the importance of gender-based variations in penguin biology, our dataset incorporates attributes that allow for the study of sexual dimorphism, such as differing body sizes and potential diet variations between males and females.
This enriched dataset is particularly suitable for: - Advanced ecological models that require multiple layers of data. - Educational case studies focusing on biology, ecology, or data science. - Data-driven conservation efforts aimed at penguin species. - Machine learning algorithms that benefit from diverse and multi-dimensional data.
We wish to express our deepest respect and acknowledgment to the original research team behind the Palmer's Penguins dataset. This Extended Palmer's Penguins dataset is designed to build upon the solid foundation laid by the original work. It is created to serve as a complementary resource that adds additional dimensions for research and educational purposes. In no way is this artificial dataset intended to discredit or disrespect the invaluable contributions made through the original dataset.
All illustrations in this dataset are AI-generated.
https://i.imgur.com/yzroo3h.png" alt="Imgur">
The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find data sets for the 1st water supply system_analysis. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make information on the supply of drinking water available to the public.
The aim of the Young People and Gambling Survey is to explore young people’s attitudes towards gambling and their participation in different types of gambling activities, designed to provide a means of tracking these perceptions and behaviours over time. The survey looks at those forms of gambling and gambling style games that children and young people legally take part in along with gambling on age restricted products.
The 2024 research was conducted by Ipsos on behalf of the Gambling Commission. The study collected data from 3,869 pupils aged 11 to 17 years old across curriculum years 7 to 12 (S1 to S6 in Scotland) using the Ipsos Young People Omnibus. Pupils completed an online self-report survey in class.
Data have been weighted to the known profile of the population in order to provide a representative sample.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Indicators from the Office for National Statistics (ONS) Opinions and Lifestyle Survey to understand the impacts of the coronavirus (COVID-19) pandemic on disabled people in Great Britain.
This data-set contains all data resources, either directly downloadable via this platform or as links to external databases, to execute the generic modeling tool as described in D5.4
Description 👋🛳️ Ahoy, welcome to Kaggle! You’re in the right place. This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works.
If you want to talk with other users about this competition, come join our Discord! We've got channels for competitions, job postings and career discussions, resources, and socializing with your fellow data scientists. Follow the link here: https://discord.gg/kaggle
The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.
Read on or watch the video below to explore more details. Once you’re ready to start competing, click on the "Join Competition button to create an account and gain access to the competition data. Then check out Alexis Cook’s Titanic Tutorial that walks you through step by step how to make your first submission!
The Challenge The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
Recommended Tutorial We highly recommend Alexis Cook’s Titanic Tutorial that walks you through making your very first submission step by step and this starter notebook to get started.
How Kaggle’s Competitions Work Join the Competition Read about the challenge description, accept the Competition Rules and gain access to the competition dataset. Get to Work Download the data, build models on it locally or on Kaggle Notebooks (our no-setup, customizable Jupyter Notebooks environment with free GPUs) and generate a prediction file. Make a Submission Upload your prediction as a submission on Kaggle and receive an accuracy score. Check the Leaderboard See how your model ranks against other Kagglers on our leaderboard. Improve Your Score Check out the discussion forum to find lots of tutorials and insights from other competitors. Kaggle Lingo Video You may run into unfamiliar lingo as you dig into the Kaggle discussion forums and public notebooks. Check out Dr. Rachael Tatman’s video on Kaggle Lingo to get up to speed!
What Data Will I Use in This Competition? In this competition, you’ll gain access to two similar datasets that include passenger information like name, age, gender, socio-economic class, etc. One dataset is titled train.csv and the other is titled test.csv.
Train.csv will contain the details of a subset of the passengers on board (891 to be exact) and importantly, will reveal whether they survived or not, also known as the “ground truth”.
The test.csv dataset contains similar information but does not disclose the “ground truth” for each passenger. It’s your job to predict these outcomes.
Using the patterns you find in the train.csv data, predict whether the other 418 passengers on board (found in test.csv) survived.
Check out the “Data” tab to explore the datasets even further. Once you feel you’ve created a competitive model, submit it to Kaggle to see where your model stands on our leaderboard against other Kagglers.
How to Submit your Prediction to Kaggle Once you’re ready to make a submission and get on the leaderboard:
Click on the “Submit Predictions” button
Upload a CSV file in the submission file format. You’re able to submit 10 submissions a day.
Submission File Format: You should submit a csv file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.
The file should have exactly 2 columns:
PassengerId (sorted in any order) Survived (contains your binary predictions: 1 for survived, 0 for deceased) Got it! I’m ready to get started. Where do I get help if I need it? For Competition Help: Titanic Discussion Forum Kaggle doesn’t have a dedicated team to help troubleshoot your code so you’ll typically find that you receive a response more quickly by asking your question in the appropriate forum. The forums are full of useful information on the data, metric, and different approaches. We encourage you to use the forums often. If you share your knowledge, you'll find that others will share a lot in turn!
A Last Word on Kaggle Notebooks As we mentioned before, Kaggle Notebooks is our no-setup, customizable, Jupyter Notebooks environment with free GPUs and a huge repository ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the code and resources for the project titled "Detection of Areas with Human Vulnerability Using Public Satellite Images and Deep Learning". The goal of this project is to identify regions where individuals are living under precarious conditions and facing neglected basic needs, a situation often seen in Brazil. This concept is referred to as "human vulnerability" and is exemplified by families living in inadequate shelters or on the streets in both urban and rural areas.
Focusing on the Federal District of Brazil as the research area, this project aims to develop two novel public datasets consisting of satellite images. The datasets contain imagery captured at 50m and 100m scales, covering regions of human vulnerability, traditional areas, and improperly disposed waste sites.
The project also leverages these datasets for training deep learning models, including YOLOv7 and other state-of-the-art models, to perform image segmentation. A comparative analysis is conducted between the models using two training strategies: training from scratch with random weight initialization and fine-tuning using pre-trained weights through transfer learning.
This repository provides the code, models, and data pipelines used for training, evaluation, and performance comparison of these deep learning models.
@TECHREPORT {TechReport-Julia-Laura-HumanVulnerability-2024,
author = "Julia Passos Pontes, Laura Maciel Neves Franco, Flavio De Barros Vidal",
title = "Detecção de Áreas com Atividades de Vulnerabilidade Humana utilizando Imagens Públicas de Satélites e Aprendizagem Profunda",
institution = "University of Brasilia",
year = "2024",
type = "Undergraduate Thesis",
address = "Computer Science Department - University of Brasilia - Asa Norte - Brasilia - DF, Brazil",
month = "aug",
note = "People living in precarious conditions and with their basic needs neglected is an unfortunate reality in Brazil. This scenario will be approached in this work according to the concept of \"human vulnerability\" and can be exemplified through families who live in inadequate shelters, without basic structures and on the streets of urban or rural centers. Therefore, assuming the Federal District as the research scope, this project proposes to develop two new databases to be made available publicly, considering the map scales of 50m and 100m, and composed by satellite images of human vulnerability areas,
regions treated as traditional and waste disposed inadequately. Furthermore, using these image bases, trainings were done with the YOLOv7 model and other deep learning models for image segmentation. By adopting an exploratory approach, this work compares the results of different image segmentation models and training strategies, using random weight initialization
(from scratch) and pre-trained weights (transfer learning). Thus, the present work was able to reach maximum F1
score values of 0.55 for YOLOv7 and 0.64 for other segmentation models."
}
This project is licensed under the MIT License - see the LICENSE file for details.
Tablets can be used to facilitate systematic testing of academic skills. Yet, when using validated paper tests on tablet, comparability between the mediums must be established. In this dataset, comparability between a tablet and a paper version of a basic math skills test (HRT: Heidelberger Rechen Test 1–4) was investigated.
Four of the five samples included in the current study covered a broad spectrum of schools regarding student achievement in mathematics, proportion of non-native students, parental educational levels, and diversity of ethnic background. The fifth sample, the intervention sample in the Apps-project, presented with similar characterstics except on mathematical achievement where they showed lower results.
To examine the test-retest reliability of the tablet versions of HRT and the Math Battery several samples were tested twice on each measure in various contexts. To test the correlation between the paper and tablet version between HRT, the participants were tested on both paper and tablet versions of HRT using a counterbalanced design to avoid potential order effects. This sample is referred to as the Different formats sample. Finally, norms were collected for HRT, the Math Battery and the mathematical word problem-solving measure. This sample (called the Normative sample) was also use to investigate the correlation, or convergent validity, between HRT and Math Battery (third hypothesis).
See article "Tablets instead of paper-based tests for young children? Comparability between paper and tablet versions of the mathematical Heidelberger Rechen Test 1-4" by Hassler Hallstedt (2018) for further information.
The dataset was originally published in DiVA and moved to SND in 2024.