51 datasets found

B
Data Cleaning Sample
borealisdata.ca
dataone.org
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.
N
Big Lake, TX Age Group Population Dataset: A Complete Breakdown of Big Lake...
neilsberg.com
csv, json
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Big Lake, TX Age Group Population Dataset: A Complete Breakdown of Big Lake Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/451118a9-f122-11ef-8c1b-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 22, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Big Lake, Texas
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Big Lake population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Big Lake. The dataset can be utilized to understand the population distribution of Big Lake by age. For example, using this dataset, we can identify the largest age group in Big Lake.

Key observations

The largest age group in Big Lake, TX was for the group of age Under 5 years years with a population of 346 (11.19%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Big Lake, TX was the 75 to 79 years years with a population of 17 (0.55%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Big Lake is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Big Lake total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Lake Population by Age. You can refer the same here
N
Big Stone City, SD Age Group Population Dataset: A Complete Breakdown of Big...
neilsberg.com
csv, json
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Big Stone City, SD Age Group Population Dataset: A Complete Breakdown of Big Stone City Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/45111e75-f122-11ef-8c1b-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 22, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Big Stone City, South Dakota
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Big Stone City population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Big Stone City. The dataset can be utilized to understand the population distribution of Big Stone City by age. For example, using this dataset, we can identify the largest age group in Big Stone City.

Key observations

The largest age group in Big Stone City, SD was for the group of age 75 to 79 years years with a population of 115 (18.88%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Big Stone City, SD was the 5 to 9 years years with a population of 3 (0.49%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Big Stone City is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Big Stone City total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Stone City Population by Age. You can refer the same here
u
Data from: CLIVAR LE project
rda.ucar.edu
data.ucar.edu
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CLIVAR LE project [Dataset]. https://rda.ucar.edu/lookfordata/datasets/?nb=y&b=topic&v=Atmosphere
Explore at:
Description
The CLIVAR Large Ensemble repository was built at NCAR and supported by the US CLIVAR WG on Large Ensembles. It features a set of CMORized variables from the following CMIP5 ... class Large Ensembles: CANESM2, CESM, CSIRO MK36, EC Earth, GFDL CM3, GFDL ESM2M, MPI, and OLENS McKinnon.
Z
Data from: Large Landing Trajectory Data Set for Go-Around Analysis
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Dec 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timothé Krauth (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7148116
Explore at:
Dataset updated
Dec 16, 2022
Dataset provided by
Raphael Monstein
Marcel Dettling
Timothé Krauth
Benoit Figuet
Manuel Waltert
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

If you use this data for a scientific publication, please consider citing our paper.

The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

go_arounds_minimal.csv.gz

Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight

The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

go_arounds_augmented.csv.gz

Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

Column name Type Description time date time UTC time of landing or first GA attempt icao24 string Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned callsign string Aircraft identifier in air-ground communications airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed has_ga string "True" if at least one GA was performed, otherwise "False" n_approaches integer Number of approaches identified for this flight n_rwy_approached integer Number of unique runways approached by this flight registration string Aircraft registration typecode string Aircraft ICAO typecode icaoaircrafttype string ICAO aircraft type wtc string ICAO wake turbulence category glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection

string

Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometre airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania) operator_country string ISO Alpha-3 country code of the operator operator_region string Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania) wind_speed_knts integer METAR, surface wind speed in knots wind_dir_deg integer METAR, surface wind direction in degrees wind_gust_knts integer METAR, surface wind gust speed in knots visibility_m float METAR, visibility in m temperature_deg integer METAR, temperature in degrees Celsius press_sea_level_p float METAR, sea level pressure in hPa press_p float METAR, QNH in hPA weather_intensity list METAR, list of present weather codes: qualifier - intensity weather_precipitation list METAR, list of present weather codes: weather phenomena - precipitation weather_desc list METAR, list of present weather codes: qualifier - descriptor weather_obscuration list METAR, list of present weather codes: weather phenomena - obscuration weather_other list METAR, list of present weather codes: weather phenomena - other

This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

go_arounds_agg.csv.gz

Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

Column name Type Description airport string ICAO airport code where the aircraft is landing runway string Runway designator on which the aircraft landed n_landings integer Total number of landings observed on this runway in 2019 ga_rate float Go-around rate, per 1000 landings glide_slope_angle float Angle of the ILS glide slope in degrees has_intersection string Boolean that is true if the runway has an other runway intersecting it, otherwise false rwy_length float Length of the runway in kilometres airport_country string ISO Alpha-3 country code of the airport airport_region string Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

This aggregated data set is used in the paper for the generalized linear regression model.

Downloading the trajectories

Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic

load minimum data set

df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

select London City Airport, go-arounds, and 2019-01-04

airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")

iterate over flights and pull the data from OpenSky Network

flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time

# fetch the data from OpenSky Network flights.append( opensky.history( start=start_time.strftime("%Y-%m-%d %H:%M:%S"), stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"), callsign=row["callsign"], return_flight=True, ) )

The flights can be converted into a Traffic object

Traffic.from_flights(flights)

Additional files

Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:

validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.

validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.
N
Big Sandy, MT Population Breakdown by Race
neilsberg.com
csv, json
Updated Aug 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Big Sandy, MT Population Breakdown by Race [Dataset]. https://www.neilsberg.com/research/datasets/688e191b-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Aug 18, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montana, Big Sandy
Variables measured
Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Big Sandy by race. It includes the population of Big Sandy across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Big Sandy across relevant racial categories.

Key observations

The percent distribution of Big Sandy population by race (across all racial categories recognized by the U.S. Census Bureau): 77.82% are white, 1.13% are Black or African American, 5.65% are American Indian and Alaska Native, 0.71% are Asian and 14.69% are multiracial.

https://i.neilsberg.com/ch/big-sandy-mt-population-by-race.jpeg" alt="Big Sandy population by race">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race: This column displays the racial categories (excluding ethnicity) for the Big Sandy

Population: The population of the racial category (excluding ethnicity) in the Big Sandy is shown in this column.

% of Total Population: This column displays the percentage distribution of each race as a proportion of Big Sandy total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Sandy Population by Race & Ethnicity. You can refer the same here
N
Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...
neilsberg.com
csv, json
Updated Jul 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aa8c95e0-4983-11ef-ae5d-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jul 24, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Excel
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Excel population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Excel. The dataset can be utilized to understand the population distribution of Excel by age. For example, using this dataset, we can identify the largest age group in Excel.

Key observations

The largest age group in Excel, AL was for the group of age 45 to 49 years years with a population of 74 (15.64%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Excel, AL was the 85 years and over years with a population of 2 (0.42%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Excel is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Excel total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here
Aluminum alloy industrial materials defect
figshare.com
zip
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ying Han; Yugang Wang (2024). Aluminum alloy industrial materials defect [Dataset]. http://doi.org/10.6084/m9.figshare.27922929.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27922929.v3
Dataset updated
Dec 3, 2024
Dataset provided by
figshare
Authors
Ying Han; Yugang Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset used in this study experiment was from the preliminary competition dataset of the 2018 Guangdong Industrial Intelligent Manufacturing Big Data Intelligent Algorithm Competition organized by Tianchi Feiyue Cloud (https://tianchi.aliyun.com/competition/entrance/231682/introduction). We have selected the dataset, removing images that do not meet the requirements of our experiment. All datasets have been classified for training and testing. The image pixels are all 2560×1960. Before training, all defects need to be labeled using labelimg and saved as json files. Then, all json files are converted to txt files. Finally, the organized defect dataset is detected and classified.Description of the data and file structureThis is a project based on the YOLOv8 enhanced algorithm for aluminum defect classification and detection tasks.All code has been tested on Windows computers with Anaconda and CUDA-enabled GPUs. The following instructions allow users to run the code in this repository based on a Windows+CUDA GPU system already in use.Files and variablesFile: defeat_dataset.zipDescription:SetupPlease follow the steps below to set up the project:Download Project RepositoryDownload the project repository defeat_dataset.zip from the following location.Unzip and navigate to the project folder; it should contain a subfolder: quexian_datasetDownload data1.Download data .defeat_dataset.zip2.Unzip the downloaded data and move the 'defeat_dataset' folder into the project's main folder.3. Make sure that your defeat_dataset folder now contains a subfolder: quexian_dataset.4. Within the folder you should find various subfolders such as addquexian-13, quexian_dataset, new_dataset-13, etc.softwareSet up the Python environment1.Download and install the Anaconda.2.Once Anaconda is installed, activate the Anaconda Prompt. For Windows, click Start, search for Anaconda Prompt, and open it.3.Create a new conda environment with Python 3.8. You can name it whatever you like; for example. Enter the following command: conda create -n yolov8 python=3.84.Activate the created environment. If the name is , enter: conda activate yolov8Download and install the Visual Studio Code.Install PyTorch based on your system:For Windows/Linux users with a CUDA GPU: bash conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forgeInstall some necessary libraries:Install scikit-learn with the command: conda install anaconda scikit-learn=0.24.1Install astropy with: conda install astropy=4.2.1Install pandas using: conda install anaconda pandas=1.2.4Install Matplotlib with: conda install conda-forge matplotlib=3.5.3Install scipy by entering: conda install scipy=1.10.1RepeatabilityFor PyTorch, it's a well-known fact:There is no guarantee of fully reproducible results between PyTorch versions, individual commits, or different platforms. In addition, results may not be reproducible between CPU and GPU executions, even if the same seed is used.All results in the Analysis Notebook that involve only model evaluation are fully reproducible. However, when it comes to updating the model on the GPU, the results of model training on different machines vary.Access informationOther publicly accessible locations of the data:https://tianchi.aliyun.com/dataset/public/Data was derived from the following sources:https://tianchi.aliyun.com/dataset/140666Data availability statementThe ten datasets used in this study come from Guangdong Industrial Wisdom Big Data Innovation Competition - Intelligent Algorithm Competition Rematch. and the dataset download link is https://tianchi.aliyun.com/competition/entrance/231682/information?lang=en-us. Officially, there are 4,356 images, including single blemish images, multiple blemish images and no blemish images. The official website provides 4,356 images, including single defect images, multiple defect images and no defect images. We have selected only single defect images and multiple defect images, which are 3,233 images in total. The ten defects are non-conductive, effacement, miss bottom corner, orange, peel, varicolored, jet, lacquer bubble, jump into a pit, divulge the bottom and blotch. Each image contains one or more defects, and the resolution of the defect images are all 2560×1920.By investigating the literature, we found that most of the experiments were done with 10 types of defects, so we chose three more types of defects that are more different from these ten types and more in number, which are suitable for the experiments. The three newly added datasets come from the preliminary dataset of Guangdong Industrial Wisdom Big Data Intelligent Algorithm Competition. The dataset can be downloaded from https://tianchi.aliyun.com/dataset/140666. There are 3,000 images in total, among which 109, 73 and 43 images are for the defects of bruise, camouflage and coating cracking respectively. Finally, the 10 types of defects in the rematch and the 3 types of defects selected in the preliminary round are fused into a new dataset, which is examined in this dataset.In the processing of the dataset, we tried different division ratios, such as 8:2, 7:3, 7:2:1, etc. After testing, we found that the experimental results did not differ much for different division ratios. Therefore, we divide the dataset according to the ratio of 7:2:1, the training set accounts for 70%, the validation set accounts for 20%, and the testing set accounts for 10%. At the same time, the random number seed is set to 0 to ensure that the results obtained are consistent every time the model is trained.Finally, the mean Average Precision (mAP) metric obtained from the experiment was tested on the dataset a total of three times. Each time the results differed very little, but for the accuracy of the experimental results, we took the average value derived from the highest and lowest results. The highest was 71.5% and the lowest was 71.1%, resulting in an average detection accuracy of 71.3% for the final experiment.All data and images utilized in this research are from publicly available sources, and the original creators have given their consent for these materials to be published in open-access formats.The settings for other parameters are as follows. epochs: 200，patience: 50，batch: 16，imgsz: 640，pretrained: true，optimizer: SGD，close_mosaic: 10，iou: 0.7，momentum: 0.937，weight_decay: 0.0005，box: 7.5，cls: 0.5，dfl: 1.5，pose: 12.0，kobj: 1.0，save_dir: runs/trainThe defeat_dataset.(ZIP)is mentioned in the Supporting information section of our manuscript. The underlying data are held at Figshare. DOI: 10.6084/m9.figshare.27922929.The results_images.zipin the system contains the experimental results graphs.The images_1.zipand images_2.zipin the system contain all the images needed to generate the manuscript.tex manuscript.
N
Big Bear Lake, CA Population Breakdown by Race
neilsberg.com
csv, json
Updated Aug 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Big Bear Lake, CA Population Breakdown by Race [Dataset]. https://www.neilsberg.com/research/datasets/688dbf57-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Aug 18, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Big Bear Lake, California
Variables measured
Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Big Bear Lake by race. It includes the population of Big Bear Lake across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Big Bear Lake across relevant racial categories.

Key observations

The percent distribution of Big Bear Lake population by race (across all racial categories recognized by the U.S. Census Bureau): 80.05% are white, 0.24% are Black or African American, 1.58% are American Indian and Alaska Native, 2.15% are Asian, 3.83% are some other race and 12.15% are multiracial.

https://i.neilsberg.com/ch/big-bear-lake-ca-population-by-race.jpeg" alt="Big Bear Lake population by race">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race: This column displays the racial categories (excluding ethnicity) for the Big Bear Lake

Population: The population of the racial category (excluding ethnicity) in the Big Bear Lake is shown in this column.

% of Total Population: This column displays the percentage distribution of each race as a proportion of Big Bear Lake total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Bear Lake Population by Race & Ethnicity. You can refer the same here
g
INSPIRE Download Service (predefined ATOM) for dataset special area large...
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INSPIRE Download Service (predefined ATOM) for dataset special area large advertising system A3 | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_bc9fe5a2-9b29-0002-fd31-4a197570046e
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Description of the INSPIRE Download Service (predefined Atom): Development plan "Sondergebiet Grosswerbeanlage A3" - The link(s) for downloading the data sets is/are dynamically generated from Get Map calls to a WMS interface
Large Scale International Boundaries
geodata.state.gov
catalog.data.gov
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (2025). Large Scale International Boundaries [Dataset]. https://geodata.state.gov/geonetwork/srv/api/records/3bdb81a0-c1b9-439a-a0b1-85dac30c59b2
Explore at:
www:link-1.0-http--link, www:link-1.0-http--related, ogc:wms-1.3.0-http-get-capabilities, www:download-1.0-http--downloadAvailable download formats
Dataset updated
Feb 24, 2025
Dataset provided by
United States Department of Statehttp://state.gov/
Authors
U.S. Department of State
Area covered
Pacific Ocean, North Pacific Ocean
Description
Overview

The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control.

National Geospatial Data Asset

This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee.

Dataset Source Details

Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground.

Cartographic Visualization

The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below.

Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://hiu.state.gov/data/cartographic_guidance_bulletins/

Contact

Direct inquiries to internationalboundaries@state.gov.

Direct download: https://data.geodata.state.gov/LSIB.zip

Attribute Structure

The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension

These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE

The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB.

Core Attributes

The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields.

County Code and Country Name Fields

“CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard.

The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user.

Descriptive Fields

The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes

Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line.

A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively.

The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps.

The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line.

Use of Core Attributes in Cartographic Visualization

Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: - International Boundaries (Rank 1); - Other Lines of International Separation (Rank 2); and - Special Lines (Rank 3).

Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction.

The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling.

Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic
d
USGS 10-m Digital Elevation Model (DEM): Hawaii: Big Island: Hillshade
catalog.data.gov
data.ioos.us
+1more
Updated Jan 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (USGS) (Point of Contact) (2025). USGS 10-m Digital Elevation Model (DEM): Hawaii: Big Island: Hillshade [Dataset]. https://catalog.data.gov/dataset/usgs-10-m-digital-elevation-model-dem-hawaii-big-island-hillshade
Explore at:
Dataset updated
Jan 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Island of Hawai'i, Hawaii
Description
A 10-meter resolution land surface digital elevation model (DEM) grayscale hillshade for Big Island in Hawaii derived from United States Geological Survey (USGS) 1/3 arc-second DEM quadrangles. For the related dataset containing numeric elevation values for this image layer, see http://pacioos.org/metadata/usgs_dem_10m_bigisland.html
G
High Resolution Digital Elevation Model Mosaic (HRDEM Mosaic) - CanElevation...
open.canada.ca
ouvert.canada.ca
fgdb/gdb, html, json +3
Updated Mar 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural Resources Canada (2025). High Resolution Digital Elevation Model Mosaic (HRDEM Mosaic) - CanElevation Series [Dataset]. https://open.canada.ca/data/en/dataset/0fe65119-e96e-4a57-8bfe-9d9245fba06b
Explore at:
json, pdf, html, fgdb/gdb, wms, wcsAvailable download formats
Dataset updated
Mar 12, 2025
Dataset provided by
Natural Resources Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The High Resolution Digital Elevation Model Mosaic provides a unique and continuous representation of the high resolution elevation data available across the country. The High Resolution Digital Elevation Model (HRDEM) product used is derived from airborne LiDAR data (mainly in the south) and satellite images in the north. The mosaic is available for both the Digital Terrain Model (DTM) and the Digital Surface Model (DSM) from web mapping services. It is part of the CanElevation Series created to support the National Elevation Data Strategy implemented by NRCan. This strategy aims to increase Canada's coverage of high-resolution elevation data and increase the accessibility of the products. Unlike the HRDEM product in the same series, which is distributed by acquisition project without integration between projects, the mosaic is created to provide a single, continuous representation of strategy data. The most recent datasets for a given territory are used to generate the mosaic. This mosaic is disseminated through the Data Cube Platform, implemented by NRCan using geospatial big data management technologies. These technologies enable the rapid and efficient visualization of high-resolution geospatial data and allow for the rapid generation of dynamically derived products. The mosaic is available from Web Map Services (WMS), Web Coverage Services (WCS) and SpatioTemporal Asset Catalog (STAC) collections. Accessible data includes the Digital Terrain Model (DTM), the Digital Surface Model (DSM) and derived products such as shaded relief and slope. The mosaic is referenced to the Canadian Height Reference System 2013 (CGVD2013) which is the reference standard for orthometric heights across Canada. Source data for HRDEM datasets used to create the mosaic is acquired through multiple projects with different partners. Collaboration is a key factor to the success of the National Elevation Strategy. Refer to the “Supporting Document” section to access the list of the different partners including links to their respective data.
d
High-Resolution Infrared Enhanced Satellite Cloud Imagery - Global
data.gov.tw
json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Weather Administration Ministry of Transportation and Communications, High-Resolution Infrared Enhanced Satellite Cloud Imagery - Global [Dataset]. https://data.gov.tw/en/datasets/33681
Explore at:
json, xmlAvailable download formats
Dataset authored and provided by
Central Weather Administration Ministry of Transportation and Communications
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
The resolution of the infrared satellite cloud map is 800x800. *Changes to the download link will be made from September 15, 112, and should be updated before December 31, 112. The old links will expire after this date. For large-scale data downloads, please apply for membership at the Meteorological Data Open Platform. https://opendata.cwa.gov.tw/index
h
webui-7k
huggingface.co
Updated Nov 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Big Lab (2024). webui-7k [Dataset]. https://huggingface.co/datasets/biglab/webui-7k
Explore at:
Dataset updated
Nov 1, 2024
Dataset authored and provided by
Big Lab
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
This data accompanies the WebUI project (https://dl.acm.org/doi/abs/10.1145/3544548.3581158) For more information, check out the project website: https://uimodeling.github.io/ To download this dataset, you need to install the huggingface-hub package pip install huggingface-hub

Use snapshot_download from huggingface_hub import snapshot_download snapshot_download(repo_id="biglab/webui-7k", repo_type="dataset")

IMPORTANT

Before downloading and using, please review the copyright info here:… See the full description on the dataset page: https://huggingface.co/datasets/biglab/webui-7k.
e
INSPIRE Download Service (predefined ATOM) for dataset Large Gardens
data.europa.eu
atom feed
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LVermGeo im Auftrag von Böhl-Iggelheim, INSPIRE Download Service (predefined ATOM) for dataset Large Gardens [Dataset]. https://data.europa.eu/data/datasets/723061d2-ea3f-0002-2b64-31e9ac312e5e?locale=en
Explore at:
atom feedAvailable download formats
Dataset authored and provided by
LVermGeo im Auftrag von Böhl-Iggelheim
Description
Description of the INSPIRE Download Service (predefined Atom): Development Plan Große Garten Böhl-Iggelheim - The link(s) for downloading the datasets is/are dynamically generated from Get Map calls to a WMS interface
T
criteo
tensorflow.org
Updated Dec 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). criteo [Dataset]. https://www.tensorflow.org/datasets/catalog/criteo
Explore at:
Dataset updated
Dec 22, 2022
Description
Criteo Uplift Modeling Dataset

This dataset is released along with the paper: “A Large Scale Benchmark for Uplift Modeling” Eustache Diemert, Artem Betlei, Christophe Renaudin; (Criteo AI Lab), Massih-Reza Amini (LIG, Grenoble INP)

This work was published in: AdKDD 2018 Workshop, in conjunction with KDD 2018.

Data description

This dataset is constructed by assembling data resulting from several incrementality tests, a particular randomized trial procedure where a random part of the population is prevented from being targeted by advertising. it consists of 25M rows, each one representing a user with 11 features, a treatment indicator and 2 labels (visits and conversions).

Fields

Here is a detailed description of the fields (they are comma-separated in the file):

f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11: feature values (dense, float)

treatment: treatment group (1 = treated, 0 = control)

conversion: whether a conversion occured for this user (binary, label)

visit: whether a visit occured for this user (binary, label)

exposure: treatment effect, whether the user has been effectively exposed (binary)

Key figures

Format: CSV

Size: 459MB (compressed)

Rows: 25,309,483

Average Visit Rate: .04132

Average Conversion Rate: .00229

Treatment Ratio: .846

Tasks

The dataset was collected and prepared with uplift prediction in mind as the main task. Additionally we can foresee related usages such as but not limited to:

benchmark for causal inference

uplift modeling

interactions between features and treatment

heterogeneity of treatment

benchmark for observational causality methods

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('criteo', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
N
Big Springs, NE Hispanic or Latino Population Distribution by Their...
neilsberg.com
csv, json
Updated Aug 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Big Springs, NE Hispanic or Latino Population Distribution by Their Ancestries [Dataset]. https://www.neilsberg.com/research/datasets/6c5dc95b-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Aug 18, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Nebraska, Big Springs
Variables measured
Hispanic or Latino population with Cuban ancestry, Hispanic or Latino population with Mexican ancestry, Hispanic or Latino population with Puerto Rican ancestry, Hispanic or Latino population with Other Hispanic or Latino ancestry, Hispanic or Latino population with Cuban ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Mexican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Puerto Rican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Other Hispanic or Latino ancestry as Percent of Total Hispanic Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) Origin / Ancestry for Hispanic population and (b) respective population as a percentage of the total Hispanic population, we initially analyzed and categorized the data for each of the ancestries across the Hispanic or Latino population. It is ensured that the population estimates used in this dataset pertain exclusively to ancestries for the Hispanic or Latino population. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Big Springs Hispanic or Latino population. It includes the distribution of the Hispanic or Latino population, of Big Springs, by their ancestries, as identified by the Census Bureau. The dataset can be utilized to understand the origin of the Hispanic or Latino population of Big Springs.

Key observations

Among the Hispanic population in Big Springs, regardless of the race, the largest group is of Mexican origin, with a population of 49 (100% of the total Hispanic population).

https://i.neilsberg.com/ch/big-springs-ne-population-by-race-and-ethnicity.jpeg" alt="Big Springs Non-Hispanic population by race">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Origin for Hispanic or Latino population include:

Mexican

Black or African American

Puerto Rican

Cuban

Other Hispanic or Latino

Variables / Data Columns

Origin: This column displays the origin for Hispanic or Latino population for the Big Springs

Population: The population of the specific origin for Hispanic or Latino population in the Big Springs is shown in this column.

% of Total Hispanic Population: This column displays the percentage distribution of each Hispanic origin as a proportion of Big Springs total Hispanic or Latino population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Big Springs Population by Race & Ethnicity. You can refer the same here
T
Development Engineering - Large Lots
open.piercecountywa.gov
internal.open.piercecountywa.gov
+2more
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Development Engineering - Large Lots [Dataset]. https://open.piercecountywa.gov/Maps-and-Geospatial/Development-Engineering-Large-Lots/qdcv-6g9t
Explore at:
application/rssxml, application/rdfxml, csv, kmz, application/geo+json, xml, tsv, kmlAvailable download formats
Dataset updated
Jan 13, 2025
Description
Polygons of active and historic large lot development in unincorporated Pierce County. Please read metadata (https://matterhorn.piercecountywa.gov/GISmetadata/pdbplandev_large_lots.html) for additional information. Any use or data download constitutes acceptance of the Terms of Use (https://matterhorn.piercecountywa.gov/Disclaimer/PierceCountyGISDataTermsofUse.pdf).
T
imdb_reviews
tensorflow.org
Updated Sep 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). imdb_reviews [Dataset]. https://www.tensorflow.org/datasets/catalog/imdb_reviews
Explore at:
Dataset updated
Sep 20, 2024
Description
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('imdb_reviews', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

Facebook

Twitter

Click to copy link

Link copied

Cite

Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177

Data Cleaning Sample

Explore at:

141 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.5683/SP3/ZCN177

Dataset updated

Jul 13, 2023

Dataset provided by

Borealis

Authors

Rong Luo

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Sample data for exercises in Further Adventures in Data Cleaning.

Clear search

Close search

Google apps

Main menu

Data Cleaning Sample

Big Lake, TX Age Group Population Dataset: A Complete Breakdown of Big Lake...

About this dataset

Content

Inspiration

Recommended for further research

Big Stone City, SD Age Group Population Dataset: A Complete Breakdown of Big...

About this dataset

Content

Inspiration

Recommended for further research

Data from: CLIVAR LE project

Data from: Large Landing Trajectory Data Set for Go-Around Analysis

load minimum data set

select London City Airport, go-arounds, and 2019-01-04

iterate over flights and pull the data from OpenSky Network

The flights can be converted into a Traffic object

Big Sandy, MT Population Breakdown by Race

About this dataset

Content

Inspiration

Recommended for further research

Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...

About this dataset

Content

Inspiration

Recommended for further research

Aluminum alloy industrial materials defect

Big Bear Lake, CA Population Breakdown by Race

About this dataset

Content

Inspiration

Recommended for further research

INSPIRE Download Service (predefined ATOM) for dataset special area large...

Large Scale International Boundaries

USGS 10-m Digital Elevation Model (DEM): Hawaii: Big Island: Hillshade

High Resolution Digital Elevation Model Mosaic (HRDEM Mosaic) - CanElevation...

High-Resolution Infrared Enhanced Satellite Cloud Imagery - Global

webui-7k

INSPIRE Download Service (predefined ATOM) for dataset Large Gardens

criteo

Criteo Uplift Modeling Dataset

Data description

Fields

Key figures

Tasks

Big Springs, NE Hispanic or Latino Population Distribution by Their...

About this dataset

Content

Inspiration

Recommended for further research

Development Engineering - Large Lots

imdb_reviews

Data Cleaning Sample