Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created using LeRobot.
Dataset Structure
meta/info.json: { "codebase_version": "v2.1", "robot_type": "franka", "total_episodes": 51, "total_frames": 28867, "total_tasks": 1, "total_videos": 0, "total_chunks": 1, "chunks_size": 1000, "fps": 20, "splits": { "train": "0:51" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path": null, "features": {… See the full description on the dataset page: https://huggingface.co/datasets/danielsanjosepro/clean-up-table.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Clean Table is a dataset for object detection tasks - it contains Tables annotations for 857 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Explore our public data on competitions, datasets, kernels (code / notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but we do think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.
Strategizing to become a Competitions Grandmaster? Wondering who, where, and what goes into a winning team? Choosing evaluation metrics for your next data science project? The kernels published using this data can help. We also hope they'll spark some lively Kaggler conversations and be a useful resource for the larger data science community.
https://i.imgur.com/2Egeb8R.png" alt="" title="a title">
This dataset is made available as CSV files through Kaggle Kernels. It contains tables on public activity from Competitions, Datasets, Kernels, Discussions, and more. The tables are updated daily.
Please note: This data is not a complete dump of our database. Rows, columns, and tables have been filtered out and transformed.
In August 2023, we released Meta Kaggle for Code, a companion to Meta Kaggle containing public, Apache 2.0 licensed notebook data. View the dataset and instructions for how to join it with Meta Kaggle here
We also updated the license on Meta Kaggle from CC-BY-NC-SA to Apache 2.0.
UserId column in the ForumMessages table has values that do not exist in the Users table.True or False.Total columns.
For example, the DatasetCount is not the total number of datasets with the Tag according to the DatasetTags table.db_abd_create_tables.sql script.clean_data.py script.
The script does the following steps for each table:
NULL.add_foreign_keys.sql script.Total columns in the database tables. I do that by running the update_totals.sql script.
Facebook
TwitterThis statistic shows the results of a survey conducted in the United States in 2017 on the frequency of tidying up all rooms. Some ** percent of respondents stated in their household tidying up all rooms happens several times per week. The Survey Data Table for the Statista survey Cleaning Products in the United States 2018 contains the complete tables for the survey including various column headings.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains the classic Wine Recognition Dataset from the UCI Machine Learning Repository — now presented in two formats:
wine_clean.csv) for fast ML workflows wine_original.zip) for purists and explorersPerfect for learning K-Nearest Neighbors (KNN), exploring distance metrics like Euclidean, Manhattan, Cosine, and building visual + interactive ML notebooks.
| File | Description |
|---|---|
wine_clean.csv | Clean version with column names, no missing data, and ready-to-use |
wine.zip | Raw UCI files: wine.data, wine.names, etc. for reference or manual parsing |
| Feature | Description |
|---|---|
Class | Target: Cultivar of wine (1, 2, or 3) |
Alcohol | Alcohol content |
Malic_Acid | Malic acid amount |
Ash | Ash content |
Alcalinity_of_Ash | Alkalinity of ash |
Magnesium | Magnesium content |
Total_Phenols | Total phenol compounds |
Flavanoids | Flavonoid concentration |
Nonflavanoid_Phenols | Non-flavonoid phenols |
Proanthocyanins | Amount of proanthocyanins |
Color_Intensity | Intensity of wine color |
Hue | Hue of wine |
OD280_OD315 | Optical density ratio |
Proline | Proline levels |
Public Domain (CC0) — free to use, remix, and share 🌍
If you're an ML student or early-career data scientist, this dataset is your 🍷 playpen. Dive in!
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data and audio included here were collected for the Soundscape Attributes Translation Project (SATP). First introduced in Aletta et. al. (2020), the SATP is an attempt to provide validated translations of soundscape attributes in languages other than English. The recordings were used for headphones - based listening experiments.
The data are provided to accompany publications resulting from this project and to provide a unique dataset of 1000s of perceptual responses to a standardised set of urban soundscape recordings. This dataset is the result of efforts from hundreds of researchers, students, assistants, PIs, and participants from institutions around the world. We have made an attempt to list every contributor to this Zenodo repo; if you feel you should be included, please get in touch.
Citation: If you use the SATP dataset or part of it, please cite our paper describing the data collection and this dataset itself.
Overview: The SATP dataset consists of 27 30-sec binaural audio recordings made in urban public spaces in London and one 60 sec stereo calibration signal.
The recordings were made at locations as reported in Table 1 of the README.md (Recording locations), at various times of day by an operator wearing a binaural kit consisting of BHS II microphones and a SQobold (HEAD acoustics) device. Recordings were then exported to WAV via the ArtemiS SUITE software, using the original dynamic range from HDF. The listening experiment and the calibration procedure were intended for a headphone playback system (Sennheiser HD650 or similar open-back headphones recommended).
The recordings were selected from an initial set of 80 recordings through a pilot study to ensure the test set had an even coverage of the soundscape circumplex space. These recordings were sent to the partner institutions (see Table 2 of the README.md) and assessed by approximately 30 participants in the institution's target language. The questionnaire used in each assessment is a translation of Method A Questionnaire, ISO 12913-2:2018. Each institution carried out their own lab experiment to collect data, then submitted their data to the team at UCL to compile into a single dataset. Some institutions included additional questions or translation options; the combined dataset (SATP Dataset v1.x.xlsx) includes only the base set of questions, the extended set of questions from each institution is included in the Institution Datasets folder.
In all, SATP Dataset v1.4 contains 19,089 samples, including 707 participants, for 27 recordings, in 18 languages with contributions from 29 institutions.
Descriptions of the recordings, including GPS coordinates and sound sources, can be found in the README.md file.
Format: The audio recordings are provided as 24 bit, 48 kHz, stereo WAV files. The combined dataset and Institutional datasets are provided as long tidy data tables in .xlsx files.
Calibration: The recommended calibration approach was based on the open-circuit voltage (OCV) procedure which was considered most accessible but other calibration procedures are also possible (Lam et. al. (2022)). The provided calibration file is a computer generated sine wave at 1kHz, matching a sine wave recorded using the exact same setup at SPL of 94 dB. In case of the calibration signal playback level set to match SPL of 94 dB at the eardrum, all the 27 samples should be reproduced at realistic loudness. More details on OCV calibration procedure and other options you can find in Lam et. al. (2022) and the attached documentation. PLEASE DO NOT EXPOSE YOURSELF NOR THE PARTICIPANTS TO THE CALIBRATION SIGNAL SET AT THE REALISTIC LEVEL AS IT CAN CAUSE HARM.
License and reuse: All SATP recordings are provided under the Creative Commons Attribution 4.0 International (CC BY 4.0) License and are free to use. We encourage other researchers to replicate the SATP protocol and contribute new languages to the dataset. We also encourage the use of these recordings and the perceptual data for further soundscape research purposes. Please provide the proper attribution and get in touch with the authors if you would like to contribute a new translation or for any other collaborations.
Facebook
TwitterAnnual total supply, output margins, international and interprovincial imports, total use, intermediate input, domestic demand, inventories, and international and interprovincial exports of the environmental and clean technology product sector, per goods and services category, for Canada, provinces and territories.
Facebook
TwitterThis statistic shows the results of a survey conducted in the United States in 2017 on attitudes towards cleaning. Some ** percent of respondents stated they are happy when everything is clean and tidy for their family. The Survey Data Table for the Statista survey Cleaning Products in the United States 2018 contains the complete tables for the survey including various column headings.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: Stake DAO - Strategies Clean Table
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 25 series, with data for years 2007 - 2016 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 item: Canada); Economic variable (9 items: Total supply; Output; Margins; Imports; ...); Product and service (9 items: Total, products and services; Total, electricity; From nuclear; From renewable sources; ...).
Facebook
TwitterThe data comes from Wikipedia.
The dataset included was mined from all 50 states, tidying column names, binding and aggregating.
state_stations.csv| variable | class | description |
|---|---|---|
| call_sign | character | Call Sign |
| frequency | character | frequency |
| city | character | city |
| licensee | character | licensee |
| format | character | format |
| state | character | state |
station_info.csvCan be joined:
state_stations |> dplyr::right_join(station_info, by = c("call_sign"))
| variable | class | description |
|---|---|---|
| call_sign | character | Call sign |
| facility_id | double | Facility id |
| service | character | Service |
| licensee | character | Licensee |
| status | character | Status |
| details | character | Details |
citation("tidytuesdayR")
Facebook
Twitterhttps://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
The global clean steam separator market is expected to reach a value of USD 2.6 billion by the end of 2025 and is projected to expand steadily to USD 3.6 billion by 2035, growing at a CAGR of 3.1%.
| Metric | Value |
|---|---|
| Industry Size (2025E) | USD 2.6 billion |
| Industry Value (2035F) | USD 3.6 billion |
| CAGR (2025 to 2035) | 3.1% |
Clean Steam Separator Market Analyzed by Top Investment Segments
| Type | CAGR (2025 to 2035) |
|---|---|
| Stainless Steel | 3.8% |
| Type | CAGR (2025 to 2035) |
|---|---|
| Stainless Steel | 3.8% |
| Structure Type | CAGR (2025 to 2035) |
|---|---|
| Fabricated | 3.9% |
| End Use | CAGR (2025 to 2035) |
|---|---|
| Pharmaceuticals | 4.1% |
Country-Wise Insights
| Country | CAGR (2025 to 2035) |
|---|---|
| United States | 2.9% |
| Country | CAGR (2025 to 2035) |
|---|---|
| United Kingdom | 2.8% |
| Country | CAGR (2025 to 2035) |
|---|---|
| European Union | 3.0% |
| Country | CAGR (2025 to 2035) |
|---|---|
| China | 3.4% |
| Country | CAGR (2025 to 2035) |
|---|---|
| India | 3.6% |
| Country | CAGR (2025 to 2035) |
|---|---|
| Japan | 2.7% |
| Country | CAGR (2025 to 2035) |
|---|---|
| South Korea | 3.2% |
Facebook
TwitterHigh-frequency sensor data from a YSI 600 OMS Optical Monitoring System (every 15 minutes) and Sontek IQ (every 10 minutes) in a meadow reach at White Clay Creek from January 2018 through December 2018. Funded by NSF and DEB as part of the LTREB grant to study the recovery of stream ecosystem structure and function during reforestation, Stroud Water Research Center. The parameters in this data package are water temperature, depth, turbidity, conductivity, specific conductance, water pressure, discharge, rivers ection area, and velocity. Data are presented in four tables which likely have significant overlap. The raw data table presents the data exactly as it was downloaded from the Aquarius Database. It is formatted as a "wide" human-readable table. IQ_stream and YSI_stream present only the data from the respective sensors. These tables are gapfilled so that there are no time gaps. Formatted as a "wide" human-readable table. The full_stream table is all of the data, raw and cleaned, from both sensors. It is organized as a long, tidy table and is optimal for machine readability. All of the parameters and table are further explained in the metadata.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Russia Avg Consumer Price: Paper & Clean Articles: Paper Table Napkins data was reported at 52.720 RUB/100 Unit in Feb 2025. This records an increase from the previous number of 52.630 RUB/100 Unit for Jan 2025. Russia Avg Consumer Price: Paper & Clean Articles: Paper Table Napkins data is updated monthly, averaging 47.530 RUB/100 Unit from Jan 2021 (Median) to Feb 2025, with 50 observations. The data reached an all-time high of 52.720 RUB/100 Unit in Feb 2025 and a record low of 34.310 RUB/100 Unit in May 2021. Russia Avg Consumer Price: Paper & Clean Articles: Paper Table Napkins data remains active status in CEIC and is reported by Federal State Statistics Service. The data is categorized under Russia Premium Database’s Prices – Table RU.PA010: Average Consumer Price: Paper and Clean Articles, Stationeries, Publishing.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
There are two files in this dataset. One dataset contains line items from 10-K and 10-Q forms filed between 2009-04-15 and 2023-09-06. The other dataset, "line_item_counts.csv", contains the frequency that each line item occurs, along with a description of the line item.
I was originally looking for a dataset with up to date company information but couldn't find anything that was current and beginner friendly to use. So I decided to pull data directly from SEC Edgar to create a tidy table from their dataset. I have yet to use it but figured I would share what I have so far in case anyone was in my position.
I'll release more info about my process in the near future, but for now I hope that you find some use from this dataset.
I have also released a sample notebook to show how you can load the large dataset into Kaggle without exceeding memory limits. Hopefully this can help you get started if you want to try in Kaggle. Other options would be to download the dataset locally and use your preferred ide to work with the dataset, and the operations would be limited by the memory currently available on your computer OR you could look into using a cloud computing platform like AWS EC2 or GCP to work with the dataset.
Facebook
TwitterThis week's dataset is a dataset all about meteorites, where they fell and when they fell! Data comes from the Meteoritical Society by way of NASA. H/t to #TidyTuesday community member Malin Axelsson for sharing this data as an issue on GitHub!
If you want to find out more about meteorite classifications, Malin was kind enough to share a wikipedia article as well!
meteorites.csv| variable | class | description |
|---|---|---|
| name | character | Meteorite name |
| id | double | Meteorite numerical ID |
| name_type | character | Name type either valid or relict, where relict = a meteorite that cannot be assigned easily to a class |
| class | character | Class of the meteorite, please see Wikipedia for full context |
| mass | double | Mass in grams |
| fall | character | Fell or Found meteorite |
| year | integer | Year found |
| lat | double | Latitude |
| long | double | Longitude |
| geolocation | character | Geolocation |
@misc{tidytuesday, title = {Tidy Tuesday: A weekly social data project}, author = {R4DS Online Learning Community}, url = {https://github.com/rfordatascience/tidytuesday}, year = {2023} }
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
A comprehensive collection of US Census state population totals by year - from 1790 to present. Includes all 50 states plus DC.
Data from 1790 to 1900 is represented once per decade based on historic US Census data. Populations between 1900 and 1946 are backfilled estimates provided by the US census based on decennial Census data combined with external data including birth rates and death rates. Populations from 1947 onwards are based on population estimate surveys conducted by the US Census.
Population data is published in a tidy / long format as well as a wide / columnar format:
tidy format:
Each row represents a total population for a particular year and state.
This format is ideally suited for computation and for converting to other formats as needed.
wide format:
A pivot table of populations by year and state, with states as columns and years as rows. Each row represents populations for all states in a year.
This format is more compact and human-readable than the tidy format.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
PROJECT OBJECTIVE
We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.
Questions (KPIs)
TASK 1: STANDARDIZING THE DATASET
TASK 2: DATA FORMATING
TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:
TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:
TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:
Process
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 25 series, with data for years 2007 - 2016 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 item: Canada); Economic variable (9 items: Total supply; Output; Margins; Imports; ...); Product and service (9 items: Total, products and services; Total, electricity; From nuclear; From renewable sources; ...).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Revised supplementary table II for JAAD manuscript "Association of cutaneous leiomyosarcoma with subsequent primary malignancies: a population-based analysis"
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created using LeRobot.
Dataset Structure
meta/info.json: { "codebase_version": "v2.1", "robot_type": "franka", "total_episodes": 51, "total_frames": 28867, "total_tasks": 1, "total_videos": 0, "total_chunks": 1, "chunks_size": 1000, "fps": 20, "splits": { "train": "0:51" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path": null, "features": {… See the full description on the dataset page: https://huggingface.co/datasets/danielsanjosepro/clean-up-table.