16 datasets found

Titanic Dataset for EDA
kaggle.com
zip
Updated Nov 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayush Sarraf0731 (2021). Titanic Dataset for EDA [Dataset]. https://www.kaggle.com/datasets/ayushsarraf0731/titanic-dataset-for-eda/code
Explore at:
zip(22548 bytes)Available download formats
Dataset updated
Nov 30, 2021
Authors
Ayush Sarraf0731
Description
Dataset

This dataset was created by Ayush Sarraf0731

Contents
The Global EDA Market size was USD 14.9 billion in 2023!
cognitivemarketresearch.com
pdf,excel,csv,ppt
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research, The Global EDA Market size was USD 14.9 billion in 2023! [Dataset]. https://www.cognitivemarketresearch.com/eda-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, The Global EDA Market size will be USD 14.9 billion in 2023 and will grow at a compound annual growth rate (CAGR) of 10.50% from 2023 to 2030.

The demand for the EDA Market is rising due to the rise in outdoor and adventure activities. Changing consumer lifestyle trends are higher in the EDA market. The cat segment held the highest EDA Market revenue share in 2023. North American EDA will continue to lead, whereas the European EDA Market will experience the most substantial growth until 2030.

Supply Chain and Risk Analysis to Provide Viable Market Output

The industry is facing supply chain and logistics disruptions. EDA tools have been instrumental in analyzing supply chain data, identifying vulnerabilities, predicting risks, and developing disruption mitigation strategies. Consumer behavior has undergone drastic changes due to blockages and restrictions. EDA helps companies analyze changing trends in buying behavior, online shopping preferences, and demand patterns, enabling organizations to adjust their marketing and sales strategies accordingly.

Health and Pharmaceutical Research to Propel Market Growth

EDA tools have played a key role in analyzing large amounts of data related to vaccine development, drug trials, patient records and epidemiological studies. These tools have helped researchers process and interpret complex medical data, leading to advances in the development of treatments and vaccines. The pandemic has created challenges in data collection, especially in sectors affected by lockdowns or blackouts. Rapidly changing conditions and incomplete data sets make effective EDA difficult due to data quality issues. The economic uncertainty caused by the pandemic has led to budget cuts in some sectors, impacting investment in new technologies. Some organizations have limited budgets that limit their ability to adopt or update EDA tools.

Market Dynamics of the EDA

Privacy and Data Security Issues to Restrict Market Growth

With the focus on data privacy regulations such as GDPR, CCPA, etc., organizations need to ensure compliance when handling sensitive data. These compliance requirements may limit the scope of the EDA by limiting the availability and use of certain data sets for information analysis. EDA often requires data analysts or data scientists who are skilled in statistical analysis and data visualization tools. A lack of professionals with these specialized skills can hinder an organization's ability to use EDA tools effectively, limiting adoption. Advanced EDA techniques can involve complex algorithms and statistical techniques that are difficult for non-technical users to understand. Interpreting results and deriving actionable insights from EDA results pose challenges that affect applicability to a wider audience.

Impact of COVID–19 on the EDA Market

The COVID-19 pandemic has had a nuanced impact on the EDA market. The pandemic has accelerated the need for data-driven decision-making as businesses face unprecedented challenges. Organizations have sought effective tools like EDA to analyze customer behavior, supply chain disruptions, and operational changes caused by the pandemic. As remote work becomes the norm, there is a huge demand for collaborative data analytics tools. EDA solutions that enable remote access, data sharing, and collaborative analysis are increasingly being used to support remote teams working on big data projects. The pandemic has led to a significant focus on health data analysis, including tracking infection rates, analyzing clinical trial data, modeling the spread of the virus, and estimating health resource needs. EDA tools have played an important role in processing and interpreting these massive amounts of health data. Introduction of the EDA Market

As data volumes grow exponentially across industries, effective tools and technologies are urgently needed to make sense of this vast and complex information. EDA tools play a critical role in understanding data patterns, correlations, and outliers and supporting decision-making. Businesses are increasingly relying on data-driven insights to make informed decisions. EDA tools help organizations gain a competitive advantage by providing a framework for extracting meaningful insights from data by visualizing patterns, trends, and relationships.

For instance, in January 2023, Siemens Digital Indu...
eda_newsdata
kaggle.com
zip
Updated Apr 8, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GodRabbit23333 (2021). eda_newsdata [Dataset]. https://www.kaggle.com/user2333/eda-newsdata
Explore at:
zip(130847439 bytes)Available download formats
Dataset updated
Apr 8, 2021
Authors
GodRabbit23333
Description
Dataset

This dataset was created by GodRabbit23333

Contents
s
Eda Export Data of HS Code 29212100 India – Seair.co.in
seair.co.in
Updated Apr 20, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2016). Eda Export Data of HS Code 29212100 India – Seair.co.in [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Apr 20, 2016
Dataset provided by
Seair Info Solutions PVT LTD
Authors
Seair Exim
Area covered
India
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
i16 Census Place EconomicallyDistressedAreas 2016
data.ca.gov
data.cnra.ca.gov
+4more
Updated Feb 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2022). i16 Census Place EconomicallyDistressedAreas 2016 [Dataset]. https://data.ca.gov/dataset/i16-census-place-economicallydistressedareas-2016
Explore at:
arcgis geoservices rest api, html, geojson, kml, zip, csvAvailable download formats
Dataset updated
Feb 16, 2022
Dataset authored and provided by
California Department of Water Resourceshttp://www.water.ca.gov/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line File is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Primary roads are generally divided, limited-access highways within the interstate highway system or under State management, and are distinguished by the presence of interchanges. These highways are accessible by ramps and may include some toll highways. The MAF/TIGER Feature Classification Code (MTFCC) is S1100 for primary roads. Secondary roads are main arteries, usually in the U.S. Highway, State Highway, and/or County Highway system. These roads have one or more lanes of traffic in each direction, may or may not be divided, and usually have at-grade intersections with many other roads and driveways. They usually have both a local name and a route number. The MAF/TIGER Feature Classification Code (MTFCC) is S1200 for secondary roads.
Com Cast EDA
kaggle.com
zip
Updated Nov 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phumlani (2021). Com Cast EDA [Dataset]. https://www.kaggle.com/datasets/phumlaninoah/com-cast-eda/code
Explore at:
zip(70702 bytes)Available download formats
Dataset updated
Nov 29, 2021
Authors
Phumlani
Description
Dataset

This dataset was created by Phumlani

Contents
i16 Census Place EconomicallyDistressedAreas 2018
data.ca.gov
data.cnra.ca.gov
+4more
Updated Feb 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
i16 Census Place EconomicallyDistressedAreas 2018 [Dataset]. https://data.ca.gov/dataset/i16-census-place-economicallydistressedareas-2018
Explore at:
kml, zip, arcgis geoservices rest api, geojson, csv, htmlAvailable download formats
Dataset updated
Feb 16, 2022
Dataset authored and provided by
California Department of Water Resourceshttp://www.water.ca.gov/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line File is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Primary roads are generally divided, limited-access highways within the interstate highway system or under State management, and are distinguished by the presence of interchanges. These highways are accessible by ramps and may include some toll highways. The MAF/TIGER Feature Classification Code (MTFCC) is S1100 for primary roads. Secondary roads are main arteries, usually in the U.S. Highway, State Highway, and/or County Highway system. These roads have one or more lanes of traffic in each direction, may or may not be divided, and usually have at-grade intersections with many other roads and driveways. They usually have both a local name and a route number. The MAF/TIGER Feature Classification Code (MTFCC) is S1200 for secondary roads.
Titanic EDA
kaggle.com
zip
Updated Aug 3, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gourav Rohra (2021). Titanic EDA [Dataset]. https://www.kaggle.com/gouravrohra/titanic-eda
Explore at:
zip(58919 bytes)Available download formats
Dataset updated
Aug 3, 2021
Authors
Gourav Rohra
Description
Dataset

This dataset was created by Gourav Rohra

Contents
Electronic Design Automation (EDA) Market Analysis APAC, North America,...
technavio.com
Updated May 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Electronic Design Automation (EDA) Market Analysis APAC, North America, Europe, South America, Middle East and Africa - US, China, Germany, Japan, South Korea - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/electronic-design-automation-market-industry-analysis
Explore at:
Dataset updated
May 15, 2024
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Germany, Japan, South Korea, China, United States, Global
Description
Snapshot img

Electronic Design Automation Market Size 2024-2028

The electronic design automation market size is projected to increase by USD 8.69 billion at a CAGR of 10.26% between 2023 and 2028. Market expansion hinges on multiple factors, prominently the escalating importance of EDA in the electronic design sphere. As technological advancements drive complexity in electronic design, EDA tools play an increasingly crucial role in streamlining and optimizing the design process. Moreover, the growing relevance of EDA extends beyond traditional applications, encompassing emerging domains such as system-level design and hardware-software co-design. This expanding significance underscores the indispensable role of EDA solutions in enabling innovation and accelerating time-to-market for electronic products. Additionally, the proliferation of IoT managed services and IoT devices, the surge in demand for high-performance computing, and the advent of artificial intelligence further underscore the growing relevance of EDA across diverse industries. These factors collectively contribute to the robust growth trajectory of the EDA market, propelling it towards continued expansion and innovation.

What will be the Size of the Electronic Design Automation Market During the Forecast Period?

For more highlights about this market report, Download Free Sample

Electronic Design Automation Market Segmentation

The market research report provides comprehensive data (region wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018 - 2022 for the following segments.

Product Type Outlook Semiconductor IP CAE IC physical design and verification PCB and multi-chip module Services Deployment Outlook On-premises Cloud-based Region Outlook North America The U.S. Canada Europe The U.K. Germany France Rest of Europe APAC China India Middle East & Africa Saudi Arabia South Africa Rest of the Middle East & Africa South America Chile Brazil Argentina

Electronic design automation (EDA) is at the forefront of innovation, especially in the cloud based solutions era. As the IoT industry and AI industry expand, demand for miniaturized chips/ICs surges, emphasizing the need for precision and accuracy in designing circuits. EDA tools provide robust support for hardware development, leveraging advanced computer aided design techniques. With a focus on efficiency and reliability, these solutions empower engineers to navigate the complexities of modern hardware design with ease.

This market research report extensively covers market segmentation by product type (semiconductor IP, CAE, IC physical design and verification, PCB and multi-chip module, and services), deployment (on-premises and cloud-based), and geography (APAC, North America, Europe, South America, and Middle East and Africa). It also includes an in-depth analysis of drivers, trends, and challenges. Furthermore, the market forecasting report includes historic market data from 2018 to 2022.

By Product Type

The market share by the semiconductor IP segment will be significant during the forecast period. The semiconductor IP market segment held the largest share of the market in 2022. New products have emerged due to the increasing complexity of semiconductor designs and production techniques as well as their integration with advanced technologies. This is increasing the number of IPs being registered in the semiconductor industry.

Get a glance at the market contribution of various segments. Download the PDF Sample

The semiconductor IP segment showed a gradual increase in the market share of USD 3.26 billion in 2017. The semiconductor industry has been recording increased demand for many devices, including sensors, chips, radio frequency (RF) components, and memory devices. The increasing use of semiconductors in a range of sectors, e.g. automotive, energy, medical care, and engineering, has led to this demand. Such factors will increase segment growth during the forecast period.

By Deployment

The on-premises deployment segment refers to the traditional approach of deploying software or applications on local servers or computing infrastructure that is owned and managed by an organization. On-premises infrastructure gives the organization complete control over the resources, services, and data. The performance of the On-premises systems provides certain advantages, such as latency. On-premises provide provision to store data locally, allowing greater control, which will avoid the cases of sensitive data leaving the company. All these advantages make the on-premises software highly preferable to the chip designers. EDA workflows include front-end design, backend workloads, as well as performance stimulation and verif
hm-customer-metadata
kaggle.com
Updated Mar 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nguyentuananh (2022). hm-customer-metadata [Dataset]. https://www.kaggle.com/datasets/astrung/hm-customer-metadata/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nguyentuananh
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This data contain inactive/active and cold-start/non-cold-start information about customer id in test data of this competition: https://www.kaggle.com/c/h-and-m-personalized-fashion-recommendations

If you want to find more ideals about hybrid approach, please check comments in my thread: https://www.kaggle.com/c/h-and-m-personalized-fashion-recommendations/discussion/312653

You can use this information for hybrid approach. Example: you can use deep learning model for active and non-cold-start users, and use trending for others.

Content

If you need more detail about my data, please check my notebook: https://www.kaggle.com/astrung/eda-extract-user-metadata-to-apply-deep-model/notebook

This data has following columns: * Customer_id: all of customer id in test set * num_missing_months: number of months which users don't have any transactions in 2020 * lastest_inactive_months: number of consecutive months which user is inactive before reappear in test set * active_status: has following categories: * active: if users have transaction in Sep 2020(appear in training data) * inactive_in_year: if users don't have any transactions in 2020 * inactive_in_3_months_or_more: if users don't have any transactions in July, Aug, Sep in 2020 * inactive_in_2_months: if users don't have any transactions in Aug, Sep in 2020 * inactive_in_1_months: if users don't have any transactions in Sep in 2020 * `num_transactions: number of transactions in 2020 * cold_start_status: has following categories: * cold_start: number of transactions <= 10 in 2020 * non_cold_start: number of transactions > 10 in 2020 * mean_transactions_in_active_month: average number of transactions in months which user is active
h
AndroidAppReviews
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harshitha, AndroidAppReviews [Dataset]. https://huggingface.co/datasets/NovaNightshade/AndroidAppReviews
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Harshitha
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The function to read the data is in the first cell of the python notebook: eda.ipynb Data Structure The JSON data is organized as follows: App Names: Top-level keys represent the names of the apps (e.g., "DoorDash", "McDonald's").

-> Score Categories: Under each app, reviews are grouped by score categories (e.g., "1", "2", "3", "4", "5").

-> Review Lists: Each score category contains a list of reviews. ->Review Details: Each review includes: - content: The… See the full description on the dataset page: https://huggingface.co/datasets/NovaNightshade/AndroidAppReviews.
A
‘COVID-19 dataset in Japan’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘COVID-19 dataset in Japan’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-covid-19-dataset-in-japan-2665/beaf3665/?iid=011-326&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Japan
Description
Analysis of ‘COVID-19 dataset in Japan’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/lisphilar/covid19-dataset-in-japan on 28 January 2022.

--- Dataset description provided by original source is as follows ---

1. Context

This is a COVID-19 dataset in Japan. This does not include the cases in Diamond Princess cruise ship (Yokohama city, Kanagawa prefecture) and Costa Atlantica cruise ship (Nagasaki city, Nagasaki prefecture). - Total number of cases in Japan - The number of vaccinated people (New/experimental) - The number of cases at prefecture level - Metadata of each prefecture

Note: Lisphilar (author) uploads the same files to https://github.com/lisphilar/covid19-sir/tree/master/data

This dataset can be retrieved with CovsirPhy (Python library).

pip install covsirphy --upgrade

import covsirphy as cs data_loader = cs.DataLoader() japan_data = data_loader.japan() # The number of cases (Total/each province) clean_df = japan_data.cleaned() # Metadata meta_df = japan_data.meta()

Please refer to CovsirPhy Documentation: Japan-specific dataset.

Note: Before analysing the data, please refer to Kaggle notebook: EDA of Japan dataset and COVID-19: Government/JHU data in Japan. The detailed explanation of the build process is discussed in Steps to build the dataset in Japan. If you find errors or have any questions, feel free to create a discussion topic.

1.1 Total number of cases in Japan

covid_jpn_total.csv Cumulative number of cases: - PCR-tested / PCR-tested and positive - with symptoms (to 08May2020) / without symptoms (to 08May2020) / unknown (to 08May2020) - discharged - fatal

The number of cases: - requiring hospitalization (from 09May2020) - hospitalized with mild symptoms (to 08May2020) / severe symptoms / unknown (to 08May2020) - requiring hospitalization, but waiting in hotels or at home (to 08May2020)

In primary source, some variables were removed on 09May2020. Values are NA in this dataset from 09May2020.

Manually collected the data from Ministry of Health, Labour and Welfare HP:
厚生労働省 HP (in Japanese)
Ministry of Health, Labour and Welfare HP (in English)

The number of vaccinated people: - Vaccinated_1st: the number of vaccinated persons for the first time on the date - Vaccinated_2nd: the number of vaccinated persons with the second dose on the date - Vaccinated_3rd: the number of vaccinated persons with the third dose on the date

Data sources for vaccination: - To 09Apr2021: 厚生労働省 HP 新型コロナワクチンの接種実績(in Japanese) - 首相官邸新型コロナワクチンについて - From 10APr2021: Twitter: 首相官邸（新型コロナワクチン情報）

1.2 The number of cases at prefecture level

covid_jpn_prefecture.csv Cumulative number of cases: - PCR-tested / PCR-tested and positive - discharged - fatal

The number of cases: - requiring hospitalization (from 09May2020) - hospitalized with severe symptoms (from 09May2020)

Using pdf-excel converter, manually collected the data from Ministry of Health, Labour and Welfare HP:
厚生労働省 HP (in Japanese)
Ministry of Health, Labour and Welfare HP (in English)

Note: covid_jpn_prefecture.groupby("Date").sum() does not match covid_jpn_total. When you analyse total data in Japan, please use covid_jpn_total data.

1.3 Metadata of each prefecture

covid_jpn_metadata.csv - Population (Total, Male, Female): 厚生労働省厚生統計要覧（2017年度）第１－５表 - Area (Total, Habitable): Wikipedia 都道府県の面積一覧 (2015)

Hospital_bed: With the primary data of 厚生労働省感染症指定医療機関の指定状況（平成31年4月1日現在）, 厚生労働省第二種感染症指定医療機関の指定状況（平成31年4月1日現在）, 厚生労働省医療施設動態調査（令和２年１月末概数）, 厚生労働省感染症指定医療機関について and secondary data of COVID-19 Japan 都道府県別感染症病床数,

Specific: Hospital beds of medical institutions designated for specific infectious diseases

Type-I: Hospital beds of medical institutions designated for type I infectious diseases

Type-II: Hospital beds of medical institutions designated for type II infectious diseases

Tuberculosis: Hospital beds of medical institutions designated for tuberculosis (outpatient care)

Care: long term care bed of hospitals

Total: Beds of all hospitals

Clinic_bed: With the primary data of 医療施設動態調査（令和２年１月末概数） ,

Care: long term care beds of clinics

Total: Beds of all clinics

Location: Data is from LinkData 都道府県庁所在地 (Public Domain) (secondary data).

Latitude

Longitude

Admin

Capital: Prefectural capital city. Data is from LinkData 都道府県庁所在地 (Public Domain) (secondary data).

Region: Region name. Data is from WIkipedia (secondary data). "Kyushu-Okinawa region" was separated to "Kyushu" and "Okinawa" by this datasets' author.

Num: Prefecture code (JIS X 0401: Hokkaido=1,...Okinawa=47). Data is from 国土交通省 GIS HP Pref code. cf. (not source) Japan VIsitor: Japan Prefectures Map.

2. Acknowledgements

To create this dataset, edited and transformed data of the following sites was used.

厚生労働省 Ministry of Health, Labour and Welfare, Japan:
厚生労働省 HP (in Japanese)
Ministry of Health, Labour and Welfare HP (in English) 厚生労働省 HP 利用規約・リンク・著作権等 CC BY 4.0 (in Japanese)

国土交通省 Ministry of Land, Infrastructure, Transport and Tourism, Japan: 国土交通省 HP (in Japanese) 国土交通省 HP (in English) 国土交通省 HP 利用規約・リンク・著作権等 CC BY 4.0 (in Japanese)

Code for Japan / COVID-19 Japan: Code for Japan COVID-19 Japan Dashboard (CC BY 4.0) COVID-19 Japan 都道府県別感染症病床数 (CC BY)

Wikipedia: Wikipedia

LinkData: LinkData (Public Domain)

Inspiration

Changes in number of cases over time

Percentage of patients without symptoms / mild or severe symptoms

What to do next to prevent outbreak

License and how to cite

Kindly cite this dataset under CC BY-4.0 license as follows. - Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, GitHub repository, https://github.com/lisphilar/covid19-sir/data/japan, or - Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, Kaggle Dataset, https://www.kaggle.com/lisphilar/covid19-dataset-in-japan

--- Original source retains full ownership of the source dataset ---
i16 Census Tract EconomicallyDistressedAreas 2018
data.ca.gov
data.cnra.ca.gov
+3more
Updated Feb 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2022). i16 Census Tract EconomicallyDistressedAreas 2018 [Dataset]. https://data.ca.gov/dataset/i16-census-tract-economicallydistressedareas-2018
Explore at:
html, arcgis geoservices rest api, geojson, kml, csv, zipAvailable download formats
Dataset updated
Feb 16, 2022
Dataset authored and provided by
California Department of Water Resourceshttp://www.water.ca.gov/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a copy of the statewide Census Tract GIS Tiger file. It is used to determine if a census tract (CT) is DAC or not by adding ACS (American Community Survey) Median Household Income (MHI) data at the CT level. The IRWM web based DAC mapping tool uses this GIS layer. Every year this table gets updated after ACS publishes their updated MHI estimates. Created by joining 2016 DAC table to 2010 Census Tracts feature class. The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line File is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The Census Bureau delineated the census tracts in situations where no local participant existed or where all the potential participants declined to participate. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. When first delineated, census tracts were designed to be homogeneous with respect to population characteristics, economic status, and living conditions. The spatial size of census tracts varies widely depending on the density of settlement. Physical changes in street patterns caused by highway construction, new development, and so forth, may require boundary revisions. In addition, census tracts occasionally are split due to population growth, or combined as a result of substantial population decline. Census tract boundaries generally follow visible and identifiable features. They may follow legal boundaries such as minor civil division (MCD) or incorporated place boundaries in some States and situations to allow for census tract-to-governmental unit relationships where the governmental boundaries tend to remain unchanged between censuses. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.
d
Virtual Reality Adaptation using Electrodermal Activity to Support User...
b2find.dkrz.de
Updated Apr 27, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Virtual Reality Adaptation using Electrodermal Activity to Support User Experience - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/ac4de0b6-8b49-59c4-a868-e126c6f66520
Explore at:
Dataset updated
Apr 27, 2022
Description
We report an experiment (N=18) where participants where engaged in a dual task setting in a Social VR (Virtual Reality) scenario. We present a physiologically-adaptive system that optimizes the virtual environment based on physiological arousal, i.e., electrodermal activity. We investigated the usability of the adaptive system in a simulated social virtual reality scenario. Participants completed an n-back task (primary) and a visual detection (secondary) task. Here, we adapted the visual complexity of the secondary task in the form of the number of not-playable characters of the secondary task to accomplish the primary task. We show that an adaptive virtual reality can improve users’ comfort by adapting to physiological arousal the task complexity. Specifically we make available physiological (Electrodermal Activity - EDA, Electroencephalography - EEG; Electrocardiography - ECG) , behavioral and questionnaires data and lastly, the analysis code. Users interested in reproducing the results can follow the methodology as reported in the paper and the analysis code as reported in the Python script (" Step_02.") in the Files section.
Electric Vehicle Population Data | Messy Data
kaggle.com
Updated Feb 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. Zohaib Zeeshan (2025). Electric Vehicle Population Data | Messy Data [Dataset]. https://www.kaggle.com/datasets/mzohaibzeeshan/electric-vehicle-population-data-messy-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 2, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
M. Zohaib Zeeshan
Description
This dataset shows the Battery Electric Vehicles (BEVs) and Plug-in Hybrid Electric Vehicles (PHEVs) that are currently registered through Washington State Department of Licensing (DOL)

Number of Rows: 223,995 Number of Columns: 17 Contains Missing Values

Column Descriptions:

VIN (1-10): First 10 characters of the Vehicle Identification Number. County: The county where the vehicle is registered. City: The city where the vehicle is registered. State: The state where the vehicle is registered. Postal Code: The ZIP code of the vehicle's registration location. Model Year: The manufacturing year of the vehicle. Make: The brand/manufacturer of the vehicle (e.g., Tesla, Nissan). Model: The specific model of the vehicle. Electric Vehicle Type: The type of EV (Battery Electric Vehicle or Plug-in Hybrid). Clean Alternative Fuel Vehicle (CAFV) Eligibility: Indicates if the vehicle qualifies for CAFV benefits. Electric Range: The maximum range the vehicle can travel on a single charge. Base MSRP: The Manufacturer's Suggested Retail Price of the vehicle. Legislative District: The legislative district where the vehicle is registered. DOL Vehicle ID: A unique identifier assigned by the Department of Licensing. Vehicle Location: A general reference to the vehicle's location. Electric Utility: The electric utility company serving the vehicle's area. 2020 Census Tract: The census tract based on 2020 data for demographic analysis.

What can you do and learn from this data:

Properly Engineer and Clean this dataset for better Analysis.

Perform EDA.

Perform datailed Analysis .

Fit a Machine learning Model, and improve accuracy.
UNSW-NB15
kaggle.com
Updated Sep 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
StrGenIx | Laurens D'hooge (2024). UNSW-NB15 [Dataset]. http://doi.org/10.34740/kaggle/dsv/9350725
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/9350725
Dataset updated
Sep 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
StrGenIx | Laurens D'hooge
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This is an academic intrusion detection dataset. All the credit goes to the original authors: dr. Nour Moustafa and dr. Jill Slay.

Please cite their original paper and all other appropriate articles listed on the UNSW-NB15 page.

The full dataset also offers the pcap, BRO and Argus files along with additional documentation.

V1: Original CSVs obtained from here V2: Cleaning -> parquet V3: Reorganize to save storage, only keep original CSVs in V1/V2 V4: Update to remove contaminating features [presentation] & [conference article]

My modifications to the predesignated train-test sets are minimal and designed to decrease disk storage and increase performance & reliability.

In its current iteration, the dataset can be loaded trivially with pd.read_parquet(). All data types are already set correctly and there are 0 records with missing information. Reading parquet files does require fastparquet and / or pyarrow

Exploratory Data Analysis (EDA) through classification with very simple models to .877 AUROC.
Not seeing a result you expected?
Learn how you can add new datasets to our index.