100+ datasets found

Iris dataset
kaggle.com
zip
Updated Jan 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ehsan Zafari (2024). Iris dataset [Dataset]. https://www.kaggle.com/datasets/ehsanzafari/iris-dataset
Explore at:
zip(955 bytes)Available download formats
Dataset updated
Jan 16, 2024
Authors
Ehsan Zafari
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The Iris dataset is a classic dataset in the field of machine learning and statistics. It's often used for demonstrating various data analysis, machine learning, and statistical techniques. Here are some key details about it:

Background - Origin: The dataset was introduced by the British statistician and biologist Ronald Fisher in his 1936 paper titled "The use of multiple measurements in taxonomic problems." - Purpose: Fisher developed the dataset as an example of linear discriminant analysis.

Data Composition - Data Points: The dataset consists of 150 samples from three species of Iris flowers: Iris Setosa, Iris Versicolour, and Iris Virginica. - Features: There are four features measured in centimeters for each sample: 1. Sepal Length 2. Sepal Width 3. Petal Length 4. Petal Width - Classes: The dataset contains three classes, corresponding to the three species of Iris. Each class has 50 samples.

Usage - Classification: The Iris dataset is widely used for classification tasks, especially to illustrate the principles of supervised machine learning algorithms. - Testing Algorithms: It's often used to test out algorithms for linear regression, classification, and clustering due to its simplicity and small size. - Educational Purpose: Because of its clarity and simplicity, it's frequently used in teaching data science and machine learning.

Characteristics - Simple and Clean: The dataset is straightforward, with minimal preprocessing required, making it ideal for beginners. - Well-Behaved Classes: The species are relatively well separated, though there's some overlap between Versicolor and Virginica. - Multivariate Data: It involves understanding the relationship between multiple variables (the four features).

Applications - Benchmarking: The Iris dataset serves as a benchmark for evaluating the performance of different algorithms. - Visualization**: It's great for practicing data visualization, especially for exploring techniques like scatter plots, box plots, and pair plots to understand feature relationships.

Despite its simplicity, the Iris dataset remains one of the most famous datasets in the world of data science and machine learning. It serves as an excellent starting point for anyone new to the field and remains a baseline for testing algorithms and teaching concepts.
1999 RoxAnn Data Points from Apalachicola Bay, Florida
catalog.data.gov
datasets.ai
+2more
Updated Oct 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA Office for Coastal Management (Point of Contact, Custodian) (2024). 1999 RoxAnn Data Points from Apalachicola Bay, Florida [Dataset]. https://catalog.data.gov/dataset/1999-roxann-data-points-from-apalachicola-bay-florida1
Explore at:
Dataset updated
Oct 31, 2024
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Area covered
Florida, Apalachicola Bay
Description
The Apalachicola Bay National Estuarine Research Reserve and the NOAA Office for Coastal Management worked together to map benthic habitats within Apalachicola Bay, Florida. The bay and the lower portions of four distributaries were surveyed on 11-22 October 1999 using three benthic sampling techniques. This data set represents the information gathered from a RoxAnn acoustic sensor. The instrument was used to characterize bottom type by extracting data on bottom roughness and bottom hardness from the primary and secondary sounder echoes. The data is classified on-the-fly, using the Sediment Profile Images and grab samples collected for field validation, and subject to a post-processing classification. The RoxAnn data points were exported into a geographic information system (GIS) and post-processed to remove unreliable data points and re-classified. This data set is comprised of the cleaned, attributed point data. The attributes include location, date, time, depth, field derived classification, and the classification derived from post-processing the data. Original contact information: Contact Org: NOAA Office for Coastal Management Phone: 843-740-1202 Email: coastal.info@noaa.gov
D
Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’...
dataverse.no
dataverse.azure.uit.no
+2more
Updated Oct 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ellen Nierenberg; Ellen Nierenberg (2024). Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’ Information Literacy Development: Knowing, Doing, and Feeling [Dataset]. http://doi.org/10.18710/SK0R1N
Explore at:
txt(21865), txt(19475), csv(55030), txt(14751), txt(26578), txt(16861), txt(28211), pdf(107685), pdf(657212), txt(12082), txt(16243), text/x-fixed-field(55030), pdf(65240), txt(8172), pdf(634629), txt(31896), application/x-spss-sav(51476), txt(4141), pdf(91121), application/x-spss-sav(31612), txt(35011), txt(23981), text/x-fixed-field(15653), txt(25369), txt(17935), csv(15653)Available download formats
Unique identifier
https://doi.org/10.18710/SK0R1N
Dataset updated
Oct 8, 2024
Dataset provided by
DataverseNO
Authors
Ellen Nierenberg; Ellen Nierenberg
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Aug 8, 2019 - Jun 10, 2022
Area covered
Norway
Description
This data set contains the replication data and supplements for the article "Knowing, Doing, and Feeling: A three-year, mixed-methods study of undergraduates’ information literacy development." The survey data is from two samples: - cross-sectional sample (different students at the same point in time) - longitudinal sample (the same students and different points in time)Surveys were distributed via Qualtrics during the students' first and sixth semesters. Quantitative and qualitative data were collected and used to describe students' IL development over 3 years. Statistics from the quantitative data were analyzed in SPSS. The qualitative data was coded and analyzed thematically in NVivo. The qualitative, textual data is from semi-structured interviews with sixth-semester students in psychology at UiT, both focus groups and individual interviews. All data were collected as part of the contact author's PhD research on information literacy (IL) at UiT. The following files are included in this data set: 1. A README file which explains the quantitative data files. (2 file formats: .txt, .pdf)2. The consent form for participants (in Norwegian). (2 file formats: .txt, .pdf)3. Six data files with survey results from UiT psychology undergraduate students for the cross-sectional (n=209) and longitudinal (n=56) samples, in 3 formats (.dat, .csv, .sav). The data was collected in Qualtrics from fall 2019 to fall 2022. 4. Interview guide for 3 focus group interviews. File format: .txt5. Interview guides for 7 individual interviews - first round (n=4) and second round (n=3). File format: .txt 6. The 21-item IL test (Tromsø Information Literacy Test = TILT), in English and Norwegian. TILT is used for assessing students' knowledge of three aspects of IL: evaluating sources, using sources, and seeking information. The test is multiple choice, with four alternative answers for each item. This test is a "KNOW-measure," intended to measure what students know about information literacy. (2 file formats: .txt, .pdf)7. Survey questions related to interest - specifically students' interest in being or becoming information literate - in 3 parts (all in English and Norwegian): a) information and questions about the 4 phases of interest; b) interest questionnaire with 26 items in 7 subscales (Tromsø Interest Questionnaire - TRIQ); c) Survey questions about IL and interest, need, and intent. (2 file formats: .txt, .pdf)8. Information about the assignment-based measures used to measure what students do in practice when evaluating and using sources. Students were evaluated with these measures in their first and sixth semesters. (2 file formats: .txt, .pdf)9. The Norwegain Centre for Research Data's (NSD) 2019 assessment of the notification form for personal data for the PhD research project. In Norwegian. (Format: .pdf)
d
Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data...
datarade.ai
.json, .csv, .xls
Updated Jan 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quadrant (2022). Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data with Custom On-Demand Attributes [Dataset]. https://datarade.ai/data-products/quadrant-point-of-interest-poi-data-business-listings-dat-quadrant
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Jan 30, 2022
Dataset authored and provided by
Quadrant
Area covered
France
Description
We seek to mitigate the challenges with web-scraped and off-the-shelf POI data, and provide tailored, complete, and manually verified datasets with Geolancer. Our goal is to help represent the physical world accurately for applications and services dependent on precise POI data, and offer a reliable basis for geospatial analysis and intelligence.

Our POI database is powered by our proprietary POI collection and verification platform, Geolancer, which provides manually verified, authentic, accurate, and up-to-date POI datasets.

Enrich your geospatial applications with a contextual layer of comprehensive and actionable information on landmarks, key features, business areas, and many more granular, on-demand attributes. We offer on-demand data collection and verification services that fit unique use cases and business requirements. Using our advanced data acquisition techniques, we build and offer tailormade POI datasets. Combined with our expertise in location data solutions, we can be a holistic data partner for our customers.

KEY FEATURES - Our proprietary, industry-leading manual verification platform Geolancer delivers up-to-date, authentic data points

POI-as-a-Service with on-demand verification and collection in 170+ countries leveraging our network of 1M+ contributors

Customise your feed by specific refresh rate, location, country, category, and brand based on your specific needs

Data Noise Filtering Algorithms normalise and de-dupe POI data that is ready for analysis with minimal preparation

DATA QUALITY

Quadrant’s POI data are manually collected and verified by Geolancers. Our network of freelancers, maps cities and neighborhoods adding and updating POIs on our proprietary app Geolancer on their smartphone. Compared to other methods, this process guarantees accuracy and promises a healthy stream of POI data. This method of data collection also steers clear of infringement on users’ privacy and sale of their location data. These purpose-built apps do not store, collect, or share any data other than the physical location (without tying context back to an actual human being and their mobile device).

USE CASES

The main goal of POI data is to identify a place of interest, establish its accurate location, and help businesses understand the happenings around that place to make better, well-informed decisions. POI can be essential in assessing competition, improving operational efficiency, planning the expansion of your business, and more.

It can be used by businesses to power their apps and platforms for last-mile delivery, navigation, mapping, logistics, and more. Combined with mobility data, POI data can be employed by retail outlets to monitor traffic to one of their sites or of their competitors. Logistics businesses can save costs and improve customer experience with accurate address data. Real estate companies use POI data for site selection and project planning based on market potential. Governments can use POI data to enforce regulations, monitor public health and well-being, plan public infrastructure and services, and more. A few common and widespread use cases of POI data are:

Navigation and mapping for digital marketplaces and apps.

Logistics for online shopping, food delivery, last-mile delivery, and more.

Improving operational efficiency for rideshare and transportation platforms.

Demographic and human mobility studies for market consumption and competitive analysis.

Market assessment, site selection, and business expansion.

Disaster management and urban mapping for public welfare.

Advertising and marketing deployment and ROI assessment.

Real-estate mapping for online sales and renting platforms.About Geolancer

ABOUT GEOLANCER

Quadrant's POI-as-a-Service is powered by Geolancer, our industry-leading manual verification project. Geolancers, equipped with a smartphone running our proprietary app, manually add and verify POI data points, ensuring accuracy and authenticity. Geolancer helps data buyers acquire data with the update frequency suited for their specific use case.
d
Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories,...
datarade.ai
.json
Updated Sep 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xverum (2024). Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular Updates [Dataset]. https://datarade.ai/data-products/global-point-of-interest-poi-data-230m-locations-5000-c-xverum
Explore at:
.jsonAvailable download formats
Dataset updated
Sep 7, 2024
Dataset provided by
Xverum LLC
Authors
Xverum
Area covered
French Polynesia, Mauritania, Andorra, Costa Rica, Kyrgyzstan, Vietnam, Antarctica, Guatemala, Northern Mariana Islands, Bahamas
Description
Xverum’s Point of Interest (POI) Data is a comprehensive dataset containing 230M+ verified locations across 5000 business categories. Our dataset delivers structured geographic data, business attributes, location intelligence, and mapping insights, making it an essential tool for GIS applications, market research, urban planning, and competitive analysis.

With regular updates and continuous POI discovery, Xverum ensures accurate, up-to-date information on businesses, landmarks, retail stores, and more. Delivered in bulk to S3 Bucket and cloud storage, our dataset integrates seamlessly into mapping, geographic information systems, and analytics platforms.

🔥 Key Features:

Extensive POI Coverage: ✅ 230M+ Points of Interest worldwide, covering 5000 business categories. ✅ Includes retail stores, restaurants, corporate offices, landmarks, and service providers.

Geographic & Location Intelligence Data: ✅ Latitude & longitude coordinates for mapping and navigation applications. ✅ Geographic classification, including country, state, city, and postal code. ✅ Business status tracking – Open, temporarily closed, or permanently closed.

Continuous Discovery & Regular Updates: ✅ New POIs continuously added through discovery processes. ✅ Regular updates ensure data accuracy, reflecting new openings and closures.

Rich Business Insights: ✅ Detailed business attributes, including company name, category, and subcategories. ✅ Contact details, including phone number and website (if available). ✅ Consumer review insights, including rating distribution and total number of reviews (additional feature). ✅ Operating hours where available.

Ideal for Mapping & Location Analytics: ✅ Supports geospatial analysis & GIS applications. ✅ Enhances mapping & navigation solutions with structured POI data. ✅ Provides location intelligence for site selection & business expansion strategies.

Bulk Data Delivery (NO API): ✅ Delivered in bulk via S3 Bucket or cloud storage. ✅ Available in structured format (.json) for seamless integration.

🏆Primary Use Cases:

Mapping & Geographic Analysis: 🔹 Power GIS platforms & navigation systems with precise POI data. 🔹 Enhance digital maps with accurate business locations & categories.

Retail Expansion & Market Research: 🔹 Identify key business locations & competitors for market analysis. 🔹 Assess brand presence across different industries & geographies.

Business Intelligence & Competitive Analysis: 🔹 Benchmark competitor locations & regional business density. 🔹 Analyze market trends through POI growth & closure tracking.

Smart City & Urban Planning: 🔹 Support public infrastructure projects with accurate POI data. 🔹 Improve accessibility & zoning decisions for government & businesses.

💡 Why Choose Xverum’s POI Data?

230M+ Verified POI Records – One of the largest & most detailed location datasets available.

Global Coverage – POI data from 249+ countries, covering all major business sectors.

Regular Updates – Ensuring accurate tracking of business openings & closures.

Comprehensive Geographic & Business Data – Coordinates, addresses, categories, and more.

Bulk Dataset Delivery – S3 Bucket & cloud storage delivery for full dataset access.

100% Compliant – Ethically sourced, privacy-compliant data.

Access Xverum’s 230M+ POI dataset for mapping, geographic analysis, and location intelligence. Request a free sample or contact us to customize your dataset today!
p
Point data
ptvlogistics.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Point data [Dataset]. https://www.ptvlogistics.com/en/products/data/points-of-interest
Explore at:
License
https://www.myptv.com/en/data/points-interesthttps://www.myptv.com/en/data/points-interest
Description
Points of sale (PoS) and other points of interest (POI) are now the most frequently used point data: Hardly any map display on the Internet can do without them. PTV GmbH also offers a large number of other point data for example location files, kindergartens and schools, house coordinates and others.
Point of interest Data India Techsalerator
kaggle.com
zip
Updated Aug 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2023). Point of interest Data India Techsalerator [Dataset]. https://www.kaggle.com/datasets/techsalerator/point-of-interest-data-india-techsalerator
Explore at:
zip(44578 bytes)Available download formats
Dataset updated
Aug 1, 2023
Authors
Techsalerator
Area covered
India
Description
Techsalerator Covers all Points of interest of companies and entities in India (Business/Company Data) in its India B2B POI Database.

For info contact us at info@techsalerator.com or via https://www.techsalerator.com/request-a-quote

We can select the Perfect set based on location, revenue, number of employees, revenue, years in business as well as 40 other fields.

200 more fields are included in this database including the following below :

UniqueID UniversalPublicationId CompanyName TradeName DirectoryName Address1 Address2 PostCode City CityCode Province ProvinceCode Region RegionCode Country CountryCode Language PhoneOrMobile Phone DNCMPhone Fax Mobile DNCMMobile Email Website WebDomain WebSocialMedialinksFacebook WebSocialMedialinksTwitter GenericLinlkedInLink WebsiteIpAddress NationalID NationalIdentificationTypeCode NationalIdentificationTypeCodeDescription NationalIDIsVat PrimaryLocalActivityCode LocalActivityTypeCode MarketabilityIndicator YearStarted NumberOfFamilyMembers CEOName CEOTitle CEOFirstName CEOLastName CEOGender CEOLanguage EmployeesHereReliabilityCode EmployeesHereReliabilityCodeDescription EmployeesTotalReliabilityCode EmployeesTotalReliabilityCodeDescription EmployeesHere EmployeesTotal Latitude Longitude LegalStatusCode LegalStatusCodeDescription StatusCode StatusCodeDescription SalesVolume Currency SalesVolumeDollars SalesVolumeEuros SalesVolumeReliabilityCode
Data from: Data Fission: Splitting a Single Data Point
tandf.figshare.com
txt
Updated Dec 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas (2023). Data Fission: Splitting a Single Data Point [Dataset]. http://doi.org/10.6084/m9.figshare.24328745.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24328745.v2
Dataset updated
Dec 14, 2023
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Suppose we observe a random vector X from some distribution in a known family with unknown parameters. We ask the following question: when is it possible to split X into two pieces f(X) and g(X) such that neither part is sufficient to reconstruct X by itself, but both together can recover X fully, and their joint distribution is tractable? One common solution to this problem when multiple samples of X are observed is data splitting, but Rasines and Young offers an alternative approach that uses additive Gaussian noise—this enables post-selection inference in finite samples for Gaussian distributed data and asymptotically when errors are non-Gaussian. In this article, we offer a more general methodology for achieving such a split in finite samples by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting. We call our method data fission, as an alternative to data splitting, data carving and p-value masking. We exemplify the method on several prototypical applications, such as post-selection inference for trend filtering and other regression problems, and effect size estimation after interactive multiple testing. Supplementary materials for this article are available online.
f
Summary statistics for the study sample (raw data, not log transformed).
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 27, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pomeroy, Emma; Stock, Jay T.; Wells, Jonathan C. K.; O'Callaghan, Michael; Cole, Tim J. (2014). Summary statistics for the study sample (raw data, not log transformed). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001202647
Explore at:
Dataset updated
Aug 27, 2014
Authors
Pomeroy, Emma; Stock, Jay T.; Wells, Jonathan C. K.; O'Callaghan, Michael; Cole, Tim J.
Description
a = 1 missing data point.b = 2 missing data points.c = 3 missing data points.Summary statistics for the study sample (raw data, not log transformed).
f
Data from: A change-point–based control chart for detecting sparse mean...
tandf.figshare.com
txt
Updated Jan 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zezhong Wang; Inez Maria Zwetsloot (2024). A change-point–based control chart for detecting sparse mean changes in high-dimensional heteroscedastic data [Dataset]. http://doi.org/10.6084/m9.figshare.24441804.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24441804.v1
Dataset updated
Jan 17, 2024
Dataset provided by
Taylor & Francis
Authors
Zezhong Wang; Inez Maria Zwetsloot
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Because of the “curse of dimensionality,” high-dimensional processes present challenges to traditional multivariate statistical process monitoring (SPM) techniques. In addition, the unknown underlying distribution of and complicated dependency among variables such as heteroscedasticity increase the uncertainty of estimated parameters and decrease the effectiveness of control charts. In addition, the requirement of sufficient reference samples limits the application of traditional charts in high-dimension, low-sample-size scenarios (small n, large p). More difficulties appear when detecting and diagnosing abnormal behaviors caused by a small set of variables (i.e., sparse changes). In this article, we propose two change-point–based control charts to detect sparse shifts in the mean vector of high-dimensional heteroscedastic processes. Our proposed methods can start monitoring when the number of observations is a lot smaller than the dimensionality. The simulation results show that the proposed methods are robust to nonnormality and heteroscedasticity. Two real data examples are used to illustrate the effectiveness of the proposed control charts in high-dimensional applications. The R codes are provided online.
d
Geolytica POIData.xyz Points of Interest (POI) Geo Data - Australia
datarade.ai
.csv
Updated Jul 5, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geolytica (2021). Geolytica POIData.xyz Points of Interest (POI) Geo Data - Australia [Dataset]. https://datarade.ai/data-products/geolytica-poidata-xyz-points-of-interest-poi-geo-data-aus-geolytica
Explore at:
.csvAvailable download formats
Dataset updated
Jul 5, 2021
Dataset authored and provided by
Geolytica
Area covered
Australia
Description
Point-of-interest (POI) is defined as a physical entity (such as a business) in a geo location (point) which may be (of interest).

We strive to provide the most accurate, complete and up to date point of interest datasets for all countries of the world. The Australian POI Dataset is one of our worldwide POI datasets with over 98% coverage.

This is our process flow:

Our machine learning systems continuously crawl for new POI data Our geoparsing and geocoding calculates their geo locations Our categorization systems cleanup and standardize the datasets Our data pipeline API publishes the datasets on our data store

POI Data is in a constant flux - especially so during times of drastic change such as the Covid-19 pandemic.

Every minute worldwide on an average day over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist.

In today's interconnected world, of the approximately 200 million POIs worldwide, over 94% have a public online presence. As a new POI comes into existence its information will appear very quickly in location based social networks (LBSNs), other social media, pictures, websites, blogs, press releases. Soon after that, our state-of-the-art POI Information retrieval system will pick it up.

We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via a recurring payment plan on our data update pipeline.

The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

The core attribute coverage for Australia is as follows:

Poi Field Data Coverage (%) poi_name 100 brand 13 poi_tel 49 formatted_address 100 main_category 94 latitude 100 longitude 100 neighborhood 3 source_url 55 email 10 opening_hours 41 building_footprint 60

The dataset may be viewed online at https://store.poidata.xyz/au and a data sample may be downloaded at https://store.poidata.xyz/datafiles/au_sample.csv
Data from: Point of sales
kaggle.com
zip
Updated Jan 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
smmmmmmmmmmmm (2024). Point of sales [Dataset]. https://www.kaggle.com/datasets/smmmmmmmmmmmm/point-of-sales
Explore at:
zip(34427 bytes)Available download formats
Dataset updated
Jan 12, 2024
Authors
smmmmmmmmmmmm
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
It corresponds to a unique transaction identified by Transaction_ID and includes details such as Date, Product_ID, Product_Name, Quantity, Unit_Price, Total_Price, Customer_ID, Payment_Method, and Store_Location. The synthetic data simulates diverse transactions with random product information, quantities, prices, customer IDs, payment methods, and store locations. This dataset provides a foundation for analyzing and understanding patterns within a Point of Sale environment, facilitating research or development in related fields such as retail analytics, inventory management, and customer behavior analysis.
d
Gulf of Maine - Control Points Used to Validate the Accuracies of the...
catalog.data.gov
datasets.ai
+1more
Updated May 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact, Custodian) (2025). Gulf of Maine - Control Points Used to Validate the Accuracies of the Interpolated Water Density Rasters [Dataset]. https://catalog.data.gov/dataset/gulf-of-maine-control-points-used-to-validate-the-accuracies-of-the-interpolated-water-density-1
Explore at:
Dataset updated
May 22, 2025
Dataset provided by
(Point of Contact, Custodian)
Area covered
Gulf of Maine
Description
This feature dataset contains the control points used to validate the accuracies of the interpolated water density rasters for the Gulf of Maine. These control points were selected randomly from the water density data points, using Hawth's Create Random Selection Tool. Twenty-five percent of each seasonal bin (for each year and at each depth) were randomly selected and set aside for validation. For example, if there were 1,000 water density data points for the fall (September, October, November) 2003 at 0 meters, then 250 of those points were randomly selected, removed and set aside to assess the accuracy of interpolated surface. The naming convention of the validation point feature class includes the year (or years), the season, and the depth (in meters) it was selected from. So for example, the name: ValidationPoints_1997_2004_Fall_0m would indicate that this point feature class was randomly selected from water density points that were at 0 meters in the fall between 1997-2004. The seasons were defined using the same months as the remote sensing data--namely, Fall = September, October, November; Winter = December, January, February; Spring = March, April, May; and Summer = June, July, August.
d
Point of Interest (POI) Data | 180 Countries Coverage | CCPA, GDPR Compliant...
datarade.ai
.json, .xls
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Silencio Network (2025). Point of Interest (POI) Data | 180 Countries Coverage | CCPA, GDPR Compliant | 100% Opted-In Users | 10M + Data Points | 100% Traceable Consent [Dataset]. https://datarade.ai/data-products/point-of-interest-poi-data-236-countries-coverage-ccpa-silencio-network
Explore at:
.json, .xlsAvailable download formats
Dataset updated
Apr 17, 2025
Dataset provided by
Quickkonnect UG
Authors
Silencio Network
Area covered
Hungary, Mauritania, Macedonia (the former Yugoslav Republic of), Rwanda, Honduras, Kosovo, Maldives, Armenia, Bosnia and Herzegovina, Mongolia
Description
Silencio’s Business-Type Segmented POI Dataset provides sector-specific footfall insights across industries such as fashion, hospitality, fitness, healthcare, and more. Built on 10M+ POI check-ins from an active base of 1M+ opted-in users, this dataset helps analysts and strategists understand consumer behavior and competitor performance in the physical world.

Use this dataset to: • Benchmark footfall across different business categories • Conduct competitor analysis and market research • Identify trends in consumer engagement

Strongest coverage: • Europe • Brazil • India • Nigeria • Philippines • Bangladesh • Pakistan • United States

Delivered via CSV or S3. AI-powered segmentation is under development.
m
Example Stata syntax and data construction for negative binomial time series...
data.mendeley.com
Updated Nov 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Price (2022). Example Stata syntax and data construction for negative binomial time series regression [Dataset]. http://doi.org/10.17632/3mj526hgzx.2
Explore at:
Unique identifier
https://doi.org/10.17632/3mj526hgzx.2
Dataset updated
Nov 2, 2022
Authors
Sarah Price
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).

The variables contained therein are defined as follows:

case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).

patid: a unique patient identifier.

time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,

ncons: number of consultations per month.

period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.

burden: binary variable denoting membership of one of two multimorbidity burden groups.

We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).

Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.
Point of interest Data Denmark Techsalerator
kaggle.com
zip
Updated Aug 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2023). Point of interest Data Denmark Techsalerator [Dataset]. https://www.kaggle.com/datasets/techsalerator/point-of-interest-data-denmark-techsalerator
Explore at:
zip(47217 bytes)Available download formats
Dataset updated
Aug 1, 2023
Authors
Techsalerator
Area covered
Denmark
Description
Techsalerator Covers all Points of interest of companies and entities in Denmark (Business/Company Data) in its Denmark B2B POI Database.

For info contact us at info@techsalerator.com or via https://www.techsalerator.com/request-a-quote

We can select the Perfect set based on location, revenue, number of employees, revenue, years in business as well as 40 other fields.

200 more fields are included in this database including the following below :

UniqueID UniversalPublicationId CompanyName TradeName DirectoryName Address1 Address2 PostCode City CityCode Province ProvinceCode Region RegionCode Country CountryCode Language PhoneOrMobile Phone DNCMPhone Fax Mobile DNCMMobile Email Website WebDomain WebSocialMedialinksFacebook WebSocialMedialinksTwitter GenericLinlkedInLink WebsiteIpAddress NationalID NationalIdentificationTypeCode NationalIdentificationTypeCodeDescription NationalIDIsVat PrimaryLocalActivityCode LocalActivityTypeCode MarketabilityIndicator YearStarted NumberOfFamilyMembers CEOName CEOTitle CEOFirstName CEOLastName CEOGender CEOLanguage EmployeesHereReliabilityCode EmployeesHereReliabilityCodeDescription EmployeesTotalReliabilityCode EmployeesTotalReliabilityCodeDescription EmployeesHere EmployeesTotal Latitude Longitude LegalStatusCode LegalStatusCodeDescription StatusCode StatusCodeDescription SalesVolume Currency SalesVolumeDollars SalesVolumeEuros SalesVolumeReliabilityCode
sample data for "A new statistical method for analyzing point collocations"
figshare.com
txt
Updated Apr 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2024). sample data for "A new statistical method for analyzing point collocations" [Dataset]. http://doi.org/10.6084/m9.figshare.25699152.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25699152.v1
Dataset updated
Apr 26, 2024
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample data for "A new statistical method for analyzing point collocations"
RTEM Hackaton API and Data Science Tutorials
kaggle.com
zip
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pony Biam (2022). RTEM Hackaton API and Data Science Tutorials [Dataset]. https://www.kaggle.com/datasets/ponybiam/onboard-api-intro
Explore at:
zip(14011904 bytes)Available download formats
Dataset updated
Apr 14, 2022
Authors
Pony Biam
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
RTEM Hackathon Tutorials

This data set and associated notebooks are meant to give you a head start in accessing the RTEM Hackathon by showing some examples of data extraction, processing, cleaning, and visualisation. Data availabe in this Kaggle page is only a selected part of the whole data set extracted for the tutorials. A series of Video Tutorials are associated with this dataset and notebooks and is found on the Onboard YouTube channel.

Part 1 - Onboard API and Onboard API Wrapper Introduction

An introduction to the API usage and how to retrieve data from it. This notebook is outlined in several YouTube videos that discuss: - how to get started with your account and get oriented to the Kaggle environment, - get acquainted with the Onboard API, - and start using the Onboard API wrapper to extract and explore data.

Part 2 - Meta-data and Point Exploration Demo

How to query data points meta-data, process them and visually explore them. This notebook is outlined in several YouTube videos that discuss: - how to get started exploring building metadata/points, - select/merge point lists and export as CSV - and visualize and explore the point lists

Part 3 - Time-series Data Extraction and Exploration Demo

How to query time-series from data points, process and visually explore them. This notebook is outlined in several YouTube videos that discuss: - how to load and filter time-series data from sensors - resample and transform time-series data - and create heat maps and boxplots of data for exploration

Part 4 - Example of starting point for analysis for RTEM and possible directions of analysis

A quick example of a starting point towards the analysis of the data for some sort of solution and reference to a paper that might help get an overview of the possible directions your team can go in. This notebook is outlined in several YouTube videos that discuss: - overview of use cases and judging criteria - an example of a real-world hypothesis - further development of that simple example

More information about the data and competition can be found on the RTEM Hackathon website.
BLM OR Water Quality and Quantity Cross Section Sample Publication Point Hub...
catalog.data.gov
datasets.ai
+1more
Updated Nov 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Land Management (2025). BLM OR Water Quality and Quantity Cross Section Sample Publication Point Hub [Dataset]. https://catalog.data.gov/dataset/blm-or-water-quality-and-quantity-cross-section-sample-publication-point-hub-d1d6d
Explore at:
Dataset updated
Nov 11, 2025
Dataset provided by
Bureau of Land Managementhttp://www.blm.gov/
Description
CROSS_SECT_SAMPLE_PUB_PT: Cross-sectional surveys capture the shape of the stream channel at a specific location by measuring elevations at intervals across the channel. Cross-sections are used to determine bankfull width, mean bankfull depth, and entrenchment of a channel at a specific point. Cross-sections are usually installed and monitored to track geomorphic change in a stream before and after a physical alteration to the channel; these surveys can detect erosion and deposition of stream sediment as well as changes to the shape (profile) of stream bed and banks. The cross-section table defined in this data standard stores the summary measurements. Raw data can be stored in a spreadsheet or document and related to the record.
Z
Integrated DInSAR + GNSS example data sets
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Oct 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corsa, Brianna (2024). Integrated DInSAR + GNSS example data sets [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_13999128
Explore at:
Dataset updated
Oct 27, 2024
Authors
Corsa, Brianna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data repository contains sample datasets of raw DInSAR time series (NSBAS_PARAMS.h5), raw, interpolated GNSS time series maps (GPS_East/North/Up.h5) , errors associated with the GNSS data (GPS_East/North/Up_sigma.h5), and integrated DInSAR + GNSS time series (fused.h5). Details about the data can be read about in the following publication: [Corsa, B. "Integration of DInSAR Time Series and GNSS data for Continuous Volcanic Deformation Monitoring and Eruption Early Warning Applications" Remote Sens. 2022, 14(3), 784; https://doi.org/10.3390/rs14030784]. The raw DInSAR time series spans 245 dates between 2015-11-11 to 2021-04-13 over the Big Island of Hawaii. The current raw GPS data and fused time series used 22 data points between those same dates.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ehsan Zafari (2024). Iris dataset [Dataset]. https://www.kaggle.com/datasets/ehsanzafari/iris-dataset

Iris dataset

Explore at:

zip(955 bytes)Available download formats

Dataset updated

Jan 16, 2024

Authors

Ehsan Zafari

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

The Iris dataset is a classic dataset in the field of machine learning and statistics. It's often used for demonstrating various data analysis, machine learning, and statistical techniques. Here are some key details about it:

Background - Origin: The dataset was introduced by the British statistician and biologist Ronald Fisher in his 1936 paper titled "The use of multiple measurements in taxonomic problems." - Purpose: Fisher developed the dataset as an example of linear discriminant analysis.

Data Composition - Data Points: The dataset consists of 150 samples from three species of Iris flowers: Iris Setosa, Iris Versicolour, and Iris Virginica. - Features: There are four features measured in centimeters for each sample: 1. Sepal Length 2. Sepal Width 3. Petal Length 4. Petal Width - Classes: The dataset contains three classes, corresponding to the three species of Iris. Each class has 50 samples.

Usage - Classification: The Iris dataset is widely used for classification tasks, especially to illustrate the principles of supervised machine learning algorithms. - Testing Algorithms: It's often used to test out algorithms for linear regression, classification, and clustering due to its simplicity and small size. - Educational Purpose: Because of its clarity and simplicity, it's frequently used in teaching data science and machine learning.

Characteristics - Simple and Clean: The dataset is straightforward, with minimal preprocessing required, making it ideal for beginners. - Well-Behaved Classes: The species are relatively well separated, though there's some overlap between Versicolor and Virginica. - Multivariate Data: It involves understanding the relationship between multiple variables (the four features).

Applications - Benchmarking: The Iris dataset serves as a benchmark for evaluating the performance of different algorithms. - Visualization**: It's great for practicing data visualization, especially for exploring techniques like scatter plots, box plots, and pair plots to understand feature relationships.

Despite its simplicity, the Iris dataset remains one of the most famous datasets in the world of data science and machine learning. It serves as an excellent starting point for anyone new to the field and remains a baseline for testing algorithms and teaching concepts.

Clear search

Close search

Google apps

Main menu

Iris dataset

1999 RoxAnn Data Points from Apalachicola Bay, Florida

Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’...

Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data...

Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories,...

Point data

Point of interest Data India Techsalerator

Data from: Data Fission: Splitting a Single Data Point

Summary statistics for the study sample (raw data, not log transformed).

Data from: A change-point–based control chart for detecting sparse mean...

Geolytica POIData.xyz Points of Interest (POI) Geo Data - Australia

Data from: Point of sales

Gulf of Maine - Control Points Used to Validate the Accuracies of the...

Point of Interest (POI) Data | 180 Countries Coverage | CCPA, GDPR Compliant...

Example Stata syntax and data construction for negative binomial time series...

Point of interest Data Denmark Techsalerator

sample data for "A new statistical method for analyzing point collocations"

RTEM Hackaton API and Data Science Tutorials

RTEM Hackathon Tutorials

Part 1 - Onboard API and Onboard API Wrapper Introduction

Part 2 - Meta-data and Point Exploration Demo

Part 3 - Time-series Data Extraction and Exploration Demo

Part 4 - Example of starting point for analysis for RTEM and possible directions of analysis

BLM OR Water Quality and Quantity Cross Section Sample Publication Point Hub...

Integrated DInSAR + GNSS example data sets

Iris dataset