44 datasets found

Small Business Contact Data | North American Small Business Owners |...
datarade.ai
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2021). Small Business Contact Data | North American Small Business Owners | Verified Contact Details from 170M Profiles | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/small-business-contact-data-north-american-small-business-o-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Oct 27, 2021
Dataset provided by
Area covered
Greenland, Guatemala, Saint Pierre and Miquelon, Panama, United States of America, Honduras, Mexico, Belize, Bermuda, Costa Rica
Description
Access B2B Contact Data for North American Small Business Owners with Success.ai—your go-to provider for verified, high-quality business datasets. This dataset is tailored for businesses, agencies, and professionals seeking direct access to decision-makers within the small business ecosystem across North America. With over 170 million professional profiles, it’s an unparalleled resource for powering your marketing, sales, and lead generation efforts.

Key Features of the Dataset:

Verified Contact Details

Includes accurate and up-to-date email addresses and phone numbers to ensure you reach your targets reliably.

AI-validated for 99% accuracy, eliminating errors and reducing wasted efforts.

Detailed Professional Insights

Comprehensive data points include job titles, skills, work experience, and education to enable precise segmentation and targeting.

Enriched with insights into decision-making roles, helping you connect directly with small business owners, CEOs, and other key stakeholders.

Business-Specific Information

Covers essential details such as industry, company size, location, and more, enabling you to tailor your campaigns effectively. Ideal for profiling and understanding the unique needs of small businesses.

Continuously Updated Data

Our dataset is maintained and updated regularly to ensure relevance and accuracy in fast-changing market conditions. New business contacts are added frequently, helping you stay ahead of the competition.

Why Choose Success.ai?

At Success.ai, we understand the critical importance of high-quality data for your business success. Here’s why our dataset stands out:

Tailored for Small Business Engagement Focused specifically on North American small business owners, this dataset is an invaluable resource for building relationships with SMEs (Small and Medium Enterprises). Whether you’re targeting startups, local businesses, or established small enterprises, our dataset has you covered.

Comprehensive Coverage Across North America Spanning the United States, Canada, and Mexico, our dataset ensures wide-reaching access to verified small business contacts in the region.

Categories Tailored to Your Needs Includes highly relevant categories such as Small Business Contact Data, CEO Contact Data, B2B Contact Data, and Email Address Data to match your marketing and sales strategies.

Customizable and Flexible Choose from a wide range of filtering options to create datasets that meet your exact specifications, including filtering by industry, company size, geographic location, and more.

Best Price Guaranteed We pride ourselves on offering the most competitive rates without compromising on quality. When you partner with Success.ai, you receive superior data at the best value.

Seamless Integration Delivered in formats that integrate effortlessly with your CRM, marketing automation, or sales platforms, so you can start acting on the data immediately.

Use Cases: This dataset empowers you to:

Drive Sales Growth: Build and refine your sales pipeline by connecting directly with decision-makers in small businesses. Optimize Marketing Campaigns: Launch highly targeted email and phone outreach campaigns with verified contact data. Expand Your Network: Leverage the dataset to build relationships with small business owners and other key figures within the B2B landscape. Improve Data Accuracy: Enhance your existing databases with verified, enriched contact information, reducing bounce rates and increasing ROI. Industries Served: Whether you're in B2B SaaS, digital marketing, consulting, or any field requiring accurate and targeted contact data, this dataset serves industries of all kinds. It is especially useful for professionals focused on:

Lead Generation Business Development Market Research Sales Outreach Customer Acquisition What’s Included in the Dataset: Each profile provides:

Full Name Verified Email Address Phone Number (where available) Job Title Company Name Industry Company Size Location Skills and Professional Experience Education Background With over 170 million profiles, you can tap into a wealth of opportunities to expand your reach and grow your business.

Why High-Quality Contact Data Matters: Accurate, verified contact data is the foundation of any successful B2B strategy. Reaching small business owners and decision-makers directly ensures your message lands where it matters most, reducing costs and improving the effectiveness of your campaigns. By choosing Success.ai, you ensure that every contact in your pipeline is a genuine opportunity.

Partner with Success.ai for Better Data, Better Results: Success.ai is committed to delivering premium-quality B2B data solutions at scale. With our small business owner dataset, you can unlock the potential of North America's dynamic small business market.

Get Started Today Request a sample or customize your dataset to fit your unique...
d
Coresignal | Private Company Data | Company Data | AI-Enriched Datasets |...
datarade.ai
.json, .csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coresignal, Coresignal | Private Company Data | Company Data | AI-Enriched Datasets | Global / 35M+ Records / Updated Weekly [Dataset]. https://datarade.ai/data-products/coresignal-private-company-data-company-data-ai-enriche-coresignal
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
Coresignal
Area covered
Benin, Jamaica, Grenada, Kyrgyzstan, Pitcairn, Togo, Senegal, Argentina, Bhutan, Kiribati
Description
This Private Company Data dataset is a refined version of our company datasets, consisting of 35M+ data records.

It’s an excellent data solution for companies with limited data engineering capabilities and those who want to reduce their time to value. You get filtered, cleaned, unified, and standardized B2B private company data. This data is also enriched by leveraging a carefully instructed large language model (LLM).

AI-powered data enrichment offers more accurate information in key data fields, such as company descriptions. It also produces over 20 additional data points that are very valuable to B2B businesses. Enhancing and highlighting the most important information in web data contributes to quicker time to value, making data processing much faster and easier.

For your convenience, you can choose from multiple data formats (Parquet, JSON, JSONL, or CSV) and select suitable delivery frequency (quarterly, monthly, or weekly).

Coresignal is a leading private company data provider in the web data sphere with an extensive focus on firmographic data and public employee profiles. More than 3B data records in different categories enable companies to build data-driven products and generate actionable insights. Coresignal is exceptional in terms of data freshness, with 890M+ records updated monthly for unprecedented accuracy and relevance.
AI-Powered Resume Screening Dataset (2025)
kaggle.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Talha (2025). AI-Powered Resume Screening Dataset (2025) [Dataset]. https://www.kaggle.com/datasets/mdtalhask/ai-powered-resume-screening-dataset-2025
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohammed Talha
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
🔹 Overview: This dataset contains 1,000+ synthetic resumes with key details such as skills, experience, education, job roles, certifications, AI screening scores, and recruiter decisions.

🔹 Features:

Resume_ID: Unique identifier Name: Candidate's name Skills: List of relevant technical skills Experience (Years): Total work experience Education: Highest qualification Certifications: Relevant industry certifications Job Role: Target job position Recruiter Decision: Hire or Reject Salary Expectation ($): Expected salary Projects Count: Number of projects completed AI Score (0-100): AI-based resume ranking score 🔹 Use Cases:

Resume screening automation HR analytics & hiring trends Salary prediction models AI-powered hiring research

🚀 Use this dataset to build AI models that can predict hiring decisions, analyze job market trends, or optimize HR processes!
A living catalogue of artificial intelligence datasets and benchmarks for...
zenodo.org
bin, tsv
Updated Dec 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Blagec Kathrin; Blagec Kathrin; Kraiger Jakob; Samwald Matthias; Kraiger Jakob; Samwald Matthias (2021). A living catalogue of artificial intelligence datasets and benchmarks for medical decision making [Dataset]. http://doi.org/10.5281/zenodo.4668570
Explore at:
tsv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4668570
Dataset updated
Dec 29, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Blagec Kathrin; Blagec Kathrin; Kraiger Jakob; Samwald Matthias; Kraiger Jakob; Samwald Matthias
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
We provide a comprehensive curated catalogue of artificial intelligence datasets and benchmarks for medical decision making. At the time of first release (April 2021), the dataset contains more than 400 biomedical and clinical datasets of which 252 are publicly available or available upon request.

The dataset was compiled based on a systematic literature review covering both biomedical and computer science literature and grey literature data sources. All datasets were manually systematized and annotated for meta-information, such as:

Availability and licensing information

Type of source data

Links to source publications, main references or dataset repositories

Benchmark dataset were additionally annotated for the following information:

Associated task

Performance metrics commonly used for evaluation

Clinical relevance

The availability of data splits

In addition to the versioned TSV file on Zenodo, the dataset can also be explored live via this Google Spreadsheet. The dataset is intended as a living, extendable resource. Edit suggestions and additions are encouraged and can be submitted via the comment function of the Google sheet.

File descriptions

annotated-datasets.tsv -- contains the annotated datasets

arXiv-literature-export.tsv -- contains the original literature record export from arXiv

pubmed-literature-export.tsv -- contains the original literature record export from PubMed

README.md -- contains a detailed description of all annotation fields
G
Low-Temperature Geothermal Geospatial Datasets: An Example from Alaska
gdr.openei.org
data.openei.org
+3more
Updated Feb 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Estefanny Davalos Elizondo; Amanda Kolker; Ian Warren; Estefanny Davalos Elizondo; Amanda Kolker; Ian Warren (2023). Low-Temperature Geothermal Geospatial Datasets: An Example from Alaska [Dataset]. http://doi.org/10.15121/1997233
Explore at:
Unique identifier
https://doi.org/10.15121/1997233
Dataset updated
Feb 6, 2023
Dataset provided by
Office of Energy Efficiency and Renewable Energyhttp://energy.gov/eere
National Renewable Energy Laboratory
Geothermal Data Repository
Authors
Estefanny Davalos Elizondo; Amanda Kolker; Ian Warren; Estefanny Davalos Elizondo; Amanda Kolker; Ian Warren
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Alaska
Description
This project is a component of a broader effort focused on geothermal heating and cooling (GHC) with the aim of illustrating the numerous benefits of incorporating GHC and geothermal heat exchange (GHX) into community energy planning and national decarbonization strategies. To better assist private sector investment, it is currently necessary to define and assess the potential of low-temperature geothermal resources. For shallow GHC/GHX fields, there is no formal compilation of subsurface characteristics shared among industry practitioners that can improve system design and operations. Alaska is specifically noted in this work, because heretofore, it has not received a similar focus in geothermal potential evaluations as the contiguous United States. The methodology consists of leveraging relevant data to generate a baseline geospatial dataset of low-temperature resources (less than 150 degrees C) to compare and analyze information accessible to anyone trying to understand the potential of GHC/GHX and small-scale low-temperature geothermal power in Alaska (e.g., energy modelers, communities, planners, and policymakers). Importantly, this project identifies data related to (1) the evaluation of GHC/GHX in the shallow subsurface, and (2) the evaluation of low-temperature geothermal resource availability. Additionally, data is being compiled to assess repurposing of oil and gas wells to contribute co-produced fluids toward the geothermal direct use and heating and cooling resource potential. In this work we identified new data from three different datasets of isolated geothermal systems in Alaska and bottom-hole temperature data from oil and gas wells that can be leveraged for evaluation of low-temperature geothermal resource potential. The goal of this project is to facilitate future deployment of GHC/GHX analysis and community-led programs and update the low-temperature geothermal resources assessment of Alaska. A better understanding of shallow potential for GHX will improve design and operations of highly efficient GHC systems. The deployment and impact that can be achieved for low-temperature geothermal resources will contribute to decarbonization goals and facilitate widespread electrification by shaving and shifting grid loads.

Most of the data uses WGS84 coordinate system. However, each dataset come from different sources and has a metadata file with the original coordinate system.
Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...
datarade.ai
Updated Jan 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2022). Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles & 70M Companies – Best Price and Quality Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-full-dataset-enrichment-api-700m-pu-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Jan 1, 2022
Dataset provided by
Area covered
Jordan, Guatemala, Svalbard and Jan Mayen, Equatorial Guinea, United Republic of, Tunisia, Saint Barthélemy, Qatar, Greenland, Nicaragua
Description
Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

API Integration: Our datasets are easily accessible via API, allowing for seamless integration into your existing systems. This ensures that you can automate data retrieval and update processes, maintaining the flow of fresh, accurate information directly into your applications.

Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.

Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.

AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.

Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.

Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.

Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.

Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.

Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

Key Use Cases:

Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.

Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.

Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.

Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.

Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.

CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.

Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understandin...
Company Financial Data | Private & Public Companies | Verified Profiles &...
datarade.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai, Company Financial Data | Private & Public Companies | Verified Profiles & Contact Data | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/b2b-contact-data-premium-us-contact-data-us-b2b-contact-d-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset provided by
Area covered
Suriname, Montserrat, Togo, Antigua and Barbuda, Guam, Iceland, Georgia, Korea (Democratic People's Republic of), United Kingdom, Dominican Republic
Description
Success.ai offers a cutting-edge solution for businesses and organizations seeking Company Financial Data on private and public companies. Our comprehensive database is meticulously crafted to provide verified profiles, including contact details for financial decision-makers such as CFOs, financial analysts, corporate treasurers, and other key stakeholders. This robust dataset is continuously updated and validated using AI technology to ensure accuracy and relevance, empowering businesses to make informed decisions and optimize their financial strategies.

Key Features of Success.ai's Company Financial Data:

Global Coverage: Access data from over 70 million businesses worldwide, including public and private companies across all major industries and regions. Our datasets span 250+ countries, offering extensive reach for your financial analysis and market research.

Detailed Financial Profiles: Gain insights into company financials, including revenue, profit margins, funding rounds, and operational costs. Profiles are enriched with key contact details, including work emails, phone numbers, and physical addresses, ensuring direct access to decision-makers.

Industry-Specific Data: Tailored datasets for sectors such as financial services, manufacturing, technology, healthcare, and energy, among others. Each dataset is customized to meet the unique needs of industry professionals and analysts.

Real-Time Accuracy: With continuous updates powered by AI-driven validation, our financial data maintains a 99% accuracy rate, ensuring you have access to the most reliable and up-to-date information available.

Compliance and Security: All data is collected and processed in strict adherence to global compliance standards, including GDPR, ensuring ethical and lawful usage.

Why Choose Success.ai for Company Financial Data?

Best Price Guarantee: We pride ourselves on offering the most competitive pricing in the industry, ensuring you receive unparalleled value for comprehensive financial data.

AI-Validated Accuracy: Our advanced AI algorithms meticulously verify every data point to ensure precision and reliability, helping you avoid costly errors in your financial decision-making.

Customized Data Solutions: Whether you need data for a specific region, industry, or type of business, we tailor our datasets to align perfectly with your requirements.

Scalable Data Access: From small startups to global enterprises, our platform caters to businesses of all sizes, delivering scalable solutions to suit your operational needs.

Comprehensive Use Cases for Financial Data:

Strategic Financial Planning:

Leverage our detailed financial profiles to create accurate budgets, forecasts, and strategic plans. Gain insights into competitors’ financial health and market positions to make data-driven decisions.

Mergers and Acquisitions (M&A):

Access key financial details and contact information to streamline your M&A processes. Identify potential acquisition targets or partners with verified profiles and financial data.

Investment Analysis:

Evaluate the financial performance of public and private companies for informed investment decisions. Use our data to identify growth opportunities and assess risk factors.

Lead Generation and Sales:

Enhance your sales outreach by targeting CFOs, financial analysts, and other decision-makers with verified contact details. Utilize accurate email and phone data to increase conversion rates.

Market Research:

Understand market trends and financial benchmarks with our industry-specific datasets. Use the data for competitive analysis, benchmarking, and identifying market gaps.

APIs to Power Your Financial Strategies:

Enrichment API: Integrate real-time updates into your systems with our Enrichment API. Keep your financial data accurate and current to drive dynamic decision-making and maintain a competitive edge.

Lead Generation API: Supercharge your lead generation efforts with access to verified contact details for key financial decision-makers. Perfect for personalized outreach and targeted campaigns.

Tailored Solutions for Industry Professionals:

Financial Services Firms: Gain detailed insights into revenue streams, funding rounds, and operational costs for competitor analysis and client acquisition.

Corporate Finance Teams: Enhance decision-making with precise data on industry trends and benchmarks.

Consulting Firms: Deliver informed recommendations to clients with access to detailed financial datasets and key stakeholder profiles.

Investment Firms: Identify potential investment opportunities with verified data on financial performance and market positioning.

What Sets Success.ai Apart?

Extensive Database: Access detailed financial data for 70M+ companies worldwide, including small businesses, startups, and large corporations.

Ethical Practices: Our data collection and processing methods are fully comp...
c
2010 Monarch Relevant Land Cover Data Set for Canada
s.cnmilf.com
data.usgs.gov
+2more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). 2010 Monarch Relevant Land Cover Data Set for Canada [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/2010-monarch-relevant-land-cover-data-set-for-canada
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Canada
Description
The USGS Upper Midwest Environmental Sciences Center developed a Monarch Relevant Land Cover data set covering the area of Canada. We used the 2010 land cover data set produced by the tri-national North American Land Change Monitoring System (NALCMS) and supported by the Commission for Environmental Cooperation (CEC) that depicts year 2010 land cover across North America at 30-meter spatial resolution, and incorporated additional spatially-explicit information to develop this land cover map. Additional sources of information included 2004 railroad data provided by The Atlas of Canada and the CEC, 2017 roads data provided by Statistics Canada, 2017 protected areas data provided by the CEC, and 2016 Canada provincial/territory boundary file data provided by Statistics Canada.
g
Data Set for A Call for an Aloft Air Quality Monitoring Network: Need and...
gimi9.com
datasets.ai
+2more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Set for A Call for an Aloft Air Quality Monitoring Network: Need and Feasibility [Dataset]. https://gimi9.com/dataset/data-gov_data-set-for-a-call-for-an-aloft-air-quality-monitoring-network-need-and-feasibility
Explore at:
Description
This data set contains all relevant data used in the creation of the 4 illustrations in the manuscript. In all cases the data have been processed (averaged/aggregated over space and/or time) from the original data which was at finer spatial or temporal resolution. The observational data sets are publicly available from the CASTNET site. Raw model outputs can be made available by contacting the corresponding author. This dataset is associated with the following publication: Mathur, R., C. Hogrefe, A. Hakami, S. Zhao, J. Szykman, and G. Hagler. A Call for an Aloft Air Quality Monitoring Network: Need, Feasibility, and Potential Value. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 52(19): 10903–10908, (2018).
Meta Kaggle Code
kaggle.com
zip
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(143722388562 bytes)Available download formats
Dataset updated
Jun 5, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
Lending Club Loan Data Analysis - Deep Learning
kaggle.com
Updated Aug 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deependra Verma (2023). Lending Club Loan Data Analysis - Deep Learning [Dataset]. https://www.kaggle.com/datasets/deependraverma13/lending-club-loan-data-analysis-deep-learning
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 9, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Deependra Verma
Description
DESCRIPTION

Create a model that predicts whether or not a loan will be default using the historical data.

Problem Statement:

For companies like Lending Club correctly predicting whether or not a loan will be a default is very important. In this project, using the historical data from 2007 to 2015, you have to build a deep learning model to predict the chance of default for future loans. As you will see later this dataset is highly imbalanced and includes a lot of features that make this problem more challenging.

Domain: Finance

Analysis to be done: Perform data preprocessing and build a deep learning prediction model.

Content:

Dataset columns and definition:

credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise.

purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other").

int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates.

installment: The monthly installments owed by the borrower if the loan is funded.

log.annual.inc: The natural log of the self-reported annual income of the borrower.

dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income).

fico: The FICO credit score of the borrower.

days.with.cr.line: The number of days the borrower has had a credit line.

revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle).

revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available).

inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months.

delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years.

pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).

Steps to perform:

Perform exploratory data analysis and feature engineering and then apply feature engineering. Follow up with a deep learning model to predict whether or not the loan will be default using the historical data.

Tasks:

Feature Transformation

Transform categorical values into numerical values (discrete)

Exploratory data analysis of different factors of the dataset.

Additional Feature Engineering

You will check the correlation between features and will drop those features which have a strong correlation

This will help reduce the number of features and will leave you with the most relevant features

Modeling

After applying EDA and feature engineering, you are now ready to build the predictive models

In this part, you will create a deep learning model using Keras with Tensorflow backend
F
Audio Visual Speech Dataset: American English
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Audio Visual Speech Dataset: American English [Dataset]. https://www.futurebeeai.com/dataset/multi-modal-dataset/american-english-visual-speech-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
Area covered
United States
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the US English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.
Dataset Content
This visual speech dataset contains 1000 videos in US English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.
[object Object][object Object][object Object][object Object]
Video Data
While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.
[object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object]
Metadata
The dataset provides comprehensive metadata for each video recording and participant:
[object Object][object Object][object Object][object Object][object Object]
This metadata is a powerful tool for understanding and characterising the data, enabling informed decision-making in the development of US English language visual speech models.
Usage and Applications
The US English Language Visual Speech Dataset serves various applications across different domains:
[object Object][object Object][object Object][object Object][object Object]
Secure and Ethical Collection
[object Object][object Object][object Object]
Updates and Customization
We understand the importance of evolving datasets to meet diverse research needs. Therefore, our dataset is regularly updated with new videos in various real-world conditions.
[object Object][object Object][object Object][object Object]
Licence
This US English Language Image Captioning Dataset, created by FutureBeeAI, is available for commercial use.
AI-Generated Prompts Dataset
kaggle.com
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). AI-Generated Prompts Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/7535708
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/7535708
Dataset updated
Feb 2, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview

This dataset contains a collection of prompts generated by the model teknium/OpenHermes-2p5-Mistral-7B. Each line in the dataset represents a unique prompt, crafted to stimulate creative and insightful responses.

Dataset Content

Number of Prompts:

Model Used: teknium/OpenHermes-2p5-Mistral-7B

Format: Each line in the file is a JSON object with a single key-value pair. The key is response, and the value is the generated prompt.

Potential Applications

This dataset can be used for a variety of applications, including but not limited to: - Training and fine-tuning language models. - Analyzing trends in AI-generated content. - Generating creative writing prompts for educational or entertainment purposes.

Model Information

The dataset was generated using the teknium/OpenHermes-2p5-Mistral-7B model. This model is known for its ability to generate high-quality, contextually relevant text based on given prompts. It has been widely used in natural language processing tasks such as text completion, summarization, and question answering.

Usage Guidelines

This dataset is made available for academic and research purposes. Users are encouraged to abide by the terms of use and licensing agreements of the source model and data.

Acknowledgements

We would like to acknowledge the creators of the teknium/OpenHermes-2p5-Mistral-7B model for providing the tools necessary to generate this dataset.
d
Data from: Data on Crime, Supervision, and Economic Change in the Greater...
datasets.ai
icpsr.umich.edu
+1more
0
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Justice, Data on Crime, Supervision, and Economic Change in the Greater Washington, DC Area, 2000 - 2014 [Dataset]. https://datasets.ai/datasets/data-on-crime-supervision-and-economic-change-in-the-greater-washington-dc-area-2000-2014-f67d6
Explore at:
0Available download formats
Dataset authored and provided by
Department of Justice
Area covered
Washington Metropolitan Area
Description
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. The study includes data collected with the purpose of creating an integrated dataset that would allow researchers to address significant, policy-relevant gaps in the literature--those that are best answered with cross-jurisdictional data representing a wide array of economic and social factors. The research addressed five research questions:

What is the impact of gentrification and suburban diversification on crime within and across jurisdictional boundaries? How does crime cluster along and around transportation networks and hubs in relation to other characteristics of the social and physical environment? What is the distribution of criminal justice-supervised populations in relation to services they must access to fulfill their conditions of supervision? What are the relationships among offenders, victims, and crimes across jurisdictional boundaries? What is the increased predictive power of simulation models that employ cross-jurisdictional data?
w
Spatial data set of mapped water-level changes in the High Plains aquifer,...
data.wu.ac.at
datadiscoverystudio.org
+2more
zip
Updated Jun 8, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of the Interior (2018). Spatial data set of mapped water-level changes in the High Plains aquifer, 2013 to 2015 [Dataset]. https://data.wu.ac.at/schema/data_gov/MTdhNzZiZWUtYTk1Ni00M2JiLThiNzUtMDM0NjBmNjE5OWZl
Explore at:
zipAvailable download formats
Dataset updated
Jun 8, 2018
Dataset provided by
Department of the Interior
Area covered
396f8c87cf794530787c1ed7cfa2d1150328942e
Description
The High Plains aquifer extends from south of about 32 degrees to almost 44 degrees north latitude and from about 96 degrees 30 minutes to 106 degrees west longitude. The aquifer underlies about 175,000 square miles in parts of Colorado, Kansas, Nebraska, New Mexico, Oklahoma, South Dakota, Texas, and Wyoming. This dataset consists of a raster of water-level changes for the High Plains aquifer, 2013 to 2015. This digital dataset was created using water-level measurements from 7,529 wells measured in both 2013 and 2015. The map was reviewed for consistency with the relevant data at a scale of 1:1,000,000.
Beauty & Cosmetics Data | Cosmetics, Beauty & Wellness Professionals...
datarade.ai
Updated Jan 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2018). Beauty & Cosmetics Data | Cosmetics, Beauty & Wellness Professionals Worldwide | Verified Global Profiles from 700M+ Dataset | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/beauty-cosmetics-data-cosmetics-beauty-wellness-profes-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Jan 1, 2018
Dataset provided by
Area covered
Slovenia, Estonia, Saint Vincent and the Grenadines, Pitcairn, Bahamas, Tunisia, Kosovo, Vanuatu, Angola, Kazakhstan
Description
Success.ai’s Beauty & Cosmetics Data for Cosmetics, Beauty & Wellness Professionals Worldwide delivers a powerful dataset tailored to connect businesses with key stakeholders in the global beauty and wellness industries. Covering professionals such as product developers, brand managers, wellness coaches, and salon owners, this dataset provides verified work emails, phone numbers, and actionable professional insights.

With access to over 700 million verified global profiles and detailed insights from 170 million professional datasets, Success.ai ensures your outreach, marketing, and strategic initiatives are powered by accurate, continuously updated, and AI-validated data. Supported by our Best Price Guarantee, this solution is ideal for businesses aiming to lead in the competitive beauty and wellness market.

Why Choose Success.ai’s Beauty & Cosmetics Data?

Verified Contact Data for Effective Outreach

Access verified work emails, phone numbers, and LinkedIn profiles of professionals in cosmetics, skincare, beauty services, and wellness industries.

AI-driven validation ensures 99% accuracy, reducing bounce rates and improving communication efficiency.

Comprehensive Global Coverage

Includes profiles of beauty and wellness professionals from regions such as North America, Europe, Asia-Pacific, and emerging markets.

Gain insights into global trends in cosmetics innovation, wellness services, and beauty product demand.

Continuously Updated Datasets

Real-time updates reflect changes in leadership, professional roles, and market developments.

Stay aligned with the fast-paced nature of the beauty and wellness industry to identify opportunities and maintain relevance.

Ethical and Compliant

Fully adheres to GDPR, CCPA, and other global privacy regulations, ensuring responsible and lawful use of data for all business initiatives.

Data Highlights:

700M+ Verified Global Profiles: Connect with professionals across the beauty, cosmetics, and wellness industries worldwide.

170M+ Professional Datasets: Access verified contact information and detailed insights into industry leaders and innovators.

Business Insights: Understand market trends, product innovations, and consumer preferences driving the beauty industry.

Decision-Maker Contacts: Engage with CEOs, brand managers, product developers, and wellness leaders driving growth and innovation.

Key Features of the Dataset:

Comprehensive Professional Profiles

Identify and connect with key players, including beauty brand executives, salon owners, skincare experts, and wellness influencers.

Access data on career histories, certifications, and industry expertise to target the right professionals effectively.

Advanced Filters for Precision Targeting

Filter professionals by industry focus (cosmetics, wellness, skincare), geographic location, or job function.

Tailor campaigns to align with specific market segments, such as luxury cosmetics, wellness services, or mass-market beauty products.

Global Trend Insights and Market Data

Leverage data on emerging beauty trends, wellness innovations, and skincare demands across regions.

Refine product development, marketing campaigns, and customer engagement strategies based on actionable insights.

AI-Driven Enrichment

Profiles enriched with actionable data allow for personalized messaging, highlight unique value propositions, and improve engagement outcomes with beauty and wellness professionals.

Strategic Use Cases:

Marketing and Brand Outreach

Design targeted campaigns to promote beauty products, wellness services, or skincare innovations to industry professionals.

Leverage verified contact data for multi-channel outreach, including email, social media, and direct engagement.

Product Development and Innovation

Utilize market insights to guide product development and align offerings with consumer demands in cosmetics, beauty, and wellness sectors.

Collaborate with product developers and brand managers to refine product lines or launch new offerings.

Sales and Partnership Development

Build relationships with wellness professionals, salon owners, and beauty distributors seeking innovative tools or products.

Present co-branding opportunities, supply chain partnerships, or new market expansion strategies to key decision-makers.

Market Research and Competitive Analysis

Analyze beauty and wellness trends, consumer preferences, and emerging niches to refine business strategies.

Benchmark against competitors to identify gaps, growth opportunities, and high-demand product categories.

Why Choose Success.ai?

Best Price Guarantee

Access premium-quality beauty and wellness data at competitive prices, ensuring strong ROI for your marketing, sales, and produc...
National Eelgrass Dataset For Canada (NETForce)
open.canada.ca
datasets.ai
+2more
csv, esri rest +3
Updated Feb 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fisheries and Oceans Canada (2025). National Eelgrass Dataset For Canada (NETForce) [Dataset]. https://open.canada.ca/data/dataset/a733fb88-ddaf-47f8-95bb-e107630e8e62
Explore at:
html, fgdb/gdb, pdf, esri rest, csvAvailable download formats
Dataset updated
Feb 11, 2025
Dataset provided by
Fisheries and Oceans Canadahttp://www.dfo-mpo.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada
Description
This collection of eelgrass data has been collated to produce a national map of the location and distribution of eelgrass beds across Canada. The data providers collaborating in this initiative include Federal, Provincial and Municipal government departments and agencies, academia, non-governmental organizations, community groups, private sector, Indigenous groups and independent science organizations. The National Eelgrass Task Force (NETForce) is a collaborative, diverse and inclusive partnership of scientists, managers, and stakeholders working towards a concrete vision which is to create a national map of eelgrass distribution in Canada that is publicly accessible, dynamic, and useful for monitoring and collective decision-making. The eelgrass data were collected using various mapping techniques including species distribution models, benthic sonar, field measurements of habitat presence or absence, video transects, aerial photography, field validation, literature review, satellite imageries, LiDAR, Airborne spectrographic imaging, and Unoccupied Aerial Vehicle (UAV). The metadata provided by the partners relevant for their own projects and the field names were made similar for the compiled dataset. We also created additional fields that differentiated the datasets, and these include data provider, institution code, water body, mapping techniques, province, biogeographic region, eelgrass observation... Other fields are included depending on the original metadata provided by the data provider (i.e. eelgrass percentage cover, eelgrass density, map reference, image classification technique). The data span from 1987 to present, with some eelgrass beds being surveyed only once while others were sampled across several years. Uncertainty information associated with a dataset is included in the metadata when available. This map is intended to be evergreen and more eelgrass data will be added when available. This compiled dataset has been collected by many organizations for different purposes, using different survey techniques and different methodologies and, therefore, considerable care must be taken when using these data. For further information concerning specific datasets contact the data provider/institution and/or see the associated technical report (if available) included in the Report folder under the ‘Data and Resources’ section. This group of eelgrass data has been divided using the geographic boundaries of the Federal Marine Bioregions (https://open.canada.ca/data/en/dataset/23eb8b56-dac8-4efc-be7c-b8fa11ba62e9). The title of each geodatabase (FGDB/GDB) contains the name of the bioregion. The Data Dictionary guide provides the fields description (English and French) from each layer included in the geodatabases. For additional information please see: Gomez C., Guijarro-Sabaniel J., Wong M. 2021. National Eelgrass Task (NET) Force: engagement in support of a dynamic map of eelgrass distribution in Canada to support monitoring, research and decision making. Can. Tech. Rep. Aquat. Sci. 3437: vi + 48 p. https://waves-vagues.dfo-mpo.gc.ca/library-bibliotheque/4098218x.pdf Guijarro-Sabaniel, J., Thomson, J. A., Vercaemer, B. and Wong, M. C. 2024. National Eelgrass Task Force (NETForce): Building a dynamic, open eelgrass map for Canada. Can. Tech. Rep. Fish. Aquat. Sci. 3583: v + 31 p. https://waves-vagues.dfo-mpo.gc.ca/library-bibliotheque/41223147.pdf
QSARs for Plasma Protein Binding: Source Data and Predictions
catalog.data.gov
datasets.ai
Updated Nov 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). QSARs for Plasma Protein Binding: Source Data and Predictions [Dataset]. https://catalog.data.gov/dataset/qsars-for-plasma-protein-binding-source-data-and-predictions
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The dataset has all of the information used to create and evaluate 3 independent QSAR models for the fraction of a chemical unbound by plasma protein (Fub) for environmentally relevant chemicals. In vitro plasma protein values for 1245 pharmaceuticals and 406 ToxCast chemicals were collected from the literature (Obach 2008, Zhu 2013, Wetmore 2012, Wetmore 2015). The 21 descriptors calculated by MOE that were used in the models are included, as is an acid/base/neutral/zwitterions classification based on ionization percentages calculated in ADMET Predictor. Finally, the dataset includes the in silico Fub predictions for each chemical from the constructed k-nearest neighbor, support vector machine, and random forest QSAR models, as well as a consensus (average) prediction. This dataset is associated with the following publication: Ingle, B., R. Tornero-Velez, J. Nichols, and B. Veber. Informing the Human Plasma Protein Binding of Environmental Chemicals by Machine Learning in the Pharmaceutical Space: Applicability Domain and Limits of Predictability. Journal of Chemical Information and Modeling. American Chemical Society, Washington, DC, USA, 56(11): 2243-2252, (2016).
d
Our415 Events and Activities
catalog.data.gov
data.sfgov.org
Updated May 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). Our415 Events and Activities [Dataset]. https://catalog.data.gov/dataset/our415-events-and-activities
Explore at:
Dataset updated
May 24, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY San Francisco offers numerous events and activities tailored for children, youth, and families. However, finding and navigating the disparate sources of information can be a major challenge. Our415.org seeks to simplify this by consolidating all relevant details, ensuring that families can easily find what they need, when they need it. It also encourages discovery of new interests and things to do. This dataset compiles current and upcoming events and activities in San Francisco for children, youth, and their families. B. HOW THE DATASET IS CREATED This dataset is a consolidation of multiple datasets from contributing City agencies and departments as well as Community Based Organizations. Currently, the information in the dataset is sourced from Rec Park’s activities catalog, SF Public Library’s events calendar, Department of Early Childhood’s family events calendar, and Support for Families' family events calendar. Rec Park activities include any “Open” activities appropriate for ages 0-24, and SF Public Library, Department of Early Childhood, and Support for Families events include events going into the next month. C. UPDATE PROCESS The dataset will be updated on a daily basis, reflecting changes to the source data. D. HOW TO USE THIS DATASET Taxonomy related fields and eligibility fields are either AI-determined or assigned through a DCYF-created crosswalk. These values are determined for the purposes of categorization and search functionality on Our415.org. Use with caution - errors may exist.
d
750K+ Car Images | AI Training Data | Object Detection Data | Annotated...
datarade.ai
Updated Nov 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Seeds (2018). 750K+ Car Images | AI Training Data | Object Detection Data | Annotated imagery data | Global Coverage [Dataset]. https://datarade.ai/data-products/750k-car-images-ai-training-data-object-detection-data-data-seeds
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Nov 2, 2018
Dataset authored and provided by
Data Seeds
Area covered
Indonesia, Åland Islands, Libya, Zambia, Bonaire, Poland, Tajikistan, Azerbaijan, Pitcairn, Palau
Description
This dataset features over 750,000 high-quality images of cars sourced from photographers worldwide. Designed to support AI and machine learning applications, it provides a diverse and richly annotated collection of flower imagery.

Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions focused on flower photography ensure fresh, relevant, and high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements such as particular flower species or geographic regions to be met efficiently.

Global Diversity: photographs have been sourced from contributors in over 100 countries, ensuring a vast array of flower species, colors, and environmental settings. The images feature varied contexts, including natural habitats, gardens, bouquets, and urban landscapes, providing an unparalleled level of diversity.

High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a mix of artistic and practical perspectives suitable for a variety of applications.

Popularity Scores Each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on user preferences or engagement trends.

I-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in tasks such as image recognition, classification, and segmentation. It is compatible with a wide range of machine learning frameworks and workflows, ensuring seamless integration into your projects.

Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

Use Cases 1. Training AI systems for plant recognition and classification. 2. Enhancing agricultural AI models for plant health assessment and species identification. 3. Building datasets for educational tools and augmented reality applications. 4. Supporting biodiversity and conservation research through AI-powered analysis.

This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models, tailored to deliver exceptional performance for your projects. Customizations are available to suit specific project needs. Contact us to learn more!

Facebook

Twitter

Click to copy link

Link copied

Cite

Success.ai (2021). Small Business Contact Data | North American Small Business Owners | Verified Contact Details from 170M Profiles | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/small-business-contact-data-north-american-small-business-o-success-ai

Small Business Contact Data | North American Small Business Owners | Verified Contact Details from 170M Profiles | Best Price Guaranteed

Explore at:

.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats

Dataset updated

Oct 27, 2021

Dataset provided by

Area covered

Greenland, Guatemala, Saint Pierre and Miquelon, Panama, United States of America, Honduras, Mexico, Belize, Bermuda, Costa Rica

Description

Access B2B Contact Data for North American Small Business Owners with Success.ai—your go-to provider for verified, high-quality business datasets. This dataset is tailored for businesses, agencies, and professionals seeking direct access to decision-makers within the small business ecosystem across North America. With over 170 million professional profiles, it’s an unparalleled resource for powering your marketing, sales, and lead generation efforts.

Key Features of the Dataset:

Verified Contact Details

Includes accurate and up-to-date email addresses and phone numbers to ensure you reach your targets reliably.

AI-validated for 99% accuracy, eliminating errors and reducing wasted efforts.

Detailed Professional Insights

Comprehensive data points include job titles, skills, work experience, and education to enable precise segmentation and targeting.

Enriched with insights into decision-making roles, helping you connect directly with small business owners, CEOs, and other key stakeholders.

Business-Specific Information

Covers essential details such as industry, company size, location, and more, enabling you to tailor your campaigns effectively. Ideal for profiling and understanding the unique needs of small businesses.

Continuously Updated Data

Our dataset is maintained and updated regularly to ensure relevance and accuracy in fast-changing market conditions. New business contacts are added frequently, helping you stay ahead of the competition.

Why Choose Success.ai?

At Success.ai, we understand the critical importance of high-quality data for your business success. Here’s why our dataset stands out:

Tailored for Small Business Engagement Focused specifically on North American small business owners, this dataset is an invaluable resource for building relationships with SMEs (Small and Medium Enterprises). Whether you’re targeting startups, local businesses, or established small enterprises, our dataset has you covered.

Comprehensive Coverage Across North America Spanning the United States, Canada, and Mexico, our dataset ensures wide-reaching access to verified small business contacts in the region.

Categories Tailored to Your Needs Includes highly relevant categories such as Small Business Contact Data, CEO Contact Data, B2B Contact Data, and Email Address Data to match your marketing and sales strategies.

Customizable and Flexible Choose from a wide range of filtering options to create datasets that meet your exact specifications, including filtering by industry, company size, geographic location, and more.

Best Price Guaranteed We pride ourselves on offering the most competitive rates without compromising on quality. When you partner with Success.ai, you receive superior data at the best value.

Seamless Integration Delivered in formats that integrate effortlessly with your CRM, marketing automation, or sales platforms, so you can start acting on the data immediately.

Use Cases: This dataset empowers you to:

Drive Sales Growth: Build and refine your sales pipeline by connecting directly with decision-makers in small businesses. Optimize Marketing Campaigns: Launch highly targeted email and phone outreach campaigns with verified contact data. Expand Your Network: Leverage the dataset to build relationships with small business owners and other key figures within the B2B landscape. Improve Data Accuracy: Enhance your existing databases with verified, enriched contact information, reducing bounce rates and increasing ROI. Industries Served: Whether you're in B2B SaaS, digital marketing, consulting, or any field requiring accurate and targeted contact data, this dataset serves industries of all kinds. It is especially useful for professionals focused on:

Lead Generation Business Development Market Research Sales Outreach Customer Acquisition What’s Included in the Dataset: Each profile provides:

Full Name Verified Email Address Phone Number (where available) Job Title Company Name Industry Company Size Location Skills and Professional Experience Education Background With over 170 million profiles, you can tap into a wealth of opportunities to expand your reach and grow your business.

Why High-Quality Contact Data Matters: Accurate, verified contact data is the foundation of any successful B2B strategy. Reaching small business owners and decision-makers directly ensures your message lands where it matters most, reducing costs and improving the effectiveness of your campaigns. By choosing Success.ai, you ensure that every contact in your pipeline is a genuine opportunity.

Partner with Success.ai for Better Data, Better Results: Success.ai is committed to delivering premium-quality B2B data solutions at scale. With our small business owner dataset, you can unlock the potential of North America's dynamic small business market.

Get Started Today Request a sample or customize your dataset to fit your unique...

Clear search

Close search

Google apps

Main menu

Small Business Contact Data | North American Small Business Owners |...

Coresignal | Private Company Data | Company Data | AI-Enriched Datasets |...

AI-Powered Resume Screening Dataset (2025)

A living catalogue of artificial intelligence datasets and benchmarks for...

Low-Temperature Geothermal Geospatial Datasets: An Example from Alaska

Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...

Company Financial Data | Private & Public Companies | Verified Profiles &...

2010 Monarch Relevant Land Cover Data Set for Canada

Data Set for A Call for an Aloft Air Quality Monitoring Network: Need and...

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Lending Club Loan Data Analysis - Deep Learning

Audio Visual Speech Dataset: American English

Introduction

Dataset Content

Video Data

Metadata

Usage and Applications

Secure and Ethical Collection

Updates and Customization

Licence

AI-Generated Prompts Dataset

Dataset Overview

Dataset Content

Potential Applications

Model Information

Usage Guidelines

Acknowledgements

Data from: Data on Crime, Supervision, and Economic Change in the Greater...

Spatial data set of mapped water-level changes in the High Plains aquifer,...

Beauty & Cosmetics Data | Cosmetics, Beauty & Wellness Professionals...

National Eelgrass Dataset For Canada (NETForce)

QSARs for Plasma Protein Binding: Source Data and Predictions

Our415 Events and Activities

750K+ Car Images | AI Training Data | Object Detection Data | Annotated...

Small Business Contact Data | North American Small Business Owners | Verified Contact Details from 170M Profiles | Best Price Guaranteed