29 datasets found

Number of LinkedIn users worldwide 2019-2028
statista.com
Updated Mar 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of LinkedIn users worldwide 2019-2028 [Dataset]. https://www.statista.com/forecasts/1147197/linkedin-users-in-the-world
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
The global number of LinkedIn users in was forecast to continuously increase between 2024 and 2028 by in total 171.9 million users (+22.3 percent). After the sixth consecutive increasing year, the LinkedIn user base is estimated to reach 942.84 million users and therefore a new peak in 2028. User figures, shown here with regards to the platform LinkedIn, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of LinkedIn users in countries like Asia and South America.
LinkedIn Datasets
brightdata.com
.json, .csv, .xlsx
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2021). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 17, 2021
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.
Countries with the most LinkedIn users 2025
statista.com
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Countries with the most LinkedIn users 2025 [Dataset]. https://www.statista.com/statistics/272783/linkedins-membership-worldwide-by-country/
Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide
Description
As of early 2025, LinkedIn had an audience reach of *** million users in the *************. The country was by far the leading market of the professional job networking service, with runner-up India having an audience of *** million. LinkedIn: the company Launched in 2003, LinkedIn is a professional networking service where jobseekers can post their CVs, and employers or recruiters can post job ads and search for prospective candidates. In December 2016, Microsoft acquired LinkedIn, making it a wholly owned subsidiary. In 2020, the platform generated over ***** billion U.S. dollars in revenue. Despite its great success, the company has not always seen positive numbers only, and in 2018, LinkedIn reported an operating loss of *** million U.S. dollars. LinkedIn marketing Greater exposure, lead generation and increased thought leadership are all key benefits of social media marketing, and LinkedIn is a popular marketing tool in the B2B segment. Whereas the company predominantly generates revenue by selling access to member information to professional parties, LinkedIn is the second-most popular social media platform used by B2B marketers, ranking only behind Facebook.
LinkedIn Jobs Datasets
brightdata.com
.json, .csv, .xlsx
Updated Apr 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). LinkedIn Jobs Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin/jobs
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Apr 4, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
The LinkedIn Jobs Listing dataset emerges as a comprehensive resource for individuals navigating the contemporary job market. With a focus on critical employment details, the dataset encapsulates key facets of job listings, including titles, company names, locations, and employment specifics such as seniority levels and functions. This wealth of information is instrumental for job seekers looking to align their skills and aspirations with the right opportunities. The inclusion of direct application links and real-time application numbers enhances the dataset's utility, offering users a streamlined approach to engaging with potential employers. Beyond aiding job seekers, the dataset serves as a valuable tool for analysts and researchers, providing nuanced insights into industry trends and the evolving demands of the job market. The temporal aspect, captured through job posting timestamps, allows for the observation of job trends over time. Moreover, the dataset's integration of company details, including unique identifiers and LinkedIn profile links, enables a deeper exploration of hiring organizations. Whether for job seekers or analysts, the LinkedIn Jobs Listing dataset emerges as a versatile and informative repository, empowering users with the knowledge to make informed decisions in their professional pursuits.
o
LinkedIn company information
opendatabay.com
.undefined
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2025). LinkedIn company information [Dataset]. https://www.opendatabay.com/data/premium/bd1786ac-7b2e-45e3-957b-f98ebd46181c
Explore at:
.undefinedAvailable download formats
Dataset updated
May 23, 2025
Dataset authored and provided by
Bright Data
Area covered
Social Media and Networking
Description
LinkedIn companies use datasets to access public company data for machine learning, ecosystem mapping, and strategic decisions. Popular use cases include competitive analysis, CRM enrichment, and lead generation.

Use our LinkedIn Companies Information dataset to access comprehensive data on companies worldwide, including business size, industry, employee profiles, and corporate activity. This dataset provides key company insights, organizational structure, and competitive landscape, tailored for market researchers, HR professionals, business analysts, and recruiters.

Leverage the LinkedIn Companies dataset to track company growth, analyze industry trends, and refine your recruitment strategies. By understanding company dynamics and employee movements, you can optimize sourcing efforts, enhance business development opportunities, and gain a strategic edge in your market. Stay informed and make data-backed decisions with this essential resource for understanding global company ecosystems.

Dataset Features

timestamp: Represents the date and time when the company data was collected.

id: Unique identifier for each company in the dataset.

company_id: Identifier linking the company to an external database or internal system.

url: Website or URL for more information about the company.

name: The name of the company.

about: Brief description of the company.

description: More detailed information about the company's operations and offerings.

organization_type: Type of the organization (e.g., private, public).

industries: List of industries the company operates in.

followers: Number of followers on the company's platform.

headquarters: Location of the company's headquarters.

country_code: Code for the country where the company is located.

country_codes_array: List of country codes associated with the company (may represent various locations or markets).

locations: Locations where the company operates.

get_directions_url: URL to get directions to the company's location(s).

formatted_locations: Human-readable format of the company's locations.

website: The official website of the company.

website_simplified: A simplified version of the company's website URL.

company_size: Number of employees or company size.

employees_in_linkedin: Number of employees listed on LinkedIn.

employees: URL of employees.

specialties: List of the company’s specializations or services.

updates: Recent updates or news related to the company.

crunchbase_url: Link to the company’s profile on Crunchbase.

founded: Year when the company was founded.

funding: Information on funding rounds or financial data.

investors: Investors who have funded the company.

alumni: Notable alumni from the company.

alumni_information: Details about the alumni, their roles, or achievements.

stock_info: Stock market information for publicly traded companies.

affiliated: Companies or organizations affiliated with the company.

image: Image representing the company.

logo: URL of the official logo of the company.

slogan: Company’s slogan or tagline.

similar: URL of companies similar to this one.

Distribution

Data Volume: 56.51M rows and 35 columns.

Structure: Tabular format (CSV, Excel).

Usage

This dataset is ideal for:
- Market Research: Identifying key trends and patterns across different industries and geographies.
- Business Development: Analyzing potential partners, competitors, or customers.
- Investment Analysis: Assessing investment potential based on company size, funding, and industries.
- Recruitment & Talent Analytics: Understanding the workforce size and specialties of various companies.

Coverage

Geographic Coverage: Global, with company locations and headquarters spanning multiple countries.

Time Range: Data likely covers both current and historical information about companies.

Demographics: Focuses on company attributes rather than demographics, but may contain information about the company's workforce.

License

CUSTOM

Please review the respective licenses below:

Data Provider's License

Bright Data Master Service Agreement

Who Can Use It

Data Scientists: For building models, conducting research, or enhancing machine learning algorithms with business data.

Researchers: For academic analysis in fields like economics, business, or technology.

Businesses: For analysis, competitive benchmarking, and strategic development.

Investors: For identifying and evaluating potential investment opportunities.

Dataset Name Ideas

Global Company Profile Database

**Business Intellige
Success.ai | User Profiles Data | Comprehensive 700M Dataset of LinkedIn...
data.success.ai
Updated Jan 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2022). Success.ai | User Profiles Data | Comprehensive 700M Dataset of LinkedIn Profiles for B2B Strategy [Dataset]. https://data.success.ai/products/success-ai-user-profiles-data-comprehensive-700m-dataset-success-ai
Explore at:
Dataset updated
Jan 1, 2022
Dataset provided by
Area covered
Tanzania, Timor-Leste, Belize, Bulgaria, Burkina Faso, Saint Lucia, Zambia, Croatia, Congo, Myanmar
Description
Harness Success.ai's robust LinkedIn and User Profiles Data, featuring extensive insights from 700M+ profiles and 70M+ companies for ideal customer profiling and competitive intelligence. Ensure data-driven decisions with our GDPR-compliant, accurately validated datasets - At Unbeatable Prices.
o
LinkedIn Reviews Sentiment Dataset
opendatabay.com
.undefined
Updated Jul 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). LinkedIn Reviews Sentiment Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/273a9cbc-e56f-41a6-8a82-a8c327bf1fe6
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 6, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Reviews & Ratings
Description
This dataset contains user reviews and ratings for the LinkedIn mobile application, extracted from its Google Store page. It provides valuable insights into the public's perception of the app over an extended period. The collection of reviews offers a basis for understanding user sentiment, identifying trends, and pinpointing common pain points experienced by users of the LinkedIn app. The dataset is particularly useful for product development teams, market analysts, and researchers interested in user feedback and app performance analysis.

Columns

index: A numerical index for each review record.

review_id: A unique identifier for each review.

pseudo_author_id: A pseudonymised identifier for the author of the review.

author_name: The name of the author who submitted the review.

review_text: The textual content of the user's review.

review_rating: The star rating given by the user, ranging from 1 to 5. Note that some very old reviews may have a zero score.

review_likes: The number of likes or upvotes a particular review received.

author_app_version: The version of the LinkedIn app installed when the review was made.

review_timestamp: The date and time (in UTC) when the review was submitted.

Distribution

This dataset is typically provided as a data file, commonly in CSV format. It comprises approximately 320,000 individual review records. The review_id column alone contains 322,641 unique values. The data structure is tabular, with each row representing a single review and columns providing specific details about that review. Specific numbers for rows/records are available and consistent with the total count.

Usage

This dataset is ideal for a variety of analytical applications and use cases, including: * Sentiment Analysis: Extracting sentiments and trends from user feedback to gauge overall satisfaction and identify shifts in public opinion. * Version Performance Tracking: Identifying which versions of the LinkedIn app received the most positive or negative feedback. * Topic Modelling: Utilising natural language processing (NLP) techniques like topic modelling to uncover specific pain points, frequently requested features, or common praise for the application. * Product Improvement: Informing product development and user experience (UX) design by directly addressing user feedback. * Market Research: Understanding user perceptions of a leading professional networking platform.

Coverage

The dataset covers reviews for the LinkedIn app, which has a global user base with over 970 million registered members from more than 200 countries and territories. The reviews themselves were extracted from its Google Store page. The time range for the reviews spans from 7th April 2011 to 18th November 2023. There are specific notes on data availability for certain groups/years visible in the timestamp distribution.

License

CC0

Who Can Use It

This dataset is intended for: * Data Scientists & Analysts: For performing sentiment analysis, natural language processing, and trend analysis on app reviews. * App Developers & Product Managers: To gain direct user feedback for product iteration, bug identification, and feature prioritisation. * Market Researchers: To understand user behaviour, competitive landscape, and public perception within the social media and professional networking domain. * Academic Researchers: For studies on user feedback, app development cycles, and the evolution of digital platform perception.

Dataset Name Suggestions

LinkedIn App User Reviews

Google Play LinkedIn Feedback

LinkedIn Application Ratings

LinkedIn Reviews Sentiment Dataset

Professional Networking App Reviews

Attributes

Original Data Source: 📝 320K LinkedIn App Google Store Reviews
d
Social Media Data | Linkedin, Youtube, TwitterX | Global Coverage | 120M+...
datarade.ai
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Exellius Systems (2024). Social Media Data | Linkedin, Youtube, TwitterX | Global Coverage | 120M+ Contacts | (Verified E-mail, Direct Dails) | Live Profile Links [Dataset]. https://datarade.ai/data-products/social-media-data-linkedin-facebook-youtube-twitterx-g-exellius-systems
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Feb 2, 2024
Dataset authored and provided by
Exellius Systems
Area covered
Martinique, United Republic of, Åland Islands, South Georgia and the South Sandwich Islands, South Africa, Poland, Western Sahara, Guam, Greece, Antigua and Barbuda
Description
Unlock the full potential of your social media outreach with our comprehensive Global Social Media Database, meticulously designed to meet your strategic needs. Covering major regions across the globe—APAC, Europe, Africa, North America, South America, and LATAM—this dynamic resource spans 16 diverse industries, making it a powerful catalyst for your marketing and social engagement strategies.

Global Geographical Coverage: Our database is designed to offer extensive coverage, enabling you to engage audiences across:

APAC (Asia-Pacific): China, India, Japan, South Korea, Australia, and more.

Europe: United Kingdom, Germany, France, Italy, Spain, and others.

Africa: South Africa, Nigeria, Kenya, Egypt, and more.

North America: United States, Canada, Mexico.

South America: Brazil, Argentina, Chile, Colombia, and more.

LATAM (Latin America): Brazil, Argentina, Peru, Venezuela, and other nations.

This widespread reach ensures that your campaigns resonate across both developed and emerging markets.

Industries Covered: Our data spans the following key industries:

Technology

Healthcare

Finance

Manufacturing

Education

Hospitality

Real Estate

Energy

Agriculture

Transportation

Media

Telecommunications

Automotive

Pharmaceutical

Aerospace

Retail

Employee Size & Revenue: We recognize the importance of targeted outreach, which is why our database also includes employee size and revenue data for each company, ensuring that you can filter and approach organizations based on their scale and financial capacity. Whether you're targeting small businesses or multinational corporations, we’ve got you covered with customized insights to optimize your campaigns.

Key Database Attributes: Our comprehensive social media data offers the following key attributes:

Total Contacts: 120M+

Social Platforms: LinkedIn, Facebook, YouTube, TwitterX

Direct Dials: Verified

Email Addresses: Verified

Live Profile Links: Provided on request

What Sets Us Apart:

Verified Direct Dials & Emails:
Accuracy is our priority. Each contact in our database comes with verified direct dials and email addresses, ensuring you reach the right people, reducing wasted outreach efforts, and maximizing engagement.

Free Lead Replacement:
Understanding that social media data is ever-changing, we offer cost-free lead replacements, maintaining the quality and relevance of your contacts over time, with no added costs.

Sourcing Excellence:
Our data isn't merely aggregated. We use precise sourcing strategies, leveraging both publication sites and a dedicated contact discovery team, guaranteeing the authenticity and relevance of our database.

Live Profile Links:
Want to explore a profile in real-time? We provide live links to social media profiles across LinkedIn, Facebook, YouTube, and TwitterX, allowing seamless interaction and profile verification.

Global Reach:
Our reach extends across continents, from the vibrant tech hubs of APAC to thriving business sectors in North America, making our database indispensable for engaging diverse and global audiences.

Industry-Specific Targeting:
Our data is structured for industry-specific targeting. Whether you are engaging with healthcare, finance, or manufacturing professionals, we provide nuanced insights tailored to your needs, ensuring strategic precision.

Strategic Asset:
Our database isn't just a collection of contacts—it's a strategic asset. With 250M+ verified contacts, direct dials, and email addresses, it empowers you to align your social media strategy with your overall sales and marketing goals, enabling meaningful interactions that can translate into successful business engagements.

Amplify Your Social Media Presence: Use our Global Social Media Data to drive meaningful connections and enhance your social media presence. With our data, you’ll have the ability to reach the right people, at the right time, on the right platforms. Whether you're exploring new territories, scaling your operations, or targeting niche industries, our data will empower you to make an impactful difference in your social media outreach.

Let us be your trusted partner as you navigate the intricate world of global social engagement strategies!
d
US Employee Data | Accurate Contact Information, Job Experience, LinkedIn...
datarade.ai
.json, .csv, .xls
Updated Aug 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salutary Data (2023). US Employee Data | Accurate Contact Information, Job Experience, LinkedIn URLs + More | Recruiting / HR [Dataset]. https://datarade.ai/data-products/salutary-data-us-employee-data-accurate-contact-informati-salutary-data
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Aug 22, 2023
Dataset authored and provided by
Salutary Data
Area covered
United States of America
Description
Salutary Data is a boutique, B2B contact and company data provider that's committed to delivering high quality data for sales intelligence, lead generation, marketing, recruiting, employee data / HR, identity resolution, and ML / AI. Our database currently consists of 148MM+ highly curated B2B Contacts ( US only), along with over 4M+ companies, and is updated regularly to ensure we have the most up-to-date information.

We can enrich your in-house data ( CRM Enrichment, Lead Enrichment, etc.) and provide you with a custom dataset ( such as a lead list) tailored to your target audience specifications and data use-case. We also support large-scale data licensing to software providers and agencies that intend to redistribute our data to their customers and end-users.

What makes Salutary unique? - We offer our clients a truly unique, one-stop aggregation of the best-of-breed quality data sources. Our supplier network consists of numerous, established high quality suppliers that are rigorously vetted. - We leverage third party verification vendors to ensure phone numbers and emails are accurate and connect to the right person. Additionally, we deploy automated and manual verification techniques to ensure we have the latest job information for contacts. - We're reasonably priced and easy to work with.

Products: API Suite Web UI Full and Custom Data Feeds

Services: Data Enrichment - We assess the fill rate gaps and profile your customer file for the purpose of appending fields, updating information, and/or rendering net new “look alike” prospects for your campaigns. ABM Match & Append - Send us your domain or other company related files, and we’ll match your Account Based Marketing targets and provide you with B2B contacts to campaign. Optionally throw in your suppression file to avoid any redundant records. Verification (“Cleaning/Hygiene”) Services - Address the 2% per month aging issue on contact records! We will identify duplicate records, contacts no longer at the company, rid your email hard bounces, and update/replace titles or phones. This is right up our alley and levers our existing internal and external processes and systems.
d
Data Licensing - ABM Data- 152+ Million Contacts | 13+ Million Companies -...
datarade.ai
.xml, .csv, .xls
Updated Oct 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomson Data (2024). Data Licensing - ABM Data- 152+ Million Contacts | 13+ Million Companies - Updated Monthly Basis [Dataset]. https://datarade.ai/data-products/thomson-data-data-licensing-abm-data-154-million-contacts-thomson-data
Explore at:
.xml, .csv, .xlsAvailable download formats
Dataset updated
Oct 25, 2024
Dataset authored and provided by
Thomson Data
Area covered
Morocco, Papua New Guinea, Paraguay, Nauru, Saint Helena, Greenland, Niger, Bangladesh, Brazil, Slovakia
Description
Empower Your Business With Professional Data Licensing Services

Discover a 360-Degree View of Worldwide Solution Buyers and Their Needs Leverage over 70 insights that will help you make better decisions to manage your sales pipeline, target key accounts with customized messaging, and focus your sales and marketing efforts:

Here are some of the types of Insights, our data licensing services can provide are:

Technology Insights: Discover companies’ technology preferences, including their tech stack for essential investments such as CRM systems, marketing and sales automation, email security and hosting, data analytics, and cloud security and providers.

Departmental Roles and Openings: Access real-time data on the number of roles and job openings across various departments, including IT, Development, Security, Marketing, Sales, and Customer Success. This information helps you gauge the company’s growth trajectory and possible needs.

Funding Insights: Keep updated of the latest funding, dates, types, and lead investors, providing you with a clear understanding of a company’s potential for growth investments.

Mobile Application Insights: Find out if the company has a mobile app or web app, enabling you to tailor your pitch effectively.

Website traffic and advertising spend metrics: Customers can leverage website traffic and advertising data to gain insights into competitor performance, allowing them to refine their marketing strategies and optimize ad spending.

Access unlimited data and improve conversation by 3X

Leverage the data for your Account-Based Marketing (ABM) strategy

Leverage ICP (industry, company size, location etc) to identify high- potential Accounts.

Utilize GTM strategies to deliver personalized marketing experiences through
Multi-channel outreach (email, Cell, social media) that resonate with the target audience.

Who can leverage our Data:

B2B marketing Teams- Increase marketing leads and enhance conversions.

B2B sales teams- Build a stronger pipeline and increase your deal wins.

Talent sourcing/Staffing companies- Leverage our data to identify and engage top talent, streamlining your recruitment process and finding the best candidates faster.

Research companies/Investors- Insights into the financial investments received by a company, including funding rounds, amounts, and investor details.

Technology companies: Leverage our Technographic data to reveal the technology stack and tools used by companies, helping tailor marketing and sales efforts.

Data Source:

The Database, sourced through multiple sources and validated using proprietary methods on an ongoing basis, is highly customizable. It contains parameters such as employee size, job title, domain, industry, Technography, Ad spends, Funding data, and more, which can be tailored to create segments that perfectly align with your targeting needs. That is exactly why our Database is perfect for licensing!

FAQs

Can licensed data be resold or redistributed? Answer: No, The customer shall not, directly or indirectly, sell, distribute, license, or otherwise make available the licensed data to any third party that intends to resell, sublicense, or redistribute the data. The Customer must take reasonable steps to ensure that any recipient of the licensed data is using it for internal purposes only and not for resale or redistribution. Any breach of this provision shall be considered a material breach of this Order Form and may result in the immediate termination of the Customer's rights under this agreement, as well as any applicable remedies available under law.

What is the duration of the data license and usage terms? Answer: The data license is valid for 12 months (1 year) for unlimited usage. Customers also have the option to license the data for multiple years. At the end of the first year, Customers can renew the license to maintain continued access.

What happens if the customer misuses the data? Answer: The data can be used without limits for a period of one year or multiple years (depending on the contract tenure); however, Thomson Data actively monitors its usage. If any unusual activity is detected, Thomson Data reserves the right to terminate the account.

How frequently is the data updated? Answer: The data is updated on a quarterly basis and fresh records added on a monthly basis

What is the accuracy rate of the data? Answer: Customers can expect 90% accuracy for all data points, with email accuracy ranging between 85% and 90%. Cell phone data accuracy is around 80%.

What types of information are included in the data? Answer: Thomson Data provides over 70+ data points, including contact details (name, job title, LinkedIn profile, cell number, email address, education, certifications, work experience, etc.), company information, department/team sizes, SIC and NAICS codes, industry classification, technographic detai...
Social Media Profile Links by Name
openwebninja.com
json
Updated Feb 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenWeb Ninja (2025). Social Media Profile Links by Name [Dataset]. https://www.openwebninja.com/api/social-links-search
Explore at:
jsonAvailable download formats
Dataset updated
Feb 2, 2025
Dataset authored and provided by
OpenWeb Ninja
Area covered
Worldwide
Description
This dataset provides comprehensive social media profile links discovered through real-time web search. It includes profiles from major social networks like Facebook, TikTok, Instagram, Twitter, LinkedIn, Youtube, Pinterest, Github and more. The data is gathered through intelligent search algorithms and pattern matching. Users can leverage this dataset for social media research, influencer discovery, social presence analysis, and social media marketing. The API enables efficient discovery of social profiles across multiple platforms. The dataset is delivered in a JSON format via REST API.
d
B2B Data - Job Posting Data - LinkedIn Data - Emails and Phone Data.
datarade.ai
.csv, .xls
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sourcemate US (2024). B2B Data - Job Posting Data - LinkedIn Data - Emails and Phone Data. [Dataset]. https://datarade.ai/data-products/daily-basis-job-seekers-data-uk-usa-and-canada-sourcemate-us
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
Feb 26, 2024
Dataset authored and provided by
Sourcemate US
Area covered
United Kingdom
Description
At source mate, we understand the value of accurate and up-to-date data in today's competitive landscape. Our CVs and B2B Linkedin data are meticulously collected, verified, and updated, ensuring their integrity and relevance.

We gather information from various trusted sources, such as our websites, job boards, professional networks, and career websites, to create a comprehensive database of potential candidates actively seeking employment opportunities.

Here's why our job seeker data sets are unparalleled in the industry:

Comprehensive and Targeted: Our data sets cover a wide range of industries, job titles, locations, and experience levels. Whether you're looking for entry-level professionals, mid-level managers, or specialized experts, we have the data to meet your specific requirements. Our data is highly segmented and customizable, enabling you to target your ideal candidates with precision.

Fresh and Updated: We understand the importance of timely information. Our dedicated team ensures that our job seeker data is regularly updated and refreshed to maintain its accuracy and relevance. This means you'll have access to the latest contact details, job preferences, skills, and qualifications of potential candidates, enabling you to engage with them at the right time and with personalized messaging.

GDPR Compliant: Privacy and data protection are paramount to us. We strictly adhere to the General Data Protection Regulation (GDPR) guidelines, ensuring that all data we provide is collected and processed lawfully and ethically. We respect the privacy rights of individuals and maintain the highest standards of data security and confidentiality.

Easy Integration: Our data sets are provided in a format that is easily integrable with your existing systems and platforms. Whether you want to import the data into your CRM, applicant tracking system, or any other software, our user-friendly formats facilitate seamless integration, saving you time and effort.

Reliable Customer Support: We pride ourselves on delivering exceptional customer service. Our dedicated support team is available to assist you at every step of the process, from helping you select the right data sets to answering any queries or concerns you may have. We strive to ensure your experience with source mate is smooth, efficient, and successful.
R
Draughts Board Dataset
universe.roboflow.com
zip
Updated Sep 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harry Field (2021). Draughts Board Dataset [Dataset]. https://universe.roboflow.com/harry-field-qemqy/draughts-board-fm9sx/model/7
Explore at:
zipAvailable download formats
Dataset updated
Sep 26, 2021
Dataset authored and provided by
Harry Field
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Draughts Pieces Bounding Boxes
Description
This dataset was created by Harry Field and contains the labelled images for capturing the game state of a draughts/checkers 8x8 board.

This was a fun project to develop a mobile draughts applciation enabling users to interact with draughts-based software via their mobile device's camera.

The data captured consists of: * White Pieces * White Kings * Black Pieces * Black Kings * Bottom left corner square * Top left corner square * Top right corner square * Bottom right corner square

Corner squares are captured so the board locations of the detected pieces can be estimated.

https://github.com/ShippingTycoon/roboflow-draughts/blob/main/PXL_20210603_093949805_jpg.rf.30e2a64a0a646e8ea8e121727cf0f1ee.jpg?raw=true" alt="Results of Yolov5 model after training with this dataset">

From this data, the locations of other squares can be estimated and game state can be captured. The image below shows the data of a different board configuration being captured. Blue circles refer to squares, numbers refer to square index and the coloured circles refer to pieces. https://github.com/ShippingTycoon/roboflow-draughts/blob/main/pieces.png?raw=true" alt="">

Once game state is captured, integration with other software becomes possible. In this example, I created a simple move suggestion mobile applciation seen working here.

The developed application is a proof of concept and is not available to the public. Further development is required in training the model accross multiple draughts boards and implementing features to add vlaue to the physical draughts game.

The dataset consists of 759 images and was trained using Yolov5 with a 70/20/10 split.

The output of Yolov5 was parsed and filtered to correct for duplicated/overlapping detections before game state could be determined.

I hope you find this dataset useful and if you have any questions feel free to drop me a message on LinkedIn as per the link above.
d
B2B Contact Data | 148MM+ High Quality US B2B Contacts | LinkedIn URL,...
datarade.ai
.json, .csv, .xls
Updated Jun 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salutary Data (2023). B2B Contact Data | 148MM+ High Quality US B2B Contacts | LinkedIn URL, Mobile Phone, Email Address, Current Job Title + More [Dataset]. https://datarade.ai/data-products/salutary-data-b2b-contact-data-62m-high-quality-us-b2b-c-salutary-data
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Jun 10, 2023
Dataset authored and provided by
Salutary Data
Area covered
United States
Description
Salutary Data is a boutique, B2B contact and company data provider that's committed to delivering high quality data for sales intelligence, lead generation, marketing, recruiting / HR, identity resolution, and ML / AI. Our database currently consists of 148MM+ highly curated B2B Contact ( US only), along with over 4M+ companies, and is updated regularly to ensure we have the most up-to-date information.

We can enrich your in-house data ( CRM Enrichment, Lead Enrichment, etc.) and provide you with a custom dataset ( such as a lead list) tailored to your target audience specifications and data use-case. We also support large-scale data licensing to software providers and agencies that intend to redistribute our data to their customers and end-users.

What makes Salutary unique? - We offer our clients a truly unique, one-stop aggregation of the best-of-breed quality data sources. Our supplier network consists of numerous, established high quality suppliers that are rigorously vetted. - We leverage third party verification vendors to ensure phone numbers and emails are accurate and connect to the right person. Additionally, we deploy automated and manual verification techniques to ensure we have the latest job information for contacts. - We're reasonably priced and easy to work with.

Products: API Suite Web UI Full and Custom Data Feeds

Services: Data Enrichment - We assess the fill rate gaps and profile your customer file for the purpose of appending fields, updating information, and/or rendering net new “look alike” prospects for your campaigns. ABM Match & Append - Send us your domain or other company related files, and we’ll match your Account Based Marketing targets and provide you with B2B contact to campaign. Optionally throw in your suppression file to avoid any redundant records. Verification (“Cleaning/Hygiene”) Services - Address the 2% per month aging issue on contact records! We will identify duplicate records, contacts no longer at the company, rid your email hard bounces, and update/replace titles or phones. This is right up our alley and levers our existing internal and external processes and systems.
Place Review Dataset - Niche (USA)
kaggle.com
Updated Jan 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enam Biswas (2021). Place Review Dataset - Niche (USA) [Dataset]. http://doi.org/10.34740/kaggle/dsv/1842046
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/1842046
Dataset updated
Jan 13, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Enam Biswas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Context

Reviews are a way to gain insight into a product/service. In machine learning tasks, text reviews play an important role in predicting/gaining insights. User-generated place reviews are extremely handy when it comes to choosing a neighborhood to live in. Niche has got a huge amount of review-rating for American neighborhood, which is perfect for several NLP tasks.

Content

The dataset is collected from Niche and each individual data is publically available. Below is the overall dataset stats - # total records = 712, 107 # total places = 56, 800

Some insight about data: # guid Generated by Niche and unique to place/entity. # body Actual review data. # rating Rating on a scale of 0 to 5. # author Provider of the review/rating. (aka Niche user) # created Timestamp. # categories Experience type (about the entity).

Acknowledgements

All rights reserved to Niche and the user who spent valuable time providing reviewers-ratings.

Inspiration

Can you predict a rating for reviews which has no rating?

Use transfer learning in predicting review sentiment on another domain.

Predict review usefulness/quality or impactful review.

Citation

If you intend to use this dataset, please cite the following - @misc{enam biswas_2021, title={Place Review Dataset - Niche (USA)}, url={https://www.kaggle.com/dsv/1842046}, DOI={10.34740/KAGGLE/DSV/1842046}, publisher={Kaggle}, author={Enam Biswas}, year={2021} } Please feel free to contact - Enam Biswas if you have any kind of questions.

Other datasets by me

IMDb Largest Review Dataset - Over 5.5M reviews/ 1.2M spoilers.

Bangla Largest Newspaper Dataset - Almost 1.7M Bangla news articles.
d
Ecommerce Data | Store Location Data | Global Coverage | 61M+ Contacts |...
datarade.ai
Updated Sep 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Exellius Systems (2024). Ecommerce Data | Store Location Data | Global Coverage | 61M+ Contacts | (Verified E-mail, Direct Dails)| Decision Makers Contacts| 20+ Attributes [Dataset]. https://datarade.ai/data-products/ecommerce-data-ecommerce-store-data-global-coverage-200-exellius-systems
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Sep 7, 2024
Dataset authored and provided by
Exellius Systems
Area covered
Spain, Gabon, Saint Vincent and the Grenadines, Heard Island and McDonald Islands, Seychelles, Jersey, Namibia, Lithuania, Iran (Islamic Republic of), Congo (Democratic Republic of the)
Description
Revolutionize Customer Engagement with Our Comprehensive Ecommerce Data

Our Ecommerce Data is designed to elevate your customer engagement strategies, providing you with unparalleled insights and precision targeting capabilities. With over 61 million global contacts, this dataset goes beyond conventional data, offering a unique blend of shopping cart links, business emails, phone numbers, and LinkedIn profiles. This comprehensive approach ensures that your marketing strategies are not just effective but also highly personalized, enabling you to connect with your audience on a deeper level.

What Makes Our Ecommerce Data Stand Out?

Unique Features for Enhanced Targeting
Our Ecommerce Data is distinguished by its depth and precision. Unlike many other datasets, it includes shopping cart links—a rare and valuable feature that provides you with direct insights into consumer behavior and purchasing intent. This information allows you to tailor your marketing efforts with unprecedented accuracy. Additionally, the integration of business emails, phone numbers, and LinkedIn profiles adds multiple layers to traditional contact data, enriching your understanding of clients and enabling more personalized engagement.

Robust and Reliable Data Sourcing
We pride ourselves on our dual-sourcing strategy that ensures the highest levels of data accuracy and relevance:

Real-Time Information from 10 Active Publication Sites: Our databases are continuously updated with the latest information, sourced from ten active publication sites that provide real-time data.

Dedicated Contact Discovery Team: Complementing our automated sources, our dedicated Contact Discovery Team conducts thorough research and investigations, ensuring that every piece of data is accurate and reliable. This two-pronged approach guarantees that our Ecommerce Data is both up-to-date and relevant, providing you with a solid foundation for your business strategies.

Primary Use Cases Across Industries

Our Ecommerce Data is versatile and can be leveraged across various industries for multiple applications: - Precision Targeting in Marketing: Create personalized marketing campaigns based on detailed shopping cart activities, ensuring that your outreach resonates with individual customer preferences. - Sales Enrichment: Sales teams can benefit from enriched client profiles that include comprehensive contact information, enabling them to connect with key decision-makers more effectively. - Market Research and Analytics: Research and analytics departments can use this data for in-depth market studies and trend analyses, gaining valuable insights into consumer behavior and market dynamics.

Global Coverage for Comprehensive Engagement

Our Ecommerce Data spans across the globe, providing you with extensive reach and the ability to engage with customers in diverse regions: - North America: United States, Canada, Mexico - Europe: United Kingdom, Germany, France, Italy, Spain, Netherlands, Sweden, and more - Asia: China, Japan, India, South Korea, Singapore, Malaysia, and more - South America: Brazil, Argentina, Chile, Colombia, and more - Africa: South Africa, Nigeria, Kenya, Egypt, and more - Australia and Oceania: Australia, New Zealand - Middle East: United Arab Emirates, Saudi Arabia, Israel, Qatar, and more

Comprehensive Employee and Revenue Size Information

Our dataset also includes detailed information on: - Employee Size: Whether you’re targeting small businesses or large corporations, our data covers all employee sizes, from startups to global enterprises. - Revenue Size: Gain insights into companies across various revenue brackets, enabling you to segment the market more effectively and target your efforts where they will have the most impact.

Seamless Integration into Broader Data Offerings

Our Ecommerce Data is not just a standalone product; it is a critical piece of our broader data ecosystem. It seamlessly integrates with our comprehensive suite of business and consumer datasets, offering you a holistic approach to data-driven decision-making: - Tailored Packages: Choose customized data packages that meet your specific business needs, combining Ecommerce Data with other relevant datasets for a complete view of your market. - Holistic Insights: Whether you are looking for industry-specific details or a broader market overview, our integrated data solutions provide you with the insights necessary to stay ahead of the competition and make informed business decisions.

Elevate Your Business Decisions with Our Ecommerce Data

In essence, our Ecommerce Data is more than just a collection of contacts—it’s a strategic tool designed to give you a competitive edge in understanding and engaging your target audience. By leveraging the power of this comprehensive dataset, you can elevate your business decisions, enhance customer interactions, and navigate the digital landscape with confi...
oyo-reviews-dataset
kaggle.com
zip
Updated Jun 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepkumar patel (2023). oyo-reviews-dataset [Dataset]. https://www.kaggle.com/datasets/deeppatel9095/oyo-reviews-dataset
Explore at:
zip(32300432 bytes)Available download formats
Dataset updated
Jun 24, 2023
Authors
Deepkumar patel
Description
The inspiration behind creating the OYO Review Dataset for sentiment analysis was to explore the sentiment and opinions expressed in hotel reviews on the OYO Hotels platform. Analyzing the sentiment of customer reviews can provide valuable insights into the overall satisfaction of guests, identify areas for improvement, and assist in making data-driven decisions to enhance the hotel experience. By collecting and curating this dataset, Deep Patel, Nikki Patel, and Nimil aimed to contribute to the field of sentiment analysis in the context of the hospitality industry. Sentiment analysis allows us to classify the sentiment expressed in textual data, such as reviews, into positive, negative, or neutral categories. This analysis can help hotel management and stakeholders understand customer sentiments, identify common patterns, and address concerns or issues that may affect the reputation and customer satisfaction of OYO Hotels. The dataset provides a valuable resource for training and evaluating sentiment analysis models specifically tailored to the hospitality domain. Researchers, data scientists, and practitioners can utilize this dataset to develop and test various machine learning and natural language processing techniques for sentiment analysis, such as classification algorithms, sentiment lexicons, or deep learning models. Overall, the goal of creating the OYO Review Dataset for sentiment analysis was to facilitate research and analysis in the area of customer sentiments and opinions in the hotel industry. By understanding the sentiment of hotel reviews, businesses can strive to improve their services, enhance customer satisfaction, and make data-driven decisions to elevate the overall guest experience.

Deep Patel: https://www.linkedin.com/in/deep-patel-55ab48199/ Nikki Patel: https://www.linkedin.com/in/nikipatel9/ Nimil lathiya: https://www.linkedin.com/in/nimil-lathiya-059a281b1/
A
‘Deep-NLP’ analyzed by Analyst-2
analyst-2.ai
Updated Mar 31, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Deep-NLP’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-deep-nlp-9bf3/latest
Explore at:
Dataset updated
Mar 31, 2019
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Deep-NLP’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/samdeeplearning/deepnlp on 28 January 2022.

--- Dataset description provided by original source is as follows ---

What's In The Deep-NLP Dataset?

Sheet_1.csv contains 80 user responses, in the response_text column, to a therapy chatbot. Bot said: 'Describe a time when you have acted as a resource for someone else'. User responded. If a response is 'not flagged', the user can continue talking to the bot. If it is 'flagged', the user is referred to help.

Sheet_2.csv contains 125 resumes, in the resume_text column. Resumes were queried from Indeed.com with keyword 'data scientist', location 'Vermont'. If a resume is 'not flagged', the applicant can submit a modified resume version at a later date. If it is 'flagged', the applicant is invited to interview.

What Do I Do With This?

Classify new resumes/responses as flagged or not flagged.

There are two sets of data here - resumes and responses. Split the data into a train set and a test set to test the accuracy of your classifier. Bonus points for using the same classifier for both problems.

Good luck.

Acknowledgements

Thank you to Parsa Ghaffari (Aylien), without whom these visuals (cover photo is in Parsa Ghaffari's excellent LinkedIn article on English, Spanish and German postive v. negative sentiment analysis) would not exist.

There Is A 'deep natural language processing' Kernel. I will update it. I Hope You Find It Useful.

You can use any of the code in that kernel anywhere, on or off Kaggle. Ping me at @_samputnam for questions.

--- Original source retains full ownership of the source dataset ---
o
Global News Popularity Insights Datset
opendatabay.com
.undefined
Updated Jul 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Global News Popularity Insights Datset [Dataset]. https://www.opendatabay.com/data/ai-ml/b036c2ea-2b40-4afe-8dc2-1c56302ffdbc
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Social Media and Networking
Description
This dataset captures the popularity of news articles across various social media platforms, providing valuable insights into how news content performs online [1, 2]. It is a subset of a larger dataset, specifically designed for analysing engagement and reach of news items [1, 2]. The data includes key details about news articles and their final popularity scores on Facebook, Google+, and LinkedIn [1-3]. It serves as an excellent resource for understanding social media trends and the dissemination of news [2].

Columns

The dataset features the following columns: * IDLink: A unique identifier for each news item [1, 2]. * Title: The title of the news item as it appeared from the official media sources [1, 2]. * Headline: The headline of the news item, also from official media sources [1, 2]. * Source: The original news outlet that published the news item [1, 2]. * Topic: The query topic used to obtain the news items from official media sources [1, 2]. * PublishDate: The date and time when the news item was published [1, 2]. * Facebook: The final popularity score of the news item on Facebook [2, 3]. * GooglePlus: The final popularity score of the news item on Google+ [2, 3]. * LinkedIn: The final popularity score of the news item on LinkedIn [2, 3]. This subset of the data is specifically noted to be missing the 'SentimentTitle' and 'SentimentHeadline' columns that are present in the full dataset [1].

Distribution

This dataset comprises approximately 37,000 news articles [1]. While exact row counts for files are not specified beyond this total, the dataset format is typically CSV [4]. * Unique Values: * IDLink: 37,288 unique values [3]. * Title: 32,366 unique values [3]. * Headline: 34,634 unique values [3]. * Source Distribution: * Bloomberg: 2% [3]. * Reuters: 1% [3]. * Other: 97% (from 35,990 sources) [3]. * Topic Distribution: * Economy: 36% [3]. * Obama: 31% [3]. * Other: 33% (from 12,165 topics) [3]. * Time Range Sample (2016): * 03/29 - 04/03: 2,239 items [5]. * 04/03 - 04/08: 2,020 items [5]. * 06/17 - 06/22: 1,650 items [5]. * 06/27 - 07/02: 2,024 items [5]. The data spans from 2016-03-29 to 2016-07-07 [6].

Usage

This dataset is ideal for: * Analysing news popularity trends across different social media platforms [2]. * Studying the impact of news content on online engagement [2]. * Exploratory data analysis of news consumption patterns [7]. * Understanding the spread of information in digital environments. * Developing models to predict social media reach for news articles. * Insights into media outlets' influence and topic relevance [1, 3].

Coverage

The dataset covers an approximate 8-month period, between November 2015 and July 2016 [2]. The specific subset provided covers 29 March 2016 to 07 July 2016 [6]. It includes news items on four primary topics: economy, Microsoft, Obama, and Palestine [2], with distribution details for 'economy' and 'obama' [3]. The region of coverage is global [8].

License

CCO

Who Can Use It

Data Scientists and Analysts: For exploratory data analysis, feature engineering, and model building related to news popularity and social media engagement [7].

Researchers: Studying media studies, social network analysis, and public opinion.

Marketing Professionals: To understand content virality and optimise news dissemination strategies.

Journalists and Media Organisations: For insights into their content performance and audience engagement on social platforms.

Dataset Name Suggestions

Social Media News Popularity

Online News Engagement Metrics

Digital News Dissemination Data

News Virality on Social Platforms

Global News Popularity Insights

Attributes

Original Data Source: News Popularity in Multiple Social Media Platforms
P
EDGE-IIOTSET Dataset
paperswithcode.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). EDGE-IIOTSET Dataset [Dataset]. https://paperswithcode.com/dataset/edge-iiotset
Explore at:
Dataset updated
Oct 16, 2023
Description
ABSTRACT In this project, we propose a new comprehensive realistic cyber security dataset of IoT and IIoT applications, called Edge-IIoTset, which can be used by machine learning-based intrusion detection systems in two different modes, namely, centralized and federated learning. Specifically, the proposed testbed is organized into seven layers, including, Cloud Computing Layer, Network Functions Virtualization Layer, Blockchain Network Layer, Fog Computing Layer, Software-Defined Networking Layer, Edge Computing Layer, and IoT and IIoT Perception Layer. In each layer, we propose new emerging technologies that satisfy the key requirements of IoT and IIoT applications, such as, ThingsBoard IoT platform, OPNFV platform, Hyperledger Sawtooth, Digital twin, ONOS SDN controller, Mosquitto MQTT brokers, Modbus TCP/IP, ...etc. The IoT data are generated from various IoT devices (more than 10 types) such as Low-cost digital sensors for sensing temperature and humidity, Ultrasonic sensor, Water level detection sensor, pH Sensor Meter, Soil Moisture sensor, Heart Rate Sensor, Flame Sensor, ...etc.). However, we identify and analyze fourteen attacks related to IoT and IIoT connectivity protocols, which are categorized into five threats, including, DoS/DDoS attacks, Information gathering, Man in the middle attacks, Injection attacks, and Malware attacks. In addition, we extract features obtained from different sources, including alerts, system resources, logs, network traffic, and propose new 61 features with high correlations from 1176 found features. After processing and analyzing the proposed realistic cyber security dataset, we provide a primary exploratory data analysis and evaluate the performance of machine learning approaches (i.e., traditional machine learning as well as deep learning) in both centralized and federated learning modes.

Instructions:

Great news! The Edge-IIoT dataset has been featured as a "Document in the top 1% of Web of Science." This indicates that it is ranked within the top 1% of all publications indexed by the Web of Science (WoS) in terms of citations and impact.

Please kindly visit kaggle link for the updates: https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-sec...

Free use of the Edge-IIoTset dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes is allowable after asking the leader author, Dr Mohamed Amine Ferrag, who has asserted his right under the Copyright.

The details of the Edge-IIoT dataset were published in following the paper. For the academic/public use of these datasets, the authors have to cities the following paper:

Mohamed Amine Ferrag, Othmane Friha, Djallel Hamouda, Leandros Maglaras, Helge Janicke, "Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning", IEEE Access, April 2022 (IF: 3.37), DOI: 10.1109/ACCESS.2022.3165809

Link to paper : https://ieeexplore.ieee.org/document/9751703

The directories of the Edge-IIoTset dataset include the following:

•File 1 (Normal traffic)

-File 1.1 (Distance): This file includes two documents, namely, Distance.csv and Distance.pcap. The IoT sensor (Ultrasonic sensor) is used to capture the IoT data.

-File 1.2 (Flame_Sensor): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.3 (Heart_Rate): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.4 (IR_Receiver): This file includes two documents, namely, IR_Receiver.csv and IR_Receiver.pcap. The IoT sensor (IR (Infrared) Receiver Sensor) is used to capture the IoT data.

-File 1.5 (Modbus): This file includes two documents, namely, Modbus.csv and Modbus.pcap. The IoT sensor (Modbus Sensor) is used to capture the IoT data.

-File 1.6 (phValue): This file includes two documents, namely, phValue.csv and phValue.pcap. The IoT sensor (pH-sensor PH-4502C) is used to capture the IoT data.

-File 1.7 (Soil_Moisture): This file includes two documents, namely, Soil_Moisture.csv and Soil_Moisture.pcap. The IoT sensor (Soil Moisture Sensor v1.2) is used to capture the IoT data.

-File 1.8 (Sound_Sensor): This file includes two documents, namely, Sound_Sensor.csv and Sound_Sensor.pcap. The IoT sensor (LM393 Sound Detection Sensor) is used to capture the IoT data.

-File 1.9 (Temperature_and_Humidity): This file includes two documents, namely, Temperature_and_Humidity.csv and Temperature_and_Humidity.pcap. The IoT sensor (DHT11 Sensor) is used to capture the IoT data.

-File 1.10 (Water_Level): This file includes two documents, namely, Water_Level.csv and Water_Level.pcap. The IoT sensor (Water sensor) is used to capture the IoT data.

•File 2 (Attack traffic):

-File 2.1 (Attack traffic (CSV files)): This file includes 13 documents, namely, Backdoor_attack.csv, DDoS_HTTP_Flood_attack.csv, DDoS_ICMP_Flood_attack.csv, DDoS_TCP_SYN_Flood_attack.csv, DDoS_UDP_Flood_attack.csv, MITM_attack.csv, OS_Fingerprinting_attack.csv, Password_attack.csv, Port_Scanning_attack.csv, Ransomware_attack.csv, SQL_injection_attack.csv, Uploading_attack.csv, Vulnerability_scanner_attack.csv, XSS_attack.csv. Each document is specific for each attack.

-File 2.2 (Attack traffic (PCAP files)): This file includes 13 documents, namely, Backdoor_attack.pcap, DDoS_HTTP_Flood_attack.pcap, DDoS_ICMP_Flood_attack.pcap, DDoS_TCP_SYN_Flood_attack.pcap, DDoS_UDP_Flood_attack.pcap, MITM_attack.pcap, OS_Fingerprinting_attack.pcap, Password_attack.pcap, Port_Scanning_attack.pcap, Ransomware_attack.pcap, SQL_injection_attack.pcap, Uploading_attack.pcap, Vulnerability_scanner_attack.pcap, XSS_attack.pcap. Each document is specific for each attack.

•File 3 (Selected dataset for ML and DL):

-File 3.1 (DNN-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating deep learning-based intrusion detection systems.

-File 3.2 (ML-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating traditional machine learning-based intrusion detection systems.

Step 1: Downloading The Edge-IIoTset dataset From the Kaggle platform from google.colab import files

!pip install -q kaggle

files.upload()

!mkdir ~/.kaggle

!cp kaggle.json ~/.kaggle/

!chmod 600 ~/.kaggle/kaggle.json

!kaggle datasets download -d mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot -f "Edge-IIoTset dataset/Selected dataset for ML and DL/DNN-EdgeIIoT-dataset.csv"

!unzip DNN-EdgeIIoT-dataset.csv.zip

!rm DNN-EdgeIIoT-dataset.csv.zip

Step 2: Reading the Datasets' CSV file to a Pandas DataFrame: import pandas as pd

import numpy as np

df = pd.read_csv('DNN-EdgeIIoT-dataset.csv', low_memory=False)

Step 3 : Exploring some of the DataFrame's contents: df.head(5)

print(df['Attack_type'].value_counts())

Step 4: Dropping data (Columns, duplicated rows, NAN, Null..): from sklearn.utils import shuffle

drop_columns = ["frame.time", "ip.src_host", "ip.dst_host", "arp.src.proto_ipv4","arp.dst.proto_ipv4",

"http.file_data","http.request.full_uri","icmp.transmit_timestamp", "http.request.uri.query", "tcp.options","tcp.payload","tcp.srcport", "tcp.dstport", "udp.port", "mqtt.msg"]

df.drop(drop_columns, axis=1, inplace=True)

df.dropna(axis=0, how='any', inplace=True)

df.drop_duplicates(subset=None, keep="first", inplace=True)

df = shuffle(df)

df.isna().sum()

print(df['Attack_type'].value_counts())

Step 5: Categorical data encoding (Dummy Encoding): import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn import preprocessing

def encode_text_dummy(df, name):

dummies = pd.get_dummies(df[name])

for x in dummies.columns:

dummy_name = f"{name}-{x}" df[dummy_name] = dummies[x]

df.drop(name, axis=1, inplace=True)

encode_text_dummy(df,'http.request.method')

encode_text_dummy(df,'http.referer')

encode_text_dummy(df,"http.request.version")

encode_text_dummy(df,"dns.qry.name.len")

encode_text_dummy(df,"mqtt.conack.flags")

encode_text_dummy(df,"mqtt.protoname")

encode_text_dummy(df,"mqtt.topic")

Step 6: Creation of the preprocessed dataset df.to_csv('preprocessed_DNN.csv', encoding='utf-8')

For more information about the dataset, please contact the lead author of this project, Dr Mohamed Amine Ferrag, on his email: mohamed.amine.ferrag@gmail.com

More information about Dr. Mohamed Amine Ferrag is available at:

https://www.linkedin.com/in/Mohamed-Amine-Ferrag

https://dblp.uni-trier.de/pid/142/9937.html

https://www.researchgate.net/profile/Mohamed_Amine_Ferrag

https://scholar.google.fr/citations?user=IkPeqxMAAAAJ&hl=fr&oi=ao

https://www.scopus.com/authid/detail.uri?authorId=56115001200

https://publons.com/researcher/1322865/mohamed-amine-ferrag/

https://orcid.org/0000-0002-0632-3172

Last Updated: 27 Mar. 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Number of LinkedIn users worldwide 2019-2028 [Dataset]. https://www.statista.com/forecasts/1147197/linkedin-users-in-the-world

Number of LinkedIn users worldwide 2019-2028

Explore at:

12 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Mar 3, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

World

Description

The global number of LinkedIn users in was forecast to continuously increase between 2024 and 2028 by in total 171.9 million users (+22.3 percent). After the sixth consecutive increasing year, the LinkedIn user base is estimated to reach 942.84 million users and therefore a new peak in 2028. User figures, shown here with regards to the platform LinkedIn, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of LinkedIn users in countries like Asia and South America.

Clear search

Close search

Google apps

Main menu

Number of LinkedIn users worldwide 2019-2028

LinkedIn Datasets

Countries with the most LinkedIn users 2025

LinkedIn Jobs Datasets

LinkedIn company information

Dataset Features

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Ideas

Success.ai | User Profiles Data | Comprehensive 700M Dataset of LinkedIn...

LinkedIn Reviews Sentiment Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Social Media Data | Linkedin, Youtube, TwitterX | Global Coverage | 120M+...

US Employee Data | Accurate Contact Information, Job Experience, LinkedIn...

Data Licensing - ABM Data- 152+ Million Contacts | 13+ Million Companies -...

Social Media Profile Links by Name

B2B Data - Job Posting Data - LinkedIn Data - Emails and Phone Data.

Draughts Board Dataset

B2B Contact Data | 148MM+ High Quality US B2B Contacts | LinkedIn URL,...

Place Review Dataset - Niche (USA)

Context

Content

Acknowledgements

Inspiration

Citation

Other datasets by me

Ecommerce Data | Store Location Data | Global Coverage | 61M+ Contacts |...

oyo-reviews-dataset

‘Deep-NLP’ analyzed by Analyst-2

What's In The Deep-NLP Dataset?

What Do I Do With This?

Acknowledgements

There Is A 'deep natural language processing' Kernel. I will update it. I Hope You Find It Useful.

Global News Popularity Insights Datset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

EDGE-IIOTSET Dataset

Number of LinkedIn users worldwide 2019-2028See More Versions

Number of LinkedIn users worldwide 2019-2028