Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Scraped LinkedIn job postings for data-related roles (Data Analyst, Data Engineer, Data Scientist, etc.)
This dataset contains job postings scraped from LinkedIn, including job titles, companies, locations, descriptions, and job types (remote/hybrid/onsite). The data can be used for data cleaning, NLP analysis, skill extraction, and building AI-powered job application tools. ## Dataset Features Column Name Description Title Job title (e.g., "Data Analyst," "Product Analyst") Company Hiring company name Location Job location (city/country) Description Full job description (may include company info) Job Type Remote, Hybrid, or Onsite (if available)
β Data Cleaning & Normalization β Standardize job titles, locations, and descriptions. β NLP & Skill Extraction β Find the most in-demand skills (Python, SQL, ML, etc.). β Job Type Analysis β Compare remote vs. onsite job trends. β AI-Powered Job Tools β Build a Streamlit app to generate:
"About Me" sections tailored to job descriptions.
Auto-generated cover letters based on job requirements.
GitHub Collaboration Want to contribute? Join the project here: π https://github.com/JoyKimaiyo/Web-scraping-data-jobs-and-automating-about-me-section
Acknowledgments Data scraped from LinkedIn for educational/non-commercial use.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
LinkedIn is a widely used professional networking platform that hosts millions of job postings. This dataset contains 1.3 million job listings scraped from LinkedIn in the year 2024.
This dataset can be used for various research tasks such as job market analysis, skills mapping, job recommendation systems, and more.
If you find this dataset valuable, please upvote ππΌ
This is the same master dataset that powers SkillExplorer
Photo by Clem Onojeghuo on Unsplash
Facebook
Twitterhttps://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
The dataset contains information on 30,000+ job postings collected from LinkedIn till the year 2023 which provides a rich source of information on job postings on LinkedIn, with concise information on the job title, company, location, and other key attributes of each posting. This data can be used to gain insights into employment trends and dynamics, identify key skills and experiences that are in high demand, and optimize job postings to attract the right candidates.
Taxonomy of the Dataset
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13623947%2F85fde0e9bcd9e6532b63e65ca1e5b58a%2FWhatsApp%20Image%202024-02-27%20at%2012.12.59.jpeg?generation=1709016197299811&alt=media" alt="">
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The data comprises job-related information from LinkedIn job postings scraped over a 2-day period. Key features include company details and job-specific information like title, description, and salary. The dataset provides a comprehensive view for exploring factors influencing job posting characteristics and has been reformatted from its original source to improve its compatibility among various machine learning algorithms.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains time-stamped AI & ML job postings scraped from LinkedIn and Indeed over multiple days, covering companies, roles, and locations. It includes:
link: URL to the job postingtitle: Job title (e.g., Data Scientist, ML Engineer)company: Company namelocation: City, state, or countrydate & time: Job posting timestampscrape_date & scrape_time: When the data was collectedDataset Highlights: - ~1,550 unique postings, clean and deduplicated - Ready for EDA, visualization, and ML experiments - Includes scrape metadata for temporal analysis
Potential Use Cases: - Trend analysis of AI/ML hiring over time - Skill extraction and NLP on job titles - Job classification or predictive modeling projects - Company hiring insights and labor market research - Geospatial analysis of AI/ML demand
Included Notebook: EDA_Job_Postings.ipynb
- Exploratory data analysis with top companies, job titles, locations, and word clouds
- Time-series analysis of job postings
License: CC BY 4.0 β free for research, educational, and analysis purposes with attribution.
Note: Data was collected via public job postings; no personal candidate information is included. Users can further enrich the dataset using the job links if legally permissible.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Data science is a rapidly growing field in the tech industry, and LinkedIn is a popular platform for finding job opportunities in this domain.
This dataset provides valuable insights into data science job postings, including the required skills and software proficiency sought by employers.
If you find this dataset useful, don't forget to hit the upvote button! ππ
Photo by Shahadat Rahman on Unsplash
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Scraper Code - https://github.com/ArshKA/LinkedIn-Job-Scraper
Every day, thousands of companies and individuals turn to LinkedIn in search of talent. This dataset contains a nearly comprehensive record of 124,000+ job postings listed in 2023 and 2024. Each individual posting contains dozens of valuable attributes for both postings and companies, including the title, job description, salary, location, application URL, and work-types (remote, contract, etc), in addition to separate files containing the benefits, skills, and industries associated with each posting. The majority of jobs are also linked to a company, which are all listed in another csv file containing attributes such as the company description, headquarters location, and number of employees, and follower count.
With so many datapoints, the potential for exploration of this dataset is vast and includes exploring the highest compensated titles, companies, and locations; predicting salaries/benefits through NLP; and examining how industries and companies vary through their internship offerings and benefits. Future updates will permit further exploration into time-based trends, including company growth, prevalence of remote jobs, and demand of individual job titles over time.
Thank you to @zoeyyuzou for scraping an additional 100,000 jobs β
job_postings.csv
- job_id: The job ID as defined by LinkedIn (https://www.linkedin.com/jobs/view/ job_id )
- company_id: Identifier for the company associated with the job posting (maps to companies.csv)
- title: Job title.
- description: Job description.
- max_salary: Maximum salary
- med_salary: Median salary
- min_salary: Minimum salary
- pay_period: Pay period for salary (Hourly, Monthly, Yearly)
- formatted_work_type: Type of work (Fulltime, Parttime, Contract)
- location: Job location
- applies: Number of applications that have been submitted
- original_listed_time: Original time the job was listed
- remote_allowed: Whether job permits remote work
- views: Number of times the job posting has been viewed
- job_posting_url: URL to the job posting on a platform
- application_url: URL where applications can be submitted
- application_type: Type of application process (offsite, complex/simple onsite)
- expiry: Expiration date or time for the job listing
- closed_time: Time to close job listing
- formatted_experience_level: Job experience level (entry, associate, executive, etc)
- skills_desc: Description detailing required skills for job
- listed_time: Time when the job was listed
- posting_domain: Domain of the website with application
- sponsored: Whether the job listing is sponsored or promoted.
- work_type: Type of work associated with the job
- currency: Currency in which the salary is provided.
- compensation_type: Type of compensation for the job.
β
job_details/benefits.csv
- job_id: The job ID
- type: Type of benefit provided (401K, Medical Insurance, etc)
- inferred: Whether the benefit was explicitly tagged or inferred through text by LinkedIn
β
company_details/companies.csv
- company_id: The company ID as defined by LinkedIn
- name: Company name
- description: Company description
- company_size: Company grouping based on number of employees (0 Smallest - 7 Largest)
- country: Country of company headquarters.
- state: State of company headquarters.
- city: City of company headquarters.
- zip_code: ZIP code of company's headquarters.
- address: Address of company's headquarters
- url: Link to company's LinkedIn page
β
company_details/employee_counts.csv
- company_id: The company ID
- employee_count: Number of employees at company
- follower_count: Number of company followers on LinkedIn
- time_recorded: Unix time of data collection
If you find this dataset helpful, your upvote would convince me I didn't waste my summer break π
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
LinkedIn is a popular professional networking platform with millions of job postings across various industries.
This dataset provides a raw dump of data science-related job postings collected from LinkedIn. It includes information about job titles, companies, locations, search parameters, and other relevant details.
The main objective of this dataset is not only to provide insights into the data science job market and the skills required by professionals in this field but also to offer users an opportunity to practice their data cleaning skills.
By working with this dataset, users can gain hands-on experience in cleaning and preprocessing raw data, a critical skill for aspiring data scientists.
If you find this dataset useful or interesting, please upvote it! ππ
Photo by Luke Chesser on Unsplash
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by RATNESH SATYARTHI
Released under Apache 2.0
Facebook
Twitterhttps://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
The dataset contains information on 30,000+ job postings collected from LinkedIn till the year 2023 which provides a rich source of information on job postings on LinkedIn, with concise information on the job title, company, location, and other key attributes of each posting. This data can be used to gain insights into employment trends and dynamics, identify key skills and experiences that are in high demand, and optimize job postings to attract the right candidates.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F18065122%2Ff51a9373d9b0bbc78235ae4be8dbdce5%2F1.jpeg?generation=1751200071553062&alt=media" alt="">
Facebook
TwitterThe following dataset is extracted using an API and contains job data from LinkedIN about 10 common job roles in Inida.
The data can be used for a comprehensive analysis on application trends, peak posting times, popular job titles, company dynamics, geographical patterns, sector-specific insights, job freshness, company-specific behaviors, and predictive modeling. Predictive modeling can be used to anticipate job market dynamics. In summary, this dataset facilitates a thorough exploration of the job market, providing actionable insights for both job seekers and employers.
id: Unique identifier for each job posting (Integer).
publishedAt: Date when the job was published (String, formatted as 'YYYY-MM-DD').
title: Job title (String).
companyName: Name of the hiring company (String).
postedTime: Time since the job was posted (String).
applicationsCount: Number of job applications received (Float).
description: Job description, including required skills (String).
contractType: Type of employment contract (String).
experienceLevel: Level of experience required for the job (String).
workType: Type of work arrangement (String).
sector: Industry sector of the job (String).
companyId: Unique identifier for the hiring company (Integer).
city: City where the job is located (String).
state: State where the job is located (String).
recently_posted_jobs: Indicates whether the job is recently posted (String, 'Yes' or 'No').
Facebook
TwitterThis dataset contains a curated collection of job listings sourced from LinkedIn, featuring a variety of positions across multiple industries and locations. Each entry includes essential details such as job title, company name, job location, employment type, and base pay range, alongside a comprehensive job summary and required qualifications.
This dataset is ideal for researchers, data scientists, and job seekers looking to analyze job market trends, understand salary expectations, or develop predictive models for career growth. Use this resource to gain insights into the evolving job landscape and make informed career decisions.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Muhammad Sufyan
Released under CC0: Public Domain
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 1,258 job postings collected from LinkedIn between 2019 and 2025. The dataset was compiled manually from jobs the author applied to and is used to study the transparency and structural characteristics of online job postings. The various attributes in the dataset are:
If you use this dataset in your research, please cite: Zagabathuni, Y. (2025). LinkedIn Job Posting Transparency Dataset (2019β2025). Kaggle.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This comprehensive dataset contains a curated collection of 100 job postings for Data Analyst positions, sourced from LinkedIn. As the demand for skilled data analysts continues to surge, this dataset serves as a valuable resource for data enthusiasts, aspiring data analysts, and researchers alike.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains information about job postings on LinkedIn. The data is divided into several files, each containing different aspects of the job postings:
job_postings.csv: This file contains detailed information about each job posting, including the job title, description, salary, work type, location, and more.companies.csv: This file contains detailed information about each company that posted a job, including the company name, website, description, size, location, and more.company_industries.csv: This file contains the industries associated with each company.company_specialities.csv: This file contains the specialties associated with each company.employee_counts.csv: This file contains the employee and follower counts for each company.benefits.csv: This file contains the benefits associated with each job.job_industries.csv: This file contains the industries associated with each job.job_skills.csv: This file contains the skills associated with each job.This dataset can be used for various purposes such as: - Analyzing the job market - Analyzing company trends - Analyzing salary trends - Building a job recommendation system - Natural Language Processing (NLP) tasks such as keyword extraction, topic modeling, etc.
This dataset was collected from LinkedIn. Please note that the data may be subject to LinkedIn's terms of use.
This dataset is released under the Open Database License (ODbL).
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The jobs_linkedin.csv file comprises the scraping results obtained from LinkedIn website. It includes the following columns:
1. title: Signifies the job title associated with each entry.
2. location: Provides information about the job's location.
3. time: Indicates the timestamp when the job post was uploaded.
4. link: Contains a unique identifier (UUID) and a direct link to the respective job post.
5. desc: Contains the comprehensive description of each job opportunity.
For a more detailed exploration of my NLP work, please refer to: - LinkedIn-NLP-Notebook - LinkedIn-NLP&DL-Notebook
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
π Overview
This dataset contains detailed information on job postings sourced from LinkedIn, collected manually or via web scraping tools. It captures a variety of fields that offer insights into job market trends, in-demand skills, company hiring behavior, salary patterns, and geographical distributions.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Collated 8 different data source. Filtered for only Data and ML jobs, titles and descriptions. Applied text data cleaning and preprocessing, documented here: https://tianyimasf.github.io/ai/data-cleaning/.
LinkedIn-Tech-Job-Data: A compilation of job posts and metadata scraped from various tech categories on LinkedIn
Data Analyst Jobs: This dataset was created by picklesueat and contains more than 2000 job listing for data analyst positions
US Job Postings from 2023-05-05: This dataset is an excerpt of our web scraping activities at Techmap.io and contains a sample of 33k Job Postings from the USA on May 5th 2023.
LinkedIn Job Postings Dataset: This dataset contains information about job postings on LinkedIn.
LinkedIn Job Postings - Machine Learning Data Set: The data comprises job-related information from LinkedIn job postings scraped over a 2-day period.
Linkedin Canada: Data Science Jobs 2024: The "LinkedIn Canada: Data Science Jobs 2024" dataset presents an insightful overview of the data science job market in Canada as sourced from LinkedIn.
Data Scientist - Linkedin Job Postings: This dataset provides valuable insights into data science job postings, including the required skills and software proficiency sought by employers.
LinkedIn Job Postings Dataset: This dataset contains information about job postings on LinkedIn.
Initially used for my project analyzing data job market, including analyzing titles, skills, and company functions. Could be used for other purposes like posting generation.
Facebook
TwitterThis dataset was created by davideev9
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Scraped LinkedIn job postings for data-related roles (Data Analyst, Data Engineer, Data Scientist, etc.)
This dataset contains job postings scraped from LinkedIn, including job titles, companies, locations, descriptions, and job types (remote/hybrid/onsite). The data can be used for data cleaning, NLP analysis, skill extraction, and building AI-powered job application tools. ## Dataset Features Column Name Description Title Job title (e.g., "Data Analyst," "Product Analyst") Company Hiring company name Location Job location (city/country) Description Full job description (may include company info) Job Type Remote, Hybrid, or Onsite (if available)
β Data Cleaning & Normalization β Standardize job titles, locations, and descriptions. β NLP & Skill Extraction β Find the most in-demand skills (Python, SQL, ML, etc.). β Job Type Analysis β Compare remote vs. onsite job trends. β AI-Powered Job Tools β Build a Streamlit app to generate:
"About Me" sections tailored to job descriptions.
Auto-generated cover letters based on job requirements.
GitHub Collaboration Want to contribute? Join the project here: π https://github.com/JoyKimaiyo/Web-scraping-data-jobs-and-automating-about-me-section
Acknowledgments Data scraped from LinkedIn for educational/non-commercial use.