https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for Resume Dataset
Dataset Summary
Context
A collection of Resume Examples taken from livecareer.com for categorizing a given resume into any of the labels defined in the dataset.
Content
Contains 2400+ Resumes in string as well as PDF format. PDF stored in the data folder differentiated into their respective labels as folders with each resume residing inside the folder in pdf form with filename as the id defined in the csv. Inside the… See the full description on the dataset page: https://huggingface.co/datasets/opensporks/resumes.
sankar12345/Resume-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
🔹 Overview: This dataset contains 1,000+ synthetic resumes with key details such as skills, experience, education, job roles, certifications, AI screening scores, and recruiter decisions.
🔹 Features:
Resume_ID: Unique identifier Name: Candidate's name Skills: List of relevant technical skills Experience (Years): Total work experience Education: Highest qualification Certifications: Relevant industry certifications Job Role: Target job position Recruiter Decision: Hire or Reject Salary Expectation ($): Expected salary Projects Count: Number of projects completed AI Score (0-100): AI-based resume ranking score 🔹 Use Cases:
Resume screening automation HR analytics & hiring trends Salary prediction models AI-powered hiring research
🚀 Use this dataset to build AI models that can predict hiring decisions, analyze job market trends, or optimize HR processes!
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset, curated and processed by Neuralframe AI, serves as a comprehensive resource for resume parsing, candidate profiling, and job matching applications. It includes structured information on career objectives, skills, education, work experience, certifications, and other relevant details. The data has been collected from both open-source platforms and Neuralframe AI's proprietary sources, ensuring all data is obtained with explicit consent.
The dataset was first utilized in the Datathon Competition at Bitfest 2025, providing participants with a practical dataset to develop and refine resume parsing algorithms and candidate evaluation systems.
Feel free to explore it, and if you find it helpful or interesting, an upvote would be appreciated!
Thank you.
Original Data Source: Resume Dataset
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Saba06huggingface/resume_dataset
A collection of Resume Examples taken from livecareer.com for categorizing a given resume into any of the labels defined in the dataset. This dataset card aims to be a base template for new datasets. It has been generated using this raw template.
Dataset Details
Dataset Description
About Dataset Context A collection of Resume Examples taken from livecareer.com for categorizing a given resume into any of… See the full description on the dataset page: https://huggingface.co/datasets/Saba06huggingface/resume_dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
RESUME is a dataset for object detection tasks - it contains Heading Paragraph annotations for 338 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a collection of resumes posted on Indeed from February 2019, providing valuable insights into job seekers' skills, experiences, and education across various industries in the United States. It includes key details such as job titles, skills, education levels, and locations, making it an essential resource for anyone looking to analyze hiring trends, skills demand, or job market patterns during this period.
You can download the full dataset here.
Whether you're conducting research on employment trends, developing machine learning models for resume parsing, or performing skills gap analysis, this dataset offers a comprehensive starting point.
For more specific or up-to-date data, or if you need tailored datasets from other platforms, consider leveraging custom web scraping services. PromptCloud offers flexible and scalable data extraction solutions to meet your unique needs, allowing you to focus on analysis and decision-making without worrying about data collection. https://www.promptcloud.com/web-scraping-services/
This file contains the following data fields:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Description:
The dataset consists of 6570 grayscale images, meticulously handpicked and curated for instance segmentation tasks. These images have been meticulously annotated to delineate individual object instances, providing a comprehensive dataset for training and evaluating instance segmentation models.
Data Collection Process:
The images within the dataset were collected through a rigorous process involving multiple sources and datasets. Leveraging the capabilities of Roboflow Universe, the team behind the project meticulously handpicked images from various publicly available sources and datasets relevant to the domain of interest. These sources may include online repositories, research datasets, and proprietary collections, ensuring a diverse and representative sample of data.
Preprocessing and Data Integration:
To ensure uniformity and consistency across the dataset, several preprocessing techniques were applied. First, the images were automatically oriented to correct any orientation discrepancies. Next, they were resized to a standardized resolution of 640x640 pixels, facilitating efficient training and inference. Moreover, to simplify the data and focus on the essential features, the images were converted to grayscale.
Furthermore, to augment the dataset and enhance its diversity, multiple datasets were combined and integrated into a single cohesive collection. This involved harmonizing annotation formats, resolving potential conflicts, and ensuring compatibility across different datasets. Through meticulous preprocessing and integration efforts, disparate datasets were seamlessly merged into a unified dataset, enriching its variability and ensuring comprehensive coverage of object instances and scenarios.
Model Details:
The instance segmentation model deployed for this dataset is built upon Roboflow 3.0 architecture, leveraging the Fast variant for efficient inference. Trained using the COCO instance segmentation dataset as its checkpoint, the model exhibits robust performance in accurately delineating object boundaries and classifying instances within the images.
Performance Metrics:
The model achieves impressive performance metrics, including a mAP of 76.5%, precision of 76.7%, and recall of 73.5%. These metrics underscore the model's effectiveness in accurately localizing and classifying object instances, demonstrating its suitability for various computer vision tasks.
Conclusion:
In summary, the dataset represents a culmination of meticulous data collection, preprocessing, and integration efforts, resulting in a comprehensive resource for instance segmentation tasks. By combining multiple datasets and leveraging advanced preprocessing techniques, the dataset offers diverse and representative imagery, enabling robust model training and evaluation. With the high-performance instance segmentation model and impressive performance metrics, the dataset serves as a valuable asset for researchers, developers, and practitioners in the field of computer vision.
For further information and access to the dataset, please visit Roboflow Universe.
syedroshanzameer/resume-dataset-classification dataset hosted on Hugging Face and contributed by the HF Datasets community
The resume files have the extension .txt and the corresponding labels are in a file with the extension .lab.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Resume Parser2 is a dataset for object detection tasks - it contains Resume Columns 3 annotations for 1,483 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
sam-2577/resume-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Resume Analyser is a dataset for classification tasks - it contains Words annotations for 1,151 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
This is the EVENT data captured from the New York City CV Pilot project that was processed by the independent evaluators at Volpe. Additional data collected and data dictionary are in the attachments.
Each EVENT record documents the details of one application warning that occurred on an Aftermarket Safety Device (ASD) in an equipped host vehicle and includes CV messages from a defined recording time both before and after the warning was generated by the host ASD. Messages in the recording time window include the Basic Safety Messages (BSM) of the host vehicle that received the warning, as well as other BSMs received from the warning target equipped vehicle (for V2V applications) or other nearby equipped vehicles. Depending on the application warning type, MAP messages, Signal Phase and Timing (SPaT) messages, and Traveler Information Messages (TIM) that were heard by the host vehicle may also be included in the event record.
https://www.koncile.ai/en/termsandconditionshttps://www.koncile.ai/en/termsandconditions
AI scanner to extract data from resumes. Reliable and customizable OCR with API & SDK to convert and validate key candidate information for HR workflows.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Resume Dataset
Dataset Description
This dataset contains resume data for different job categories with skills, education, and experience information that can be used for resume classification or career prediction applications.
Data Structure
This dataset is stored in CSV format with the following columns:
id: Unique identifier for each resume category: Job category or field (e.g., HR, IT, Marketing) skills: Comma-separated list of skills mentioned in the… See the full description on the dataset page: https://huggingface.co/datasets/C0ldSmi1e/resume-dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 3 rows and is filtered where the book is Resumes for dummies. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Cv Project is a dataset for object detection tasks - it contains Helmet Person Helmet Glove Vest annotations for 3,131 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Problem Statement
👉 Download the case studies here
A multinational corporation faced inefficiencies in its recruitment process, including lengthy hiring cycles and difficulty matching candidates to roles effectively. Manual candidate screening was time-consuming and prone to biases, leading to suboptimal hires and increased operational costs. The company needed an automated solution to streamline recruitment and improve the quality of hires.
Challenge
Automating the recruitment process posed several challenges:
Processing large volumes of resumes and job applications efficiently.
Identifying the best candidates based on qualifications, skills, and cultural fit.
Reducing bias in the hiring process while maintaining compliance with employment regulations.
Solution Provided
An AI-driven applicant tracking system (ATS) was developed, leveraging machine learning algorithms to automate candidate screening and matching. The solution was designed to:
Parse resumes and extract relevant candidate information efficiently.
Rank candidates based on their suitability for specific job roles using predictive analytics.
Provide actionable insights for recruiters to make data-driven hiring decisions.
Development Steps
Data Collection
Collected historical hiring data, including resumes, job descriptions, and hiring outcomes, to train machine learning models.
Preprocessing
Standardized and structured resume data, ensuring consistency and compatibility with the applicant tracking system.
Model Training
Built machine learning models to rank candidates based on skills, experience, and job requirements. Integrated natural language processing (NLP) algorithms to analyze resumes and match keywords to job descriptions.
Validation
Tested the system with past hiring data to ensure accuracy in candidate ranking and matching.
Deployment
Implemented the applicant tracking system across the company’s recruitment platforms, enabling seamless automation of the hiring process.
Monitoring & Improvement
Established a feedback loop to refine models based on recruiter input and hiring outcomes, ensuring continuous improvement.
Results
Reduced Hiring Time
The automated system decreased hiring cycles by 40%, accelerating the recruitment process
Improved Candidate-Job Fit
Advanced matching algorithms ensured that candidates better aligned with job requirements and organizational culture.
Enhanced Recruitment Efficiency
Automation reduced manual workload for recruiters, allowing them to focus on strategic aspects of hiring.
Minimized Bias in Screening
AI-driven algorithms provided unbiased candidate assessments, promoting diversity and inclusion in the hiring process.
Scalable Solution
The system scaled effortlessly to handle recruitment across multiple regions and job levels, supporting the company’s global operations.
The Tampa CV Pilot generates data from the interaction between vehicles and between vehicles and infrastructure. This dataset consists of Basic Safety Messages (BSMs) generated by participant and public transportation vehicles onboard units (OBU) and transmitted to road-side units (RSU) located throughout the Tampa CV Pilot Study area. The full set of raw, BSM data from Tampa CV Pilot can be found in the ITS Sandbox. The data fields follow SAE J2735 and J2945/1 standards and adopted units of measure. This dataset holds a flattened sample of the BSM data from Tampa CV Pilot. An extra geo column (coreData_position) was added to this dataset to allow for mapping of the geocoded BSM data within Socrata, and a column of random numbers (randomNum) was added to allow for random sampling of data points within Socrata.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for Resume Dataset
Dataset Summary
Context
A collection of Resume Examples taken from livecareer.com for categorizing a given resume into any of the labels defined in the dataset.
Content
Contains 2400+ Resumes in string as well as PDF format. PDF stored in the data folder differentiated into their respective labels as folders with each resume residing inside the folder in pdf form with filename as the id defined in the csv. Inside the… See the full description on the dataset page: https://huggingface.co/datasets/opensporks/resumes.