25 datasets found
  1. LinkedIn US Retail

    • kaggle.com
    zip
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ilya Novoselskiy (2024). LinkedIn US Retail [Dataset]. https://www.kaggle.com/datasets/ilya9711nov/linkedin-us-retail
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 22, 2024
    Authors
    Ilya Novoselskiy
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    United States
    Description

    Dataset contains US Retail companies with company size from 200-500 workers. For each company, all workers were scrapped as well.

    For mode details about scrapping code, you can check my article or GitHub code

  2. LinkedIn Dataset - Israel People Profiles

    • kaggle.com
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph from Proxycurl (2023). LinkedIn Dataset - Israel People Profiles [Dataset]. https://www.kaggle.com/datasets/proxycurl/10000-israel-people-profiles
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Joseph from Proxycurl
    Area covered
    Israel
    Description

    Full profile of 10,000 people in Israel - download here, data schema here, with more than 40 data points including - Full Name - Education - Location - Work Experience History and many more!

    There are additionally millions more Israel people profiles available, visit the LinkDB product page here.

    Our LinkDB database is an exhaustive database of publicly accessible LinkedIn people and companies profiles. It contains close to 500 Million people and companies profiles globally.

  3. h

    Breast-Cancer-Cell-Dataset

    • huggingface.co
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadi Hassan (2024). Breast-Cancer-Cell-Dataset [Dataset]. https://huggingface.co/datasets/Mahadih534/Breast-Cancer-Cell-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Authors
    Mahadi Hassan
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Data Source

    https://www.kaggle.com/datasets/andrewmvd/breast-cancer-cell-segmentation

      Dataset Card Authors
    

    Mahadi Hassan

      Dataset Card Contact
    
    
    
    
    
      mahadise01@gmail.com
    
    
    
    
    
      Linkdin: https://www.linkedin.com/in/mahadise01
    
    
    
    
    
      Github: https://github.com/Mahadih534
    
  4. h

    brain-tumor-MRI-dataset

    • huggingface.co
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadi Hassan (2024). brain-tumor-MRI-dataset [Dataset]. https://huggingface.co/datasets/Mahadih534/brain-tumor-MRI-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Authors
    Mahadi Hassan
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Data Source

    https://www.kaggle.com/datasets/navoneel/brain-mri-images-for-brain-tumor-detection

      Dataset Card Authors
    

    Mahadi Hassan

      Dataset Card Contact
    
    
    
    
    
      mahadise01@gmail.com
    
    
    
    
    
      Linkdin: https://www.linkedin.com/in/mahadise01
    
    
    
    
    
      Github: https://github.com/Mahadih534
    
  5. h

    linkedin-jobs

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haiwei He, linkedin-jobs [Dataset]. https://huggingface.co/datasets/HaiweiHe/linkedin-jobs
    Explore at:
    Authors
    Haiwei He
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description
  6. LinkedIn - Job Posts Insights Dataset

    • kaggle.com
    Updated Feb 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sindhu Madhuri (2024). LinkedIn - Job Posts Insights Dataset [Dataset]. https://www.kaggle.com/datasets/sindhumadhurii/linkedin-job-posts-insights-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sindhu Madhuri
    License

    https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

    Description

    The dataset contains information on 30,000+ job postings collected from LinkedIn till the year 2023 which provides a rich source of information on job postings on LinkedIn, with concise information on the job title, company, location, and other key attributes of each posting. This data can be used to gain insights into employment trends and dynamics, identify key skills and experiences that are in high demand, and optimize job postings to attract the right candidates.

    Taxonomy of the Dataset https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13623947%2F85fde0e9bcd9e6532b63e65ca1e5b58a%2FWhatsApp%20Image%202024-02-27%20at%2012.12.59.jpeg?generation=1709016197299811&alt=media" alt="">

  7. h

    Chest_X-Ray_Images-Dataset

    • huggingface.co
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadi Hassan (2024). Chest_X-Ray_Images-Dataset [Dataset]. https://huggingface.co/datasets/Mahadih534/Chest_X-Ray_Images-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Authors
    Mahadi Hassan
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Data Source

    https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia

      Dataset Card Authors
    

    Mahadi Hassan

      Dataset Card Contact
    
    
    
    
    
      mahadise01@gmail.com
    
    
    
    
    
      Linkdin: https://www.linkedin.com/in/mahadise01
    
    
    
    
    
      Github: https://github.com/Mahadih534
    
  8. h

    Chest_CT-Scan_images-Dataset

    • huggingface.co
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadi Hassan (2024). Chest_CT-Scan_images-Dataset [Dataset]. https://huggingface.co/datasets/Mahadih534/Chest_CT-Scan_images-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Authors
    Mahadi Hassan
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Data Source

    https://www.kaggle.com/datasets/mohamedhanyyy/chest-ctscan-images

      Dataset Card Authors
    

    Mahadi Hassan

      Dataset Card Contact
    
    
    
    
    
      mahadise01@gmail.com
    
    
    
    
    
      Linkdin: https://www.linkedin.com/in/mahadise01
    
    
    
    
    
      Github: https://github.com/Mahadih534
    
  9. A

    ‘Precipitation Prediction in LA’ analyzed by Analyst-2

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Precipitation Prediction in LA’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-precipitation-prediction-in-la-8cce/f3c83692/?iid=002-283&v=presentation
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Precipitation Prediction in LA’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/varunnagpalspyz/precipitation-prediction-in-la on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This Dataset is part of a basic DIY Machine Learning project offered by my college, Indian Institute of Technology, Guwahati (IIT G). The main aim of this project was to get familiar with the workflow and various techniques involved in a Machine Learning project.

    Content

    The dataset is fairly simple and contains various features regarding precipitation. PRCP = Precipitation (tenths of mm) TMAX = Maximum temperature (tenths of degrees C) TMIN = Minimum temperature (tenths of degrees C) PGTM = Peak gust time (hours and minutes, i.e., HHMM) AWND = Average daily wind speed (tenths of meters per second) TAVG = Average temperature (tenths of degrees C) WDFx = Direction of fastest x-minute wind (degrees) WSFx = Fastest x-minute wind speed (tenths of meters per second) WT = Weather Type

    Acknowledgements

    All Credits go to the Coding Club of Indian Institute of Technology, Guwahati (IIT Guwahati). Instagram: https://www.instagram.com/codingclubiitg/ LinkedIn : https://www.linkedin.com/company/coding-club-iitg/

    Inspiration

    Hope that this dataset + my notebook (https://www.kaggle.com/varunnagpalspyz/precipitation-prediction/notebook) helps all beginners like me.

    --- Original source retains full ownership of the source dataset ---

  10. LinkedIn Tech Jobs

    • kaggle.com
    Updated Aug 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joakim Arvidsson (2023). LinkedIn Tech Jobs [Dataset]. https://www.kaggle.com/datasets/joebeachcapital/linkedin-jobs/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 29, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Joakim Arvidsson
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Over 500 jobs scraped from the job section of LinkedIn.

    Attribute Feature's Meaning location The location of the job designation The designation of the job name Name of the company industry Industry in which the company operates employees_count Count of employees linkedin_followers Number of followers on linkedin involvement the nature of involvement in the job, for instance: Full-time, part-time level The seniority level like Mid-Senior level total_applicants total number of applicants Skills Skills required for the job

  11. P

    Novel COVID-19 Chestxray Repository Dataset

    • paperswithcode.com
    Updated Sep 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratik Bhowal; Subhankar Sen; Jin Hee Yoon Zong Woo Geem; Ram Sarkar (2021). Novel COVID-19 Chestxray Repository Dataset [Dataset]. https://paperswithcode.com/dataset/novel-covid-19-chestxray-repository
    Explore at:
    Dataset updated
    Sep 8, 2021
    Authors
    Pratik Bhowal; Subhankar Sen; Jin Hee Yoon Zong Woo Geem; Ram Sarkar
    Description

    Authors of the Dataset:

    Pratik Bhowal (B.E., Dept of Electronics and Instrumentation Engineering, Jadavpur University Kolkata, India) [LinkedIn], [Github] Subhankar Sen (B.Tech, Dept of Computer Science Engineering, Manipal University Jaipur, India) [LinkedIn], [Github], [Google Scholar] Jin Hee Yoon (faculty of the Dept. of Mathematics and Statistics at Sejong University, Seoul, South Korea) [LinkedIn], [Google Scholar] Zong Woo Geem (faculty of College of IT Convergence at Gachon University, South Korea) [LinkedIn], [Google Scholar] Ram Sarkar( Professor at Dept. of Computer Science Engineering, Jadavpur Univeristy Kolkata, India) [LinkedIn], [Google Scholar]

    Overview The authors have created a new dataset known as Novel COVID-19 Chestxray Repository by the fusion of publicly available chest-xray image repositories. In creating this combined dataset, three different datasets obtained from the Github and Kaggle databases,created by the authors of other research studies in this field, were utilized.In our study,frontal and lateral chest X-ray images are used since this view of radiography is widely used by radiologist in clinical diagnosis.In the following section, authors have summarized how this dataset is created.

    COVID-19 Radiography Database: The first release of this dataset reports 219 COVID-19,1345 viral pneumonia and 1341 normal radiographic chest X-ray images. This dataset was created by a team of researchers from Qatar University, Doha, Qatar, and the University of Dhaka, Bangladesh in collaboration with medical doctors and specialists from Pakistan and Malaysia.This database is regularly updated with the emergence of new cases of COVID-19 patients worldwide.Related Paper:https://arxiv.org/abs/2003.13145

    COVID-Chestxray set:Joseph Paul Cohen and Paul Morrison and Lan Dao have created a public image repository on Github which consists both CT scans and digital chest x-rays.The data was collected mainly from retrospective cohorts of pediatric patients from Guangzhou Women and Children’s medical center.With the aid of metadata information provided along with the dataset,we were able to extract 521 COVID-19 positive,239 viral and bacterial pneumonias;which are of the following three broad categories:Middle East Respiratory Syndrome (MERS),Severe Acute Respiratory Syndrome (SARS), and Acute Respiratory Distress syndrome (ARDS);and 218 normal radiographic chest X-ray images of varying image resolutions. Related Paper: https://arxiv.org/abs/2006.11988

    Actualmed COVID chestxray dataset:Actualmed-COVID-chestxray-dataset comprises of 12 COVID-19 positive and 80 normal radiographic chest x-ray images.

    The combined dataset includes chest X-ray images of COVID-19,Pneumonia and Normal (healthy) classes, with a total of 752, 1584, and 1639 images respectively. Information about the Novel COVID-19 Chestxray Database and its parent image repositories is provided in Table 1.

    Table 1: Dataset Description | Dataset| COVID-19 |Pneumonia | Normal | | ------------- | ------------- | ------------- | -------------| | COVID Chestxray set | 521 |239|218| | COVID-19 Radiography Database(first release) | 219 |1345|1341| | Actualmed COVID chestxray dataset| 12 |0|80| | Total|752|1584|1639|

    DATA ACCESS AND USE: Academic/Non-Commercial Use Dataset License : Database: Open Database, Contents: Database Contents

  12. A

    ‘Deep-NLP’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Mar 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Deep-NLP’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-deep-nlp-9bf3/latest
    Explore at:
    Dataset updated
    Mar 31, 2019
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Deep-NLP’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/samdeeplearning/deepnlp on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    What's In The Deep-NLP Dataset?

    Sheet_1.csv contains 80 user responses, in the response_text column, to a therapy chatbot. Bot said: 'Describe a time when you have acted as a resource for someone else'. User responded. If a response is 'not flagged', the user can continue talking to the bot. If it is 'flagged', the user is referred to help.

    Sheet_2.csv contains 125 resumes, in the resume_text column. Resumes were queried from Indeed.com with keyword 'data scientist', location 'Vermont'. If a resume is 'not flagged', the applicant can submit a modified resume version at a later date. If it is 'flagged', the applicant is invited to interview.

    What Do I Do With This?

    Classify new resumes/responses as flagged or not flagged.

    There are two sets of data here - resumes and responses. Split the data into a train set and a test set to test the accuracy of your classifier. Bonus points for using the same classifier for both problems.

    Good luck.

    Acknowledgements

    Thank you to Parsa Ghaffari (Aylien), without whom these visuals (cover photo is in Parsa Ghaffari's excellent LinkedIn article on English, Spanish and German postive v. negative sentiment analysis) would not exist.

    There Is A 'deep natural language processing' Kernel. I will update it. I Hope You Find It Useful.

    You can use any of the code in that kernel anywhere, on or off Kaggle. Ping me at @_samputnam for questions.

    --- Original source retains full ownership of the source dataset ---

  13. Precipitation Prediction in LA

    • kaggle.com
    Updated Jan 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Varun Nagpal Spyz (2022). Precipitation Prediction in LA [Dataset]. https://www.kaggle.com/datasets/varunnagpalspyz/precipitation-prediction-in-la/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Varun Nagpal Spyz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Los Angeles
    Description

    Context

    This Dataset is part of a basic DIY Machine Learning project offered by my college, Indian Institute of Technology, Guwahati (IIT G). The main aim of this project was to get familiar with the workflow and various techniques involved in a Machine Learning project.

    Content

    The dataset is fairly simple and contains various features regarding precipitation. PRCP = Precipitation (tenths of mm) TMAX = Maximum temperature (tenths of degrees C) TMIN = Minimum temperature (tenths of degrees C) PGTM = Peak gust time (hours and minutes, i.e., HHMM) AWND = Average daily wind speed (tenths of meters per second) TAVG = Average temperature (tenths of degrees C) WDFx = Direction of fastest x-minute wind (degrees) WSFx = Fastest x-minute wind speed (tenths of meters per second) WT = Weather Type

    Acknowledgements

    All Credits go to the Coding Club of Indian Institute of Technology, Guwahati (IIT Guwahati). Instagram: https://www.instagram.com/codingclubiitg/ LinkedIn : https://www.linkedin.com/company/coding-club-iitg/

    Inspiration

    Hope that this dataset + my notebook (https://www.kaggle.com/varunnagpalspyz/precipitation-prediction/notebook) helps all beginners like me.

  14. LinkedIn Digital Data

    • kaggle.com
    Updated Sep 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chowdhury Saleh Ahmed Rony (2020). LinkedIn Digital Data [Dataset]. https://www.kaggle.com/salehahmedrony/linkedin-digital-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 8, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Chowdhury Saleh Ahmed Rony
    License

    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets

    Description

    The LinkedIn and World Bank Group collaboration is a prime example of how technology companies can work with development institutions to bring new data and insights to developing countries to address pressing development challenges. The opportunities and challenges presented by the global economy require the public and private sectors to join forces, share information, share resources, and work towards a common vision to make a meaningful, positive and scalable impact.

    The datasets presented here are ones that underlie the visuals at linkedindata.worldbank.org. The datasets cover four categories of metrics: 1) Industry Employment Shifts, 2) Talent Migration, 3) Industry Skills Needs, and 4) Skill Penetration. LinkedIn and the World Bank Group plan to refresh the data annually at a minimum. The datasets are annual time series and go back to 2015.

    Each category of the metrics is provided in a separate file with a cover sheet listing the variables names, definitions, and caveats. Country coverage varies slightly between metrics because of different data extraction and quality control rules. Countries with at least 100,000 LinkedIn members are included in the datasets. If more countries cross this threshold in the future, new countries can be added during the annual refresh.

  15. Human Resources Data Set

    • kaggle.com
    Updated Oct 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Rich (2020). Human Resources Data Set [Dataset]. http://doi.org/10.34740/kaggle/dsv/1572001
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 19, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dr. Rich
    Description

    Updated 30 January 2023

    Version 14 of Dataset

    License Update:

    There has been some confusion around licensing for this data set. Dr. Carla Patalano and Dr. Rich Huebner are the original authors of this dataset.

    We provide a license to anyone who wishes to use this dataset for learning or teaching. For the purposes of sharing, please follow this license:

    CC-BY-NC-ND This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

    Codebook

    https://rpubs.com/rhuebner/hrd_cb_v14

    PLEASE NOTE -- I recently updated the codebook - please use the above link. A few minor discrepancies were identified between the codebook and the dataset. Please feel free to contact me through LinkedIn (www.linkedin.com/in/RichHuebner) to report discrepancies and make requests.

    Context

    HR data can be hard to come by, and HR professionals generally lag behind with respect to analytics and data visualization competency. Thus, Dr. Carla Patalano and I set out to create our own HR-related dataset, which is used in one of our graduate MSHRM courses called HR Metrics and Analytics, at New England College of Business. We created this data set ourselves. We use the data set to teach HR students how to use and analyze the data in Tableau Desktop - a data visualization tool that's easy to learn.

    This version provides a variety of features that are useful for both data visualization AND creating machine learning / predictive analytics models. We are working on expanding the data set even further by generating even more records and a few additional features. We will be keeping this as one file/one data set for now. There is a possibility of creating a second file perhaps down the road where you can join the files together to practice SQL/joins, etc.

    Note that this dataset isn't perfect. By design, there are some issues that are present. It is primarily designed as a teaching data set - to teach human resources professionals how to work with data and analytics.

    Content

    We have reduced the complexity of the dataset down to a single data file (v14). The CSV revolves around a fictitious company and the core data set contains names, DOBs, age, gender, marital status, date of hire, reasons for termination, department, whether they are active or terminated, position title, pay rate, manager name, and performance score.

    Recent additions to the data include: - Absences - Most Recent Performance Review Date - Employee Engagement Score

    Acknowledgements

    Dr. Carla Patalano provided the baseline idea for creating this synthetic data set, which has been used now by over 200 Human Resource Management students at the college. Students in the course learn data visualization techniques with Tableau Desktop and use this data set to complete a series of assignments.

    Inspiration

    We've included some open-ended questions that you can explore and try to address through creating Tableau visualizations, or R or Python analyses. Good luck and enjoy the learning!

    • Is there any relationship between who a person works for and their performance score?
    • What is the overall diversity profile of the organization?
    • What are our best recruiting sources if we want to ensure a diverse organization?
    • Can we predict who is going to terminate and who isn't? What level of accuracy can we achieve on this?
    • Are there areas of the company where pay is not equitable?

    There are so many other interesting questions that could be addressed through this interesting data set. Dr. Patalano and I look forward to seeing what we can come up with.

    If you have any questions or comments about the dataset, please do not hesitate to reach out to me on LinkedIn: http://www.linkedin.com/in/RichHuebner

    You can also reach me via email at: Richard.Huebner@go.cambridgecollege.edu

  16. Age and Sex Prediction by Artificial Intelligence

    • kaggle.com
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EMİRHAN BULUT (2025). Age and Sex Prediction by Artificial Intelligence [Dataset]. https://www.kaggle.com/datasets/emirhanai/age-and-sex-prediction-by-artificial-intelligence
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 5, 2025
    Dataset provided by
    Kaggle
    Authors
    EMİRHAN BULUT
    License

    http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html

    Description

    Age and Sex Prediction from Image - Convolutional Neural Network with Artificial Intelligence

    I developed an artificial intelligence software that predicts your Age and Gender. It has a 93% accuracy rate. I'm 21 years old and he predicted my age 100% correctly! I adjusted the algorithm and prepared the codes. A system that works together with Neural Networks in the Deep Learning system. I used Convolutional Layers from Convolutional Neural Networks. I am pleased to present this software for humanity. Doctoral students can use it in their theses or various companies can use this software! Upload your photo, guess your age and gender!

    Kind regards,

    Emirhan BULUT

    Head of AI & AI Inventor

    The coding language used:

    Python 3.9.8

    Libraries Used:

    TensorFlow

    Keras

    OpenCV

    MatPlotlib

    NumPy

    Pandas

    Scikit-learn - (SKLEARN)

    https://raw.githubusercontent.com/emirhanai/Age-and-Sex-Prediction-from-Image---Convolutional-Neural-Network-with-Artificial-Intelligence/main/Age%20and%20Sex%20Prediction%20from%20Image%20-%20Convolutional%20Neural%20Network%20with%20Artificial%20Intelligence.png" alt="Age and Sex Prediction from Image - Convolutional Neural Network with Artificial Intelligence">

    Developer Information:

    Name-Surname: Emirhan BULUT

    Contact (Email) : emirhan@isap.solutions

    LinkedIn : https://www.linkedin.com/in/artificialintelligencebulut/

    Kaggle: https://www.kaggle.com/emirhanai

    Official Website: https://www.emirhanbulut.com.tr

  17. Face Mask Mask Dataset

    • kaggle.com
    Updated Mar 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikola (2021). Face Mask Mask Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/2045433
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nikola
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Face mask segmentation mask dataset for more efficient detection and localization.

    • 222 images, 222 masks.
    • Original images are in "images" folder.
    • Segmentation masks with the same name as the images are in "masks" images.
    • If more masks are present on the image, each has a unique color in order to instance each of them (max 3 masks per image):
      • Mask 1: #ffffff
      • Mask 2: #fdeded
      • Mask 3: #fcdbdb

    Contact: https://www.linkedin.com/in/pericnikola/

    • Big thanks to all users on Pexels and Unsplash - find their user names in the names of the images.

    • Why I made this? I was bored.

    • No animals were hurt during the creation of this dataset (dataset was presented to them and they had absolutely no idea what to do with it).

  18. Advanced: Saudi Arabian Aramco Stocks Dataset 🐪

    • kaggle.com
    Updated May 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Azhar Saleem (2024). Advanced: Saudi Arabian Aramco Stocks Dataset 🐪 [Dataset]. https://www.kaggle.com/datasets/azharsaleem/advanced-saudi-arabian-aramco-stocks-dataset/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Azhar Saleem
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Saudi Arabia
    Description

    Saudi Arabian Oil Company Aramco, Stocks

    👨‍💻 Author: Azhar Saleem

    "https://github.com/azharsaleem18" target="_blank"> https://img.shields.io/badge/GitHub-Profile-blue?style=for-the-badge&logo=github" alt="GitHub Profile"> "https://www.kaggle.com/azharsaleem" target="_blank"> https://img.shields.io/badge/Kaggle-Profile-blue?style=for-the-badge&logo=kaggle" alt="Kaggle Profile"> "https://www.linkedin.com/in/azhar-saleem/" target="_blank"> https://img.shields.io/badge/LinkedIn-Profile-blue?style=for-the-badge&logo=linkedin" alt="LinkedIn Profile">
    "https://www.youtube.com/@AzharSaleem19" target="_blank"> https://img.shields.io/badge/YouTube-Profile-red?style=for-the-badge&logo=youtube" alt="YouTube Profile"> "https://www.facebook.com/azhar.saleem1472/" target="_blank"> https://img.shields.io/badge/Facebook-Profile-blue?style=for-the-badge&logo=facebook" alt="Facebook Profile"> "https://www.tiktok.com/@azhar_saleem18" target="_blank"> https://img.shields.io/badge/TikTok-Profile-blue?style=for-the-badge&logo=tiktok" alt="TikTok Profile">
    "https://twitter.com/azhar_saleem18" target="_blank"> https://img.shields.io/badge/Twitter-Profile-blue?style=for-the-badge&logo=twitter" alt="Twitter Profile"> "https://www.instagram.com/azhar_saleem18/" target="_blank"> https://img.shields.io/badge/Instagram-Profile-blue?style=for-the-badge&logo=instagram" alt="Instagram Profile"> "mailto:azharsaleem6@gmail.com"> https://img.shields.io/badge/Email-Contact%20Me-red?style=for-the-badge&logo=gmail" alt="Email Contact">

    Dataset Description

    Welcome to the Enhanced Saudi Arabian Oil Company (Aramco) Stock Dataset! This dataset has been meticulously prepared from Yahoo Finance and further enriched with several engineered features to elevate your data analysis, machine learning, and financial forecasting projects. It captures the daily trading figures of Aramco stocks, presented in Saudi Riyal (SAR), providing a robust foundation for comprehensive market analysis.

    Columns in the Dataset

    • Date: The trading day for the data recorded (ISO 8601 format).
    • Open: The price at which the stock first traded upon the opening of an exchange on a given trading day.
    • High: The highest price at which the stock traded during the trading day.
    • Low: The lowest price at which the stock traded during the trading day.
    • Close: The price at which the stock last traded upon the close of an exchange on a given trading day.
    • Volume: The total number of shares traded during the trading day.
    • Dividends: The dividend value paid out per share on the trading day.
    • Stock Splits: The number of stock splits occurring on the trading day.
    • Lag Features (Lag_Close, Lag_High, Lag_Low): Previous day's closing, highest, and lowest prices.
    • Rolling Window Statistics (e.g., Rolling_Mean_7, Rolling_Std_7): 7-day and 30-day moving averages and standard deviations of the Close price.
    • Technical Indicators (RSI, MACD, Bollinger Bands): Key metrics used in trading to analyze short-term price movements.
    • Change Features (Change_Close, Change_Volume): Day-over-day changes in Close price and trading volume.
    • Date-Time Features (Weekday, Month, Year, Quarter): Extracted components of the trading day.
    • Volume_Normalized: The standardized trading volume using z-score normalization to adjust for scale differences.

    Potential Uses

    This dataset is tailored for a wide array of applications:

    • Financial Analysis: Explore historical performance, volatility, and market trends.
    • Forecasting Models: Utilize features like lagged prices and rolling statistics to predict future stock prices.
    • Machine Learning: Develop regression models or classification frameworks to predict market movements.
    • Deep Learning: Leverage LSTM networks for more sophisticated time-series forecasting.
    • Time-Series Analysis: Dive deep into trend analysis, seasonality, and cyclical behavior of stock prices.

    Whether you are a data scientist, a financial analyst, or a hobbyist interested in the stock market, this dataset provides a rich playground for analysis and model building. Its comprehensive feature set allows for the development of robust predictive models and offers unique insights into one of the world’s most significant oil companies. Unlock the potential of financial data with this carefully crafted dataset.

  19. BCG Data Science Simulation

    • kaggle.com
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PAVITR KUMAR SWAIN (2025). BCG Data Science Simulation [Dataset]. https://www.kaggle.com/datasets/pavitrkumar/bcg-data-science-simulation
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PAVITR KUMAR SWAIN
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description
    ** Feature Engineering for Churn Prediction**

    🚀**# BCG Data Science Job Simulation | Forage** This notebook focuses on feature engineering techniques to enhance a dataset for churn prediction modeling. As part of the BCG Data Science Job Simulation, I transformed raw customer data into valuable features to improve predictive performance.

    📊 What’s Inside? ✅ Data Cleaning: Removing irrelevant columns to reduce noise ✅ Date-Based Feature Extraction: Converting raw dates into useful insights like activation year, contract length, and renewal month ✅ New Predictive Features:

    consumption_trend → Measures if a customer’s last-month usage is increasing or decreasing total_gas_and_elec → Aggregates total energy consumption ✅ Final Processed Dataset: Ready for churn prediction modeling

    📂Dataset Used: 📌 clean_data_after_eda.csv → Original dataset after Exploratory Data Analysis (EDA) 📌 clean_data_with_new_features.csv → Final dataset after feature engineering

    🛠 Technologies Used: 🔹 Python (Pandas, NumPy) 🔹 Data Preprocessing & Feature Engineering

    🌟 Why Feature Engineering? Feature engineering is one of the most critical steps in machine learning. Well-engineered features improve model accuracy and uncover deeper insights into customer behavior.

    🚀 This notebook is a great reference for anyone learning data preprocessing, feature selection, and predictive modeling in Data Science!

    📩 Connect with Me: 🔗 GitHub Repo: https://github.com/Pavitr-Swain/BCG-Data-Science-Job-Simulation 💼 LinkedIn: https://www.linkedin.com/in/pavitr-kumar-swain-ab708b227/

    🔍 Let’s explore churn prediction insights together! 🎯

  20. Tourism in Romania 2004-2022

    • kaggle.com
    Updated Nov 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Tourism in Romania 2004-2022 [Dataset]. https://www.kaggle.com/datasets/thedevastator/a-study-of-tourism-in-romania-2004-2022
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 27, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Area covered
    Romania
    Description

    Tourism in Romania 2004-2022

    The Evolution of Transport and Foreign Visitors

    By RomanianDATA Tribe [source]

    About this dataset

    Tourism is a vital component for many economies around the world, benefiting their revenues, and their job market, pushing improvements towards a country's infrastructure, and engaging cultural exchange between foreigners and citizens.

    And as the summer holiday season starts, we decided to put together some data regarding tourism in Romania.

    We gathered the data with ease from the CEIC Data platform, which we highly recommend using; they have a plethora of very well-structured datasets covering more than 213 economies, 23 industries, and 18 macroeconomic sectors, compiled from 2,200 sources worldwide, which include up to 8 million macro and industry time series and are available via either web or API.

    Create a visualization that shows how Romania's tourism has changed over the years.

    After you finish the challenge make sure to fill in ✍️the participation tracker (), then share your makeover data visualization on LinkedIn using #RomanianDATA, #RomanianDATATribe and tagging RomanianDATA Tribe

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains data on tourism in Romania from 2004 to 2022. The data includes the number of foreign visitors and the type of transport used by them. The transport types include air, land, and sea

    Research Ideas

    • The data could be used to study trends in tourism in Romania over time.
    • The data could be used to study the effects of different types of transport on tourism in Romania.
    • The data could be used to study the effects of different types of visitors on tourism in Romania

    Acknowledgements

    If you use this dataset in your research, please credit the original authors.

    Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Romania - Tourism data RDT June 2022.csv | Column name | Description | |:----------------------|:-------------------------------------------------------------| | Date | The date of the observation. (Date) | | Type of transport | The type of transport used by the foreign visitors. (String) | | Foreign Visitors | The number of foreign visitors. (Integer) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit RomanianDATA Tribe.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ilya Novoselskiy (2024). LinkedIn US Retail [Dataset]. https://www.kaggle.com/datasets/ilya9711nov/linkedin-us-retail
Organization logo

LinkedIn US Retail

Medium size US Retail companies and people working there

Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 22, 2024
Authors
Ilya Novoselskiy
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Area covered
United States
Description

Dataset contains US Retail companies with company size from 200-500 workers. For each company, all workers were scrapped as well.

For mode details about scrapping code, you can check my article or GitHub code

Search
Clear search
Close search
Google apps
Main menu