3 datasets found
  1. The Backbencher Dataset

    • kaggle.com
    zip
    Updated Feb 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek sahoo (2025). The Backbencher Dataset [Dataset]. https://www.kaggle.com/datasets/abhisheksahoo15/the-backbencher-dataset
    Explore at:
    zip(3798 bytes)Available download formats
    Dataset updated
    Feb 8, 2025
    Authors
    Abhishek sahoo
    Description

    The Backbencher Dataset is a unique and fun dataset designed to analyze and explore student behaviors and trends in classrooms. This dataset focuses on the attendance patterns, assignment completion rates, and other factors that influence a student’s academic performance, with a quirky twist: it includes a column identifying whether the student wears glasses!

    This dataset is ideal for Machine Learning practitioners and Data Science enthusiasts who want to work on real-world datasets with an engaging context. It can be used for various ML problems such as:

    Predictive Analytics: Predicting student performance based on attendance and assignments. **Clustering Analysis: **Grouping students based on shared characteristics. Classification Tasks: Classifying students as "active" or "inactive" based on participation metrics. Key Features: USN: Unique Student Number for identification. Name: Student names (for reference). Attendance (%): Percentage of classes attended. Assignments Completed: Number of assignments completed. Exam Scores: Performance in exams. Participation in Activities: Measures involvement in extracurricular activities. Glasses (Yes/No): Whether the student wears glasses (interesting feature for pattern recognition). Use Cases: Educational data analysis and predictive modeling. Creating engaging ML projects for students and beginners. Developing dashboards for visualizing student performance trends.

  2. Predict Human Character from a Facial Image

    • kaggle.com
    zip
    Updated Mar 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gerry (2022). Predict Human Character from a Facial Image [Dataset]. https://www.kaggle.com/datasets/gpiosenka/good-guysbad-guys-image-data-set/code
    Explore at:
    zip(614433423 bytes)Available download formats
    Dataset updated
    Mar 11, 2022
    Authors
    Gerry
    Description

    All data in this dataset was gathered from PUBLICLY accessible web sites or databases .This dataset consists of 2 classes, savory and unsavory. The unsavory class is populate with i facial mages of convicted felons. The savory class is populated with facial images of "ordinary" people. Granted some "ordinary" people may be convicted felon but I expect the percentage is very low. All downloaded images were processed by a custom duplicate image detector before being split into a train set, a validation set and a test set. This is meant to prevent images being in common between these data sets. All images were cropped from the original downloaded image to just a facial image using the MTCNN crop module. The crop is such that very little extraneous background is included in the cropped image. This is to prevent the CNN classifier from extracting background features not relevant to the task of classification from a facial image. The train set has 5610 images in the savory class and 5610 images in the unsavory class. The test set has 300 images in the savory class and 300 images in the unsavory class as does the validation set.

    BIASES IN THE DATASET Early runs of a CNN classifier led to the discovery of unwanted biases in the dataset which incorrectly influenced classifications. As a matter of fact the CNN was a smile classifier versus a personality classifier. EMOTIONAL BIAS Images of ordinary citizens had a preponderance of "smiling" faces. People naturally smile when a photo is taken. To the converse, booking images of felons have few if any that are smiling. Consequently the data set biases the classifier to predict images with smiling faces as savory and non-smiling images as unsavory. To fix that I went to the savory dataset and tried to replace the majority of smiling images with emotionally neutral faces. GLASSES BIAS I noticed a much higher percentage of images in the savory class wore glasses than was the case for the unsavory images. Some images in the unsavory class wore glesses. I went to the savory class images and tried to replace images with glasses with emotionally neutral images but did leave some with glasses. RACIAL BIAS From the onssey I was aware of the potential of embedding racial bias in the dataset. While not counting each case I tried for example to include as many African Americans in the savory class as I did in the unsavory class. Last thing I want to build is a racist classifier! OTHER POTENTIAL BIASES I am aware there may well be additional biases built into the dataset. For example I notice that there is a higher percentage of long haired images in the unsavory class than in the savory class. Also I believe more of the images in the unsavory class sport beards than in the savory class. I did not deal with these potential problems. There may well be other less apparent biases in the dataset. If you observe any please advise of same. INTERPERTATIONS I believe this dataset provides the basis for a CNN classifier to predict with reasonably high probability the genetic predisposition of an individual from a facial image. Results of runs on the test set support this conclusion. However please Note the term genetic predisposition. That does not imply that that is the individual's actual persona but rather a predisposition toward one class extreme or another. Of course persona is NOT a binary situation but rather a spectrum with a wide range of characteristics. However one is forced into a binary situation because there is no way to do a search for 1/2 savory facial images. As the old argument goes Nature vs Nuture. Individuals with a predisposition toward the unsavory class can be of course highly savory individuals if a sufficient level of positive life experiences are applied to overcome the disposition. On the converse individuals with a savory predisposition can become unsavory due to negative influences within their lifetime environment. I personally know several individuals where this has taken place. Politicians come to mind. Started out as good guys and ended up as bad guys. So I strongly urge that if you use this data set and develop a strong classifier you DO NOT make the mistake of applying the classification as that individuals true personality. LICENSE The dataset is to be limited specifically to personal use only. No use by governments at any level or commercial entities is allowed. If you derive work from this data you are required to post this license as part of the work. You may not limit or extend the terms of use. I have sufficient resources and will use same to take legal action if I discover violation of the terms of the license. ** FACIAL MORPHOLOGY AND BRAIN FUNCTION** It is a well established fact that a correlation exists between a facial image and certain features of brain function. The clearest exa...

  3. LinkedIn Profile Data

    • kaggle.com
    zip
    Updated Mar 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Om Ashish Mishra (2020). LinkedIn Profile Data [Dataset]. https://www.kaggle.com/omashish/linkedin-profile-data
    Explore at:
    zip(2415431 bytes)Available download formats
    Dataset updated
    Mar 21, 2020
    Authors
    Om Ashish Mishra
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    LinkedIn is a place for increasing connection, showing your skills and achievements. Therefore in order to understand the various features like promotions, regional analysis and facial characteristics. This data is taken into consideration.

    Content

    Data is consisting of around 15000 profiles. The data set deals with a lot of features like region, the way the images are being uploaded, the emotions on them and growth of the users over time.

    Lets understand the following attributes for the betterment:-

    User id is a thing of privacy and should not be disclosed although there characteristics can be given in order to understand the various behavior pattern of people in LinkedIn. c id : name for each data, basically forms the primary key.

    Profession Columns avg time in previous position: The amount of time spent in years in the previous position avg current position length: The amount of time on an average the user is present in the current position avg previous position length: The amount of time on an average the user is present in the previous position m urn: The user id for each profile m urn id: This is reduced to a distinct code no of promotions: Total number of times the user was promoted no of previous positions: The number of previous positions the user holds current position length: The number of months the person is in current position age: The Age of the person gender: Male or Female ethnicity: The percentage of ethnicity n followers: Number of followers

    Image Clarity beauty: The beauty is the index for the analysis of the beauty female: This predicts the user image is more to be female or not.
    beauty male: This predicts the user image is more to be male or not. blur: The degree of shadiness of the image

    Emotion Captured emo anger: The percentage of anger found emo disgust: The percentage of disgust found emo fear : The percentage of fear found emo happiness: The percentage of happiness found emo neutral: The percentage of neutral emo sadness: The percentage of sadness emo surprise: The percentage of surprise

    Orientation & Facial Accessories glass: The person is wearing glasses or not or sunglasses head pitch: The orientation of head(basically Up or down) head roll: The orientation of head(side ways rolling; horizontal or vertical) head yaw: The orientation of head(side facing; left or right) mouth close: The percentage of closed mouth mouth mask: The percentage of masked mouth mouth open: The percentage of open mouth mouth other: The percentage of other mouth things skin acne: The percentage of skin tone skin dark_circle: The percentage of dark circle on skin skin health: The growth of the skin percentage skin stain: The stain percentage on skin smile: The smile percentage

    Region Columns nationality: The nationality belonging Followed by the percentage of each:- african celtic english
    east asian
    european
    greek
    hispanic
    jewish
    muslim
    nordic
    south asian

    face_quality: The quality of the face recognized.

    Acknowledgements

    We wouldn't be here without the help of Kagglers. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Always wanted to contribute to the data science community and open up to questions.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Abhishek sahoo (2025). The Backbencher Dataset [Dataset]. https://www.kaggle.com/datasets/abhisheksahoo15/the-backbencher-dataset
Organization logo

The Backbencher Dataset

"Unlock the Secrets of the Backbenchers – A Fun Dataset for Machine Learning Ent

Explore at:
zip(3798 bytes)Available download formats
Dataset updated
Feb 8, 2025
Authors
Abhishek sahoo
Description

The Backbencher Dataset is a unique and fun dataset designed to analyze and explore student behaviors and trends in classrooms. This dataset focuses on the attendance patterns, assignment completion rates, and other factors that influence a student’s academic performance, with a quirky twist: it includes a column identifying whether the student wears glasses!

This dataset is ideal for Machine Learning practitioners and Data Science enthusiasts who want to work on real-world datasets with an engaging context. It can be used for various ML problems such as:

Predictive Analytics: Predicting student performance based on attendance and assignments. **Clustering Analysis: **Grouping students based on shared characteristics. Classification Tasks: Classifying students as "active" or "inactive" based on participation metrics. Key Features: USN: Unique Student Number for identification. Name: Student names (for reference). Attendance (%): Percentage of classes attended. Assignments Completed: Number of assignments completed. Exam Scores: Performance in exams. Participation in Activities: Measures involvement in extracurricular activities. Glasses (Yes/No): Whether the student wears glasses (interesting feature for pattern recognition). Use Cases: Educational data analysis and predictive modeling. Creating engaging ML projects for students and beginners. Developing dashboards for visualizing student performance trends.

Search
Clear search
Close search
Google apps
Main menu