5 datasets found

Worldwide Male & Female Height Factors
kaggle.com
zip
Updated Dec 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafsun Ahmad (2023). Worldwide Male & Female Height Factors [Dataset]. https://www.kaggle.com/datasets/rafsunahmad/worldwide-male-and-female-height-factors
Explore at:
zip(711360 bytes)Available download formats
Dataset updated
Dec 19, 2023
Authors
Rafsun Ahmad
License
https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
Description
This dataset is about men and female height in the world and protein impact on the height. This dataset is also about the height change rate of male and female year on year. This a good dataset to perform data analysis or exploratory data analysis.
Z
A stakeholder-centered determination of High-Value Data sets: the use-case...
data-staging.niaid.nih.gov
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anastasija Nikiforova (2021). A stakeholder-centered determination of High-Value Data sets: the use-case of Latvia [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5142816
Explore at:
Dataset updated
Oct 27, 2021
Dataset provided by
University of Latvia
Authors
Anastasija Nikiforova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Latvia
Description
The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.

The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)

Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}

Format of the file .xls, .csv (for the first spreadsheet only), .odt

Licenses or restrictions CC-BY
HR Dataset (Multinational Company)
kaggle.com
zip
Updated Aug 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Science Lovers (2025). HR Dataset (Multinational Company) [Dataset]. https://www.kaggle.com/datasets/rohitgrewal/hr-data-mnc/code
Explore at:
zip(69930946 bytes)Available download formats
Dataset updated
Aug 23, 2025
Authors
Data Science Lovers
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
📹Project Video available on YouTube - https://youtu.be/fykrwQD3HR4

🖇️Connect with me on LinkedIn - https://www.linkedin.com/in/rohit-grewal

Human Resource (HR) Data of a Multi-national Corporation (MNC) - 2 Million Records

This dataset contains HR information for employees of a multinational corporation (MNC). It includes 2 Million (20 Lakhs) employee records with details about personal identifiers, job-related attributes, performance, employment status, and salary information. The dataset can be used for HR analytics, including workforce distribution, attrition analysis, salary trends, and performance evaluation.

This data is available as a CSV file. We are going to analyse this data set using the Pandas. This analyse will be helpful for those working in HR domain.

Using this dataset, we answered multiple questions with Python in our Project.

Q.1) What is the distribution of Employee Status (Active, Resigned, Retired, Terminated) ?

Q.2) What is the distribution of work modes (On-site, Remote) ?

Q.3) How many employees are there in each department ?

Q.4) What is the average salary by Department ?

Q.5) Which job title has the highest average salary ?

Q.6) What is the average salary in different Departments based on Job Title ?

Q.7) How many employees Resigned & Terminated in each department ?

Q.8) How does salary vary with years of experience ?

Q.9) What is the average performance rating by department ?

Q.10) Which Country have the highest concentration of employees ?

Q.11) Is there a correlation between performance rating and salary ?

Q.12) How has the number of hires changed over time (per year) ?

Q.13) Compare salaries of Remote vs. On-site employees — is there a significant difference ?

Q.14) Find the top 10 employees with the highest salary in each department.

Q.15) Identify departments with the highest attrition rate (Resigned %).

Enrol in our Udemy courses : 1. Python Data Analytics Projects - https://www.udemy.com/course/bigdata-analysis-python/?referralCode=F75B5F25D61BD4E5F161 2. Python For Data Science - https://www.udemy.com/course/python-for-data-science-real-time-exercises/?referralCode=9C91F0B8A3F0EB67FE67 3. Numpy For Data Science - https://www.udemy.com/course/python-numpy-exercises/?referralCode=FF9EDB87794FED46CBDF

These are the main Features/Columns available in the dataset :

1) Unnamed: 0 – Index column (auto-generated, not useful for analysis, will be deleted).

2) Employee_ID – Unique identifier assigned to each employee (e.g., EMP0000001).

3) Full_Name – Full name of the employee.

4) Department – Department in which the employee works (e.g., IT, HR, Marketing, Operations).

5) Job_Title – Designation or role of the employee (e.g., Software Engineer, HR Manager).

6) Hire_Date – The date when the employee was hired by the company.

7) Location – Geographical location of the employee (city, country).

8) Performance_Rating – Performance evaluation score (numeric scale, higher is better).

9) Experience_Years – Number of years of professional experience the employee has.

10) Status – Current employment status (e.g., Active, Resigned).

11) Work_Mode – Mode of working (e.g., On-site, Hybrid, Remote).

12) Salary_INR – Annual salary of the employee in Indian Rupees.
ICARUS Chamber Experiment: UCR-CECERT-APL-Cocker...
gdex.ucar.edu
data.ucar.edu
+1more
Updated Aug 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Cocker (2023). ICARUS Chamber Experiment: UCR-CECERT-APL-Cocker Group_20161009_NOx/Diesel(7-11)/SOA Surrogate/H2O2_Hydroxyl radical_No Seed_EPA2260A - Weihua [Dataset]. http://doi.org/10.5065/n4qp-7k11
Explore at:
Unique identifier
https://doi.org/10.5065/n4qp-7k11
Dataset updated
Aug 23, 2023
Dataset provided by
National Science Foundationhttp://www.nsf.gov/
Authors
David Cocker
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 9, 2016
Description
Goals: Diesel photooxidation under NOx and SOA surrogate

Summary: 1.1ppmc surrogate + 25 ppb NOx + 150 uL (7-11) diesel + 1 ppmC H2O2 INSTRUMENTS: AMS, SMPS, GC1

Organization: UCR-CECERT-APL-Cocker Group Lab Affiliation: University of California, Riverside, CE-CERT Chamber: EPA chamber

Experiment Category: Photolysis, any phase Oxidant: Hydroxyl radical Reactants: NOx, Diesel (7-11), SOA Surrogate, H2O2 Reaction Type: Photooxidation Relative Humidity: 0.01 Temperature: 28 Pressure: 0.015'' H2O
Retail Transactions Dataset
kaggle.com
zip
Updated May 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasad Patil (2024). Retail Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/prasad22/retail-transactions-dataset/code
Explore at:
zip(37330179 bytes)Available download formats
Dataset updated
May 18, 2024
Authors
Prasad Patil
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:

Context:

Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.

Inspiration:

The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.

Dataset Information:

The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:

Transaction_ID: A unique identifier for each transaction, represented as a 10-digit number. This column is used to uniquely identify each purchase.

Date: The date and time when the transaction occurred. It records the timestamp of each purchase.

Customer_Name: The name of the customer who made the purchase. It provides information about the customer's identity.

Product: A list of products purchased in the transaction. It includes the names of the products bought.

Total_Items: The total number of items purchased in the transaction. It represents the quantity of products bought.

Total_Cost: The total cost of the purchase, in currency. It represents the financial value of the transaction.

Payment_Method: The method used for payment in the transaction, such as credit card, debit card, cash, or mobile payment.

City: The city where the purchase took place. It indicates the location of the transaction.

Store_Type: The type of store where the purchase was made, such as a supermarket, convenience store, department store, etc.

Discount_Applied: A binary indicator (True/False) representing whether a discount was applied to the transaction.

Customer_Category: A category representing the customer's background or age group.

Season: The season in which the purchase occurred, such as spring, summer, fall, or winter.

Promotion: The type of promotion applied to the transaction, such as "None," "BOGO (Buy One Get One)," or "Discount on Selected Items."

Use Cases:

Market Basket Analysis: Discover associations between products and uncover buying patterns.

Customer Segmentation: Group customers based on purchasing behavior.

Pricing Optimization: Optimize pricing strategies and identify opportunities for discounts and promotions.

Retail Analytics: Analyze store performance and customer trends.

Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Rafsun Ahmad (2023). Worldwide Male & Female Height Factors [Dataset]. https://www.kaggle.com/datasets/rafsunahmad/worldwide-male-and-female-height-factors

Worldwide Male & Female Height Factors

Different Factors and in Worldwide Male and Female Height

Explore at:

zip(711360 bytes)Available download formats

Dataset updated

Dec 19, 2023

Authors

Rafsun Ahmad

License

https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets

Description

This dataset is about men and female height in the world and protein impact on the height. This dataset is also about the height change rate of male and female year on year. This a good dataset to perform data analysis or exploratory data analysis.

Clear search

Close search

Google apps

Main menu

Worldwide Male & Female Height Factors

A stakeholder-centered determination of High-Value Data sets: the use-case...

HR Dataset (Multinational Company)

📹Project Video available on YouTube - https://youtu.be/fykrwQD3HR4

🖇️Connect with me on LinkedIn - https://www.linkedin.com/in/rohit-grewal

Human Resource (HR) Data of a Multi-national Corporation (MNC) - 2 Million Records

Using this dataset, we answered multiple questions with Python in our Project.

These are the main Features/Columns available in the dataset :

ICARUS Chamber Experiment: UCR-CECERT-APL-Cocker...

Retail Transactions Dataset

`Context:`

`Inspiration:`

`Dataset Information:`

`Use Cases:`

Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.

Worldwide Male & Female Height Factors

Different Factors and in Worldwide Male and Female Height