Facebook
Twitterhttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
This dataset is about men and female height in the world and protein impact on the height. This dataset is also about the height change rate of male and female year on year. This a good dataset to perform data analysis or exploratory data analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.
The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)
Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}
Format of the file .xls, .csv (for the first spreadsheet only), .odt
Licenses or restrictions CC-BY
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains HR information for employees of a multinational corporation (MNC). It includes 2 Million (20 Lakhs) employee records with details about personal identifiers, job-related attributes, performance, employment status, and salary information. The dataset can be used for HR analytics, including workforce distribution, attrition analysis, salary trends, and performance evaluation.
This data is available as a CSV file. We are going to analyse this data set using the Pandas. This analyse will be helpful for those working in HR domain.
Q.1) What is the distribution of Employee Status (Active, Resigned, Retired, Terminated) ?
Q.2) What is the distribution of work modes (On-site, Remote) ?
Q.3) How many employees are there in each department ?
Q.4) What is the average salary by Department ?
Q.5) Which job title has the highest average salary ?
Q.6) What is the average salary in different Departments based on Job Title ?
Q.7) How many employees Resigned & Terminated in each department ?
Q.8) How does salary vary with years of experience ?
Q.9) What is the average performance rating by department ?
Q.10) Which Country have the highest concentration of employees ?
Q.11) Is there a correlation between performance rating and salary ?
Q.12) How has the number of hires changed over time (per year) ?
Q.13) Compare salaries of Remote vs. On-site employees — is there a significant difference ?
Q.14) Find the top 10 employees with the highest salary in each department.
Q.15) Identify departments with the highest attrition rate (Resigned %).
Enrol in our Udemy courses : 1. Python Data Analytics Projects - https://www.udemy.com/course/bigdata-analysis-python/?referralCode=F75B5F25D61BD4E5F161 2. Python For Data Science - https://www.udemy.com/course/python-for-data-science-real-time-exercises/?referralCode=9C91F0B8A3F0EB67FE67 3. Numpy For Data Science - https://www.udemy.com/course/python-numpy-exercises/?referralCode=FF9EDB87794FED46CBDF
1) Unnamed: 0 – Index column (auto-generated, not useful for analysis, will be deleted).
2) Employee_ID – Unique identifier assigned to each employee (e.g., EMP0000001).
3) Full_Name – Full name of the employee.
4) Department – Department in which the employee works (e.g., IT, HR, Marketing, Operations).
5) Job_Title – Designation or role of the employee (e.g., Software Engineer, HR Manager).
6) Hire_Date – The date when the employee was hired by the company.
7) Location – Geographical location of the employee (city, country).
8) Performance_Rating – Performance evaluation score (numeric scale, higher is better).
9) Experience_Years – Number of years of professional experience the employee has.
10) Status – Current employment status (e.g., Active, Resigned).
11) Work_Mode – Mode of working (e.g., On-site, Hybrid, Remote).
12) Salary_INR – Annual salary of the employee in Indian Rupees.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Goals: Diesel photooxidation under NOx and SOA surrogate
Summary: 1.1ppmc surrogate + 25 ppb NOx + 150 uL (7-11) diesel + 1 ppmC H2O2 INSTRUMENTS: AMS, SMPS, GC1
Organization: UCR-CECERT-APL-Cocker Group Lab Affiliation: University of California, Riverside, CE-CERT Chamber: EPA chamber
Experiment Category: Photolysis, any phase Oxidant: Hydroxyl radical Reactants: NOx, Diesel (7-11), SOA Surrogate, H2O2 Reaction Type: Photooxidation Relative Humidity: 0.01 Temperature: 28 Pressure: 0.015'' H2O
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:
Context:Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.
Inspiration:The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.
Dataset Information:The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:
Use Cases:Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
This dataset is about men and female height in the world and protein impact on the height. This dataset is also about the height change rate of male and female year on year. This a good dataset to perform data analysis or exploratory data analysis.