Facebook
TwitterDescription: This dataset is created solely for the purpose of practice and learning. It contains entirely fake and fabricated information, including names, phone numbers, emails, cities, ages, and other attributes. None of the information in this dataset corresponds to real individuals or entities. It serves as a resource for those who are learning data manipulation, analysis, and machine learning techniques. Please note that the data is completely fictional and should not be treated as representing any real-world scenarios or individuals.
Attributes: - phone_number: Fake phone numbers in various formats. - name: Fictitious names generated for practice purposes. - email: Imaginary email addresses created for the dataset. - city: Made-up city names to simulate geographical diversity. - age: Randomly generated ages for practice analysis. - sex: Simulated gender values (Male, Female). - married_status: Synthetic marital status information. - job: Fictional job titles for practicing data analysis. - income: Fake income values for learning data manipulation. - religion: Pretend religious affiliations for practice. - nationality: Simulated nationalities for practice purposes.
Please be aware that this dataset is not based on real data and should be used exclusively for educational purposes.
Facebook
TwitterCreating a robust employee dataset for data analysis and visualization involves several key fields that capture different aspects of an employee's information. Here's a list of fields you might consider including: Employee ID: A unique identifier for each employee. Name: First name and last name of the employee. Gender: Male, female, non-binary, etc. Date of Birth: Birthdate of the employee. Email Address: Contact email of the employee. Phone Number: Contact number of the employee. Address: Home or work address of the employee. Department: The department the employee belongs to (e.g., HR, Marketing, Engineering, etc.). Job Title: The specific job title of the employee. Manager ID: ID of the employee's manager. Hire Date: Date when the employee was hired. Salary: Employee's salary or compensation. Employment Status: Full-time, part-time, contractor, etc. Employee Type: Regular, temporary, contract, etc. Education Level: Highest level of education attained by the employee. Certifications: Any relevant certifications the employee holds. Skills: Specific skills or expertise possessed by the employee. Performance Ratings: Ratings or evaluations of employee performance. Work Experience: Previous work experience of the employee. Benefits Enrollment: Information on benefits chosen by the employee (e.g., healthcare plan, retirement plan, etc.). Work Location: Physical location where the employee works. Work Hours: Regular working hours or shifts of the employee. Employee Status: Active, on leave, terminated, etc. Emergency Contact: Contact information of the employee's emergency contact person. Employee Satisfaction Survey Responses: Data from employee satisfaction surveys, if applicable.
Code Url: https://github.com/intellisenseCodez/faker-data-generator
Facebook
TwitterDescription: This dataset contains simulated employee records for a fictional company. The dataset was generated using the Python Faker library to create realistic but fake data. The dataset includes the following fields for each employee:
Employee ID: A unique identifier for each employee (integer). Name: A randomly generated full name (string). Job title: A randomly generated job title (string). Department: A randomly selected department from a predefined list (HR, Marketing, Sales, IT, or Finance) (string). Email: A randomly generated email address (string). Phone number: A randomly generated phone number (string). Date of hiring: A randomly generated hiring date within the last 10 years (date). Salary: A randomly generated salary value between 30,000 and 150,000 (decimal). Please note that this dataset is for demonstration and testing purposes only. The data is entirely fictional and should not be used for any decision-making or analysis.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset, titled "Realistic Email Categorization Dataset for BERT (Synthetic)," contains 20,000 entries of diverse and realistic email addresses generated using a Python script. The dataset is meticulously crafted to mimic real-world email categorization scenarios, making it an excellent resource for training and evaluating machine learning models, particularly transformer-based models like BERT.
john.doe@example.com).@ symbol.@ symbol.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a sample dataset of eazydinner in Ahmedabad, dataset contains each piece of information. For example Restaurant name, address, URL, service type, phone no, email, and many more.
For more details Check: Crawlmagic
Facebook
TwitterThis dataset was collected by a edtech startup. The startup is into teaching entrepreneurial life-skills in animated-gamified format through its video series to kids between the age group of 6-14 years. Through its learning management system the company tracks the progress made by all of its subscribers on the platform. Company records platform content usage activity data and tries to follow up with parents if there is any inactiveness on the platform by their child. Here's more information about the dataset
There is some missing data as well. I hope it would be good dataset for beginners practicing their NLP skills.
Image by Steven Weirather from Pixabay
Facebook
TwitterBy Amresh [source]
This All India Saree Retailers Database is a comprehensive collection of up-to-date information on 10,000 Saree Retailers located all over India. The database is updated in April 2021 and offers an overall accuracy rate of around 90%.
For business owners, marketers, and data analysts and researchers, this dataset is an invaluable resource. It contains contact details of store name, contact person names, phone number and email address along with store location information like city state and pin code to help you target the right audience precisely.
The database can be accessed in Microsoft Excel (.xlsx) format which makes it easy to read or manipulate the file according to your needs. Apart from this wide range of payment options like Credit/Debit Card; Online Transfer; NEFT; Cash Deposit; Paytm; PhonePe; Google Pay or PayPal allow quick download access within 2-3 business hours.
So if you are looking for reliable business intelligence data related to Indian saree retailers that can help you unlock incredible opportunities for your business then make sure to download our All India Saree Retailers Database at the earliest!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides a comprehensive list of Saree retailers in India, including store name, contact person, email address, mobile number, phone number, address details like city and state along with pin code. It contains 10 thousand records updated in April 2021 with an overall accuracy rate of around 90%. This data can be used to understand customer behaviour as well as to analyse geographical customer pattern.
Using this dataset you can: - Target specific states or cities where potential customers are located for your Saree business. - Get in touch with local Saree retailers for possible collaborations and partnerships. - Learn more about industry trends from actual store owners who can offer insights into the latest ongoing trends and identify new opportunities for you to grow your business. 4 .Analyse existing competitors’ market share by studying the cities/states where they operate and their contact information such as Mobile Number & Email Ids .
5 .Identify potential new customers for better sales conversion rates by understanding who is already operating in similar products nearby or have similar target audience as yours that help your company reach out to them quickly & effectively using direct marketing techniques such as emails & SMS etc.,
- Creating targeted email campaigns to increase Saree sales: The dataset can be used to create targeted email campaigns that can reach the 10,000 Saree Retailers in India. This will allow businesses to increase sales by directing their message about promotions and discounts directly to potential customers.
- Customizing online product recommendations for each retailer: The dataset can be used to identify the specific products that each individual retailer is interested in selling, so product recommendations on an e-commerce website could be tailored accordingly. This would optimize customer experience giving them more accurate and relevant results when searching for a particular item they are looking for while shopping online.
- Using GPS technology to generate location-based marketing campaigns: By creating geo-fenced areas around each store using the pin code database, it would be possible to send out marketing messages based on people's physical location instead of just sending them out in certain neighborhoods or cities without regard for store locations within those areas. This could help reach specific customers with relevant messages about products or promotions that may interested them more effectively than a standard marketing campaign with no location targeting involved
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: 301-Saree-Garment-Retailer-Database-Sample.csv
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Amresh.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The invoice dataset provided is a mock dataset generated using the Python Faker library. It has been designed to mimic the format of data collected from an online store. The dataset contains various fields, including first name, last name, email, product ID, quantity, amount, invoice date, address, city, and stock code. All of the data in the dataset is randomly generated and does not represent actual individuals or products. The dataset can be used for various purposes, including testing algorithms or models related to invoice management, e-commerce, or customer behavior analysis. The data in this dataset can be used to identify trends, patterns, or anomalies in online shopping behavior, which can help businesses to optimize their online sales strategies.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is aggregated from sources such as
Entirely available in the public domain.
Resumes are usually in pdf format. OCR was used to convert the PDF into text and LLMs were used to convert the data into a structured format.
This dataset contains structured information extracted from professional resumes, normalized into multiple related tables. The data includes personal information, educational background, work experience, professional skills, and abilities.
Primary table containing core information about each individual.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| person_id | INTEGER | Unique identifier for each person | Primary Key, Not Null | 1 |
| name | VARCHAR(255) | Full name of the person | May be Null | "Database Administrator" |
| VARCHAR(255) | Email address | May be Null | "john.doe@email.com" | |
| phone | VARCHAR(50) | Contact number | May be Null | "+1-555-0123" |
| VARCHAR(255) | LinkedIn profile URL | May be Null | "linkedin.com/in/johndoe" |
Detailed abilities and competencies listed by individuals.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| person_id | INTEGER | Reference to people table | Foreign Key, Not Null | 1 |
| ability | TEXT | Description of ability | Not Null | "Installation and Building Server" |
Contains educational history for each person.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| person_id | INTEGER | Reference to people table | Foreign Key, Not Null | 1 |
| institution | VARCHAR(255) | Name of educational institution | May be Null | "Lead City University" |
| program | VARCHAR(255) | Degree or program name | May be Null | "Bachelor of Science" |
| start_date | VARCHAR(7) | Start date of education | May be Null | "07/2013" |
| location | VARCHAR(255) | Location of institution | May be Null | "Atlanta, GA" |
Details of work experience entries.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| person_id | INTEGER | Reference to people table | Foreign Key, Not Null | 1 |
| title | VARCHAR(255) | Job title | May be Null | "Database Administrator" |
| firm | VARCHAR(255) | Company name | May be Null | "Family Private Care LLC" |
| start_date | VARCHAR(7) | Employment start date | May be Null | "04/2017" |
| end_date | VARCHAR(7) | Employment end date | May be Null | "Present" |
| location | VARCHAR(255) | Job location | May be Null | "Roswell, GA" |
Mapping table connecting people to their skills.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| person_id | INTEGER | Reference to people table | Foreign Key, Not Null | 1 |
| skill | VARCHAR(255) | Reference to skills table | Foreign Key, Not Null | "SQL Server" |
Master list of unique skills mentioned across all resumes.
| Column Name | Data Type | Description | Constraints | Example |
|---|---|---|---|---|
| skill | VARCHAR(255) | Unique skill name | Primary Key, Not Null | "SQL Server" |
-- Get all skills for a person
SELECT s.skill
FROM person_skills ps
JOIN skills s ON ps.skill = s.skill
WHERE ps.person_id = 1;
-- Get complete work history
SELECT *
FROM experience
WHERE person_id = 1
ORDER BY start_date DESC;
-- Most common skills
SELECT s.skill, COUNT(*) as frequency
FROM person_skills ps
...
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Patients Table:
This table stores information about individual patients, including their names and contact details.
Doctors Table:
This table contains details about healthcare providers, including their names, specializations, and contact information.
Appointments Table:
This table records scheduled appointments, linking patients to doctors.
MedicalProcedure Table:
This table stores details about medical procedures associated with specific appointments.
Billing Table:
This table maintains records of billing transactions, associating them with specific patients.
demo Table:
This table appears to be a demonstration or testing table, possibly unrelated to the healthcare management system.
This dataset schema is designed to capture comprehensive information about patients, doctors, appointments, medical procedures, and billing transactions in a healthcare management system. Adjustments can be made based on specific requirements, and additional attributes can be included as needed.
Facebook
TwitterCSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csvid: Unique identifier for each distribution center.name: Name of the distribution center.latitude: Latitude coordinate of the distribution center.longitude: Longitude coordinate of the distribution center.events.csvid: Unique identifier for each event.user_id: Identifier for the user associated with the event.sequence_number: Sequence number of the event.session_id: Identifier for the session during which the event occurred.created_at: Timestamp indicating when the event took place.ip_address: IP address from which the event originated.city: City where the event occurred.state: State where the event occurred.postal_code: Postal code of the event location.browser: Web browser used during the event.traffic_source: Source of the traffic leading to the event.uri: Uniform Resource Identifier associated with the event.event_type: Type of event recorded.inventory_items.csvid: Unique identifier for each inventory item.product_id: Identifier for the associated product.created_at: Timestamp indicating when the inventory item was created.sold_at: Timestamp indicating when the item was sold.cost: Cost of the inventory item.product_category: Category of the associated product.product_name: Name of the associated product.product_brand: Brand of the associated product.product_retail_price: Retail price of the associated product.product_department: Department to which the product belongs.product_sku: Stock Keeping Unit (SKU) of the product.product_distribution_center_id: Identifier for the distribution center associated with the product.order_items.csvid: Unique identifier for each order item.order_id: Identifier for the associated order.user_id: Identifier for the user who placed the order.product_id: Identifier for the associated product.inventory_item_id: Identifier for the associated inventory item.status: Status of the order item.created_at: Timestamp indicating when the order item was created.shipped_at: Timestamp indicating when the order item was shipped.delivered_at: Timestamp indicating when the order item was delivered.returned_at: Timestamp indicating when the order item was returned.orders.csvorder_id: Unique identifier for each order.user_id: Identifier for the user who placed the order.status: Status of the order.gender: Gender information of the user.created_at: Timestamp indicating when the order was created.returned_at: Timestamp indicating when the order was returned.shipped_at: Timestamp indicating when the order was shipped.delivered_at: Timestamp indicating when the order was delivered.num_of_item: Number of items in the order.products.csvid: Unique identifier for each product.cost: Cost of the product.category: Category to which the product belongs.name: Name of the product.brand: Brand of the product.retail_price: Retail price of the product.department: Department to which the product belongs.sku: Stock Keeping Unit (SKU) of the product.distribution_center_id: Identifier for the distribution center associated with the product.users.csvid: Unique identifier for each user.first_name: First name of the user.last_name: Last name of the user.email: Email address of the user.age: Age of the user.gender: Gender of the user.state: State where t...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a curated collection of quotes from various renowned individuals. Each entry in the dataset includes the text of the quote and the name of the author. It is designed for use in text analysis, natural language processing (NLP), and sentiment analysis tasks.
quote: The text of the quote.author: The name of the author of the quote.| Quote | Author |
|---|---|
| “Whatever you do, you need courage. ...” | Ralph Waldo Emerson |
| “To be yourself in a world that is constantly trying to make you something else is the greatest accomplishment.” | Ralph Waldo Emerson |
The quotes have been collected from various sources including books, websites, and public domain materials. The data has been verified for accuracy to the best extent possible.
This dataset is suitable for: - Sentiment Analysis - Text Classification - NLP Models - Data Visualization
Specify the license under which the dataset is distributed. For example, "Creative Commons Attribution 4.0 International (CC BY 4.0)" or any other license that fits your requirements.
Acknowledgements to any contributors or sources of the dataset if applicable.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Certainly! Here's a description for the Kaggle dataset related to the cloud-training-demos.SAP_REPLICATED_DATA BigQuery public dataset:
Dataset ID: cloud-training-demos.SAP_REPLICATED_DATA
Overview:
The SAP_REPLICATED_DATA dataset in BigQuery provides a comprehensive replication of SAP (Systems, Applications, and Products in Data Processing) business data. This dataset is designed to support data analytics and machine learning tasks by offering a rich set of structured data that mimics real-world enterprise scenarios. It includes data from various SAP modules and processes, enabling users to perform in-depth analysis, build predictive models, and explore business insights.
Content: - Tables and Schemas: The dataset consists of multiple tables representing different aspects of SAP business operations, including but not limited to sales, inventory, finance, and procurement data. - Data Types: It contains structured data with fields such as transaction IDs, timestamps, customer details, product information, sales figures, and financial metrics. - Data Volume: The dataset is designed to simulate large-scale enterprise data, making it suitable for performance testing, data processing, and analysis.
Usage: - Business Analytics: Users can analyze business trends, sales performance, and financial metrics. - Machine Learning: Ideal for developing and testing machine learning models related to business forecasting, anomaly detection, and customer segmentation. - Data Processing: Suitable for practicing SQL queries, data transformation, and integration tasks.
Example Use Cases: - Sales Analysis: Track and analyze sales performance across different regions and time periods. - Inventory Management: Monitor inventory levels and identify trends in stock movements. - Financial Reporting: Generate financial reports and analyze expense patterns.
For more information and to access the dataset, visit the BigQuery public datasets page or refer to the dataset documentation in the BigQuery console.
Here's a Markdown table with the information you provided:
| File Name | Description |
|---|---|
| adr6.csv | Addresses with organizational units. Contains address details related to organizational units like departments or branches. |
| adrc.csv | General Address Data. Provides information about addresses, including details such as street, city, and postal codes. |
| adrct.csv | Address Contact Information. Contains contact information linked to addresses, including phone numbers and email addresses. |
| adrt.csv | Address Details. Includes detailed address data such as street addresses, city, and country codes. |
| ankt.csv | Accounting Document Segment. Provides details on segments within accounting documents, including account numbers and amounts. |
| anla.csv | Asset Master Data. Contains information about fixed assets, including asset identification and classification. |
| bkpf.csv | Accounting Document Header. Contains headers of accounting documents, such as document numbers and fiscal year. |
| bseg.csv | Accounting Document Segment. Details line items within accounting documents, including account details and amounts. |
| but000.csv | Business Partners. Contains basic information about business partners, including IDs and names. |
| but020.csv | Business Partner Addresses. Provides address details associated with business partners. |
| cepc.csv | Customer Master Data - Central. Contains centralized data for customer master records. |
| cepct.csv | Customer Master Data - Contact. Provides contact details associated with customer records. |
| csks.csv | Cost Center Master Data. Contains data about cost centers within the organization. |
| cskt.csv | Cost Center Texts. Provides text descriptions and labels for cost centers. |
| dd03l.csv | Data Element Field Labels. Contains labels and descriptions for data fields in the SAP system. |
| ekbe.csv | Purchase Order History. Details history of purchase orders, including quantities and values. |
| ekes.csv | Purchasing Document History. Contains history of purchasing documents including changes and statuses. |
| eket.csv | Purchase Order Item History. Details changes and statuses for individual purchase order items. |
| ekkn.csv | Purchase Order Account Assignment. Provides account assignment details for purchas... |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is designed for cutting-edge NLP research in resume parsing, job classification, and ATS system development. Below are extensive details and several ready-made diagrams you can include in your Kaggle upload (just save and upload as “Additional Files” or use them in your dataset description).
| Field | Description | Example/Data Type |
|---|---|---|
| ResumeID | Unique, anonymized string | "DIS4JE91Z..." (string) |
| Category | Tech job category/label | "DevOps Engineer" |
| Name | Anonymized (Faker-generated) name | "Jordan Patel" |
| Anonymized email address | "jpatel@example.com" | |
| Phone | Anonymized phone number | "+1-555-343-2123" |
| Location | City, country or region (anonymized) | "Austin, TX, USA" |
| Summary | Professional summary/intro | String (3-6 sentences) |
| Skills | List or comma-separated tech/soft skills | "Python, Kubernetes..." |
| Experience | Work chronology, organizations, bullet-point details | String (multiline) |
| Education | Universities, degrees, certs | String (multiline) |
| Source | "real", "template", "llm", "faker" | String |
https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/626086319755b5c5810ff838ca0c0c3b/a5b5a057-7265-4428-9827-0a4c92f88d19/0e26c38c.png" alt="Dataset Schema Overview with Field Descriptions and Data Types">
Dataset Schema Overview with Field Descriptions and Data Types
MMM-YYYY)Composition by Data Source:
https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/626086319755b5c5810ff838ca0c0c3b/a5aafe90-c5b6-4d07-ad9c-cf5244266561/5723c094.png" alt="Composition of Tech Resume Dataset by Data Source">
Composition of Tech Resume Dataset by Data Source
Role Cluster Diversity:
https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/626086319755b5c5810ff838ca0c0c3b/8c6ba5d6-f676-4213-b4f7-16a133081e00/e9cc61b6.png" alt="Distribution of Major Tech Role Clusters in the 3,500 Resumes Dataset">
Distribution of Major Tech Role Clusters in the 3,500 Resumes Dataset
Alternative: Dataset by Source Type (Pie Chart):
https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/626086319755b5c5810ff838ca0c0c3b/2325f133-7fe5-4294-9a9d-4db19be3584f/b85a47bd.png" alt="Resume Dataset Composition by Source Type">
Resume Dataset Composition by Source Type
Each line in tech_resumes_dataset.jsonl is a single, fully structured resume object:
import json
with open('tech_resumes_dataset.jsonl', 'r', encoding='utf-8') as f:
resumes = [json.loads(line) for line in f]
# Each record is now a Python dictionary
If you use this dataset, credit it as “[your Kaggle dataset URL]” and mention original sources (ResumeAtlas, Resume_Classification, Kaggle Resume Dataset, and synthetic methodology as described).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Overview: This dataset contains synthetic customer data for a CRM system in Parquet format. It includes customer demographic information, transaction details, and behavioral attributes.
Data Fields: customer_id: Unique identifier for each customer (UUID).
name: Full name of the customer.
email: Email address of the customer.
join_date: The date when the customer joined the platform.
total_spent: Total money spent by the customer.
purchase_count: Number of purchases made by the customer.
last_purchase: Date of the last purchase made by the customer.
File Format: Parquet: The dataset is stored in Parquet format. It provides better performance and compression compared to CSV.
Use Cases: Customer segmentation
Transaction analysis
Predictive modeling
Notes: This dataset was generated synthetically and does not represent real customers.
The data was generated using the Faker library and random values.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The CSV data was sourced from the existing Kaggle dataset titled "Adventure Works 2022" by Algorismus. This data was normalized and consisted of seven individual CSV files. The Sales table served as a fact table that connected to other dimensions. To consolidate all the data into a single table, it was loaded into a SQLite database and transformed accordingly. The final denormalized table was then exported as a single CSV file (delimited by | ), and the column names were updated to follow snake_case style.
doi.org/10.6084/m9.figshare.27899706
| Column Name | Description |
|---|---|
| sales_order_number | Unique identifier for each sales order. |
| sales_order_date | The date and time when the sales order was placed. (e.g., Friday, August 25, 2017) |
| sales_order_date_day_of_week | The day of the week when the sales order was placed (e.g., Monday, Tuesday). |
| sales_order_date_month | The month when the sales order was placed (e.g., January, February). |
| sales_order_date_day | The day of the month when the sales order was placed (1-31). |
| sales_order_date_year | The year when the sales order was placed (e.g., 2022). |
| quantity | The number of units sold in the sales order. |
| unit_price | The price per unit of the product sold. |
| total_sales | The total sales amount for the sales order (quantity * unit price). |
| cost | The total cost associated with the products sold in the sales order. |
| product_key | Unique identifier for the product sold. |
| product_name | The name of the product sold. |
| reseller_key | Unique identifier for the reseller. |
| reseller_name | The name of the reseller. |
| reseller_business_type | The type of business of the reseller (e.g., Warehouse, Value Reseller, Specialty Bike Shop). |
| reseller_city | The city where the reseller is located. |
| reseller_state | The state where the reseller is located. |
| reseller_country | The country where the reseller is located. |
| employee_key | Unique identifier for the employee associated with the sales order. |
| employee_id | The ID of the employee who processed the sales order. |
| salesperson_fullname | The full name of the salesperson associated with the sales order. |
| salesperson_title | The title of the salesperson (e.g., North American Sales Manager, Sales Representative). |
| email_address | The email address of the salesperson. |
| sales_territory_key | Unique identifier for the sales territory for the actual sale. (e.g. 3) |
| assigned_sales_territory | List of sales_territory_key separated by comma assigned to the salesperson. (e.g., 3,4) |
| sales_territory_region | The region of the sales territory. US territory broken down in regions. International regions listed as country name (e.g., Northeast, France). |
| sales_territory_country | The country associated with the sales territory. |
| sales_territory_group | The group classification of the sales territory. (e.g., Europe, North America, Pacific) |
| target | The ... |
Facebook
TwitterIn situations where data is not readily available but needed, you'll have to resort to building up the data yourself. There are many methods you can use to acquire this data from web scraping to APIs. But sometimes, you'll end up needing to create fake or “dummy” data. Dummy data can be useful in times where you know the exact features you’ll be using and the data types included but, you just don’t have the data itself.
Features Description
Reference - https://towardsdatascience.com/build-a-your-own-custom-dataset-using-python-9296540a0178
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
PROJECT OBJECTIVE
We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.
Questions (KPIs)
TASK 1: STANDARDIZING THE DATASET
TASK 2: DATA FORMATING
TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:
TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:
TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:
Process
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Hyderabad City House Prices dataset is a detailed collection of real estate data for residential properties across various localities in Hyderabad. This dataset is aimed at real estate analysts, data scientists, urban planners, and researchers who are interested in studying the housing market, price trends, and neighborhood dynamics within Hyderabad, one of India's rapidly growing metropolitan cities.
The dataset includes the following features:
This dataset can be utilized for various purposes, including: - Market Analysis: Understanding pricing trends, supply and demand, and market conditions in different localities of Hyderabad. - Price Prediction Models: Developing machine learning models to predict property prices based on the given features. - Investment Analysis: Identifying potential investment opportunities by analyzing location, property type, and price data. - Urban Planning: Assisting urban planners in understanding housing distribution and development trends across the city.
The data has been scraped from popular real estate websites such as Magicbricks, 99acres, and Housing.com using the Scrapy framework. The data was collected in [insert month/year] and represents a snapshot of the real estate market in Hyderabad at that time.
| Title | Location | Price (L) | Rate per Sqft | Area in Sqft | Building Status |
|---|---|---|---|---|---|
| Luxurious 3 BHK Apartment | Jubilee Hills | 300 | 15,000 | 2000 | Ready to Move |
| Spacious 4 BHK Villa | Gachibowli | 450 | 10,000 | 4500 | Under Construction |
| Affordable 2 BHK Flat | Madhapur | 80 | 8,000 | 1000 | Ready to Move |
For more information or to access the dataset, please contact [Your Name] at [Your Email Address].
This dataset provides valuable insights into Hyderabad's diverse real estate market, helping stakeholders make informed decisions based on accurate and up-to-date data.
Facebook
TwitterDescription: This dataset is created solely for the purpose of practice and learning. It contains entirely fake and fabricated information, including names, phone numbers, emails, cities, ages, and other attributes. None of the information in this dataset corresponds to real individuals or entities. It serves as a resource for those who are learning data manipulation, analysis, and machine learning techniques. Please note that the data is completely fictional and should not be treated as representing any real-world scenarios or individuals.
Attributes: - phone_number: Fake phone numbers in various formats. - name: Fictitious names generated for practice purposes. - email: Imaginary email addresses created for the dataset. - city: Made-up city names to simulate geographical diversity. - age: Randomly generated ages for practice analysis. - sex: Simulated gender values (Male, Female). - married_status: Synthetic marital status information. - job: Fictional job titles for practicing data analysis. - income: Fake income values for learning data manipulation. - religion: Pretend religious affiliations for practice. - nationality: Simulated nationalities for practice purposes.
Please be aware that this dataset is not based on real data and should be used exclusively for educational purposes.