100+ datasets found
  1. Pandas Practice Dataset

    • kaggle.com
    zip
    Updated Jan 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
    Explore at:
    zip(493 bytes)Available download formats
    Dataset updated
    Jan 27, 2023
    Authors
    Mrityunjay Pathak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    What is Pandas?

    Pandas is a Python library used for working with data sets.

    It has functions for analyzing, cleaning, exploring, and manipulating data.

    The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

    Why Use Pandas?

    Pandas allows us to analyze big data and make conclusions based on statistical theories.

    Pandas can clean messy data sets, and make them readable and relevant.

    Relevant data is very important in data science.

    What Can Pandas Do?

    Pandas gives you answers about the data. Like:

    Is there a correlation between two or more columns?

    What is average value?

    Max value?

    Min value?

  2. Learn Pandas

    • kaggle.com
    zip
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vaidik Patel (2023). Learn Pandas [Dataset]. https://www.kaggle.com/datasets/js1js2js3js4js5/learn-pandas
    Explore at:
    zip(1209861 bytes)Available download formats
    Dataset updated
    Oct 5, 2023
    Authors
    Vaidik Patel
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    It is a dataset with notebook kind of learning. Download the whole package and you will find everything to learn basics to advanced pandas which is exactly what you will need in machine learning and in data science. 😄

    This will gives you the overview and data analysis tools in pandas that is mostly required in the data manipulation and extraction important data.

    Use this notebook as notes for pandas. whenever you forget the code or syntax open it and scroll through it and you will find the solution. 🥳

  3. Panda Image Dataset

    • kaggle.com
    zip
    Updated Jan 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ho Loong (2022). Panda Image Dataset [Dataset]. https://www.kaggle.com/datasets/holoong9291/pandaimagedataset
    Explore at:
    zip(23965708 bytes)Available download formats
    Dataset updated
    Jan 13, 2022
    Authors
    Ho Loong
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Data Source

    Chengdu Research Base of Gaint Panda Breeding

    PNG File Name

    • 0_: newborn panda body
    • 1_: superstar
    • 2_: overseas
    • 3_: growth dairy
    • 4_: sleepy
    • 5_: mother and child
    • 6_: cute
    • 7_: play
  4. Investigating the dataset using pandas

    • kaggle.com
    zip
    Updated May 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nimra hafeez (2024). Investigating the dataset using pandas [Dataset]. https://www.kaggle.com/datasets/nimrahafeez942/investigating-the-dataset-using-pandas/data
    Explore at:
    zip(4595 bytes)Available download formats
    Dataset updated
    May 14, 2024
    Authors
    Nimra hafeez
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Nimra hafeez

    Released under Database: Open Database, Contents: Database Contents

    Contents

  5. Pandas Practice Files

    • kaggle.com
    zip
    Updated Aug 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil (2024). Pandas Practice Files [Dataset]. https://www.kaggle.com/datasets/sahil23009/pandas-practice-files/data
    Explore at:
    zip(30039742 bytes)Available download formats
    Dataset updated
    Aug 19, 2024
    Authors
    Sahil
    Description

    Dataset

    This dataset was created by Sahil

    Contents

  6. Dataset for pandas data-frame 1.1

    • kaggle.com
    zip
    Updated Jun 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    _anxious (2024). Dataset for pandas data-frame 1.1 [Dataset]. https://www.kaggle.com/datasets/par7h0/dataset-for-pandas-data-frame-1-1/code
    Explore at:
    zip(763342 bytes)Available download formats
    Dataset updated
    Jun 16, 2024
    Authors
    _anxious
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by _anxious

    Released under CC0: Public Domain

    Contents

  7. Convert Text to Pandas

    • kaggle.com
    zip
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeyad Usf (2024). Convert Text to Pandas [Dataset]. https://www.kaggle.com/datasets/zeyadusf/convert-text-to-pandas
    Explore at:
    zip(4333134 bytes)Available download formats
    Dataset updated
    Sep 22, 2024
    Authors
    Zeyad Usf
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    kaggle notebook
    Github Repo

    I found two datasets about converting text with context to pandas code on Hugging Face, but the challenge is in the context. The context in both datasets is different which reduces the results of the model. First let's mention the data I found and then show examples, solution and some other problems.

    • Rahima411/text-to-pandas:

      • The data is divided into Train with 57.5k and Test with 19.2k.

      • The data has two columns as you can see in the example:

        • "Input": Contains the context and the question together, in the context it shows the metadata about the data frame.
        • "Pandas Query": Pandas code txt Input | Pandas Query -----------------------------------------------------------|------------------------------------------- Table Name: head (age (object), head_id (object)) | result = management['head.age'].unique() Table Name: management (head_id (object), | temporary_acting (object)) | What are the distinct ages of the heads who are acting? |
    • hiltch/pandas-create-context:

      • It contains 17k rows with three columns:
        • question : text .
        • context : Code to create a data frame with column names, unlike the first data set which contains the name of the data frame, column names and data type.
        • answer : Pandas code.
          question           |            context             |       answer 
    ----------------------------------------|--------------------------------------------------------|---------------------------------------
    What was the lowest # of total votes?  | df = pd.DataFrame(columns=['_number_of_total_votes']) | df['_number_of_total_votes'].min()   
    

    As you can see, the problem with this data is that they are not similar as inputs and the structure of the context is different . My solution to this problem was: - Convert the first data set to become like the second in the context. I chose this because it is difficult to get the data type for the columns in the second data set. It was easy to convert the structure of the context from this shape Table Name: head (age (object), head_id (object)) to this head = pd.DataFrame(columns=['age','head_id']) through this code that I wrote. - Then separate the question from the context. This was easy because if you look at the data, you will find that the context always ends with "(" and then a blank and then the question. You will find all of this in this code. - You will also notice that more than one code or line can be returned to the context, and this has been engineered into the code. ```py def extract_table_creation(text:str)->(str,str): """ Extracts DataFrame creation statements and questions from the given text.

    Args:
      text (str): The input text containing table definitions and questions.
    
    Returns:
      tuple: A tuple containing a concatenated DataFrame creation string and a question.
    """
    # Define patterns
    table_pattern = r'Table Name: (\w+) \(([\w\s,()]+)\)'
    column_pattern = r'(\w+)\s*\((object|int64|float64)\)'
    
    # Find all table names and column definitions
    matches = re.findall(table_pattern, text)
    
    # Initialize a list to hold DataFrame creation statements
    df_creations = []
    
    for table_name, columns_str in matches:
      # Extract column names
      columns = re.findall(column_pattern, columns_str)
      column_names = [col[0] for col in columns]
    
      # Format DataFrame creation statement
      df_creation = f"{table_name} = pd.DataFrame(columns={column_names})"
      df_creations.append(df_creation)
    
    # Concatenate all DataFrame creation statements
    df_creation_concat = '
    

    '.join(df_creations)

    # Extract and clean the question
    question = text[text.rindex(')')+1:].strip()
    
    return df_creation_concat, question
    
    After both datasets were similar in structure, they were merged into one set and divided into _72.8K_ train and _18.6K_ test. We analyzed this dataset and you can see it all through the **[`notebook`](https://www.kaggle.com/code/zeyadusf/text-2-pandas-t5#Exploratory-Data-Analysis(EDA))**, but we found some problems in the dataset as well, such as
    > - `Answer` : `df['Id'].count()` has been repeated, but this is possible, so we do not need to dispense with these rows.
    > - `Context` : We see that it contains `147` rows that do not contain any text. We will see Through the experiment if this will affect the results negatively or positively.
    > - `Question` : It is ...
    
  8. Panda Data

    • kaggle.com
    zip
    Updated Jul 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NG NM WT (2023). Panda Data [Dataset]. https://www.kaggle.com/datasets/ngnmwt/panda-data
    Explore at:
    zip(10440111 bytes)Available download formats
    Dataset updated
    Jul 17, 2023
    Authors
    NG NM WT
    Description

    Dataset

    This dataset was created by NG NM WT

    Contents

    Plotting

  9. Pandas data fram implementation

    • kaggle.com
    zip
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jedike (2024). Pandas data fram implementation [Dataset]. https://www.kaggle.com/datasets/jedike/pandas-data-fram-implementation
    Explore at:
    zip(36011 bytes)Available download formats
    Dataset updated
    Nov 25, 2024
    Authors
    jedike
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by jedike

    Released under MIT

    Contents

  10. EDA with Pandas

    • kaggle.com
    zip
    Updated Feb 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Raja (2023). EDA with Pandas [Dataset]. https://www.kaggle.com/datasets/amirraja/eda-with-pandas
    Explore at:
    zip(231014 bytes)Available download formats
    Dataset updated
    Feb 15, 2023
    Authors
    Amir Raja
    Description

    Dataset

    This dataset was created by Amir Raja

    Contents

  11. Dataset to practice Pandas Series

    • kaggle.com
    zip
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mayuri Awati (2023). Dataset to practice Pandas Series [Dataset]. https://www.kaggle.com/datasets/mayuriawati/dataset-to-practice-pandas-series/data
    Explore at:
    zip(20527 bytes)Available download formats
    Dataset updated
    Oct 5, 2023
    Authors
    Mayuri Awati
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset consists of three one-dimensional datasets, each containing a series of values. One-dimensional data is characterized by its simplicity, making it an ideal starting point for those new to data analysis and manipulation. With the power of Pandas Series, you can perform a wide range of operations and functions to gain insights and derive valuable information from these datasets.

  12. Pokkemon Dataset_csv

    • kaggle.com
    zip
    Updated Aug 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BHARGAV NATH (2021). Pokkemon Dataset_csv [Dataset]. https://www.kaggle.com/datasets/bhargavnath/new-dataset-fr-pandas
    Explore at:
    zip(13955 bytes)Available download formats
    Dataset updated
    Aug 18, 2021
    Authors
    BHARGAV NATH
    Description

    Dataset

    This dataset was created by BHARGAV NATH

    Contents

  13. Retail Data Customer Summary (Learn Pandas Basics)

    • kaggle.com
    zip
    Updated Jul 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kunaal Naik (2020). Retail Data Customer Summary (Learn Pandas Basics) [Dataset]. https://www.kaggle.com/funxexcel/retail-data-customer-summary-learn-pandas-basics
    Explore at:
    zip(162094 bytes)Available download formats
    Dataset updated
    Jul 26, 2020
    Authors
    Kunaal Naik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I have taught many students to use Pandas. Often, many lacked context to apply their newly acquired skills. This dataset will help new learners work on their Pandas skills.

    Content

    This dataset contains 13 columns and 6889 rows. The data is at a unique customer level. Each customers transaction amount and number of transactions information is present in a separate column (or unpivoted). Also, the data contains its first and last transaction date.

    Acknowledgements

    To be added.

    Inspiration

    I was inspired by creating contextual questions that will help students learn Pandas faster.

  14. Foodpanda Analysis Dataset 2025

    • kaggle.com
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nabiha zahid (2025). Foodpanda Analysis Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/nabihazahid/foodpanda-analysis-dataset-2025
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nabiha zahid
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    "A clean and comprehensive Foodpanda dataset with 6000 records, featuring customer demographics, orders, payments, ratings, and delivery details — ideal for analyzing customer behavior, sales trends, and churn analysis

  15. Practice Panda and Dictionary

    • kaggle.com
    zip
    Updated Sep 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brenda N (2020). Practice Panda and Dictionary [Dataset]. https://www.kaggle.com/brendan45774/dictionary-and-pandas-csv
    Explore at:
    zip(316 bytes)Available download formats
    Dataset updated
    Sep 4, 2020
    Authors
    Brenda N
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    If you need a practice dataset to improve your skills then use this dataset. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2681031%2F9dfecc01e0d719b732e69389b592de91%2Fpython%20panda2.jpg?generation=1599246568164860&alt=media" alt="">

    I created this dataset for my notebook Getting started with Dictionary and Pandas. To help people improve their dictionary and panda skills. https://www.kaggle.com/brendan45774/getting-started-with-dictionary-and-pandas

  16. Data for Pandas Tutorial for Beginners

    • kaggle.com
    zip
    Updated Feb 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Le Liem (2022). Data for Pandas Tutorial for Beginners [Dataset]. https://www.kaggle.com/datasets/thanhlimlk/data-for-pandas-tutorial-for-beginners
    Explore at:
    zip(1212 bytes)Available download formats
    Dataset updated
    Feb 28, 2022
    Authors
    Le Liem
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset is used to practice Pandas for beginners

    Content

    This dataset is presented with some errors which is needed to be fixed. You can use this dataset to practice: Cleaning NaN values with basic Pandas techniques.

    Acknowledgements

    I have this dataset from w3school

  17. PD-Practice_Data

    • kaggle.com
    zip
    Updated Oct 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AyanStark (2024). PD-Practice_Data [Dataset]. https://www.kaggle.com/datasets/ayanstark/pd-practice-data
    Explore at:
    zip(15089 bytes)Available download formats
    Dataset updated
    Oct 4, 2024
    Authors
    AyanStark
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by AyanStark

    Released under MIT

    Contents

  18. Data analysis with pandas and python

    • kaggle.com
    zip
    Updated Apr 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    乡TOBY乡 (2023). Data analysis with pandas and python [Dataset]. https://www.kaggle.com/datasets/toby000/data-analysis-with-pandas-and-python
    Explore at:
    zip(701073 bytes)Available download formats
    Dataset updated
    Apr 16, 2023
    Authors
    乡TOBY乡
    Description

    This dataset includes data that is provided in the Udemy course "Data Analysis with Pandas and Python" by Boris Paskhaver.

  19. Visualization of Data Via Pandas

    • kaggle.com
    zip
    Updated Jan 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Bhushan (2021). Visualization of Data Via Pandas [Dataset]. https://www.kaggle.com/bhushanaditya/visualization-of-data-via-pandas
    Explore at:
    zip(39173 bytes)Available download formats
    Dataset updated
    Jan 17, 2021
    Authors
    Aditya Bhushan
    Description

    Dataset

    This dataset was created by Aditya Bhushan

    Contents

  20. data ex pandas

    • kaggle.com
    zip
    Updated Nov 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bertille Pagès (2021). data ex pandas [Dataset]. https://www.kaggle.com/datasets/bertillepags/data-ex-pandas
    Explore at:
    zip(1867147 bytes)Available download formats
    Dataset updated
    Nov 24, 2021
    Authors
    Bertille Pagès
    Description

    Dataset

    This dataset was created by Bertille Pagès

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
Organization logo

Pandas Practice Dataset

Dataset to Practice Your Pandas Skill's

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
zip(493 bytes)Available download formats
Dataset updated
Jan 27, 2023
Authors
Mrityunjay Pathak
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

What is Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

Is there a correlation between two or more columns?

What is average value?

Max value?

Min value?

Search
Clear search
Close search
Google apps
Main menu