100+ datasets found
  1. Pandas Practice Dataset

    • kaggle.com
    zip
    Updated Jan 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
    Explore at:
    zip(493 bytes)Available download formats
    Dataset updated
    Jan 27, 2023
    Authors
    Mrityunjay Pathak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    What is Pandas?

    Pandas is a Python library used for working with data sets.

    It has functions for analyzing, cleaning, exploring, and manipulating data.

    The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

    Why Use Pandas?

    Pandas allows us to analyze big data and make conclusions based on statistical theories.

    Pandas can clean messy data sets, and make them readable and relevant.

    Relevant data is very important in data science.

    What Can Pandas Do?

    Pandas gives you answers about the data. Like:

    Is there a correlation between two or more columns?

    What is average value?

    Max value?

    Min value?

  2. EDA with Pandas

    • kaggle.com
    zip
    Updated Feb 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Raja (2023). EDA with Pandas [Dataset]. https://www.kaggle.com/datasets/amirraja/eda-with-pandas
    Explore at:
    zip(231014 bytes)Available download formats
    Dataset updated
    Feb 15, 2023
    Authors
    Amir Raja
    Description

    Dataset

    This dataset was created by Amir Raja

    Contents

  3. PandasPlotBench

    • huggingface.co
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JetBrains Research (2024). PandasPlotBench [Dataset]. https://huggingface.co/datasets/JetBrains-Research/PandasPlotBench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    JetBrainshttp://jetbrains.com/
    Authors
    JetBrains Research
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    PandasPlotBench

    PandasPlotBench is a benchmark to assess the capability of models in writing the code for visualizations given the description of the Pandas DataFrame. šŸ› ļø Task. Given the plotting task and the description of a Pandas DataFrame, write the code to build a plot. The dataset is based on the MatPlotLib gallery. The paper can be found in arXiv: https://arxiv.org/abs/2412.02764v1. To score your model on this dataset, you can use the our GitHub repository. šŸ“© If you have… See the full description on the dataset page: https://huggingface.co/datasets/JetBrains-Research/PandasPlotBench.

  4. h

    wikipedia-summary-dataset-128k

    • huggingface.co
    Updated Apr 4, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Bukowski (2015). wikipedia-summary-dataset-128k [Dataset]. https://huggingface.co/datasets/mbukowski/wikipedia-summary-dataset-128k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 4, 2015
    Authors
    Martin Bukowski
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Wikipedia Summary Dataset 128k

    This is random subsample of 128k entries from the wikipedia summary dataset, processed with the following code: import pandas as pd

    df = pd.read_parquet('wikipedia-summary.parquet') df['l'] = df['summary'].str.len() rdf = df[(df['l'] > 300) & (df['l'] < 600)]

    Filter out any rows 'topic' that have non-alphanumeric characters

    mask = rdf['topic'].str.contains(r'^[a-zA-Z0-9 ]+$') == True rdf = rdf[mask == True].sample(128000)[['topic'… See the full description on the dataset page: https://huggingface.co/datasets/mbukowski/wikipedia-summary-dataset-128k.

  5. Capstone Project TikTok - EDA

    • kaggle.com
    zip
    Updated Nov 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sohail K. Nikouzad (2023). Capstone Project TikTok - EDA [Dataset]. https://www.kaggle.com/datasets/sohailnikouzad/capstone-pr0ject-tiktok-eda
    Explore at:
    zip(52324 bytes)Available download formats
    Dataset updated
    Nov 15, 2023
    Authors
    Sohail K. Nikouzad
    Description

    Dataset

    This dataset was created by Sohail K. Nikouzad

    Contents

  6. Retail Data Customer Summary (Learn Pandas Basics)

    • kaggle.com
    zip
    Updated Jul 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kunaal Naik (2020). Retail Data Customer Summary (Learn Pandas Basics) [Dataset]. https://www.kaggle.com/funxexcel/retail-data-customer-summary-learn-pandas-basics
    Explore at:
    zip(162094 bytes)Available download formats
    Dataset updated
    Jul 26, 2020
    Authors
    Kunaal Naik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I have taught many students to use Pandas. Often, many lacked context to apply their newly acquired skills. This dataset will help new learners work on their Pandas skills.

    Content

    This dataset contains 13 columns and 6889 rows. The data is at a unique customer level. Each customers transaction amount and number of transactions information is present in a separate column (or unpivoted). Also, the data contains its first and last transaction date.

    Acknowledgements

    To be added.

    Inspiration

    I was inspired by creating contextual questions that will help students learn Pandas faster.

  7. Z

    Dataset for class comment analysis

    • data.niaid.nih.gov
    Updated Feb 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pooja Rani (2022). Dataset for class comment analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4311838
    Explore at:
    Dataset updated
    Feb 22, 2022
    Dataset provided by
    University of Bern
    Authors
    Pooja Rani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A list of different projects selected to analyze class comments (available in the source code) of various languages such as Java, Python, and Pharo. The projects vary in terms of size, contributors, and domain.

    Structure

    Projects/
      Java_projects/
        eclipse.zip
        guava.zip
        guice.zip
        hadoop.zip
        spark.zip
        vaadin.zip
    
      Pharo_projects/
        images/
          GToolkit.zip
          Moose.zip
          PetitParser.zip
          Pillar.zip
          PolyMath.zip
          Roassal2.zip
          Seaside.zip
    
        vm/
          70-x64/Pharo
    
        Scripts/
          ClassCommentExtraction.st
          SampleSelectionScript.st    
    
      Python_projects/
        django.zip
        ipython.zip
        Mailpile.zip
        pandas.zip
        pipenv.zip
        pytorch.zip   
        requests.zip 
      
    

    Contents of the Replication Package

    Projects/ contains the raw projects of each language that are used to analyze class comments. - Java_projects/ - eclipse.zip - Eclipse project downloaded from the GitHub. More detail about the project is available on GitHub Eclipse. - guava.zip - Guava project downloaded from the GitHub. More detail about the project is available on GitHub Guava. - guice.zip - Guice project downloaded from the GitHub. More detail about the project is available on GitHub Guice - hadoop.zip - Apache Hadoop project downloaded from the GitHub. More detail about the project is available on GitHub Apache Hadoop - spark.zip - Apache Spark project downloaded from the GitHub. More detail about the project is available on GitHub Apache Spark - vaadin.zip - Vaadin project downloaded from the GitHub. More detail about the project is available on GitHub Vaadin

    • Pharo_projects/

      • images/ -

        • GToolkit.zip - Gtoolkit project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • Moose.zip - Moose project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • PetitParser.zip - Petit Parser project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • Pillar.zip - Pillar project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • PolyMath.zip - PolyMath project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • Roassal2.zip - Roassal2 project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
        • Seaside.zip - Seaside project is imported into the Pharo image. We can run this image with the virtual machine given in the vm/ folder. The script to extract the comments is already provided in the image.
      • vm/ -

      • 70-x64/Pharo - Pharo7 (version 7 of Pharo) virtual machine to instantiate the Pharo images given in the images/ folder. The user can run the vm on macOS and select any of the Pharo image.

      • Scripts/ - It contains the sample Smalltalk scripts to extract class comments from various projects.

      • ClassCommentExtraction.st - A Smalltalk script to show how class comments are extracted from various Pharo projects. This script is already provided in the respective project image.

      • SampleSelectionScript.st - A Smalltalk script to show sample class comments of Pharo projects are selected. This script can be run in any of the Pharo images given in the images/ folder.

    • Python_projects/

      • django.zip - Django project downloaded from the GitHub. More detail about the project is available on GitHub Django
      • ipython.zip - IPython project downloaded from the GitHub. More detail about the project is available on GitHub on IPython
      • Mailpile.zip - Mailpile project downloaded from the GitHub. More detail about the project is available on GitHub on Mailpile
      • pandas.zip - pandas project downloaded from the GitHub. More detail about the project is available on GitHub on pandas
      • pipenv.zip - Pipenv project downloaded from the GitHub. More detail about the project is available on GitHub on Pipenv
      • pytorch.zip - PyTorch project downloaded from the GitHub. More detail about the project is available on GitHub on PyTorch
      • requests.zip - Requests project downloaded from the GitHub. More detail about the project is available on GitHub on Requests
  8. Summary of study pandas and GPS collar performance over the one year period...

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vanessa Hull; Jindong Zhang; Jinyan Huang; Shiqiang Zhou; AndrƩs ViƱa; Ashton Shortridge; Rengui Li; Dian Liu; Weihua Xu; Zhiyun Ouyang; Hemin Zhang; Jianguo Liu (2023). Summary of study pandas and GPS collar performance over the one year period included in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0162266.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Vanessa Hull; Jindong Zhang; Jinyan Huang; Shiqiang Zhou; AndrƩs ViƱa; Ashton Shortridge; Rengui Li; Dian Liu; Weihua Xu; Zhiyun Ouyang; Hemin Zhang; Jianguo Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of study pandas and GPS collar performance over the one year period included in this study.

  9. Summary of miRNAs sequencing.

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingyu Yang; Lianming Du; Wujiao Li; Fujun Shen; Zhenxin Fan; Zuoyi Jian; Rong Hou; Yongmei Shen; Bisong Yue; Xiuyue Zhang (2023). Summary of miRNAs sequencing. [Dataset]. http://doi.org/10.1371/journal.pone.0143242.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mingyu Yang; Lianming Du; Wujiao Li; Fujun Shen; Zhenxin Fan; Zuoyi Jian; Rong Hou; Yongmei Shen; Bisong Yue; Xiuyue Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of miRNAs sequencing.

  10. S&P 500 Companies Analysis Project

    • kaggle.com
    zip
    Updated Apr 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    anshadkaggle (2025). S&P 500 Companies Analysis Project [Dataset]. https://www.kaggle.com/datasets/anshadkaggle/s-and-p-500-companies-analysis-project
    Explore at:
    zip(9721576 bytes)Available download formats
    Dataset updated
    Apr 6, 2025
    Authors
    anshadkaggle
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This project focuses on analyzing the S&P 500 companies using data analysis tools like Python (Pandas), SQL, and Power BI. The goal is to extract insights related to sectors, industries, locations, and more, and visualize them using dashboards.

    Included Files:

    sp500_cleaned.csv – Cleaned dataset used for analysis

    sp500_analysis.ipynb – Jupyter Notebook (Python + SQL code)

    dashboard_screenshot.png – Screenshot of Power BI dashboard

    README.md – Summary of the project and key takeaways

    This project demonstrates practical data cleaning, querying, and visualization skills.

  11. Summary the gender and age for all samples.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingyu Yang; Lianming Du; Wujiao Li; Fujun Shen; Zhenxin Fan; Zuoyi Jian; Rong Hou; Yongmei Shen; Bisong Yue; Xiuyue Zhang (2023). Summary the gender and age for all samples. [Dataset]. http://doi.org/10.1371/journal.pone.0143242.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mingyu Yang; Lianming Du; Wujiao Li; Fujun Shen; Zhenxin Fan; Zuoyi Jian; Rong Hou; Yongmei Shen; Bisong Yue; Xiuyue Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary the gender and age for all samples.

  12. E

    A Replication Dataset for Fundamental Frequency Estimation

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    json
    Updated Oct 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). A Replication Dataset for Fundamental Frequency Estimation [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7808
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 19, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods.Ā© 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise.The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time.Included Code and Data

    ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora:

    CMU-ARCTIC (consensus truth) [1]FDA (corpus truth and consensus truth) [2]KEELE (corpus truth and consensus truth) [3]MOCHA-TIMIT (consensus truth) [4]PTDB-TUG (corpus truth and consensus truth) [5]TIMIT (consensus truth) [6]

    noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora:NOISEX [7]QUT-NOISE [8]

    synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise.noisy_speech.pkl and synthetic_speech.pkl are pickled Pandas dataframes of performance metrics derived from the above data for the following list of fundamental frequency estimation algorithms:AUTOC [9]AMDF [10]BANA [11]CEP [12]CREPE [13]DIO [14]DNN [15]KALDI [16]MAPSMBSC [17]NLS [18]PEFAC [19]PRAAT [20]RAPT [21]SACC [22]SAFE [23]SHR [24]SIFT [25]SRH [26]STRAIGHT [27]SWIPE [28]YAAPT [29]YIN [30]

    noisy speech evaluation.py and synthetic speech evaluation.py are Python programs to calculate the above Pandas dataframes from the above JBOF datasets. They calculate the following performance measures:Gross Pitch Error (GPE), the percentage of pitches where the estimated pitch deviates from the true pitch by more than 20%.Fine Pitch Error (FPE), the mean error of grossly correct estimates.High/Low Octave Pitch Error (OPE), the percentage pitches that are GPEs and happens to be at an integer multiple of the true pitch.Gross Remaining Error (GRE), the percentage of pitches that are GPEs but not OPEs.Fine Remaining Bias (FRB), the median error of GREs.True Positive Rate (TPR), the percentage of true positive voicing estimates.False Positive Rate (FPR), the percentage of false positive voicing estimates.False Negative Rate (FNR), the percentage of false negative voicing estimates.F₁, the harmonic mean of precision and recall of the voicing decision.

    Pipfile is a pipenv-compatible pipfile for installing all prerequisites necessary for running the above Python programs.

    The Python programs take about an hour to compute on a fast 2019 computer, and require at least 32 Gb of memory.References:

    John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003.Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993.F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. In Fourth European Conference on Speech Communication and Technology, pages 837–840, Madrid, Spain, 1995.Alan Wrench. MOCHA MultiCHannel Articulatory database: English, November 1999.Gregor Pirker, Michael Wohlmayr, Stefan Petrik, and Franz Pernkopf. A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. page 4, 2011.John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.Andrew Varga and Herman J.M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recog- nition systems. Speech Communication, 12(3):247–251, July 1993.David B. Dean, Sridha Sridharan, Robert J. Vogt, and Michael W. Mason. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. Proceedings of Interspeech 2010, 2010.Man Mohan Sondhi. New methods of pitch extraction. Audio and Electroacoustics, IEEE Transactions on, 16(2):262—266, 1968.Myron J. Ross, Harry L. Shaffer, Asaf Cohen, Richard Freudberg, and Harold J. Manley. Average magnitude difference function pitch extractor. Acoustics, Speech and Signal Processing, IEEE Transactions on, 22(5):353—362, 1974.Na Yang, He Ba, Weiyang Cai, Ilker Demirkol, and Wendi Heinzelman. BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):1833–1848, December 2014.Michael Noll. Cepstrum Pitch Determination. The Journal of the Acoustical Society of America, 41(2):293–309, 1967.Jong Wook Kim, Justin Salamon, Peter Li, and Juan Pablo Bello. CREPE: A Convolutional Representation for Pitch Estimation. arXiv:1802.06182 [cs, eess, stat], February 2018. arXiv: 1802.06182.Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications. IEICE Transactions on Information and Systems, E99.D(7):1877–1884, 2016.Kun Han and DeLiang Wang. Neural Network Based Pitch Tracking in Very Noisy Speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):2158–2168, Decem- ber 2014.Pegah Ghahremani, Bagher BabaAli, Daniel Povey, Korbinian Riedhammer, Jan Trmal, and Sanjeev Khudanpur. A pitch extraction algorithm tuned for automatic speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 2494–2498. IEEE, 2014.Lee Ngee Tan and Abeer Alwan. Multi-band summary correlogram-based pitch detection for noisy speech. Speech Communication, 55(7-8):841–856, September 2013.Jesper KjƦr Nielsen, Tobias LindstrĆøm Jensen, Jesper Rindom Jensen, Mads GrƦsbĆøll Christensen, and SĆøren Holdt Jensen. Fast fundamental frequency estimation: Making a statistically efficient estimator computationally efficient. Signal Processing, 135:188–197, June 2017.Sira Gonzalez and Mike Brookes. PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2):518—530, February 2014.Paul Boersma. Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the institute of phonetic sciences, volume 17, page 97—110. Amsterdam, 1993.David Talkin. A robust algorithm for pitch tracking (RAPT). Speech coding and synthesis, 495:518, 1995.Byung Suk Lee and Daniel PW Ellis. Noise robust pitch tracking by subband autocorrelation classification. In Interspeech, pages 707–710, 2012.Wei Chu and Abeer Alwan. SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech. In INTERSPEECH, pages 2590–2593, 2010.Xuejing Sun. Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio. In Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on, volume 1, page I—333. IEEE, 2002.Markel. The SIFT algorithm for fundamental frequency estimation. IEEE Transactions on Audio and Electroacoustics, 20(5):367—377, December 1972.Thomas Drugman and Abeer Alwan. Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics. In Interspeech, page 1973—1976, 2011.Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, and Hideki Banno. TANDEM-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation. In Acous- tics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, pages 3933–3936. IEEE, 2008.Arturo Camacho. SWIPE: A sawtooth waveform inspired pitch estimator for speech and music. PhD thesis, University of Florida, 2007.Kavita Kasi and Stephen A. Zahorian. Yet Another Algorithm for Pitch Tracking. In IEEE International Conference on Acoustics Speech and Signal Processing, pages I–361–I–364, Orlando, FL, USA, May 2002. IEEE.Alain de CheveignĆ© and Hideki Kawahara. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4):1917, 2002.

  13. Reproduction of PANDA: analysis for simulations and applications

    • zenodo.org
    zip
    Updated Aug 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meng-Guo Wang; Meng-Guo Wang (2024). Reproduction of PANDA: analysis for simulations and applications [Dataset]. http://doi.org/10.5281/zenodo.13324624
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Meng-Guo Wang; Meng-Guo Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data are derived from analyses based on PANDA results and are consistent with those presented in the paper "Dual decoding of cell types and gene expression in spatial transcriptomics with PANDA".

    To ensure that the file paths match those used in the code, please place the files in the following directories within your working directory before extracting them:

    "Analysis/simulations/paired_scenario.zip"

    "Analysis/simulations/unpaired_scenario.zip"

    "Analysis/simulations/merfish.zip"

    "Analysis/simulations/reference_choice.zip"

    "Analysis/simulations/parameter_sensitivity.zip"

    "Analysis/simulations/time_memory.zip"

    "Analysis/applications/melanoma.zip"

    "Analysis/applications/mouse_brain.zip"

    "Analysis/applications/human_heart.zip"

  14. f

    Multivariable effects analysis by GLM for Environmental Factors and Behavior...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jan 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, He; Duan, Hejun; Wang, Cheng (2017). Multivariable effects analysis by GLM for Environmental Factors and Behavior in Zoo-housed Giant Pandas. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001765677
    Explore at:
    Dataset updated
    Jan 21, 2017
    Authors
    Liu, He; Duan, Hejun; Wang, Cheng
    Description

    Multivariable effects analysis by GLM for Environmental Factors and Behavior in Zoo-housed Giant Pandas.

  15. Additional file 1 of A tail of two pandas— whole genome k-mer signature...

    • springernature.figshare.com
    xlsx
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matyas Cserhati (2023). Additional file 1 of A tail of two pandas— whole genome k-mer signature analysis of the red panda (Ailurus fulgens) and the Giant panda (Ailuropoda melanoleuca) [Dataset]. http://doi.org/10.6084/m9.figshare.14361492.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Matyas Cserhati
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1. Results of whole genome analysis of 28 species. The file includes a list of species, and the genome sequence files downloaded from NCBI, the PCC matrix which is a result of the WGKS algorithm, as well as the species clusters and the cluster statistics.

  16. Pandu Pandas Price Prediction for 2025-12-10

    • coinunited.io
    Updated Nov 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CoinUnited.io (2025). Pandu Pandas Price Prediction for 2025-12-10 [Dataset]. https://coinunited.io/en/data/prices/crypto/pandu-pandas-pandas/price-prediction
    Explore at:
    Dataset updated
    Nov 24, 2025
    Dataset provided by
    CoinUnited.io
    Description

    Based on professional technical analysis and AI models, deliver precise price‑prediction data for Pandu Pandas on 2025-12-10. Includes multi‑scenario analysis (bullish, baseline, bearish), risk assessment, technical‑indicator insights and market‑trend forecasts to help investors make informed trading decisions and craft sound investment strategies.

  17. h

    urdu-speech-tagging

    • huggingface.co
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Sajjad Rasool (2025). urdu-speech-tagging [Dataset]. https://huggingface.co/datasets/ReySajju742/urdu-speech-tagging
    Explore at:
    Dataset updated
    May 8, 2025
    Authors
    Muhammad Sajjad Rasool
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Summary Dataset

    This a summary dataset. You can train abstractive summarization model using this dataset. It contains 3 files i.e. train, test and val. Data is in jsonl format. Every line has these keys. id url title summary text

    You can easily read the data with pandas import pandas as pd test = pd.read_json("summary/urdu_test.jsonl", lines=True)

      POS dataset
    

    Urdu dataset for POS training. This is a small dataset and can be used for training parts of speech tagging… See the full description on the dataset page: https://huggingface.co/datasets/ReySajju742/urdu-speech-tagging.

  18. f

    Changes in the Milk Metabolome of the Giant Panda (Ailuropoda melanoleuca)...

    • figshare.com
    • data.niaid.nih.gov
    • +1more
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tong Zhang; Rong Zhang; Liang Zhang; Zhihe Zhang; Rong Hou; Hairui Wang; I. Kati Loeffler; David G. Watson; Malcolm W. Kennedy (2023). Changes in the Milk Metabolome of the Giant Panda (Ailuropoda melanoleuca) with Time after Birth – Three Phases in Early Lactation and Progressive Individual Differences [Dataset]. http://doi.org/10.1371/journal.pone.0143417
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Tong Zhang; Rong Zhang; Liang Zhang; Zhihe Zhang; Rong Hou; Hairui Wang; I. Kati Loeffler; David G. Watson; Malcolm W. Kennedy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ursids (bears) in general, and giant pandas in particular, are highly altricial at birth. The components of bear milks and their changes with time may be uniquely adapted to nourish relatively immature neonates, protect them from pathogens, and support the maturation of neonatal digestive physiology. Serial milk samples collected from three giant pandas in early lactation were subjected to untargeted metabolite profiling and multivariate analysis. Changes in milk metabolites with time after birth were analysed by Principal Component Analysis, Hierarchical Cluster Analysis and further supported by Orthogonal Partial Least Square-Discriminant Analysis, revealing three phases of milk maturation: days 1–6 (Phase 1), days 7–20 (Phase 2), and beyond day 20 (Phase 3). While the compositions of Phase 1 milks were essentially indistinguishable among individuals, divergences emerged during the second week of lactation. OPLS regression analysis positioned against the growth rate of one cub tentatively inferred a correlation with changes in the abundance of a trisaccharide, isoglobotriose, previously observed to be a major oligosaccharide in ursid milks. Three artificial milk formulae used to feed giant panda cubs were also analysed, and were found to differ markedly in component content from natural panda milk. These findings have implications for the dependence of the ontogeny of all species of bears, and potentially other members of the Carnivora and beyond, on the complexity and sequential changes in maternal provision of micrometabolites in the immediate period after birth.

  19. a

    San Francisco Road Safety Analysis

    • hub.arcgis.com
    Updated Feb 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of California San Diego (2021). San Francisco Road Safety Analysis [Dataset]. https://hub.arcgis.com/documents/UCSDOnline::san-francisco-road-safety-analysis/about
    Explore at:
    Dataset updated
    Feb 18, 2021
    Dataset authored and provided by
    University of California San Diego
    Area covered
    San Francisco
    Description

    Our main question is to find out what San Francisco's road safety problems are and what the city is doing to fix them. Our first approach is to see if there is any correlation between specific populations by census tract and the collision rates. If the approach fails, the alternative is to look at how the collision rates are correlated with the public safety projects. By looking at how the projects have impacted road safety, we can assess whether the city is on the right track with the projects, or if the projects are a waste of time and money. Our original proposal was to analyze traffic in San Francisco. That was when we assumed we were able to use the data from Uber Movement. Due to certain constraints that will be mentioned in the Data Sources section, we were unable to perform such analysis. Hence, we switched to analyzing road safety instead.Notable Modules Used: Python: pandas, geopandas, shapely, matplotlib, scipy ArcGIS: aggregate_points

  20. Table_6_Metagenomic Analysis of Bacteria, Fungi, Bacteriophages, and...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shengzhi Yang; Xin Gao; Jianghong Meng; Anyun Zhang; Yingmin Zhou; Mei Long; Bei Li; Wenwen Deng; Lei Jin; Siyue Zhao; Daifu Wu; Yongguo He; Caiwu Li; Shuliang Liu; Yan Huang; Hemin Zhang; Likou Zou (2023). Table_6_Metagenomic Analysis of Bacteria, Fungi, Bacteriophages, and Helminths in the Gut of Giant Pandas.DOCX [Dataset]. http://doi.org/10.3389/fmicb.2018.01717.s021
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Shengzhi Yang; Xin Gao; Jianghong Meng; Anyun Zhang; Yingmin Zhou; Mei Long; Bei Li; Wenwen Deng; Lei Jin; Siyue Zhao; Daifu Wu; Yongguo He; Caiwu Li; Shuliang Liu; Yan Huang; Hemin Zhang; Likou Zou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To obtain full details of gut microbiota, including bacteria, fungi, bacteriophages, and helminths, in giant pandas (GPs), we created a comprehensive microbial genome database and used metagenomic sequences to align against the database. We delineated a detailed and different gut microbiota structures of GPs. A total of 680 species of bacteria, 198 fungi, 185 bacteriophages, and 45 helminths were found. Compared with 16S rRNA sequencing, the dominant bacterium phyla not only included Proteobacteria, Firmicutes, Bacteroidetes, and Actinobacteria but also Cyanobacteria and other eight phyla. Aside from Ascomycota, Basidiomycota, and Glomeromycota, Mucoromycota, and Microsporidia were the dominant fungi phyla. The bacteriophages were predominantly dsDNA Myoviridae, Siphoviridae, Podoviridae, ssDNA Inoviridae, and Microviridae. For helminths, phylum Nematoda was the dominant. In addition to previously described parasites, another 44 species of helminths were found in GPs. Also, differences in abundance of microbiota were found between the captive, semiwild, and wild GPs. A total of 1,739 genes encoding cellulase, β-glucosidase, and cellulose β-1,4-cellobiosidase were responsible for the metabolism of cellulose, and 128,707 putative glycoside hydrolase genes were found in bacteria/fungi. Taken together, the results indicated not only bacteria but also fungi, bacteriophages, and helminths were diverse in gut of giant pandas, which provided basis for the further identification of role of gut microbiota. Besides, metagenomics revealed that the bacteria/fungi in gut of GPs harbor the ability of cellulose and hemicellulose degradation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
Organization logo

Pandas Practice Dataset

Dataset to Practice Your Pandas Skill's

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
zip(493 bytes)Available download formats
Dataset updated
Jan 27, 2023
Authors
Mrityunjay Pathak
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

What is Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

Is there a correlation between two or more columns?

What is average value?

Max value?

Min value?

Search
Clear search
Close search
Google apps
Main menu