100+ datasets found
  1. Folha - News of the Brazilian Newspaper - 2024

    • kaggle.com
    Updated Feb 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    luisfcaldeira (2024). Folha - News of the Brazilian Newspaper - 2024 [Dataset]. https://www.kaggle.com/datasets/luisfcaldeira/folha-news-of-the-brazilian-newspaper-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2024
    Dataset provided by
    Kaggle
    Authors
    luisfcaldeira
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    --- EN ---

    Data collected from the Folha website, with dates prior to February 2024. The data encoding is UTF-8. You may feel the need to clean the data, although I have already done some work in this regard. The columns you will find are: Title, Content, URL, Published and Category. The application used was developed by me in C# and you can find the repository at the link below.

    --- PT-BR ----

    Dados coletados do site da Folha, com datas anteriores a fevereiro de 2024. O encoding dos dados é UTF-8. Você pode sentir necessidade de limpar os dados, embora eu já tenha feito algo nesse sentido. As colunas que você vai encontrar são: Título, Conteúdo, URL, publicado(em) e categoria. A aplicação usada foi desenvolvida por mim em C# e você pode encontrar o repositório no link abaixo.

    https://github.com/luisfcaldeira/WebScrapper

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19155838%2Fb33a84bd4c5c02defaf5bc4574afd042%2Fcloud%20-%20Copia.png?generation=1708745865454053&alt=media" alt="">

  2. h

    Kaggle-LLM-Science-Exam

    • huggingface.co
    Updated Aug 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sangeetha Venkatesan (2023). Kaggle-LLM-Science-Exam [Dataset]. https://huggingface.co/datasets/Sangeetha/Kaggle-LLM-Science-Exam
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 8, 2023
    Authors
    Sangeetha Venkatesan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for [LLM Science Exam Kaggle Competition]

      Dataset Summary
    

    https://www.kaggle.com/competitions/kaggle-llm-science-exam/data

      Languages
    

    [en, de, tl, it, es, fr, pt, id, pl, ro, so, ca, da, sw, hu, no, nl, et, af, hr, lv, sl]

      Dataset Structure
    

    Columns prompt - the text of the question being asked A - option A; if this option is correct, then answer will be A B - option B; if this option is correct, then answer will be B C - option C; if this… See the full description on the dataset page: https://huggingface.co/datasets/Sangeetha/Kaggle-LLM-Science-Exam.

  3. imagenes

    • kaggle.com
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Virginia Forcone (2023). imagenes [Dataset]. https://www.kaggle.com/datasets/mariavirginiaforcone/imagenes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Maria Virginia Forcone
    Description

    Datos para el trabajo correspondiente al proyecto final de la Diplomatura en Machine Learning con Python, del Instituto Data Science, titulado "Análisis de la deforestación en la región aledaña a la localidad de Joaquín V. González, Salta" https://www.kaggle.com/code/mariavirginiaforcone/deforestation-analysis-main-notebook

    Contiene un informe que resume el trabajo realizado y una carpeta que incluye:

    • Imágenes satelitales en combinación de falso color compuesto (nir, swir1, red) de la región a analizar, en formato .tif. Corresponden al periodo desde 1986 a 2021; una imagen por año.
    • Dataset de entrenamiento (train_df.csv) para el modelo de clasificación supervisada.
    • Dataset sobre métricas medias (mean_data.csv) del modelo de clasificación supervisada.
    • Dataset de la serie temporal de deforestación (disminución de la vegetación nativa) (area_0.csv).
    • Dataset sobre el error de cada punto de la serie temporal (error_0.csv).

    Data for the final project of the Diploma in Machine Learning with Python, from the Data Science Institute, entitled "Analysis of deforestation in the region surrounding the town of Joaquín V. González, Salta" https://www.kaggle.com/code/mariavirginiaforcone/deforestation-analysis-main-notebook

    It contains a report that summaries the work and a folder that includes:

    • Satelital images in color combination of nir, swir1, red, corresponding to the analysis region, in .tif format. They're 1 image per year, from 1986 to 2021.
    • Train dataset (train_df.csv) for the supervised classification model.
    • Mean metrics dataset (mean_data.csv) of the supervised classification model.
    • Time series dataset about deforestation (area_0.csv)
    • Time series error dataset (error_0.csv)
  4. Breast Tissue Impedance Measurements

    • kaggle.com
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tarık Tuna Taşaltı (2024). Breast Tissue Impedance Measurements [Dataset]. https://www.kaggle.com/datasets/tarktunataalt/breast-tissue-impedance-measurements
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    Kaggle
    Authors
    Tarık Tuna Taşaltı
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Breast Tissue Dataset

    This dataset contains electrical impedance measurements of freshly excised tissue samples from the breast. The data is sourced from the UCI Machine Learning Repository.

    Dataset Characteristics

    • Type: Multivariate
    • Subject Area: Health and Medicine
    • Associated Tasks: Classification
    • Feature Type: Real
    • Instances: 106
    • Features: Various impedance measurements

    Dataset Information

    Impedance measurements were taken at the following frequencies: 15.625, 31.25, 62.5, 125, 250, 500, and 1000 KHz. These measurements, when plotted in the (real, -imaginary) plane, constitute the impedance spectrum from which the breast tissue features are computed. The dataset can be used for predicting the classification of either the original 6 classes or of 4 classes by merging the fibro-adenoma, mastopathy, and glandular classes, which are hard to discriminate.

    Features

    • I0: Impedivity (ohm) at zero frequency
    • PA500: Phase angle at 500 KHz
    • HFS: High-frequency slope of phase angle
    • DA: Impedance distance between spectral ends
    • AREA: Area under spectrum
    • A/DA: Area normalized by DA
    • MAX IP: Maximum of the spectrum
    • DR: Distance between I0 and real part of the maximum frequency point
    • P: Length of the spectral curve
    • Class: Tissue type (carcinoma, fibro-adenoma, mastopathy, glandular, connective, adipose)

    Classes

    • car: Carcinoma
    • fad: Fibro-adenoma
    • mas: Mastopathy
    • gla: Glandular
    • con: Connective
    • adi: Adipose

    Usage

    This dataset is suitable for classification tasks. The impedance measurements can be used to predict the type of breast tissue.

    If you use this dataset, please cite it as follows:

    S, JP and Jossinet, J. (2010). Breast Tissue. UCI Machine Learning Repository. https://doi.org/10.24432/C5P31H.

    @misc{misc_breast_tissue_192,
    author = "S, JP and Jossinet, J",
    title = "Breast Tissue",
    year = 2010,
    howpublished = "UCI Machine Learning Repository",
    note = "DOI: https://doi.org/10.24432/C5P31H"
    }

  5. Iris Species

    • kaggle.com
    zip
    Updated Sep 27, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCI Machine Learning (2016). Iris Species [Dataset]. https://www.kaggle.com/datasets/uciml/iris
    Explore at:
    zip(3687 bytes)Available download formats
    Dataset updated
    Sep 27, 2016
    Dataset authored and provided by
    UCI Machine Learning
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.

    It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

    The columns in this dataset are:

    • Id
    • SepalLengthCm
    • SepalWidthCm
    • PetalLengthCm
    • PetalWidthCm
    • Species

    Sepal Width vs. Sepal Length

  6. A

    ‘Campeonato Brasileiro de futebol’ analyzed by Analyst-2

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Campeonato Brasileiro de futebol’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-campeonato-brasileiro-de-futebol-76c7/884f5307/?iid=019-463&v=presentation
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Campeonato Brasileiro de futebol’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/adaoduque/campeonato-brasileiro-de-futebol on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Campeonato Brasileiro de Futebol

    18 anos de campeonato brasileiro de futebol

    Conteúdo

    No total 7645 partidas de 2003 à 2021

    Github do projeto

    https://github.com/adaoduque/Brasileirao_Dataset

    --- Original source retains full ownership of the source dataset ---

  7. D.A.Project_1

    • kaggle.com
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shail_2604 (2024). D.A.Project_1 [Dataset]. https://www.kaggle.com/shail2604/d-a-project-1/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shail_2604
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Shail_2604

    Released under Apache 2.0

    Contents

  8. Data from: College Completion Dataset

    • kaggle.com
    Updated Dec 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). College Completion Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/boost-student-success-with-college-completion-da
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 6, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    College Completion Dataset

    Graduation Rates, Race, Efficiency Measures and More

    By Jonathan Ortiz [source]

    About this dataset

    This College Completion dataset provides an invaluable insight into the success and progress of college students in the United States. It contains graduation rates, race and other data to offer a comprehensive view of college completion in America. The data is sourced from two primary sources – the National Center for Education Statistics (NCES)’ Integrated Postsecondary Education System (IPEDS) and Voluntary System of Accountability’s Student Success and Progress rate.

    At four-year institutions, the graduation figures come from IPEDS for first-time, full-time degree seeking students at the undergraduate level, who entered college six years earlier at four-year institutions or three years earlier at two-year institutions. Furthermore, colleges report how many students completed their program within 100 percent and 150 percent of normal time which corresponds with graduation within four years or six year respectively. Students reported as being of two or more races are included in totals but not shown separately

    When analyzing race and ethnicity data NCES have classified student demographics since 2009 into seven categories; White non-Hispanic; Black non Hispanic; American Indian/ Alaskan native ; Asian/ Pacific Islander ; Unknown race or ethnicity ; Non resident with two new categorize Native Hawaiian or Other Pacific Islander combined with Asian plus students belonging to several races. Also worth noting is that different classifications for graduate data stemming from 2008 could be due to variations in time frame examined & groupings used by particular colleges – those who can’t be identified from National Student Clearinghouse records won’t be subjected to penalty by these locations .

    When it comes down to efficiency measures parameters like “Awards per 100 Full Time Undergraduate Students which includes all undergraduate completions reported by a particular institution including associate degrees & certificates less than 4 year programme will assist us here while we also take into consideration measures like expenditure categories , Pell grant percentage , endowment values , average student aid amounts & full time faculty members contributing outstandingly towards instructional research / public service initiatives .

    When trying to quantify outcomes back up Median Estimated SAT score metric helps us when it is derived either on 25th percentile basis / 75th percentile basis with all these factors further qualified by identifying required criteria meeting 90% threshold when incoming students are considered for relevance . Last but not least , Average Student Aid equalizes amount granted by institution dividing same over total sum received against what was allotted that particular year .

    All this analysis gives an opportunity get a holistic overview about performance , potential deficits &

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains data on student success, graduation rates, race and gender demographics, an efficiency measure to compare colleges across states and more. It is a great source of information to help you better understand college completion and student success in the United States.

    In this guide we’ll explain how to use the data so that you can find out the best colleges for students with certain characteristics or focus on your target completion rate. We’ll also provide some useful tips for getting the most out of this dataset when seeking guidance on which institutions offer the highest graduation rates or have a good reputation for success in terms of completing programs within normal timeframes.

    Before getting into specifics about interpreting this dataset, it is important that you understand that each row represents information about a particular institution – such as its state affiliation, level (two-year vs four-year), control (public vs private), name and website. Each column contains various demographic information such as rate of awarding degrees compared to other institutions in its sector; race/ethnicity Makeup; full-time faculty percentage; median SAT score among first-time students; awards/grants comparison versus national average/state average - all applicable depending on institution location — and more!

    When using this dataset, our suggestion is that you begin by forming a hypothesis or research question concerning student completion at a given school based upon observable characteristics like financ...

  9. public_submit

    • kaggle.com
    Updated Mar 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DA (2021). public_submit [Dataset]. https://www.kaggle.com/graafffff/public-submit/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DA
    Description

    Dataset

    This dataset was created by DA

    Contents

  10. The Dresden Surgical Anatomy Dataset

    • kaggle.com
    Updated Apr 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anindya Majumder (2025). The Dresden Surgical Anatomy Dataset [Dataset]. https://www.kaggle.com/datasets/anindyamajumder/the-dresden-surgical-anatomy-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anindya Majumder
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Dresden Surgical Anatomy Dataset includes semantic segmentations for eight abdominal organs (colon, liver, pancreas, small intestine, spleen, stomach, ureter, vesicular glands), the abdominal wall, and two vascular structures (inferior mesenteric artery, intestinal veins) as seen in laparoscopic views. The dataset was collected from 32 surgeries, with the majority of patients (26/32) being male, an average age of 63 years, and a mean BMI of 26.75 kg/m². All patients had clinical reasons for the procedures. The surgeries were conducted using a Da Vinci® Xi/X Endoscope with an 8mm diameter, 30° angled camera (Intuitive Surgical, Item code 470057), and recorded in MPEG-4 format at 1920 × 1080 pixel resolution, with each surgery lasting between two and ten hours. A medical student with two years of experience in robot-assisted rectal surgery (MC, FMR) used the Surgery Workflow Toolbox [Annotate] version 2.2.0 (b<>com, Cesson-Sévigné, France) to annotate the surgical processes. To ensure diversity, videos from at least 20 surgeries were selected for each anatomical structure, with up to 100 equidistant frames randomly chosen per organ. Consequently, the dataset contains at least 1,000 annotated images for each organ or structure, covering at least 20 patients. The pixel-wise segmentation was performed using 3D Slicer 4.11.20200930 (with the SlicerRT extension), an open-source medical imaging software, and was done manually with a stylus on a tablet computer. Additionally, weak labels were used to indicate the visibility of anatomical structures in each image, annotated by a medical student with experience in minimally invasive surgery and reviewed by a second annotator.

    Paper Link: The Dresden Surgical Anatomy Dataset

  11. Portfolio de ativos da B3 Bovespa

    • kaggle.com
    Updated Sep 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberto Barberá (2018). Portfolio de ativos da B3 Bovespa [Dataset]. https://www.kaggle.com/rbarbera/portfolio-de-ativos-da-b3-bovespa/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 14, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Roberto Barberá
    Description

    Dataset

    This dataset was created by Roberto Barberá

    Released under Data files © Original Authors

    Contents

  12. NYC_Jobs

    • kaggle.com
    zip
    Updated Sep 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sheila Dias da Silva (2021). NYC_Jobs [Dataset]. https://www.kaggle.com/sheiladiasdasilva/nyc-jobs
    Explore at:
    zip(3104402 bytes)Available download formats
    Dataset updated
    Sep 2, 2021
    Authors
    Sheila Dias da Silva
    Area covered
    New York
    Description

    Dataset

    This dataset was created by Sheila Dias da Silva

    Contents

  13. Tarea 2 visualización de datos

    • kaggle.com
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Alessandri (2023). Tarea 2 visualización de datos [Dataset]. https://www.kaggle.com/datasets/franciscoalessandri/tarea-2-visualizacin-de-datos/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Francisco Alessandri
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Francisco Alessandri

    Released under CC0: Public Domain

    Contents

  14. Audible Dataset

    • kaggle.com
    Updated Apr 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Snehangsu De (2022). Audible Dataset [Dataset]. https://www.kaggle.com/datasets/snehangsude/audible-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Snehangsu De
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Introduction

    With the trend toward audiobooks growing, I gathered this data to understand how the audiobook market has been growing over the years. From authors of audiobooks to release dates, the data represents the important details of audiobooks from 1998 till 2025 (pre-planned releases).

    I have yet to find a great audiobooks dataset and hence the urge to make a dataset that provides us with information on the basics and the history of audiobooks. I look to improve the dataset with more details in the near future.

    File Information

    The Uncleaned data or audible_uncleaned.csv is exactly the raw data I derived from Audible.in The Cleaned one or audible_cleaned.csv consists of a few basic data cleaning steps.

    Libraries used

    The data was collected using webs-scraping. - re - Beautiful Soup - Selenium

    Beautiful Soup and Selenium were used in unison to mainly gather the data. The code can be re-used and you can find the code here: https://github.com/snehangsude/audible_scraper

    Column Breakdown

    • name: Name of the audiobook
    • author: Author of the audiobook
    • narrator: Narrator of the audiobook
    • time: Length of the audiobook
    • releasedate: Release date of the audiobook
    • language: Language of the audiobook
    • stars: No. of stars the audiobook received
    • price: Price of the audiobook in INR
    • ratings: No. of reviews received by the audiobook
  15. nr12atualizada

    • kaggle.com
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Éderson de Almeida Pedro (2023). nr12atualizada [Dataset]. https://www.kaggle.com/dersondealmeidapedro/nr12atualizada/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Éderson de Almeida Pedro
    Description

    Dataset

    This dataset was created by Éderson de Almeida Pedro

    Contents

  16. Dataset de Crédito

    • kaggle.com
    Updated Sep 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Narciso Nascimento (2022). Dataset de Crédito [Dataset]. https://www.kaggle.com/datasets/narcisonascimento/ebac-python-projeto-final-dataset/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 1, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Narciso Nascimento
    Description

    Ebac Python - Material de apoio para o projeto final. Análise de crédito.

  17. Microdados da Rede Municipal Matrículas

    • kaggle.com
    Updated Apr 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yaso (2018). Microdados da Rede Municipal Matrículas [Dataset]. https://www.kaggle.com/datasets/yasodara/microdados-da-rede-municipal-matrculas/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2018
    Dataset provided by
    Kaggle
    Authors
    yaso
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by yaso

    Released under CC0: Public Domain

    Contents

  18. deteccao de objetos roboflow

    • kaggle.com
    Updated Jan 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah da Silva Silveira (2023). deteccao de objetos roboflow [Dataset]. https://www.kaggle.com/datasets/sarahsilveira/deteccao-de-objetos-roboflow/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sarah da Silva Silveira
    Description

    Dataset

    This dataset was created by Sarah da Silva Silveira

    Contents

  19. pesosvaegan

    • kaggle.com
    zip
    Updated Feb 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guilherme H da Silva (2021). pesosvaegan [Dataset]. https://www.kaggle.com/guilhermehdasilva/pesosvaegan
    Explore at:
    zip(292526110 bytes)Available download formats
    Dataset updated
    Feb 12, 2021
    Authors
    Guilherme H da Silva
    Description

    Dataset

    This dataset was created by Guilherme H da Silva

    Contents

  20. DataPaises

    • kaggle.com
    zip
    Updated Oct 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zoen de Loi (2023). DataPaises [Dataset]. https://www.kaggle.com/datasets/zoendeloi/datapaises
    Explore at:
    zip(566127 bytes)Available download formats
    Dataset updated
    Oct 24, 2023
    Authors
    Zoen de Loi
    Description

    Dataset

    This dataset was created by Zoen de Loi

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
luisfcaldeira (2024). Folha - News of the Brazilian Newspaper - 2024 [Dataset]. https://www.kaggle.com/datasets/luisfcaldeira/folha-news-of-the-brazilian-newspaper-2024
Organization logo

Folha - News of the Brazilian Newspaper - 2024

150k news of the site Folha de São Paulo

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2024
Dataset provided by
Kaggle
Authors
luisfcaldeira
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

--- EN ---

Data collected from the Folha website, with dates prior to February 2024. The data encoding is UTF-8. You may feel the need to clean the data, although I have already done some work in this regard. The columns you will find are: Title, Content, URL, Published and Category. The application used was developed by me in C# and you can find the repository at the link below.

--- PT-BR ----

Dados coletados do site da Folha, com datas anteriores a fevereiro de 2024. O encoding dos dados é UTF-8. Você pode sentir necessidade de limpar os dados, embora eu já tenha feito algo nesse sentido. As colunas que você vai encontrar são: Título, Conteúdo, URL, publicado(em) e categoria. A aplicação usada foi desenvolvida por mim em C# e você pode encontrar o repositório no link abaixo.

https://github.com/luisfcaldeira/WebScrapper

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19155838%2Fb33a84bd4c5c02defaf5bc4574afd042%2Fcloud%20-%20Copia.png?generation=1708745865454053&alt=media" alt="">

Search
Clear search
Close search
Google apps
Main menu