8 datasets found

Dataset for class comment analysis

zenodo.org
data.niaid.nih.gov

zip

Updated Feb 22, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Pooja Rani; Pooja Rani (2022). Dataset for class comment analysis [Dataset]. http://doi.org/10.5281/zenodo.4311839

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.4311839

Dataset updated

Feb 22, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Pooja Rani; Pooja Rani

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A list of different projects selected to analyze class comments (available in the source code) of various languages such as Java, Python, and Pharo. The projects vary in terms of size, contributors, and domain.

## Structure
```
Projects/
  Java_projects/
    eclipse.zip
    guava.zip
    guice.zip
    hadoop.zip
    spark.zip
    vaadin.zip

  Pharo_projects/
    images/
      GToolkit.zip
      Moose.zip
      PetitParser.zip
      Pillar.zip
      PolyMath.zip
      Roassal2.zip
      Seaside.zip

    vm/
      70-x64/Pharo

    Scripts/
      ClassCommentExtraction.st
      SampleSelectionScript.st    

  Python_projects/
    django.zip
    ipython.zip
    Mailpile.zip
    pandas.zip
    pipenv.zip
    pytorch.zip   
    requests.zip 
  
```

## Contents of the Replication Package
---

**Projects/** contains the raw projects of each language that are used to analyze class comments.
- **Java_projects/**
  - `eclipse.zip` - Eclipse project downloaded from the GitHub. More detail about the project is available on GitHub [Eclipse](https://github.com/eclipse).
  - `guava.zip` - Guava project downloaded from the GitHub. More detail about the project is available on GitHub [Guava](https://github.com/google/guava).
  - `guice.zip` - Guice project downloaded from the GitHub. More detail about the project is available on GitHub [Guice](https://github.com/google/guice)
  - `hadoop.zip` - Apache Hadoop project downloaded from the GitHub. More detail about the project is available on GitHub [Apache Hadoop](https://github.com/apache/hadoop)
  - `spark.zip` - Apache Spark project downloaded from the GitHub. More detail about the project is available on GitHub [Apache Spark](https://github.com/apache/spark)
  - `vaadin.zip` - Vaadin project downloaded from the GitHub. More detail about the project is available on GitHub [Vaadin](https://github.com/vaadin/framework)

- **Pharo_projects/**
 - **images/** - 
    - `GToolkit.zip` - Gtoolkit project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image. 
    - `Moose.zip` - Moose project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image. 
    - `PetitParser.zip` - Petit Parser project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Pillar.zip` - Pillar project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `PolyMath.zip` - PolyMath project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Roassal2.zip` - Roassal2 project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Seaside.zip` - Seaside project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.

 - **vm/** - 
  - **70-x64/Pharo** - Pharo7 (version 7 of Pharo) virtual machine to instantiate the Pharo images given in the `images/` folder. The user can run the vm on macOS and select any of the Pharo image. 

 - **Scripts/** - It contains the sample Smalltalk scripts to extract class comments from various projects. 
  - `ClassCommentExtraction.st` - A Smalltalk script to show how class comments are extracted from various Pharo projects. This script is already provided in the respective project image.
  - `SampleSelectionScript.st` - A Smalltalk script to show sample class comments of Pharo projects are selected. This script can be run in any of the Pharo images given in the images/ folder.


- **Python_projects/**
  - `django.zip` - Django project downloaded from the GitHub. More detail about the project is available on GitHub [Django](https://github.com/django)
  - `ipython.zip` - IPython project downloaded from the GitHub. More detail about the project is available on GitHub on [IPython](https://github.com/ipython/ipython)
  - `Mailpile.zip` - Mailpile project downloaded from the GitHub. More detail about the project is available on GitHub on [Mailpile](https://github.com/mailpile/Mailpile)
  - `pandas.zip` - pandas project downloaded from the GitHub. More detail about the project is available on GitHub on [pandas](https://github.com/pandas-dev/pandas)
  - `pipenv.zip` - Pipenv project downloaded from the GitHub. More detail about the project is available on GitHub on [Pipenv](https://github.com/pypa/pipenv)
  - `pytorch.zip` - PyTorch project downloaded from the GitHub. More detail about the project is available on GitHub on [PyTorch](https://github.com/pytorch/pytorch)
  - `requests.zip` - Requests project downloaded from the GitHub. More detail about the project is available on GitHub on [Requests](https://github.com/psf/requests/)

What you see is what you get: Delineating the urban jobs-housing spatial...
figshare.com
zip
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yao Yao; Jiaqi Zhang; Chen Qian; Yu Wang; Shuliang Ren; Zehao Yuan; Qingfeng Guan (2021). What you see is what you get: Delineating the urban jobs-housing spatial distribution at a parcel scale by using street view imagery [Dataset]. http://doi.org/10.6084/m9.figshare.12960212.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12960212.v1
Dataset updated
Feb 12, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Yao Yao; Jiaqi Zhang; Chen Qian; Yu Wang; Shuliang Ren; Zehao Yuan; Qingfeng Guan
License
https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html
Description
The compressed package (Study_code.zip) contains the code files implemented by an under review paper ("What you see is what you get: Delineating urban jobs-housing spatial distribution at a parcel scale by using street view imagery based on deep learning technique").The compressed package (input_land_parcel_with_attributes.zip) is the sampled mixed "jobs-housing" attributes data of the study area with multiple probability attributes (Only working, Only living, working and living) at the land parcel scale.The compressed package (input_street_view_images.zip) is the surrounding street view data near sampled land parcels (input_land_parcel_with_attributes.zip) with the pixel size of 240*160 obtained from Tencent map (https://map.qq.com/).The compressed package (output_results.zip) contains the result vector files (Jobs-housing pattern distribution and error distribution) and file description (Readme.txt).This project uses some Python open source libraries (Numpy, Pandas, Selenium, Gdal, Pytorch and sklearn). This project complies with the GPL license.Numpy (https://numpy.org/) is an open source numerical calculation tool developed by Travis Oliphant. Used in this project for matrix operation. This library complies with the BSD license.Pandas (https://pandas.pydata.org/) is an open source library, providing high-performance, easy-to-use data structures and data analysis tools. This library complies with the BSD license.Selenium(https://www.selenium.dev/) is a suite of tools for automating web browsers.Used in this project for getting street view images.This library complies with the BSD license.Gdal(https://gdal.org/) is a translator library for raster and vector geospatial data formats.Used in this project for processing geospatial data.This library complies with the BSD license.Pytorch(https://pytorch.org/) is an open source machine learning framework that accelerates the path from research prototyping to production deployment.Used in this project for deep learning.This library complies with the BSD license.sklearn(https://scikit-learn.org/) is an open source machine learning tool for python.Used in this project for comparing precision metrics.This library complies with the BSD license.
Bank Data Analysis
kaggle.com
Updated Mar 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steve Gallegos (2022). Bank Data Analysis [Dataset]. https://www.kaggle.com/stevegallegos/bank-marketing-data-set/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Steve Gallegos
Description
Data Set Information

The bank.csv dataset describes about a phone call between customer and customer care staffs who are working for Portuguese banking institution. The dataset is about, whether the customer will get the scheme or product such as bank term deposit. Maximum the data will have ‘yes’ or ‘no’ type data.

bank-additional-full.csv with all examples (41188) and 20 inputs, ordered by date (from May 2008 to November 2010)

Changed file name to bank.csv after delimited

Goal

The main goal is to predict if clients will subscribe to a term deposit or not.

Attribute Information

-Input Variables -

Bank Client Data: 1 - age: (numeric) 2 - job: type of job (categorical: admin., blue-collar, entrepreneur, housemaid, management, retired, self-employed, services, student, technician, unemployed, unknown) 3 - marital: marital status (categorical: divorced, married, single, unknown; note: divorced means either divorced or widowed) 4 - education: (categorical: basic.4y, basic.6y, basic.9y, high.school, illiterate, professional.course, university.degree, unknown) 5 - default: has credit in default? (categorical: no, yes, unknown) 6 - housing: has housing loan? (categorical: no, yes, unknown) 7 - loan: has personal loan? (categorical: no, yes, unknown)

Related with the Last Contact of the Current Campaign: 8 - contact: contact communication type (categorical: cellular, telephone) 9 - month: last contact month of year (categorical: jan, feb, mar, ..., nov, dec) 10 - day_of_week: last contact day of the week (categorical: mon, tue, wed, thu, fri) 11 - duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model.

Other Attributes: 12 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 13 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted) 14 - previous: number of contacts performed before this campaign and for this client (numeric) 15 - poutcome: outcome of the previous marketing campaign (categorical: failure, nonexistent, success)

#Social and Economic Context Attributes 16 - emp.var.rate: employment variation rate - quarterly indicator (numeric) 17 - cons.price.idx: consumer price index - monthly indicator (numeric) 18 - cons.conf.idx: consumer confidence index - monthly indicator (numeric) 19 - euribor3m: euribor 3 month rate - daily indicator (numeric) 20 - nr.employed: number of employees - quarterly indicator (numeric)

Output Variable (Desired Target): 21 - y (deposit): - has the client subscribed a term deposit? (binary: yes, no) -> changed column title from '***y***' to '***deposit***'

Source

[Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
Olympics game data analysis
kaggle.com
Updated Mar 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sarita (2025). Olympics game data analysis [Dataset]. https://www.kaggle.com/datasets/saritas95/olympics-game-data-analysis/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 2, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
sarita
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The Olympics Data Analysis project explores historical Olympic data using Exploratory Data Analysis (EDA) techniques. By leveraging Python libraries such as pandas, seaborn, and matplotlib, the project uncovers patterns in medal distribution, athlete demographics, and country-wise performance.

Key findings reveal that most medalists are aged between 20-30 years, with USA, China, and Russia leading in total medals. Over time, female participation has increased significantly, reflecting improved gender equality in sports. Additionally, athlete characteristics like height and weight play a crucial role in certain sports, such as basketball (favoring taller players) and gymnastics (favoring younger athletes).

The project includes interactive visualizations such as heatmaps, medal trends, and gender-wise participation charts to provide a comprehensive understanding of Olympic history and trends. The insights can help sports analysts, researchers, and enthusiasts better understand performance patterns in the Olympics.
o
Dating App Sentiment Analysis Dataset
opendatabay.com
.undefined
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Dating App Sentiment Analysis Dataset [Dataset]. https://www.opendatabay.com/data/consumer/77355978-301e-414e-8094-a205b7a505b6
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Reviews & Ratings
Description
This dataset provides a collection of user reviews and ratings for dating applications, primarily sourced from the Google Play Store for the Indian region between 2017 and 2022. It offers valuable insights into user sentiment, evolving trends, and common feedback regarding dating apps. The data is particularly useful for practising Natural Language Processing (NLP) tasks such as sentiment analysis, topic modelling, and identifying user concerns.

Columns

Index: A unique identifier for each review entry.

Name: The name of the user who left the review.

Username: The username of the reviewer.

Review: The textual content of the review left by the user.

Rating: The numerical rating given by the user to the app, indicating their satisfaction level.

#ThumbsUp: A measure of how useful the review was perceived to be by other users.

Date&Time: The specific date and time when the review was posted.

App: The name of the dating application being reviewed.

Label Count: A numerical label, the specific purpose of which is not detailed in the provided information, but it appears to relate to ranges of index or other numerical values within the dataset.

Distribution

The dataset is typically provided in a CSV file format. It contains a substantial number of records, estimated to be around 527,000 individual reviews. This makes it suitable for large-scale data analysis and machine learning projects. The dataset structure is tabular, with clearly defined columns for review content, metadata, and user feedback. Specific row/record counts are not exact but are indicated by the extensive range of index labels.

Usage

This dataset is ideally suited for a variety of analytical and machine learning applications: * Analysing trends in dating app usage and perception over the years. * Determining which dating applications receive more favourable responses and if this consistency has changed over time. * Identifying common issues reported by users who give low ratings (below 3/5). * Investigating the correlation between user enthusiasm and their app ratings. * Performing sentiment analysis on review texts to gauge overall user sentiment. * Developing Natural Language Processing (NLP) models for text classification, entity recognition, or summarisation. * Examining the perceived usefulness of top-rated reviews. * Understanding user behaviour and preferences across different dating apps.

Coverage

The dataset primarily covers user reviews from the Google Play Store, specifically for the Indian country region ('in'), despite being titled as "all regions" in some contexts. The data spans a time range from 2017 to 2022, offering a multi-year perspective on dating app trends and user feedback. There are no specific demographic details for the reviewers themselves beyond their reviews and ratings.

License

CCO

Who Can Use It

This dataset is suitable for: * Data Scientists and Analysts: For conducting deep dives into user sentiment, trend analysis, and predictive modelling. * NLP Practitioners and Researchers: As a practical dataset for training and evaluating natural language processing models, especially for text classification and sentiment analysis tasks. * App Developers and Product Managers: To understand user feedback, identify areas for improvement in their own or competing dating applications, and inform product development strategies. * Market Researchers: To gain insights into the consumer behaviour and preferences within the online dating market. * Students and Beginners: It is tagged as 'Beginner' friendly, making it a good resource for those new to data analysis or NLP projects.

Dataset Name Suggestions

Google Play Dating App Reviews (India, 2017-2022)

Indian Dating App User Reviews

Mobile Dating App Reviews & Ratings

Dating App Sentiment Analysis Dataset

Google Play Dating App Feedback

Attributes

Original Data Source: Dating Apps Reviews 2017-2022 (all regions)
Analysis Bay Area Bike Share Udacity
kaggle.com
Updated Nov 10, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luiz Henrique Amorim (2017). Analysis Bay Area Bike Share Udacity [Dataset]. https://www.kaggle.com/luizoamorim/analysis-bay-area-bike-share-udacity/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 10, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luiz Henrique Amorim
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
San Francisco Bay Area
Description
This dataset is a udacity data science course project. In this project is did an analysis about open data of Bay Area Bake Share. Ford GoBike is the Bay Area's new bike share system, with thousands of public bikes for use across San Francisco, East Bay and San Jose. Theirs bike share is designed with convenience in mind; it’s a fun and affordable way to get around town.

In this project was did be many analysis that they ask me. And in the final of project, I did two simple analysis. In my first analysis, I present a rainy day influence on the trips. The analysis show that a rainy day influence in a trips reduction. In my second analysis I show trips quantity per week day in San Francisco. I show too trips quantity per subscriber type. And last I present trips quantity for each subscriber type per week day. It was possible to observe that trips are lower in weekends and bigger in weekdays. The data show us too that in the weekdays, most of the trips are made by annual subscribers and, in the weekend, a most are made by customers. I did create some functions that can help somebody, and it's possible check many examples about python and pandas.

This was my first project in data science. I continue to study and learn and hope to improve more and more.

Thank's.
a
San Francisco Road Safety Analysis
hub.arcgis.com
Updated Feb 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of California San Diego (2021). San Francisco Road Safety Analysis [Dataset]. https://hub.arcgis.com/documents/UCSDOnline::san-francisco-road-safety-analysis
Explore at:
Dataset updated
Feb 18, 2021
Dataset authored and provided by
University of California San Diego
Description
Our main question is to find out what San Francisco's road safety problems are and what the city is doing to fix them. Our first approach is to see if there is any correlation between specific populations by census tract and the collision rates. If the approach fails, the alternative is to look at how the collision rates are correlated with the public safety projects. By looking at how the projects have impacted road safety, we can assess whether the city is on the right track with the projects, or if the projects are a waste of time and money. Our original proposal was to analyze traffic in San Francisco. That was when we assumed we were able to use the data from Uber Movement. Due to certain constraints that will be mentioned in the Data Sources section, we were unable to perform such analysis. Hence, we switched to analyzing road safety instead.Notable Modules Used: Python: pandas, geopandas, shapely, matplotlib, scipy ArcGIS: aggregate_points
Enhanced Pizza Sales Data (2024–2025)
kaggle.com
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
akshay gaikwad (2025). Enhanced Pizza Sales Data (2024–2025) [Dataset]. https://www.kaggle.com/datasets/akshaygaikwad448/pizza-delivery-data-with-enhanced-features
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
akshay gaikwad
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a realistic and structured pizza sales dataset covering the time span from **2024 to 2025. ** Whether you're a beginner in data science, a student working on a machine learning project, or an experienced analyst looking to test out time series forecasting and dashboard building, this dataset is for you.

📁 What’s Inside? The dataset contains rich details from a pizza business including:

✅ Order Dates & Times ✅ Pizza Names & Categories (Veg, Non-Veg, Classic, Gourmet, etc.) ✅ Sizes (Small, Medium, Large, XL) ✅ Prices ✅ Order Quantities ✅ Customer Preferences & Trends

It is neatly organized in Excel format and easy to use with tools like Python (Pandas), Power BI, Excel, or Tableau.

💡** Why Use This Dataset?** This dataset is ideal for:

📈 Sales Analysis & Reporting 🧠 Machine Learning Models (demand forecasting, recommendations) 📅 Time Series Forecasting 📊 Data Visualization Projects 🍽️ Customer Behavior Analysis 🛒 Market Basket Analysis 📦 Inventory Management Simulations

🧠 Perfect For: Data Science Beginners & Learners BI Developers & Dashboard Designers MBA Students (Marketing, Retail, Operations) Hackathons & Case Study Competitions

pizza, sales data, excel dataset, retail analysis, data visualization, business intelligence, forecasting, time series, customer insights, machine learning, pandas, beginner friendly
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Pooja Rani; Pooja Rani (2022). Dataset for class comment analysis [Dataset]. http://doi.org/10.5281/zenodo.4311839

Dataset for class comment analysis

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.4311839

Dataset updated

Feb 22, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Pooja Rani; Pooja Rani

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

## Structure
```
Projects/
  Java_projects/
    eclipse.zip
    guava.zip
    guice.zip
    hadoop.zip
    spark.zip
    vaadin.zip

  Pharo_projects/
    images/
      GToolkit.zip
      Moose.zip
      PetitParser.zip
      Pillar.zip
      PolyMath.zip
      Roassal2.zip
      Seaside.zip

    vm/
      70-x64/Pharo

    Scripts/
      ClassCommentExtraction.st
      SampleSelectionScript.st    

  Python_projects/
    django.zip
    ipython.zip
    Mailpile.zip
    pandas.zip
    pipenv.zip
    pytorch.zip   
    requests.zip 
  
```

## Contents of the Replication Package
---

**Projects/** contains the raw projects of each language that are used to analyze class comments.
- **Java_projects/**
  - `eclipse.zip` - Eclipse project downloaded from the GitHub. More detail about the project is available on GitHub [Eclipse](https://github.com/eclipse).
  - `guava.zip` - Guava project downloaded from the GitHub. More detail about the project is available on GitHub [Guava](https://github.com/google/guava).
  - `guice.zip` - Guice project downloaded from the GitHub. More detail about the project is available on GitHub [Guice](https://github.com/google/guice)
  - `hadoop.zip` - Apache Hadoop project downloaded from the GitHub. More detail about the project is available on GitHub [Apache Hadoop](https://github.com/apache/hadoop)
  - `spark.zip` - Apache Spark project downloaded from the GitHub. More detail about the project is available on GitHub [Apache Spark](https://github.com/apache/spark)
  - `vaadin.zip` - Vaadin project downloaded from the GitHub. More detail about the project is available on GitHub [Vaadin](https://github.com/vaadin/framework)

- **Pharo_projects/**
 - **images/** - 
    - `GToolkit.zip` - Gtoolkit project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image. 
    - `Moose.zip` - Moose project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image. 
    - `PetitParser.zip` - Petit Parser project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Pillar.zip` - Pillar project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `PolyMath.zip` - PolyMath project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Roassal2.zip` - Roassal2 project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.
    - `Seaside.zip` - Seaside project is imported into the Pharo image. We can run this image with the virtual machine given in the `vm/` folder. The script to extract the comments is already provided in the image.

 - **vm/** - 
  - **70-x64/Pharo** - Pharo7 (version 7 of Pharo) virtual machine to instantiate the Pharo images given in the `images/` folder. The user can run the vm on macOS and select any of the Pharo image. 

 - **Scripts/** - It contains the sample Smalltalk scripts to extract class comments from various projects. 
  - `ClassCommentExtraction.st` - A Smalltalk script to show how class comments are extracted from various Pharo projects. This script is already provided in the respective project image.
  - `SampleSelectionScript.st` - A Smalltalk script to show sample class comments of Pharo projects are selected. This script can be run in any of the Pharo images given in the images/ folder.


- **Python_projects/**
  - `django.zip` - Django project downloaded from the GitHub. More detail about the project is available on GitHub [Django](https://github.com/django)
  - `ipython.zip` - IPython project downloaded from the GitHub. More detail about the project is available on GitHub on [IPython](https://github.com/ipython/ipython)
  - `Mailpile.zip` - Mailpile project downloaded from the GitHub. More detail about the project is available on GitHub on [Mailpile](https://github.com/mailpile/Mailpile)
  - `pandas.zip` - pandas project downloaded from the GitHub. More detail about the project is available on GitHub on [pandas](https://github.com/pandas-dev/pandas)
  - `pipenv.zip` - Pipenv project downloaded from the GitHub. More detail about the project is available on GitHub on [Pipenv](https://github.com/pypa/pipenv)
  - `pytorch.zip` - PyTorch project downloaded from the GitHub. More detail about the project is available on GitHub on [PyTorch](https://github.com/pytorch/pytorch)
  - `requests.zip` - Requests project downloaded from the GitHub. More detail about the project is available on GitHub on [Requests](https://github.com/psf/requests/)

Clear search

Close search

Google apps

Main menu

Dataset for class comment analysis

What you see is what you get: Delineating the urban jobs-housing spatial...

Bank Data Analysis

Data Set Information

Goal

Attribute Information

-Input Variables -

Source

Olympics game data analysis

Dating App Sentiment Analysis Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Analysis Bay Area Bike Share Udacity

San Francisco Road Safety Analysis

Enhanced Pizza Sales Data (2024–2025)

Dataset for class comment analysis