71 datasets found

One Hundred Cities
kaggle.com
Updated Apr 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chiticariu Cristian (2021). One Hundred Cities [Dataset]. https://www.kaggle.com/datasets/chiticariucristian/one-hundred-cities/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 29, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Chiticariu Cristian
Description
Context

100 cities

Content

The dataset consists of one hundred cities around the world, short description for each one, and their population.

Acknowledgements

The data was extracted from https://www.bestcities.org/rankings/worlds-best-cities/
T
cifar100
tensorflow.org
universe.roboflow.com
+3more
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). cifar100 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar100
Explore at:
Dataset updated
Jun 1, 2024
Description
This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cifar100', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar100-3.0.2.png" alt="Visualization" width="500px">
h
SOS-Training-Data-Visualization
huggingface.co
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weikai Huang (2025). SOS-Training-Data-Visualization [Dataset]. https://huggingface.co/datasets/weikaih/SOS-Training-Data-Visualization
Explore at:
Dataset updated
May 23, 2025
Authors
Weikai Huang
Description
weikaih/SOS-Training-Data-Visualization dataset hosted on Hugging Face and contributed by the HF Datasets community
Z
Demonstrations of witness visualization using the Witness Visualizer Tool
data.niaid.nih.gov
Updated Aug 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mordan, Vitalii (2024). Demonstrations of witness visualization using the Witness Visualizer Tool [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10817852
Explore at:
Dataset updated
Aug 16, 2024
Dataset authored and provided by
Mordan, Vitalii
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We have three datasets that display visualized SVCOMP witnesses generated with the help of the Witness Visualizer tool. Each dataset comprises two directories: witnesses, which contains the original witnesses provided by SVCOMP tools, and visualization, which contains our visual representations of the respective witnesses in HTML format. The visualization file name contains the prefix error_trace-, for example, error_trace-witness.2ls.html corresponds to a witness named witness.2ls.graphml.

Overall thoroughness for all SVCOMP tools (dataset_1.zip)

This dataset includes a single random witness for each SVCOMP tool, accompanied by its corresponding visualization. The visualizations showcase the various witness elements such as function calls, conditions, assumptions, thread specifics, and other operations. Cells marked with +/- indicate that some elements were present in the error trace, but not all of them. All witnesses are presented in the table below:

Witness SV-COMP Tool Function calls Threads Assumptions Conditions Link to sources

witness.2ls.graphml

2LS

-

+

+

witness.aprove.graphml

AProVE (2022)

-

-

+ +

witness.brick.graphml

BRICK

-

+

+

witness.bubaak.graphml

Bubaak

-

+

+

witness.cbmc.graphml

CBMC

+

+

+

witness.cpa-bam-bnb.graphml CPA-BAM-BnB

+

+ + +

witness.cpa-bam-smg.graphml CPA-BAM-SMG

+

+ + +

witness.cpalockator.graphml CPALockator + + + + +

witness.cpachecker.graphml CPAChecker + + + + +

witness.crux.graphml

Crux

-

+

+

witness.cseq.graphml Cseq + +

+

+

witness.dartagnan.graphml

Dartagnan

+

-

+

witness.deagle.graphml

Deagle

+

+

+

-

DIVINE (until 2022) empty

-

EBF empty

witness.esbmc-incr.graphml

ESBMC-incr

+

+

+

witness.esbmc-kind.graphml

ESBMC-kind

+

+

+

-

Frama-C-SV empty

witness.gazer-theta.graphml Gazer-Theta

+

+

wrong path

witness.gdart.graphml

Gdart-LLVM

-

+

+

-

Goblint empty

witness.graves_cpa.graphml Graves-CPA + + + + +

witness.graves_par.graphml Graves-Par + + + + +

-

Infer empty

witness.korn.graphml

Korn

-

+

+

witness.lart.graphml

LART (2022)

-

+

+

witness.lazy-cseq.graphml Lazy-CSeq + + + + +

witness.lfchecker.graphml

LF-checker

+

+

+

-

Locksmith empty

-

Mopsa empty

witness.pesco_cpa.graphml PeSCo-CPA + + + + +

witness.pichecker.graphml PIChecker

+

+ + +

witness.pinaka.graphml

Pinaka

-

+

+

witness.predator.graphml

PredatorHP

-

-

-

+

-

SESL (2022) empty

witness.smack.graphml

SMACK (until 2022)

-

+

+

witness.symbiotic.graphml

Symbiotic

+

+

+

witness.theta.graphml Theta different format

witness.uatomozer.graphml UAutomizer +/- + + + +

witness.ucutter.graphml UgemCutter +/- + + + +

witness.ukojak.graphml UKojak

+/-

+ + +

witness.utaipan.graphml UTaipan +/- + + + +

witness.veriabs.graphml

VeriAbs

-

+

wrong path

witness.veriabsl.graphml VeriAbsL

+

+ + wrong path

witness.verifuzz.graphml

VeriFuzz

-

+

+

witness.verioover.graphml

VeriOover

-

+

+

Thoroughness by property (dataset_2.zip)

This dataset comprises a selected witness for each SVCOMP property (ReachSafety, MemSafety, Termination, NoOverflow, ConcurrencySafety). The witnesses are presented in the following table:

Witness SV-COMP Tool Property Mandatory elements Description

witness.smg_memory.graphml CPA-BAM-SMG MemSafety Assumptions / conditions, function calls There is a double free operation. Employing function calls append aids in comprehending the structure of the list, while assumptions reveal which branch was chosen.

witness.graves_overflow.graphml Graves-CPA NoOverflow Assumptions / conditions The witness showcases an explicit (-2147483648, which represents the minimal value for the int type), which has the potential to cause overflow in specific program.

witness.cpachecker_termination.graphml CPAChecker NoTermination Assumptions / conditions There is a condition leading to an infinite loop.

witness.cpachecker_unreach.graphml CPAChecker ReachSafety Function calls The error trace indicates a potential scenario where a mutex was unlockedwithout the corresponding mutex_unlock operation.

witness.cpachecker_conc.graphml CPAChecker ConcurrencySafety Function calls, thread operations The error trace illustrates the creation of threads and highlights the assignments made within each thread that ultimately resulted in the violation of the property.

Known bug (dataset_3.zip)

This dataset contains witnesses for a known bug from SVCOMP (linux-3.14--drivers--usb--misc--adutux.ko.cil.i) involving a data race on dev->udev, where concurrent writes occur without corresponding locks. Only two tools were able to solve the corresponding verification task: ESBMC-kind and CPALockator. The ESBMC error trace (witness.esbmc_2020.graphml) includes only thread specifics and assumptions, while the CPALockator witness (witness.lockator.graphml) comprises all witness elements and is presented in a human-readable format.

Comparison with the validation rate

This section presents a comparison between witness thoroughness and the actual validation rate for each property. We considered all tools that participated in the respective category and generated at least 10 error traces, then calculated the validation rate. This comparison demonstrates how effectively thoroughness can approximate the validation rate. The following tables provide details for each property, with the relevant elements used to calculate thoroughness highlighted:

MemSafety property:

SV-COMP Tool Function calls Threads Assumptions Conditions Thoroughness Error traces Validation rate

Bubaak 0 0 1 0 33.33 64 67.19

CBMC 0 1 1 0 33.33 27 11.11

CPA-BAM-SMG 1 0 1 1 100 46 78.26

CPAChecker 1 1 1 1 100 37 67.57

ESBMC-kind 0 1 1 0 33.33 25 20

Graves-CPA 1 1 1 1 100 44 56.82

Graves-Par 1 1 1 1 100 18 77.78

PeSCo-CPA 1 1 1 1 100 37 67.57

NoOverflow property:

SV-COMP Tool Function calls Threads Assumptions Conditions Thoroughness Error traces Validation rate

2LS 0 0 1 0 100 2071 95.7

Bubaak 0 0 1 0 100 2233 94.67

CBMC 0 1 1 0 100 3296 62.14

CPAChecker 1 1 1 1 100 196 100

Crux 0 0 1 0 100 222 95.05

ESBMC-kind 0 1 1 0 100 3296 66.69

Frama-C-SV 0 0 0 0 0 676 0

Graves-Par 1 1 1 1 100 750 2

Infer 0 0 0 0 0 583 0

Pinaka 0 0 1 0 100 2232 100

Symbiotic 0 1 1 0 100 1418 100

UAutomizer 0.5 1 1 1 100 2222 100

UKojak 0.5 0 1 1 100 168 100

UTaipan 0.5 1 1 1 100 0 100

VeriFuzz 0 0 1 0 100 185 90.81

NoTermination property:

SV-COMP Tool Function calls Threads Assumptions Conditions Thoroughness Error traces Validation rate

2LS 0 0 1 0 50 663 69.08

Bubaak 0 0 1 0 50 578 34.78

CPAChecker 1 1 1 1 100 501 97.01

Symbiotic 0 1 1 0 50 591 52.96

UAutomizer 0.5 1 1 1 100 512 98.24

VeriFuzz 0 0 1 0 50 492 71.34

ReachSafety property:

SV-COMP Tool Function calls Threads Assumptions Conditions Thoroughness Error traces Validation rate

Bubaak 0 0 1 0 0 24 54.17

CBMC 0 1 1 0 0 392 1.28

CPA-BAM-BnB 1 0 1 1 100 69 85.51

CPA-BAM-SMG 1 0 1 1 100 67 85.07

CPAChecker 1 1 1 1 100 45 88.89

Crux 0 0 1 0 0 1572 0.13

ESBMC-kind 0 1 1 0 0 64 21.88

Graves-CPA 1 1 1 1 100 66 87.88

Graves-Par 1 1 1 1 100 24 58.33

PeSCo-CPA 1 1 1 1 100 63 85.71

ConcurrencySafety property:

SV-COMP Tool Function calls Threads Assumptions Conditions Thoroughness Error traces Validation rate

CBMC 0 1 1 0 50 277 87

CPA-Lockator 1 1 1 1 100 83 26.51

CPAChecker 1 1 1 1 100 257 100

Cseq 1 1 1 0 100 277 94.58

Dartagnan 0 1 0 0 50 281 92.17

Deagle 0 1 1 0 50 280 96.07

DIVINE 0 0 0 0 0 230 80.87

EBF 0 0 0 0 0 282 89.01

ESBMC-incr 0 1 1 0 50 68 79.41

ESBMC-kind 0 1 1 0 50 263 89.73

Graves-CPA 1 1 1 1 100 261 99.23

Graves-Par 1 1 1 1 100 28 100

Infer 0 0 0 0 0 634 0

Lazy-CSeq 1 1 1 1 100 274 94.89

LF-checker 0 1 1 0 50 286 85.31

PeSCo-CPA 1 1 1 1 100 256 100

PIChecker 1 0 1 1 50 269 98.14

Symbiotic 0 1 1 0 50 110 92.73

UAutomizer 0.5 1 1 1 75 297 94.95

UgemCutter 0.5 1 1 1 75 283 96.47

UTaipan 0.5 1 1 1 75 293 96.25

Overall distance for all possible combinations of elements for thoroughness

This section presents the overall difference (i.e., the sum of differences between witness thoroughness and validation rates for each tool) when thoroughness is calculated based on all possible combinations of witness elements (assumptions, conditions, thread specifics, and function calls). The set of witnesses is the same as in the previous section. The following tables provide details for each property, with the minimum difference highlighted:

MemSafety property:

Combination Overall difference

Function calls 250.3

Thread specifics 444.6

Assumptions 353.7

Conditions 250.3

Function calls, Thread specifics 294.6

Assumptions, Function calls 238.08

Conditions, Function calls 250.3

Assumptions, Thread specifics 344.6

Conditions, Thread specifics 294.6

Assumptions, Conditions 238.08

Assumptions, Function calls, Thread specifics 277.94

Conditions, Function calls, Thread specifics 244.59

Assumptions, Conditions, Function calls 221.41

Assumptions, Conditions, Thread specifics 277.94

Assumptions, Conditions, Function calls, Thread specifics 244.6

NoOverflow property:

Combination Overall difference

Function calls 953.06

Thread specifics 745.4

Assumptions 192.94

Conditions 803.06

Function calls, Thread specifics 778.06

Assumptions, Function calls 478.06

Conditions, Function calls 878.06

Assumptions, Thread specifics 445.4

Conditions, Thread specifics 703.06

Assumptions, Conditions 403.06

Assumptions, Function calls, Thread specifics 528.8

Conditions, Function calls, Thread specifics 786.41

Assumptions, Conditions, Function calls 586.43

Assumptions, Conditions, Thread specifics 478.79

Assumptions, Conditions, Function calls, Thread specifics 590.56

NoTermination property:

Combination Overall
Top 100 popular movies from 2003 to 2022 (iMDB)
kaggle.com
Updated Mar 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Scutelnicu (2023). Top 100 popular movies from 2003 to 2022 (iMDB) [Dataset]. https://www.kaggle.com/datasets/georgescutelnicu/top-100-popular-movies-from-2003-to-2022-imdb/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2023
Dataset provided by
Kaggle
Authors
George Scutelnicu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The dataset contains most 100 popular movies for each year in the interval 2003-2022. The Data is Ideal for Exploratory Data Analysis. Every single information has been collected by web scraping and can be found on iMDB.

The dataset contains: - Title - Rating - Year - Month - Certificate - Runtime - Director/s - Stars - Genre/s - Filming Location - Budget - Income - Country of Origin
f
Parameters of the studied CNN models.
plos.figshare.com
xls
Updated Mar 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiasheng Yang; Guanfang Wang; Xu Xiao; Meihua Bao; Geng Tian (2024). Parameters of the studied CNN models. [Dataset]. http://doi.org/10.1371/journal.pone.0296175.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0296175.t002
Dataset updated
Mar 22, 2024
Dataset provided by
PLOS ONE
Authors
Jiasheng Yang; Guanfang Wang; Xu Xiao; Meihua Bao; Geng Tian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The accuracy and interpretability of artificial intelligence (AI) are crucial for the advancement of optical coherence tomography (OCT) image detection, as it can greatly reduce the manual labor required by clinicians. By prioritizing these aspects during development and application, we can make significant progress towards streamlining the clinical workflow. In this paper, we propose an explainable ensemble approach that utilizes transfer learning to detect fundus lesion diseases through OCT imaging. Our study utilized a publicly available OCT dataset consisting of normal subjects, patients with dry age-related macular degeneration (AMD), and patients with diabetic macular edema (DME), each with 15 samples. The impact of pre-trained weights on the performance of individual networks was first compared, and then these networks were ensemble using majority soft polling. Finally, the features learned by the networks were visualized using Grad-CAM and CAM. The use of pre-trained ImageNet weights improved the performance from 68.17% to 92.89%. The ensemble model consisting of the three CNN models with pre-trained parameters loaded performed best, correctly distinguishing between AMD patients, DME patients and normal subjects 100% of the time. Visualization results showed that Grad-CAM could display the lesion area more accurately. It is demonstrated that the proposed approach could have good performance of both accuracy and interpretability in retinal OCT image detection.
d
Data Visualization in Social Work Research
search.dataone.org
dataverse.harvard.edu
+1more
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rothwell, David; Esposito, Tonino; Wegner-Lohin (2023). Data Visualization in Social Work Research [Dataset]. http://doi.org/10.7910/DVN/I6IIXL
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/I6IIXL
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Rothwell, David; Esposito, Tonino; Wegner-Lohin
Time period covered
Jan 1, 2009 - Jan 1, 2012
Description
Research dissemination and knowledge translation are imperative in social work. Methodological developments in data visualization techniques have improved the ability to convey meaning and reduce erroneous conclusions. The purpose of this project is to examine: (1) How are empirical results presented visually in social work research?; (2) To what extent do top social work journals vary in the publication of data visualization techniques?; (3) What is the predominant type of analysis presented in tables and graphs?; (4) How can current data visualization methods be improved to increase understanding of social work research? Method: A database was built from a systematic literature review of the four most recent issues of Social Work Research and 6 other highly ranked journals in social work based on the 2009 5-year impact factor (Thomson Reuters ISI Web of Knowledge). Overall, 294 articles were reviewed. Articles without any form of data visualization were not included in the final database. The number of articles reviewed by journal includes : Child Abuse & Neglect (38), Child Maltreatment (30), American Journal of Community Psychology (31), Family Relations (36), Social Work (29), Children and Youth Services Review (112), and Social Work Research (18). Articles with any type of data visualization (table, graph, other) were included in the database and coded sequentially by two reviewers based on the type of visualization method and type of analyses presented (descriptive, bivariate, measurement, estimate, predicted value, other). Additional revi ew was required from the entire research team for 68 articles. Codes were discussed until 100% agreement was reached. The final database includes 824 data visualization entries.
T
cifar100_n
tensorflow.org
Updated Aug 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). cifar100_n [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar100_n
Explore at:
Dataset updated
Aug 11, 2023
Description
A re-labeled version of CIFAR-100 with real human annotation errors. For every pair (image, label) in the original CIFAR-100 train set, it provides an additional label given by a real human annotator.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cifar100_n', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar100_n-1.0.1.png" alt="Visualization" width="500px">
Founders in e-commerce and fintech dominated the RoW100 2022 list - Chart
restofworld.org
Updated May 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rest of World (2022). Founders in e-commerce and fintech dominated the RoW100 2022 list - Chart [Dataset]. https://restofworld.org/charts/2022/Zb1SZ-founders-ecommerce-fintech-dominated-row100-2022
Explore at:
Dataset updated
May 11, 2022
Dataset authored and provided by
Rest of World
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A data visualization representing Founders in e-commerce and fintech dominated the RoW100 2022 list
f
Performance of three CNNs models and ensemble model with pretraining to each...
plos.figshare.com
xls
Updated Mar 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiasheng Yang; Guanfang Wang; Xu Xiao; Meihua Bao; Geng Tian (2024). Performance of three CNNs models and ensemble model with pretraining to each class (Mean±SD). [Dataset]. http://doi.org/10.1371/journal.pone.0296175.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0296175.t006
Dataset updated
Mar 22, 2024
Dataset provided by
PLOS ONE
Authors
Jiasheng Yang; Guanfang Wang; Xu Xiao; Meihua Bao; Geng Tian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performance of three CNNs models and ensemble model with pretraining to each class (Mean±SD).
r
Percentage of population with mobile internet subscriptions - Chart
restofworld.org
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rest of World (2024). Percentage of population with mobile internet subscriptions - Chart [Dataset]. https://restofworld.org/charts/2024/scTzv-percentage-population-mobile-internet-subscriptions
Explore at:
Dataset updated
Oct 31, 2024
Dataset authored and provided by
Rest of World
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Nearly 100% of Singapore’s population has a mobile internet subscription, while in Bangladesh, Nigeria, and Pakistan it is below 50%.
Materials for 2d representation of the HathiTrust Library
zenodo.org
application/gzip, bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin M Schmidt; Benjamin M Schmidt (2020). Materials for 2d representation of the HathiTrust Library [Dataset]. http://doi.org/10.5281/zenodo.1477018
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1477018
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Benjamin M Schmidt; Benjamin M Schmidt
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Materials to create the LargeVis visualization online at http://creatingdata.us/datasets/hathi-features/, and described in Benjamin Schmidt, "Stable random projection: lightweight, general-purpose dimensionality reduction for digitized libraries," Journal of Cultural Analytics. October 3, 2018.

Two items. First, `hathi_pca.bin`: a binary file with 100-dimensional representations of the complete Hathi Trust Extended Features set. These began as 1280-dimensional SRP features, and were reduced to 100 dimensions using a PCA transformation matrix derived using a random sample of the full 13 million book set. Vectors were reduced to unit length before PCA, but not afterwords; this means that in general, their length gives some sense of much information was lost in the PCA representation. This can be read using the code at https://github.com/bmschmidt/pySRP, or anything that reads word2vec formatted vectors. Includes HathiTrust identifiers.

Second, `hathi.tsv.gz`: a row oriented set containing a variety of metadata fields for each set, including (as 'x' and 'y') the coordinates of a 2-d LargeVis visualization. This is the immediate input to the visualization at ttp://creatingdata.us/datasets/hathi-features/. Columns should be relatively straightforward; they are derived from the HathiTrust MARC records, which can be accessed through Hathi's public API. Classification codes ('lc1') are using the Library of Congress classification; they represent the subclass (generally two characters, though it can be one or three). The first character alone represents the LC class and can be useful for coloring high-level overviews.

These two files can be merged through the Hathi Trust identifier present in both.
T
coil100
tensorflow.org
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). coil100 [Dataset]. https://www.tensorflow.org/datasets/catalog/coil100
Explore at:
Dataset updated
Nov 23, 2022
Description
The dataset contains 7200 color images of 100 objects (72 images per object). The objects have a wide variety of complex geometric and reflectance characteristics. The objects were placed on a motorized turntable against a black background. The turntable was rotated through 360 degrees to vary object pose with respect to a fxed color camera. Images of the objects were taken at pose intervals of 5 degrees.This corresponds to 72 poses per object

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('coil100', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/coil100-2.0.0.png" alt="Visualization" width="500px">
Student Grades
kaggle.com
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
clemence travers (2025). Student Grades [Dataset]. https://www.kaggle.com/datasets/clemencetravers/student-grades/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
clemence travers
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Dataset comes from Mahmoud Elhemaly.

I've modified this dataset since it had no correlation between the variables. I've used it for data visualization on Tableau. Many columns contains NON ACCURATE DATA.

Description

Student Performance & Behavior Dataset This dataset is real data of 5,000 records collected from a private learning provider. The dataset includes key attributes necessary for exploring patterns, correlations, and insights related to academic performance.

Columns:

Student_ID: Unique identifier for each student.

First_Name: Student’s first name.

Last_Name: Student’s last name.

Email: Contact email (can be anonymized).

Gender: Male, Female, Other.

Age: The age of the student.

Department: Student's department (e.g., CS, Engineering, Business).

Attendance (%): Attendance percentage (0-100%).

Participation_Score: Score based on class participation (0-10).

Projects_Score: Project evaluation score (out of 100).

Total_Score: Weighted sum of all grades.

Grade: Letter grade (A, B, C, D, F).

Study_Hours_per_Week: Average study hours per week.

Extracurricular_Activities: Whether the student participates in extracurriculars (Yes/No).

Internet_Access_at_Home: Does the student have access to the internet at home? (Yes/No).

Parent_Education_Level: Highest education level of parents (None, High School, Bachelor's, Master's, PhD).

Family_Income_Level: Low, Medium, High.

Stress_Level (1-10): Self-reported stress level (1: Low, 10: High).

Sleep_Hours_per_Night: Average hours of sleep per night.

Sleep_Hours_per_Night_Entier: with integrer only

Country: Country of origin

Dataset contains:

Missing values (nulls): in some records (e.g., Attendance, Assignments, or Parent Education Level).

Bias in some Datae (ex: grading e.g., students with high attendance get slightly better grades).

Imbalanced distributions (e.g., some departments having more students).
H
Teasing Out the True Milky Way
dataverse.harvard.edu
Updated Jan 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alyssa Goodman (2020). Teasing Out the True Milky Way [Dataset]. http://doi.org/10.7910/DVN/UPJJBV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/UPJJBV
Dataset updated
Jan 23, 2020
Dataset provided by
Harvard Dataverse
Authors
Alyssa Goodman
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Presentation Date: Tuesday, September 10, 2019. Location: Institute for Advanced Study, Princeton, NJ. Abstract: It has been nearly 100 years since the "Great Debate," where Heber Curtis correctly argued that Thomas Wright's 1750 ideas about our Milky Way being one of many "galaxies," each a flattish disk of a multitude of stars, was correct. Since then, astronomers have made sharper and sharper images of galaxies beyond our own, often revealing intricate sprial structure. But, for the mostpart, our potentially super-close-up view of our own Galaxy's structure has been ruined by our unfortunate vantage point within its disk. Work over the past century indicates that the Milky Way is a barred spiral, but even the Galaxy's number of arms is still at-issue. In this talk, I will discuss how four techniques are being combined to tease out the true structure of the Milky Way. In particular, our collaboration* is combining 3D-dust mapping, searches for extraordinarily long galactic filaments called "Bones," position-position-velocity observations of gas, and numerical simulations to create a new, and sometimes very surprising, view of our Galaxy. Unexpected results to be presented include: several-hundred-pc long, ~1-pc wide, gaseous "Bones" lying in, and likely defining, the gravitational mid-plane of the Milky Way; a 2.5 kpc-long damped sine wave with 200-pc amplitude that seems to be the Local Arm (and the undoing of "Gould's Belt"); and simulations that suggest the need for feedback and/or magnetic fields, and/or stranger physics (dark matter in the disk?) in order to explain the Bones and/or the Local Arm's Wave.
Unveiling Insights from 100K Bike Sales
kaggle.com
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hari Goshika (2024). Unveiling Insights from 100K Bike Sales [Dataset]. https://www.kaggle.com/datasets/harigoshika/unveiling-insights-from-100k-bike-sales/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hari Goshika
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
I'm excited to share my latest project—an interactive Power BI dashboard that provides a comprehensive analysis of bike sales data from 2019 to 2024!

Key Highlights of the Dashboard:

📈 Sales Trend Analysis: Understand how bike sales have fluctuated over the years, with peaks in specific months that give us clues about seasonal demand. 🏢 Sales by Store Location: See how different cities like New York and Phoenix lead in terms of total sales revenue. 🚴‍♀️ Customer Demographics: Almost equal contributions from male and female customers—showing the broad appeal of our products. 💳 Payment Method Preferences: Breakdown of the most used payment methods, with insights that can help improve our customer experience. 📊 Revenue by Bike Model: A detailed look at which bike models drive the most revenue, helping guide product focus and inventory management. This dashboard was built to provide actionable insights into the sales performance and customer behavior of a large dataset of 100K records. It highlights the power of data visualization in turning numbers into strategic insights!

Why Power BI? Power BI's flexibility and interactive capabilities made it the ideal tool for visualizing the data, allowing users to drill down into specific details using slicers for bike models and time periods. 💡

Would love to hear your thoughts or any feedback on this project! If you’re interested in how this dashboard was built or want to discuss data visualization, feel free to reach out. Let’s transform data into stories that drive success! 🌟
A
Data from: California State Waters Map Series--Santa Barbara Channel Web...
data.amerigeoss.org
data.usgs.gov
+3more
xml
Updated Aug 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2022). California State Waters Map Series--Santa Barbara Channel Web Services [Dataset]. https://data.amerigeoss.org/dataset/california-state-waters-map-series-santa-barbara-channel-web-services-b23aa
Explore at:
xmlAvailable download formats
Dataset updated
Aug 23, 2022
Dataset provided by
United States
Area covered
Santa Barbara Channel
Description
In 2007, the California Ocean Protection Council initiated the California Seafloor Mapping Program (CSMP), designed to create a comprehensive seafloor map of high-resolution bathymetry, marine benthic habitats, and geology within California’s State Waters. The program supports a large number of coastal-zone- and ocean-management issues, including the California Marine Life Protection Act (MLPA) (California Department of Fish and Wildlife, 2008), which requires information about the distribution of ecosystems as part of the design and proposal process for the establishment of Marine Protected Areas. A focus of CSMP is to map California’s State Waters with consistent methods at a consistent scale. The CSMP approach is to create highly detailed seafloor maps through collection, integration, interpretation, and visualization of swath sonar data (the undersea equivalent of satellite remote-sensing data in terrestrial mapping), acoustic backscatter, seafloor video, seafloor photography, high-resolution seismic-reflection profiles, and bottom-sediment sampling data. The map products display seafloor morphology and character, identify potential marine benthic habitats, and illustrate both the surficial seafloor geology and shallow (to about 100 m) subsurface geology. It is emphasized that the more interpretive habitat and geology data rely on the integration of multiple, new high-resolution datasets and that mapping at small scales would not be possible without such data. This approach and CSMP planning is based in part on recommendations of the Marine Mapping Planning Workshop (Kvitek and others, 2006), attended by coastal and marine managers and scientists from around the state. That workshop established geographic priorities for a coastal mapping project and identified the need for coverage of “lands” from the shore strand line (defined as Mean Higher High Water; MHHW) out to the 3-nautical-mile (5.6-km) limit of California’s State Waters. Unfortunately, surveying the zone from MHHW out to 10-m water depth is not consistently possible using ship-based surveying methods, owing to sea state (for example, waves, wind, or currents), kelp coverage, and shallow rock outcrops. Accordingly, some of the data presented in this series commonly do not cover the zone from the shore out to 10-m depth. This data is part of a series of online U.S. Geological Survey (USGS) publications, each of which includes several map sheets, some explanatory text, and a descriptive pamphlet. Each map sheet is published as a PDF file. Geographic information system (GIS) files that contain both ESRI ArcGIS raster grids (for example, bathymetry, seafloor character) and geotiffs (for example, shaded relief) are also included for each publication. For those who do not own the full suite of ESRI GIS and mapping software, the data can be read using ESRI ArcReader, a free viewer that is available at http://www.esri.com/software/arcgis/arcreader/index.html (last accessed September 20, 2013). The California Seafloor Mapping Program is a collaborative venture between numerous different federal and state agencies, academia, and the private sector. CSMP partners include the California Coastal Conservancy, the California Ocean Protection Council, the California Department of Fish and Wildlife, the California Geological Survey, California State University at Monterey Bay’s Seafloor Mapping Lab, Moss Landing Marine Laboratories Center for Habitat Studies, Fugro Pelagos, Pacific Gas and Electric Company, National Oceanic and Atmospheric Administration (NOAA, including National Ocean Service–Office of Coast Surveys, National Marine Sanctuaries, and National Marine Fisheries Service), U.S. Army Corps of Engineers, the Bureau of Ocean Energy Management, the National Park Service, and the U.S. Geological Survey. These web services for the Santa Barbara Channel map area includes data layers that are associated to GIS and map sheets available from the USGS CSMP web page at https://walrus.wr.usgs.gov/mapping/csmp/index.html. Each published CSMP map area includes a data catalog of geographic information system (GIS) files; map sheets that contain explanatory text; and an associated descriptive pamphlet. This web service represents the available data layers for this map area. Data was combined from different sonar surveys to generate a comprehensive high-resolution bathymetry and acoustic-backscatter coverage of the map area. These data reveal a range of physiographic including exposed bedrock outcrops, large fields of sand waves, as well as many human impacts on the seafloor. To validate geological and biological interpretations of the sonar data, the U.S. Geological Survey towed a camera sled over specific offshore locations, collecting both video and photographic imagery; these “ground-truth” surveying data are available from the CSMP Video and Photograph Portal at https://doi.org/10.5066/F7J1015K. The “seafloor character” data layer shows classifications of the seafloor on the basis of depth, slope, rugosity (ruggedness), and backscatter intensity and which is further informed by the ground-truth-survey imagery. The “potential habitats” polygons are delineated on the basis of substrate type, geomorphology, seafloor process, or other attributes that may provide a habitat for a specific species or assemblage of organisms. Representative seismic-reflection profile data from the map area is also include and provides information on the subsurface stratigraphy and structure of the map area. The distribution and thickness of young sediment (deposited over the past about 21,000 years, during the most recent sea-level rise) is interpreted on the basis of the seismic-reflection data. The geologic polygons merge onshore geologic mapping (compiled from existing maps by the California Geological Survey) and new offshore geologic mapping that is based on integration of high-resolution bathymetry and backscatter imagery seafloor-sediment and rock samplesdigital camera and video imagery, and high-resolution seismic-reflection profiles. The information provided by the map sheets, pamphlet, and data catalog has a broad range of applications. High-resolution bathymetry, acoustic backscatter, ground-truth-surveying imagery, and habitat mapping all contribute to habitat characterization and ecosystem-based management by providing essential data for delineation of marine protected areas and ecosystem restoration. Many of the maps provide high-resolution baselines that will be critical for monitoring environmental change associated with climate change, coastal development, or other forcings. High-resolution bathymetry is a critical component for modeling coastal flooding caused by storms and tsunamis, as well as inundation associated with longer term sea-level rise. Seismic-reflection and bathymetric data help characterize earthquake and tsunami sources, critical for natural-hazard assessments of coastal zones. Information on sediment distribution and thickness is essential to the understanding of local and regional sediment transport, as well as the development of regional sediment-management plans. In addition, siting of any new offshore infrastructure (for example, pipelines, cables, or renewable-energy facilities) will depend on high-resolution mapping. Finally, this mapping will both stimulate and enable new scientific research and also raise public awareness of, and education about, coastal environments and issues. Web services were created using an ArcGIS service definition file. The ArcGIS REST service and OGC WMS service include all Santa Barbara Channel map area data layers. Data layers are symbolized as shown on the associated map sheets.
Apps with Chinese parents that are still popular in India - Chart
restofworld.org
Updated Oct 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rest of World (2022). Apps with Chinese parents that are still popular in India - Chart [Dataset]. https://restofworld.org/charts/2022/tF7jU-apps-chinese-parents-popular-india
Explore at:
Dataset updated
Oct 4, 2022
Dataset authored and provided by
Rest of World
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
India
Description
A Rest of World audit of the Google Play store indicated that at least eight of the 100-most downloaded free apps in India may be owned by large Chinese parent companies.
m
Visualizations of rotational curves within a Standardized Gait Cycle
data.mendeley.com
Updated May 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jürgen Konradi (2022). Visualizations of rotational curves within a Standardized Gait Cycle [Dataset]. http://doi.org/10.17632/m7tbn7vhpf.1
Explore at:
Unique identifier
https://doi.org/10.17632/m7tbn7vhpf.1
Dataset updated
May 4, 2022
Authors
Jürgen Konradi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains graphs and a movie. Both show visualizations of rotational curves in the transversal plane within a Standardized Gait Cycle from Vertebra prominens downwards, ending at the pelvis. They display 201 anonymous healthy people aged 18-70 years walking at 2,3,4, and 5 km/h on a treadmill. They are based on a SPSS (v23) syntax file and a relating graph template that can be found at our datasets as well. Files are numbered subsequently across all speeds and can be linked by number to its non-standardized counterpart in a further dataset. Positive values show vertebral body rotation to the left, negative values show rotation to the right. Percent of the Standardized Gait Cycle (0-100%) is displayed on the abscissa, always starting with Initial Contact of the right foot. Within a Standardized Gait Cycle the duration of the stance phase right is expected to be 60% (Perry, 1992). As can be seen in the graphs, interpolating spline functions work for average walking speed measurements leading to a more precise determination of relevant and characteristic points (e.g. maxima, phase shifts, lumbar and thoracic movement behavior), thereby aiding in in the clarification of individual features.
Scalable ParaView for Extreme Scale Visualization, Phase I
data.nasa.gov
application/rdfxml +5
Updated Jun 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Scalable ParaView for Extreme Scale Visualization, Phase I [Dataset]. https://data.nasa.gov/dataset/Scalable-ParaView-for-Extreme-Scale-Visualization-/up7h-hkky
Explore at:
csv, tsv, xml, application/rssxml, application/rdfxml, jsonAvailable download formats
Dataset updated
Jun 26, 2018
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Petscale computing is leading to significant breakthroughs in a number of fields and is revolutionizing the way science is conducted. Data is not knowledge, however, and the challenge has been how to analyze and gain insight from the massive quantities of data that are generated. In order to address the peta-scale visualization challenges, we propose to develop a scientific visualization software that would enable real-time visualization capability of extremely large data sets. We plan to accomplish this by extending the ParaView visualization architecture to extreme scales. ParaView is an open source software installed on all HPC sites including NASA's Pleiades and has a large user base in diverse areas of science and engineering. Our proposed solution will significantly enhance the scientific return from NASA HPC investments by providing the next generation of open source data analysis and visualization tools for very large datasets. To test our solution on real world data with complex pipeline, we have partnered with SciberQuest, who have recently performed the largest kinetic simulations of magnetosphere using 25 K cores on Pleiades and 100 K cores on Kraken. Given that IO is the main bottleneck for scientific visualization at large scales, we propose to work closely with Pleiades's systems team and provide efficient prepackaged general purpose I/O component for ParaView for structured and unstructured data across a spectrum of scales and access patterns with focus on Lustre file system used by Pleiades.

Facebook

Twitter

Click to copy link

Link copied

Cite

Chiticariu Cristian (2021). One Hundred Cities [Dataset]. https://www.kaggle.com/datasets/chiticariucristian/one-hundred-cities/discussion

One Hundred Cities

100 cities from the world with their short description and population

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 29, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Chiticariu Cristian

Description

Context

100 cities

Content

The dataset consists of one hundred cities around the world, short description for each one, and their population.

Acknowledgements

The data was extracted from https://www.bestcities.org/rankings/worlds-best-cities/

Clear search

Close search

Google apps

Main menu

One Hundred Cities

Context

Content

Acknowledgements

cifar100

SOS-Training-Data-Visualization

Demonstrations of witness visualization using the Witness Visualizer Tool

2LS

+

AProVE (2022)

BRICK

+

Bubaak

+

CBMC

+

+

+

Crux

+

+

Dartagnan

+

Deagle

+

ESBMC-incr

+

ESBMC-kind

+

+

+

Gdart-LLVM

+

Korn

+

LART (2022)

+

LF-checker

+

+

Pinaka

+

PredatorHP

SMACK (until 2022)

+

Symbiotic

+

+/-

VeriAbs

+

+

VeriFuzz

+

VeriOover

+

Top 100 popular movies from 2003 to 2022 (iMDB)

Parameters of the studied CNN models.

Data Visualization in Social Work Research

cifar100_n

Founders in e-commerce and fintech dominated the RoW100 2022 list - Chart

Performance of three CNNs models and ensemble model with pretraining to each...

Percentage of population with mobile internet subscriptions - Chart

Materials for 2d representation of the HathiTrust Library

coil100

Student Grades

Teasing Out the True Milky Way

Unveiling Insights from 100K Bike Sales

Data from: California State Waters Map Series--Santa Barbara Channel Web...

Apps with Chinese parents that are still popular in India - Chart

Visualizations of rotational curves within a Standardized Gait Cycle

Scalable ParaView for Extreme Scale Visualization, Phase I

One Hundred Cities

100 cities from the world with their short description and population

Context

Content

Acknowledgements