Facebook
Twitterhttps://www.gnu.org/licenses/agpl-3.0.htmlhttps://www.gnu.org/licenses/agpl-3.0.html
A generator for synthetic data sets for use in cluster analysis, classification, and outlier detection.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data we used to evaluate Louvain Method in the study Benchmarking Graph Databases on the Problem of Community Detection. These data werw synthetically generated using the LFR-Benchmark (3rd link). There are two type of files, networkX.dat and communityX.dat. The networkX.dat file contains the list of edges (nodes are labelled from 1 to the number of nodes; the edges are ordered and repeated twice, i.e. source-target and target-source). The first four lines of the networkX.dat file list the parameters we used to generate the data. The communityX.dat file contains a list of the nodes and their membership (memberships are labelled by integer numbers >=1). Note X correspond to the number of nodes each dataset contains.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
DrCyZ: Techniques for analyzing and extracting useful information from CyZ.
Samples from NASA Perseverance and set of GAN generated synthetic images from Neural Mars.
Repository: https://github.com/decurtoidiaz/drcyz
Subset of samples from (includes tools to visualize and analyse the dataset):
CyZ: MARS Space Exploration Dataset. [https://doi.org/10.5281/zenodo.5655473]
Images from NASA missions of the celestial body.
Repository: https://github.com/decurtoidiaz/cyz
Authors:
J. de Curtò c@decurto.be
I. de Zarzà z@dezarza.be
• Subset of samples from Perseverance (drcyz/c).
∙ png (drcyz/c/png).
PNG files (5025) selected from NASA Perseverance (CyZ-1.1) after t-SNE and K-means Clustering.
∙ csv (drcyz/c/csv).
CSV file.
• Resized samples from Perseverance (drcyz/c+).
∙ png 64x64; 128x128; 256x256; 512x512; 1024x1024 (drcyz/c+/drcyz_64-1024).
PNG files resized at the corresponding size.
∙ TFRecords 64x64; 128x128; 256x256; 512x512; 1024x1024 (drcyz/c+/tfr_drcyz_64-1024).
TFRecord resized at the corresponding size to import on Tensorflow.
• Synthetic images from Neural Mars generated using Stylegan2-ada (drcyz/drcyz+).
∙ png 100; 1000; 10000 (drcyz/drcyz+/drcyz_256_100-10000)
PNG files subset of 100, 1000 and 10000 at size 256x256.
• Network Checkpoint from Stylegan2-ada trained at size 256x256 (drcyz/model_drcyz).
∙ network-snapshot-000798-drcyz.pkl
• Notebooks in python to analyse the original dataset and reproduce the experiments; K-means Clustering, t-SNE, PCA, synthetic generation using Stylegan2-ada and instance segmentation using Deeplab (https://github.com/decurtoidiaz/drcyz/tree/main/dr_cyz+).
∙ clustering_curiosity_de_curto_and_de_zarza.ipynb
K-means Clustering and PCA(2) with images from Curiosity.
∙ clustering_perseverance_de_curto_and_de_zarza.ipynb
K-means Clustering and PCA(2) with images from Perseverance.
∙ tsne_curiosity_de_curto_and_de_zarza.ipynb
t-SNE and PCA (components selected to explain 99% of variance) with images from Curiosity.
∙ tsne_perseverance_de_curto_and_de_zarza.ipynb
t-SNE and PCA (components selected to explain 99% of variance) with images from Perseverance.
∙ Stylegan2-ada_de_curto_and_de_zarza.ipynb
Stylegan2-ada trained on a subset of images from NASA Perseverance (DrCyZ).
∙ statistics_perseverance_de_curto_and_de_zarza.ipynb
Compute statistics from synthetic samples generated by Stylegan2-ada (DrCyZ) and images from NASA Perseverance (CyZ).
∙ DeepLab_TFLite_ADE20k_de_curto_and_de_zarza.ipynb
Example of instance segmentation using Deeplab with a sample from NASA Perseverance (DrCyZ).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hyperparameters and performance metrics of BGWO with K-means clustering.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Diabetic retinopathy (DR) is a prominent reason of blindness globally, which is a diagnostically challenging disease owing to the intricate process of its development and the human eye’s complexity, which consists of nearly forty connected components like the retina, iris, optic nerve, and so on. This study proposes a novel approach to the identification of DR employing methods such as synthetic data generation, K- Means Clustering-Based Binary Grey Wolf Optimizer (KCBGWO), and Fully Convolutional Encoder-Decoder Networks (FCEDN). This is achieved using Generative Adversarial Networks (GANs) to generate high-quality synthetic data and transfer learning for accurate feature extraction and classification, integrating these with Extreme Learning Machines (ELM). The substantial evaluation plan we have provided on the IDRiD dataset gives exceptional outcomes, where our proposed model gives 99.87% accuracy and 99.33% sensitivity, while its specificity is 99. 78%. This is why the outcomes of the presented study can be viewed as promising in terms of the further development of the proposed approach for DR diagnosis, as well as in creating a new reference point within the framework of medical image analysis and providing more effective and timely treatments.
Facebook
TwitterWe present a method of generating Mercer Kernels from an ensemble of probabilistic mixture models, where each mixture model is generated from a Bayesian mixture density estimate. We show how to convert the ensemble estimates into a Mercer Kernel, describe the properties of this new kernel function, and give examples of the performance of this kernel on unsupervised clustering of synthetic data and also in the domain of unsupervised multispectral image understanding.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mathematical representation of performance evaluation matrices.
Facebook
Twitterhttp://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
Accompanying data and analyses of the article "Generating brain-wide connectome using synthetic axonal morphologies". The code to reproduce the figures is available at this repository.
Main contents:
Additional files:
*These softwares might not be open-source at the time of publication of this data, but a public link will be provided as soon as they are.
Facebook
TwitterThis paper presents a method of generating Mercer Kernels from an ensemble of probabilistic mixture models, where each mixture model is generated from a Bayesian mixture density estimate. We show how to convert the ensemble estimates into a Mercer Kernel, describe the properties of this new kernel function, and give examples of the performance of this kernel on unsupervised clustering of synthetic data and also in the _domain of unsupervised multispectral image understanding.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Swiss-PDGs: Synthetic low- and medium-voltage grids for Switzerland
For details on the model used to generate this dataset, please refer to the article "Large-scale generation of geo-referenced power distribution grids from open data with load clustering" (2025), by A. Oneto, B. Gjorgiev, F. Tettamanti, and G. Sansavini, published in Sustainable Energy, Grids and Networks.
https://doi.org/10.1016/j.segan.2025.101678" target="_blank" rel="noreferrer noopener">https://doi.org/10.1016/j.segan.2025.101678
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
To realize the molecular design of new functional silver(I) clusters, a new synthetic approach has been proposed, by which the weakly coordinating ligands NO3– in a Ag20 thiolate cluster precursor can be substituted by carboxylic ligands while keeping its inner core intact. By rational design, novel atom-precise carboxylic or amino acid protected 20-core Ag(I)-thiolate clusters have been demonstrated for the first time. The fluorescence and electrochemical activity of the postmodified Ag20 clusters can be modulated by alrestatin or ferrocenecarboxylic acid substitution. More strikingly, when chiral amino acids were used as postmodified ligands, CD-activity was observed for the Ag20 clusters, unveiling an efficient way to obtain atom-precise chiral silver(I) clusters.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Generative image models have revolutionized artificial intelligence by enabling the synthesis of high-quality, realistic images. These models utilize deep learning techniques to learn complex data distributions and generate novel images that closely resemble the training dataset. Recent advancements, particularly in diffusion models, have led to remarkable improvements in image fidelity, diversity, and controllability. In this work, we investigate the application of a conditional latent diffusion model in the healthcare domain. Specifically, we trained a latent diffusion model using unlabeled histopathology images. Initially, these images were embedded into a lower-dimensional latent space using a Vector Quantized Generative Adversarial Network (VQ-GAN). Subsequently, a diffusion process was applied within this latent space, and clustering was performed on the resulting latent features. The clustering results were then used as a conditioning mechanism for the diffusion model, enabling conditional image generation. Finally, we determined the optimal number of clusters using cluster validation metrics and assessed the quality of the synthetic images through quantitative methods. To enhance the interpretability of the synthetic image generation process, expert input was incorporated into the cluster assignments.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Generative image models have revolutionized artificial intelligence by enabling the synthesis of high-quality, realistic images. These models utilize deep learning techniques to learn complex data distributions and generate novel images that closely resemble the training dataset. Recent advancements, particularly in diffusion models, have led to remarkable improvements in image fidelity, diversity, and controllability. In this work, we investigate the application of a conditional latent diffusion model in the healthcare domain. Specifically, we trained a latent diffusion model using unlabeled histopathology images. Initially, these images were embedded into a lower-dimensional latent space using a Vector Quantized Generative Adversarial Network (VQ-GAN). Subsequently, a diffusion process was applied within this latent space, and clustering was performed on the resulting latent features. The clustering results were then used as a conditioning mechanism for the diffusion model, enabling conditional image generation. Finally, we determined the optimal number of clusters using cluster validation metrics and assessed the quality of the synthetic images through quantitative methods. To enhance the interpretability of the synthetic image generation process, expert input was incorporated into the cluster assignments.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Generative image models have revolutionized artificial intelligence by enabling the synthesis of high-quality, realistic images. These models utilize deep learning techniques to learn complex data distributions and generate novel images that closely resemble the training dataset. Recent advancements, particularly in diffusion models, have led to remarkable improvements in image fidelity, diversity, and controllability. In this work, we investigate the application of a conditional latent diffusion model in the healthcare domain. Specifically, we trained a latent diffusion model using unlabeled histopathology images. Initially, these images were embedded into a lower-dimensional latent space using a Vector Quantized Generative Adversarial Network (VQ-GAN). Subsequently, a diffusion process was applied within this latent space, and clustering was performed on the resulting latent features. The clustering results were then used as a conditioning mechanism for the diffusion model, enabling conditional image generation. Finally, we determined the optimal number of clusters using cluster validation metrics and assessed the quality of the synthetic images through quantitative methods. To enhance the interpretability of the synthetic image generation process, expert input was incorporated into the cluster assignments.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Although biological iron–sulfur (Fe–S) clusters perform some of the most difficult redox reactions in nature, they are thought to be composed exclusively of Fe2+ and Fe3+ ions, as well as mixed-valent pairs with average oxidation states of Fe2.5+. We herein show that Fe–S clusters formally composed of these valences can access a wider range of electronic configurationsin particular, those featuring low-valent Fe1+ centers. We demonstrate that CO binding to a synthetic [Fe4S4]0 cluster supported by N-heterocyclic carbene ligands induces the generation of Fe1+ centers via intracluster electron transfer, wherein a neighboring pair of Fe2+ sites reduces the CO-bound site to a low-valent Fe1+ state. Similarly, CO binding to an [Fe4S4]+ cluster induces electron delocalization with a neighboring Fe site to form a mixed-valent Fe1.5+Fe2.5+ pair in which the CO-bound site adopts partial low-valent character. These low-valent configurations engender remarkable C–O bond activation without having to traverse highly negative and physiologically inaccessible [Fe4S4]0/[Fe4S4]− redox couples.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://www.gnu.org/licenses/agpl-3.0.htmlhttps://www.gnu.org/licenses/agpl-3.0.html
A generator for synthetic data sets for use in cluster analysis, classification, and outlier detection.