89 datasets found

Weather and Housing in North America
kaggle.com
zip
Updated Feb 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Weather and Housing in North America [Dataset]. https://www.kaggle.com/datasets/thedevastator/weather-and-housing-in-north-america
Explore at:
zip(512280 bytes)Available download formats
Dataset updated
Feb 13, 2023
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
North America
Description
Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

By [source]

About this dataset

This comprehensive dataset explores the relationship between housing and weather conditions across North America in 2012. Through a range of climate variables such as temperature, wind speed, humidity, pressure and visibility it provides unique insights into the weather-influenced environment of numerous regions. The interrelated nature of housing parameters such as longitude, latitude, median income, median house value and ocean proximity further enhances our understanding of how distinct climates play an integral part in area real estate valuations. Analyzing these two data sets offers a wealth of knowledge when it comes to understanding what factors can dictate the value and comfort level offered by residential areas throughout North America

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset offers plenty of insights into the effects of weather and housing on North American regions. To explore these relationships, you can perform data analysis on the variables provided.

First, start by examining descriptive statistics (i.e., mean, median, mode). This can help show you the general trend and distribution of each variable in this dataset. For example, what is the most common temperature in a given region? What is the average wind speed? How does this vary across different regions? By looking at descriptive statistics, you can get an initial idea of how various weather conditions and housing attributes interact with one another.

Next, explore correlations between variables. Are certain weather variables correlated with specific housing attributes? Is there a link between wind speeds and median house value? Or between humidity and ocean proximity? Analyzing correlations allows for deeper insights into how different aspects may influence one another for a given region or area. These correlations may also inform broader patterns that are present across multiple North American regions or countries.

Finally, use visualizations to further investigate this relationship between climate and housing attributes in North America in 2012. Graphs allow you visualize trends like seasonal variations or long-term changes over time more easily so they are useful when interpreting large amounts of data quickly while providing larger context beyond what numbers alone can tell us about relationships between different aspects within this dataset

Research Ideas

Analyzing the effect of climate change on housing markets across North America. By looking at temperature and weather trends in combination with housing values, researchers can better understand how climate change may be impacting certain regions differently than others.

Investigating the relationship between median income, house values and ocean proximity in coastal areas. Understanding how ocean proximity plays into housing prices may help inform real estate investment decisions and urban planning initiatives related to coastal development.

Utilizing differences in weather patterns across different climates to determine optimal seasonal rental prices for property owners. By analyzing changes in temperature, wind speed, humidity, pressure and visibility from season to season an investor could gain valuable insights into seasonal market trends to maximize their profits from rentals or Airbnb listings over time

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: Weather.csv | Column name | Description | |:---------------------|:-----------------------------------------------| | Date/Time | Date and time of the observation. (Date/Time) | | Temp_C | Temperature in Celsius. (Numeric) | | Dew Point Temp_C | Dew point temperature in Celsius. (Numeric) | | Rel Hum_% | Relative humidity in percent. (Numeric) | | Wind Speed_km/h | Wind speed in kilometers per hour. (Numeric) | | Visibility_km | Visibilit...
w
Synthetic Data for an Imaginary Country, Sample, 2023 - World
microdata.worldbank.org
nada-demo.ihsn.org
Updated Jul 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Development Data Group, Data Analytics Unit (2023). Synthetic Data for an Imaginary Country, Sample, 2023 - World [Dataset]. https://microdata.worldbank.org/index.php/catalog/5906
Explore at:
Dataset updated
Jul 7, 2023
Dataset authored and provided by
Development Data Group, Data Analytics Unit
Time period covered
2023
Area covered
World
Description
Abstract

The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.

The full-population dataset (with about 10 million individuals) is also distributed as open data.

Geographic coverage

The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.

Analysis unit

Household, Individual

Universe

The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.

Kind of data

ssd

Sampling procedure

The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.

Mode of data collection

other

Research instrument

The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.

Cleaning operations

The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.

Response rate

This is a synthetic dataset; the "response rate" is 100%.
d
Monthly Modal Time Series
catalog.data.gov
data.virginia.gov
+1more
Updated Nov 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Transit Administration (2025). Monthly Modal Time Series [Dataset]. https://catalog.data.gov/dataset/monthly-modal-time-series
Explore at:
Dataset updated
Nov 7, 2025
Dataset provided by
Federal Transit Administration
Description
Modal Service data and Safety & Security (S&S) public transit time series data delineated by transit/agency/mode/year/month. Includes all Full Reporters--transit agencies operating modes with more than 30 vehicles in maximum service--to the National Transit Database (NTD). This dataset will be updated monthly. The monthly ridership data is released one month after the month in which the service is provided. Records with null monthly service data reflect late reporting. The S&S statistics provided include both Major and Non-Major Events where applicable. Events occurring in the past three months are excluded from the corresponding monthly ridership rows in this dataset while they undergo validation. This dataset is the only NTD publication in which all Major and Non-Major S&S data are presented without any adjustment for historical continuity.
Used Car Listings in Indonesia
kaggle.com
zip
Updated Oct 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Indra (2023). Used Car Listings in Indonesia [Dataset]. https://www.kaggle.com/datasets/indraputra21/used-car-listings-in-indonesia/code
Explore at:
zip(26021 bytes)Available download formats
Dataset updated
Oct 23, 2023
Authors
Indra
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Indonesia
Description
Dataset Description:

This dataset contains information about various listings of used cars, their attributes, and features, including their brand, year of manufacture, price, installment amount, mileage, transmission type, location, license plate type, and various features such as rear camera, sunroof, auto retract mirror, and more.

Column Descriptions:

car name: The name or model of the car.

brand: The brand or manufacturer of the car.

year: The year the car was manufactured.

mileage (km): The mileage or distance traveled by the car in kilometers (km).

location: The location where the car is listed for sale.

transmission: The transmission type, such as "Manual" or "Automatic."

plate type: The type of license plate, which can be an even plate or an odd plate.

rear camera: Indicates whether the car has a rear camera (0 for no, 1 for yes).

sun roof: Indicates whether the car has a sunroof (0 for no, 1 for yes).

auto retract mirror: Indicates whether the car has auto-retracting mirrors (0 for no, 1 for yes).

electric parking brake: Indicates whether the car has an electric parking brake (0 for no, 1 for yes).

map navigator: Indicates whether the car has a built-in map navigator (0 for no, 1 for yes).

vehicle stability control: Indicates whether the car has vehicle stability control (0 for no, 1 for yes).

keyless push start: Indicates whether the car has a keyless push start (0 for no, 1 for yes).

sports mode: Indicates whether the car has a sports mode (0 for no, 1 for yes).

360 camera view: Indicates whether the car has a 360-degree camera view (0 for no, 1 for yes).

power sliding door: Indicates whether the car has a power sliding door (0 for no, 1 for yes).

auto cruise control: Indicates whether the car has auto cruise control (0 for no, 1 for yes).

price (Rp): The price of the car in Indonesian Rupiah (Rp).

instalment (Rp|Monthly): The monthly installment amount for the car, in Indonesian Rupiah (Rp).

Potential Usages:

this data set can be used for used car market analysis, price prediction etc

Other:

Raw data provided for anyone who wants it (in bahasa Indonesia)

data source: scraped from https://www.carsome.id/

image: generated using DALL·E 3
MISR Level 1B1 Local Mode Radiance Data V002
data.nasa.gov
cmr.earthdata.nasa.gov
+2more
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MISR Level 1B1 Local Mode Radiance Data V002 [Dataset]. https://data.nasa.gov/dataset/misr-level-1b1-local-mode-radiance-data-v002-7db9a
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
MIB1LM_002 is the Multi-angle Imaging SpectroRadiometer (MISR) Level 1B1 Local Mode Radiance Data version 2. It contains the data numbers (DNs) radiometrically scaled to radiances with no geometric resampling. Multi-angle Imaging SpectroRadiometer (MISR) Level 1B1 Radiance data product contains spectral radiances for all MISR channels. Each value represents the incident radiance averaged over the sensor's total band response. Processing includes both radiance scaling and conditioning steps. Radiance scaling converts the Level 1A data from digital counts to radiances, using coefficients derived from the onboard calibrator (OBC) and vicarious calibrations. The OBC contains Spectralon calibration panels, deployed monthly and reflect sunlight into cameras. The OBC detector standards then measure this reflected light to provide the calibration. No out-of-band correction is done for this product, nor are the data geometrically corrected or resampled. Data collection for this product is ongoing.The MISR instrument consists of nine push-broom cameras that measure radiance in four spectral bands. Global coverage is achieved in nine days. The cameras are arranged with one camera pointing toward the nadir, four forward, and four aftward. It takes seven minutes for all nine cameras to view the same surface location. The view angles relative to the surface reference ellipsoid are 0, 26.1, 45.6, 60.0, and 70.5 degrees. The spectral band shapes are nominally Gaussian, centered at 443, 555, 670, and 865 nm.MISR is designed to view Earth with cameras pointed in 9 different directions. As the instrument flies overhead, each piece of Earth's surface below is successively imaged by all nine cameras in 4 wavelengths (blue, green, red, and near-infrared). The goal of MISR is to improve our understanding of the effects of sunlight on Earth and distinguish different types of clouds, particles, and surfaces. Specifically, MISR monitors the monthly, seasonal, and long-term trends in three areas: 1) amount and type of atmospheric particles (aerosols), including those formed by natural sources and by human activities; 2) amounts, types, and heights of clouds, and 3) distribution of land surface cover, including vegetation canopy structure.
Uniform Sentinel 1-2 Dataset
kaggle.com
huggingface.co
zip
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shamba Chowdhury (2025). Uniform Sentinel 1-2 Dataset [Dataset]. https://www.kaggle.com/datasets/shambac/uniform-sentinel-1-2-dataset
Explore at:
zip(25713003558 bytes)Available download formats
Dataset updated
Jun 9, 2025
Authors
Shamba Chowdhury
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset name

UNIFORM SEN1-2

Year of publication

2025

Author

Shamba Chowdhury Ankana Ghosh Shreyashi Ghosh

License

CC-BY-SA-4.0 The dataset contains Copernicus data (2024). Terms and conditions apply: https://scihub.copernicus.eu/twiki/pub/SciHubWebPortal/TermsConditions/TC_Sentinel_Data_31072014.pdf

Associated publication

TBA

Links

Dataset: https://www.kaggle.com/datasets/shambac/uniform-sentinel-1-2-dataset Paper: TBA

Dataset structure

Folders named in the format of 'r_XXX' and CSV files named in the format of 'data_r_XXX.csv'.

Each folder contains two sub folders named 's1_XXX' and 's2_XXX'.

s1 folder contains 256x256 grayscale Sentinel 1 images from a particular region and s2 folder contains 256x256 color Sentinel 2 images from the same region.

Each region folder has an accompanying data csv.

Dataset size

No. of files: 616,148 Storage: 53,699 MB

Description

The dataset has images spread uniformly across all over the world with 165 regions and 129,438 pairs of images. Thus the total number of image files in the dataset amounts to 258,876 images. An overview of the selected regions given on the worldmap is given in the figure below.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20405330%2F5a6633a532d3b6e3b03587781f50e1b6%2Funknown.png?generation=1757140433954325&alt=media" alt="">

The information in the CSV files are basically metadata for all the images. The information are: - Coordinates: Geo-coordinates of the top-left point of the image. - Country: Name of the country where the image was captured. - Date-Time: Date and time when the image was captured. - Resolution Scale: Geospatial resolution of the image. - Temperature Region: Temperature zone of the region in the image. - Season: Season in the specific region at the time the image was captured.

Sentinel 1 images have two more attributes to them: - Operational Mode: It is the operational/acquisition mode of the satellite it used to capture the given image. - Polarisation: It is the polarisation with which the image was captured.

Sentinel 2 images have one unique attribute: - Bands: Sentinel 2 images come with multiple different information channels called bands, this attribute contains a list of the bands in the image.

A grid of sample images from the dataset is given below:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20405330%2Fbb696ebf7c3317e0624ce84ced0b3731%2Funknown.png?generation=1757140607662415&alt=media" alt="">
Z
Data from: FISBe: A real-world benchmark dataset for instance segmentation...
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mais, Lisa; Hirsch, Peter; Managan, Claire; Kandarpa, Ramya; Rumberger, Josef Lorenz; Reinke, Annika; Maier-Hein, Lena; Ihrke, Gudrun; Kainmueller, Dagmar (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10875062
Explore at:
Dataset updated
Apr 2, 2024
Dataset provided by
German Cancer Research Center
Max Delbrück Center
Howard Hughes Medical Institute - Janelia Research Campus
Max Delbrück Center for Molecular Medicine
Authors
Mais, Lisa; Hirsch, Peter; Managan, Claire; Kandarpa, Ramya; Rumberger, Josef Lorenz; Reinke, Annika; Maier-Hein, Lena; Ihrke, Gudrun; Kainmueller, Dagmar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General

For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

Summary

A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

30 completely labeled (segmented) images

71 partly labeled images

altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

A set of metrics and a novel ranking score for respective meaningful method benchmarking

An evaluation of three baseline methods in terms of the above metrics and score

Abstract

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

Dataset documentation:

We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

FISBe Datasheet

Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

Files

fisbe_v1.0_{completely,partly}.zip

contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

fisbe_v1.0_mips.zip

maximum intensity projections of all samples, for convenience.

sample_list_per_split.txt

a simple list of all samples and the subset they are in, for convenience.

view_data.py

a simple python script to visualize samples, see below for more information on how to use it.

dim_neurons_val_and_test_sets.json

a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

Readme.md

general information

How to work with the image files

Each sample consists of a single 3d MCFO image of neurons of the fruit fly.For each image, we provide a pixel-wise instance segmentation for all separable neurons.Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.The segmentation mask for each neuron is stored in a separate channel.The order of dimensions is CZYX.

We recommend to work in a virtual environment, e.g., by using conda:

conda create -y -n flylight-env -c conda-forge python=3.9conda activate flylight-env

How to open zarr files

Install the python zarr package:

pip install zarr

Opened a zarr file with:

import zarrraw = zarr.open(, mode='r', path="volumes/raw")seg = zarr.open(, mode='r', path="volumes/gt_instances")

optional:import numpy as npraw_np = np.array(raw)

Zarr arrays are read lazily on-demand.Many functions that expect numpy arrays also work with zarr arrays.Optionally, the arrays can also explicitly be converted to numpy arrays.

How to view zarr image files

We recommend to use napari to view the image data.

Install napari:

pip install "napari[all]"

Save the following Python script:

import zarr, sys, napari

raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

viewer = napari.Viewer(ndisplay=3)for idx, gt in enumerate(gts): viewer.add_labels( gt, rendering='translucent', blending='additive', name=f'gt_{idx}')viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')napari.run()

Execute:

python view_data.py /R9F03-20181030_62_B5.zarr

Metrics

S: Average of avF1 and C

avF1: Average F1 Score

C: Average ground truth coverage

clDice_TP: Average true positives clDice

FS: Number of false splits

FM: Number of false merges

tp: Relative number of true positives

For more information on our selected metrics and formal definitions please see our paper.

Baseline

To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..For detailed information on the methods and the quantitative results please see our paper.

License

The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Citation

If you use FISBe in your research, please use the following BibTeX entry:

@misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

Acknowledgments

We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuablediscussions.P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.This work was co-funded by Helmholtz Imaging.

Changelog

There have been no changes to the dataset so far.All future change will be listed on the changelog page.

Contributing

If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

All contributions are welcome!
m
Graphite//LFP synthetic training prognosis dataset
data.mendeley.com
Updated May 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthieu Dubarry (2020). Graphite//LFP synthetic training prognosis dataset [Dataset]. http://doi.org/10.17632/6s6ph9n8zg.1
Explore at:
Unique identifier
https://doi.org/10.17632/6s6ph9n8zg.1
Dataset updated
May 6, 2020
Authors
Matthieu Dubarry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This training dataset was calculated using the mechanistic modeling approach. See the “Benchmark Synthetic Training Data for Artificial Intelligence-based Li-ion Diagnosis and Prognosis“ publication for mode details. More details will be added when published. The prognosis dataset was harder to define as there are no limits on how the three degradation modes can evolve. For this proof of concept work, we considered eight parameters to scan. For each degradation mode, degradation was chosen to follow equation (1).

%degradation=a × cycle+ (exp^(b×cycle)-1) (1)

Considering the three degradation modes, this accounts for six parameters to scan. In addition, two other parameters were added, a delay for the exponential factor for LLI, and a parameter for the reversibility of lithium plating. The delay was introduced to reflect degradation paths where plating cannot be explained by an increase of LAMs or resistance [55]. The chosen parameters and their values are summarized in Table S1 and their evolution is represented in Figure S1. Figure S1(a,b) presents the evolution of parameters p1 to p7. At the worst, the cells endured 100% of one of the degradation modes in around 1,500 cycles. Minimal LLI was chosen to be 20% after 3,000 cycles. This is to guarantee at least 20% capacity loss for all the simulations. For the LAMs, conditions were less restrictive, and, after 3,000 cycles, the lowest degradation is of 3%. The reversibility factor p8 was calculated with equation (2) when LAMNE > PT.

%LLI=%LLI+p8 (LAM_PE-PT) (2)

Where PT was calculated with equation (3) from [60].

PT=100-((100-LAMPE)/(100×LRini-LAMPE ))×(100-OFSini-LLI) (3)

Varying all those parameters accounted for more than 130,000 individual duty cycles. With one voltage curve for every 100 cycles. 6 MATLAB© .mat files are included: The GIC-LFP_duty_other.mat file contains 12 variables Qnorm: normalize capacity scale for all voltage curves

P1 to p8: values used to generate the duty cycles

Key: index for which values were used for each degradation paths. 1 -p1, … 8 - p8

QL: capacity loss, one line per path, one column per 100 cycles.

File GIC-LFP_duty_LLI-LAMsvalues.mat contains the values for LLI, LAMPE and LAMNE for all cycles (1line per 100 cycles) and duty cycles (columns).

Files GIC-LFP_duty_1 to _4 files contains the voltage data split into 1GB chunks (40,000 simulations). Each cell corresponds to 1 line in the key variable. Inside each cell, one colunm per 100 cycles.
MMS 4 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...
data.nasa.gov
Updated Aug 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MMS 4 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey Mode, 5 s Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/mms-4-electron-drift-instrument-edi-electric-field-level-2-l2-survey-mode-5-s-data
Explore at:
Dataset updated
Aug 21, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Electron Drift Instrument (EDI) Electric Field Survey, Level 2, 5 s Data. EDI has two scientific data acquisition modes, called electric field mode and ambient mode. In electric field mode, two coded electron beams are emitted such that they return to the detectors after one or more gyrations in the ambient magnetic and electric field. The firing directions and times-of-flight allow the derivation of the drift velocity and electric field. In ambient mode, the electron beams are not used. The detectors with their large geometric factors and their ability to adjust the field of view quickly allow continuous sampling of ambient electrons at a selected pitch angle and fixed but selectable energy. To find the beam directions that will hit the detector, EDI sweeps each beam in the plane perpendicular to B at a fixed angular rate of 0.22 °/ms until a signal has been acquired by the detector. Once signal has been acquired, the beams are swept back and forth to stay on target. Beam detection is not determined from the changes in the count-rates directly, but from the square of the beam counts divided by the background counts from ambient electrons, i.e., from the square of the instantaneous signal-to-noise ratio (SNR). This quantity is computed from data provided by the correlator in the Gun-Detector Electronics that also generates the coding pattern imposed on the outgoing beams. If the squared SNR ratio exceeds a threshold, this is taken as evidence that the beam is returning to the detector. The thresholds for SNR are chosen dependent on background fluxes. They represent a compromise between getting false hits (induced by strong variations in background electron fluxes) and missing true beam hits. The basic software loop that controls EDI operations is executed every 2 ms. As the times when the beams hit their detectors are neither synchronized with the telemetry nor equidistant, EDI data have no fixed time-resolution. Data are reported in telemetry slots. In Survey, using the standard packing mode 0, there are eight telemetry slots per second and Gyn Detector Unit (GDU). The last beam detected during the previous slot will be reported in the current slot. If no beam has been detected, the data quality will be set to zero. In Burst telemetry there are 128 slots per second and GDU. The data in each slot consists of information regarding the beam firing directions (stored in the form of analytic gun deflection voltages), times-of-flight (if successfully measured), quality indicators, time stamps of the beam hits, and some auxiliary correlator-related information. Whenever EDI is not in electron drift mode, it uses its ambient electron mode. The mode has the capability to sample at either 90 degrees pitch angle or at 0/180 degrees (field aligned), or to alternate between 90 degrees and field aligned with selectable dwell times. While all options have been demonstrated during the commissioning phase, only the field aligned mode has been used in the routine operations phase. The choices for energy are 250 eV, 500 eV, and 1 keV. The two detectors, which are facing opposite hemispheres, are looking strictly into opposite directions, so while one detector is looking along B the other is looking antiparallel to B (corresponding to pitch angles of 180 and 0 degrees, respectively). The two detectors switch roles every half spin of the spacecraft as the tip of the magnetic field vector spins outside the field of view of one detector and into the field of view of the other detector. This is the primary data product generated from data collected in electric field mode. The science data generated are drift velocity and electric field data in various coordinate systems. They are derived from triangulation and/or time-of-flight analysis. Where both methods are applicable, their results will be combined using a weighting approach based on their relative errors. The EDI instrument paper can be found at: http://link.springer.com/article/10.1007%2Fs11214-015-0182-7. The EDI instrument data products guide can be found at https://lasp.colorado.edu/mms/sdc/public/datasets/fields/.
MMS 2 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...
data.nasa.gov
Updated Aug 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MMS 2 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey Mode, 5 s Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/mms-2-electron-drift-instrument-edi-electric-field-level-2-l2-survey-mode-5-s-data
Explore at:
Dataset updated
Aug 21, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Electron Drift Instrument (EDI) Electric Field Survey, Level 2, 5 s Data. EDI has two scientific data acquisition modes, called electric field mode and ambient mode. In electric field mode, two coded electron beams are emitted such that they return to the detectors after one or more gyrations in the ambient magnetic and electric field. The firing directions and times-of-flight allow the derivation of the drift velocity and electric field. In ambient mode, the electron beams are not used. The detectors with their large geometric factors and their ability to adjust the field of view quickly allow continuous sampling of ambient electrons at a selected pitch angle and fixed but selectable energy. To find the beam directions that will hit the detector, EDI sweeps each beam in the plane perpendicular to B at a fixed angular rate of 0.22 °/ms until a signal has been acquired by the detector. Once signal has been acquired, the beams are swept back and forth to stay on target. Beam detection is not determined from the changes in the count-rates directly, but from the square of the beam counts divided by the background counts from ambient electrons, i.e., from the square of the instantaneous signal-to-noise ratio (SNR). This quantity is computed from data provided by the correlator in the Gun-Detector Electronics that also generates the coding pattern imposed on the outgoing beams. If the squared SNR ratio exceeds a threshold, this is taken as evidence that the beam is returning to the detector. The thresholds for SNR are chosen dependent on background fluxes. They represent a compromise between getting false hits (induced by strong variations in background electron fluxes) and missing true beam hits. The basic software loop that controls EDI operations is executed every 2 ms. As the times when the beams hit their detectors are neither synchronized with the telemetry nor equidistant, EDI data have no fixed time-resolution. Data are reported in telemetry slots. In Survey, using the standard packing mode 0, there are eight telemetry slots per second and Gyn Detector Unit (GDU). The last beam detected during the previous slot will be reported in the current slot. If no beam has been detected, the data quality will be set to zero. In Burst telemetry there are 128 slots per second and GDU. The data in each slot consists of information regarding the beam firing directions (stored in the form of analytic gun deflection voltages), times-of-flight (if successfully measured), quality indicators, time stamps of the beam hits, and some auxiliary correlator-related information. Whenever EDI is not in electron drift mode, it uses its ambient electron mode. The mode has the capability to sample at either 90 degrees pitch angle or at 0/180 degrees (field aligned), or to alternate between 90 degrees and field aligned with selectable dwell times. While all options have been demonstrated during the commissioning phase, only the field aligned mode has been used in the routine operations phase. The choices for energy are 250 eV, 500 eV, and 1 keV. The two detectors, which are facing opposite hemispheres, are looking strictly into opposite directions, so while one detector is looking along B the other is looking antiparallel to B (corresponding to pitch angles of 180 and 0 degrees, respectively). The two detectors switch roles every half spin of the spacecraft as the tip of the magnetic field vector spins outside the field of view of one detector and into the field of view of the other detector. This is the primary data product generated from data collected in electric field mode. The science data generated are drift velocity and electric field data in various coordinate systems. They are derived from triangulation and/or time-of-flight analysis. Where both methods are applicable, their results will be combined using a weighting approach based on their relative errors. The EDI instrument paper can be found at: http://link.springer.com/article/10.1007%2Fs11214-015-0182-7. The EDI instrument data products guide can be found at https://lasp.colorado.edu/mms/sdc/public/datasets/fields/.
MMS 3 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...
data.nasa.gov
Updated Aug 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MMS 3 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey Mode, 5 s Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/mms-3-electron-drift-instrument-edi-electric-field-level-2-l2-survey-mode-5-s-data
Explore at:
Dataset updated
Aug 21, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Electron Drift Instrument (EDI) Electric Field Survey, Level 2, 5 s Data. EDI has two scientific data acquisition modes, called electric field mode and ambient mode. In electric field mode, two coded electron beams are emitted such that they return to the detectors after one or more gyrations in the ambient magnetic and electric field. The firing directions and times-of-flight allow the derivation of the drift velocity and electric field. In ambient mode, the electron beams are not used. The detectors with their large geometric factors and their ability to adjust the field of view quickly allow continuous sampling of ambient electrons at a selected pitch angle and fixed but selectable energy. To find the beam directions that will hit the detector, EDI sweeps each beam in the plane perpendicular to B at a fixed angular rate of 0.22 °/ms until a signal has been acquired by the detector. Once signal has been acquired, the beams are swept back and forth to stay on target. Beam detection is not determined from the changes in the count-rates directly, but from the square of the beam counts divided by the background counts from ambient electrons, i.e., from the square of the instantaneous signal-to-noise ratio (SNR). This quantity is computed from data provided by the correlator in the Gun-Detector Electronics that also generates the coding pattern imposed on the outgoing beams. If the squared SNR ratio exceeds a threshold, this is taken as evidence that the beam is returning to the detector. The thresholds for SNR are chosen dependent on background fluxes. They represent a compromise between getting false hits (induced by strong variations in background electron fluxes) and missing true beam hits. The basic software loop that controls EDI operations is executed every 2 ms. As the times when the beams hit their detectors are neither synchronized with the telemetry nor equidistant, EDI data have no fixed time-resolution. Data are reported in telemetry slots. In Survey, using the standard packing mode 0, there are eight telemetry slots per second and Gyn Detector Unit (GDU). The last beam detected during the previous slot will be reported in the current slot. If no beam has been detected, the data quality will be set to zero. In Burst telemetry there are 128 slots per second and GDU. The data in each slot consists of information regarding the beam firing directions (stored in the form of analytic gun deflection voltages), times-of-flight (if successfully measured), quality indicators, time stamps of the beam hits, and some auxiliary correlator-related information. Whenever EDI is not in electron drift mode, it uses its ambient electron mode. The mode has the capability to sample at either 90 degrees pitch angle or at 0/180 degrees (field aligned), or to alternate between 90 degrees and field aligned with selectable dwell times. While all options have been demonstrated during the commissioning phase, only the field aligned mode has been used in the routine operations phase. The choices for energy are 250 eV, 500 eV, and 1 keV. The two detectors, which are facing opposite hemispheres, are looking strictly into opposite directions, so while one detector is looking along B the other is looking antiparallel to B (corresponding to pitch angles of 180 and 0 degrees, respectively). The two detectors switch roles every half spin of the spacecraft as the tip of the magnetic field vector spins outside the field of view of one detector and into the field of view of the other detector. This is the primary data product generated from data collected in electric field mode. The science data generated are drift velocity and electric field data in various coordinate systems. They are derived from triangulation and/or time-of-flight analysis. Where both methods are applicable, their results will be combined using a weighting approach based on their relative errors. The EDI instrument paper can be found at: http://link.springer.com/article/10.1007%2Fs11214-015-0182-7. The EDI instrument data products guide can be found at https://lasp.colorado.edu/mms/sdc/public/datasets/fields/.
Estimated stand-off distance between ADS-B equipped aircraft and obstacles
zenodo.org
data.niaid.nih.gov
+1more
jpeg, zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Weinert; Andrew Weinert (2024). Estimated stand-off distance between ADS-B equipped aircraft and obstacles [Dataset]. http://doi.org/10.5281/zenodo.7741273
Explore at:
zip, jpegAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7741273
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrew Weinert; Andrew Weinert
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Summary:

Estimated stand-off distance between ADS-B equipped aircraft and obstacles. Obstacle information was sourced from the FAA Digital Obstacle File and the FHWA National Bridge Inventory. Aircraft tracks were sourced from processed data curated from the OpenSky Network. Results are presented as histograms organized by aircraft type and distance away from runways.

Description:

For many aviation safety studies, aircraft behavior is represented using encounter models, which are statistical models of how aircraft behave during close encounters. They are used to provide a realistic representation of the range of encounter flight dynamics where an aircraft collision avoidance system would be likely to alert. These models currently and have historically have been limited to interactions between aircraft; they have not represented the specific interactions between obstacles and aircraft equipped transponders. In response, we calculated the standoff distance between obstacles and ADS-B equipped manned aircraft.

For robustness, this assessment considered two different datasets of manned aircraft tracks and two datasets of obstacles. For robustness, MIT LL calculated the standoff distance using two different datasets of aircraft tracks and two datasets of obstacles. This approach aligned with the foundational research used to support the ASTM F3442/F3442M-20 well clear criteria of 2000 feet laterally and 250 feet AGL vertically.

The two datasets of processed tracks of ADS-B equipped aircraft curated from the OpenSky Network. It is likely that rotorcraft were underrepresented in these datasets. There were also no considerations for aircraft equipped only with Mode C or not equipped with any transponders. The first dataset was used to train the v1.3 uncorrelated encounter models and referred to as the “Monday” dataset. The second dataset is referred to as the “aerodrome” dataset and was used to train the v2.0 and v3.x terminal encounter model. The Monday dataset consisted of 104 Mondays across North America. The other dataset was based on observations at least 8 nautical miles within Class B, C, D aerodromes in the United States for the first 14 days of each month from January 2019 through February 2020. Prior to any processing, the datasets required 714 and 847 Gigabytes of storage. For more details on these datasets, please refer to "Correlated Bayesian Model of Aircraft Encounters in the Terminal Area Given a Straight Takeoff or Landing" and “Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-Scheduling.”

Two different datasets of obstacles were also considered. First was point obstacles defined by the FAA digital obstacle file (DOF) and consisted of point obstacle structures of antenna, lighthouse, meteorological tower (met), monument, sign, silo, spire (steeple), stack (chimney; industrial smokestack), transmission line tower (t-l tower), tank (water; fuel), tramway, utility pole (telephone pole, or pole of similar height, supporting wires), windmill (wind turbine), and windsock. Each obstacle was represented by a cylinder with the height reported by the DOF and a radius based on the report horizontal accuracy. We did not consider the actual width and height of the structure itself. Additionally, we only considered obstacles at least 50 feet tall and marked as verified in the DOF.

The other obstacle dataset, termed as “bridges,” was based on the identified bridges in the FAA DOF and additional information provided by the National Bridge Inventory. Due to the potential size and extent of bridges, it would not be appropriate to model them as point obstacles; however, the FAA DOF only provides a point location and no information about the size of the bridge. In response, we correlated the FAA DOF with the National Bridge Inventory, which provides information about the length of many bridges. Instead of sizing the simulated bridge based on horizontal accuracy, like with the point obstacles, the bridges were represented as circles with a radius of the longest, nearest bridge from the NBI. A circle representation was required because neither the FAA DOF or NBI provided sufficient information about orientation to represent bridges as rectangular cuboid. Similar to the point obstacles, the height of the obstacle was based on the height reported by the FAA DOF. Accordingly, the analysis using the bridge dataset should be viewed as risk averse and conservative. It is possible that a manned aircraft was hundreds of feet away from an obstacle in actuality but the estimated standoff distance could be significantly less. Additionally, all obstacles are represented with a fixed height, the potentially flat and low level entrances of the bridge are assumed to have the same height as the tall bridge towers. The attached figure illustrates an example simulated bridge.

It would had been extremely computational inefficient to calculate the standoff distance for all possible track points. Instead, we define an encounter between an aircraft and obstacle as when an aircraft flying 3069 feet AGL or less comes within 3000 feet laterally of any obstacle in a 60 second time interval. If the criteria were satisfied, then for that 60 second track segment we calculate the standoff distance to all nearby obstacles. Vertical separation was based on the MSL altitude of the track and the maximum MSL height of an obstacle.

For each combination of aircraft track and obstacle datasets, the results were organized seven different ways. Filtering criteria were based on aircraft type and distance away from runways. Runway data was sourced from the FAA runways of the United States, Puerto Rico, and Virgin Islands open dataset. Aircraft type was identified as part of the em-processing-opensky workflow.

All: No filter, all observations that satisfied encounter conditions

nearRunway: Aircraft within or at 2 nautical miles of a runway

awayRunway: Observations more than 2 nautical miles from a runway

glider: Observations when aircraft type is a glider

fwme: Observations when aircraft type is a fixed-wing multi-engine

fwse: Observations when aircraft type is a fixed-wing single engine

rotorcraft: Observations when aircraft type is a rotorcraft

License

This dataset is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(CC BY-NC-ND 4.0).

This license requires that reusers give credit to the creator. It allows reusers to copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only. Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Exceptions are given for the not for profit standards organizations of ASTM International and RTCA.

MIT is releasing this dataset in good faith to promote open and transparent research of the low altitude airspace. Given the limitations of the dataset and a need for more research, a more restrictive license was warranted. Namely it is based only on only observations of ADS-B equipped aircraft, which not all aircraft in the airspace are required to employ; and observations were source from a crowdsourced network whose surveillance coverage has not been robustly characterized.

As more research is conducted and the low altitude airspace is further characterized or regulated, it is expected that a future version of this dataset may have a more permissive license.

Distribution Statement

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

© 2021 Massachusetts Institute of Technology.

Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.

This document is derived from work done for the FAA (and possibly others); it is not the direct product of work done for the FAA. The information provided herein may include content supplied by third parties. Although the data and information contained herein has been produced or processed from sources believed to be reliable, the Federal Aviation Administration makes no warranty, expressed or implied, regarding the accuracy, adequacy, completeness, legality, reliability or usefulness of any information, conclusions or recommendations provided herein. Distribution of the information contained herein does not constitute an endorsement or warranty of the data or information provided herein by the Federal Aviation Administration or the U.S. Department of Transportation. Neither the Federal Aviation Administration nor the U.S. Department of
n
RapidScat Level 2B Climate Ocean Wind Vectors in 12.5km Footprints
podaac.jpl.nasa.gov
data.nasa.gov
+3more
html
Updated May 6, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PO.DAAC (2016). RapidScat Level 2B Climate Ocean Wind Vectors in 12.5km Footprints [Dataset]. http://doi.org/10.5067/RSX12-L2C11
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.5067/RSX12-L2C11
Dataset updated
May 6, 2016
Dataset provided by
PO.DAAC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
SURFACE WINDS
Description
This dataset contains the RapidScat Level 2B 12.5km Version 1.0 Climate quality ocean surface wind vectors. The Level 2B wind vectors are binned on a 12.5 km Wind Vector Cell (WVC) grid and processed using the using the "full aperture" normalized radar cross-section (NRCS, a.k.a. Sigma-0) from the L1B dataset. RapidScat is a Ku-band dual beam circular rotating scatterometer retaining much of the same hardware and functionality of QuikSCAT, with exception of the antenna sub-system and digital interface to the International Space Station (ISS) Columbus module, which is where RapidScat is mounted. The NASA mission is officially referred to as ISS-RapidScat. Unlike QuikSCAT, ISS-RapidScat is not in sun-synchronous orbit, and flies at roughly half the altitude with a low inclination angle that restricts data coverage to the tropics and mid-latitude regions; the extent of latitudinal coverage stretches from approximately 61 degrees North to 61 degrees South. Furthermore, there is no consistent local time of day retrieval. This dataset is provided in a netCDF-3 file format that follows the netCDF-4 classic model (i.e., generated by the netCDF-4 API) and made available via Direct Download and OPeNDAP. For data access, please click on the "Data Access" tab above. This climate quality data set differs from the nominal "slice" L2B dataset as follows: 1) it uses full antenna footprint measurements (~20-km) without subdividing by range (~7-km) and 2) the absolute calibration has been modified for the two different low signal-to-noise ratio (SNR) mode data sets: LowSNR1 14 August 2015 to 18 September 2015; LowSNR2 6 October 2015 to 7 February 2016. The above enhancements allow this dataset to provide consistent calibration across all SNR states. Low SNR periods and other key quality control (QC) issues are tracked and kept up-to-date in PO.DAAC Drive at https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-docs/rapidscat/open/L1B/docs/revtime.csv. If you have any questions, please visit our user forums: https://podaac.jpl.nasa.gov/forum/.
A geometric shape regularity effect in the human brain: fMRI dataset
openneuro.org
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathias Sablé-Meyer; Lucas Benjamin; Cassandra Potier Watkins; Chenxi He; Maxence Pajot; Théo Morfoisse; Fosca Al Roumi; Stanislas Dehaene (2025). A geometric shape regularity effect in the human brain: fMRI dataset [Dataset]. http://doi.org/10.18112/openneuro.ds006010.v1.0.1
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds006010.v1.0.1
Dataset updated
Mar 14, 2025
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Mathias Sablé-Meyer; Lucas Benjamin; Cassandra Potier Watkins; Chenxi He; Maxence Pajot; Théo Morfoisse; Fosca Al Roumi; Stanislas Dehaene
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A geometric shape regularity effect in the human brain: fMRI dataset

Authors:

Mathias Sablé-Meyer*

Lucas Benjamin

Cassandra Potier Watkins

Chenxi He

Maxence Pajot

Théo Morfoisse

Fosca Al Roumi

Stanislas Dehaene

*Corresponding author: mathias.sable-meyer@ucl.ac.uk

Abstract

The perception and production of regular geometric shapes is a characteristic trait of human cultures since prehistory, whose neural mechanisms are unknown. Behavioral studies suggest that humans are attuned to discrete regularities such as symmetries and parallelism, and rely on their combinations to encode regular geometric shapes in a compressed form. To identify the relevant brain systems and their dynamics, we collected functional MRI and magnetoencephalography data in both adults and six-year-olds during the perception of simple shapes such as hexagons, triangles and quadrilaterals. The results revealed that geometric shapes, relative to other visual categories, induce a hypoactivation of ventral visual areas and an overactivation of the intraparietal and inferior temporal regions also involved in mathematical processing, whose activation is modulated by geometric regularity. While convolutional neural networks captured the early visual activity evoked by geometric shapes, they failed to account for subsequent dorsal parietal and prefrontal signals, which could only be captured by discrete geometric features or by more advanced transformer models of vision. We propose that the perception of abstract geometric regularities engages an additional symbolic mode of visual perception.

Notes about this dataset

We separately share the MEG dataset at https://openneuro.org/datasets/ds006012. Below are some notes about the fMRI dataset of N=20 adult participants (sub-2xx, numbers between 204 and 223), and N=22 children (sub-3xx, numbers between 301 and 325).

The code for the analyses is provided at https://github.com/mathias-sm/AGeometricShapeRegularityEffectHumanBrain
However, the analyses work from already preprocessed data. Since there is no custom code per se for the preprocessing, I have not included it in the repository. To preprocess the data as was done in the published article, here is the command and software information:

fMRIPrep version: 20.0.5

fMRIPrep command: /usr/local/miniconda/bin/fmriprep /data /out participant --participant-label <label> --output-spaces MNI152NLin6Asym:res-2 MNI152NLin2009cAsym:res-2

Defacing has been performed with bidsonym running the pydeface masking, and nobrainer brain registraction pipeline.
The published analyses have been performed on the non-defaced data. I have checked for data quality on all participants after defacing. In specific cases, I may be able to request the permission to share the original, non-defaced dataset.

sub-325 was acquired by a different experimenter and defaced before being shared with the rest of the research team, hence why the slightly different defacing mask. That participant was also preprocessed separately, and using a more recent fMRIPrep version: 20.2.6.

The data associated with the children has a few missing files. Notably:

sub-313 and sub-316 are missing one run of the localizer each

sub-316 has no data at all for the geometry

sub-308 has eno useable data for the intruder task Since all of these still have some data to contribute to either task, all available files were kept on this dataset. The analysis code reflects these inconsistencies where required with specific exceptions.
d
Replication Data for: Integrating online data collection in a household...
demo-b2find.dkrz.de
Updated May 28, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Replication Data for: Integrating online data collection in a household panel study: effects on second-wave participation - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/f3984387-c555-5198-86f1-daa3cd0a0cc4
Explore at:
Dataset updated
May 28, 2021
Description
Received wisdom in survey practice suggests that using web mode in the first wave of a panelstudy is not as effective as using interviewers. Based on data from a two-wave mode experiment for the Swiss Household Panel (SHP), this study examines how the use of online data collection in the first wave affects participation in the second wave, and if so, who is affected. The experiment compared the traditional SHP design of telephone interviewing to a mixed-mode design combining a household questionnaire by telephone with individual questionnaires by web and to a web-only design for the household and individual questionnaires. We looked at both participation of the household reference person (HRP) and of all household members in multi-person households. We find no support for a higher dropout at wave 2 of HRPs who followed the mixed-mode protocol or who participated online. Neither do we find much evidence that the association between mode and dropout varies by socio-demographic characteristics. The only exception was that of higher dropout rates among HRPs of larger households in the telephone group, compared to the web-only group. Moreover, the mixed-mode and web-only designs were more successful than the telephone design in enrolling and keeping all eligible household members in multi-person households in the study. In conclusion, the results suggest that using web mode (whether alone or combined with telephone) when starting a new panel shows no clear disadvantage with respect to second wave participation compared with telephone interviews.
Physicians Actively Working by Specialty and Activity Hours
catalog.data.gov
data.chhs.ca.gov
+2more
Updated Aug 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2025). Physicians Actively Working by Specialty and Activity Hours [Dataset]. https://catalog.data.gov/dataset/physicians-actively-working-by-specialty-and-activity-hours
Explore at:
Dataset updated
Aug 23, 2025
Dataset provided by
Department of Health Care Access and Information
Description
These datasets contain aggregated responses from the HCAI Health Workforce License Renewal Survey for Physician and Surgeon and Osteopathic Physician and Surgeon licensees. The data are limited to physicians whose license was in an Active Status and who indicated they were actively working in a position that required their license as of April 3, 2025. The license renewal survey utilizes a cell-based weighting methodology to estimate the total count of licensed individuals and accounts for individuals that decline to provide a response or who have not yet taken the survey. The presented estimated counts were calculated by multiplying county level weighted percentages for each metric by the total count of active licenses within each county. If no physicians within a given specialty and county provided a response to the activity hours question, the statewide mode for that specialty was used. Note: Previous versions of this dataset utilized raw counts rather than a cell-based weighting approach and did not take into account whether physicians were actively working in their field. In addition, the previous version of this dataset (as of April 3, 2024) contained errors regarding the total number of “Unsurveyed” individuals within each county. For more information regarding this issue, please contact the Workforce Data team using the email address below.
h
wikipedia-small-3000-embedded
huggingface.co
Updated Apr 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Not Lain (2024). wikipedia-small-3000-embedded [Dataset]. https://huggingface.co/datasets/not-lain/wikipedia-small-3000-embedded
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2024
Authors
Not Lain
License
https://choosealicense.com/licenses/gfdl/https://choosealicense.com/licenses/gfdl/
Description
this is a subset of the wikimedia/wikipedia dataset code for creating this dataset : from datasets import load_dataset, Dataset from sentence_transformers import SentenceTransformer model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")

load dataset in streaming mode (no download and it's fast)

dataset = load_dataset( "wikimedia/wikipedia", "20231101.en", split="train", streaming=True )

select 3000 samples

from tqdm importtqdm data = Dataset.from_dict({}) for i, entry in… See the full description on the dataset page: https://huggingface.co/datasets/not-lain/wikipedia-small-3000-embedded.
MMS 4 Electron Drift Instrument (EDI) Quality 0 Counts, Level 2 (L2), Burst...
data.nasa.gov
Updated Aug 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). MMS 4 Electron Drift Instrument (EDI) Quality 0 Counts, Level 2 (L2), Burst Mode, 7.8125 ms Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/mms-4-electron-drift-instrument-edi-quality-0-counts-level-2-l2-burst-mode-7-8125-ms-data
Explore at:
Dataset updated
Aug 21, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Electron Drift Instrument (EDI) Q0 Burst Survey, Level 2, 0.0078125 s Data (128 samples/s). EDI has two scientific data acquisition modes, called electric field mode and ambient mode. In electric field mode, two coded electron beams are emitted such that they return to the detectors after one or more gyrations in the ambient magnetic and electric field. The firing directions and times-of-flight allow the derivation of the drift velocity and electric field. In ambient mode, the electron beams are not used. The detectors with their large geometric factors and their ability to adjust the field of view quickly allow continuous sampling of ambient electrons at a selected pitch angle and fixed but selectable energy. To find the beam directions that will hit the detector, EDI sweeps each beam in the plane perpendicular to B at a fixed angular rate of 0.22 °/ms until a signal has been acquired by the detector. Once signal has been acquired, the beams are swept back and forth to stay on target. Beam detection is not determined from the changes in the count-rates directly, but from the square of the beam counts divided by the background counts from ambient electrons, i.e., from the square of the instantaneous signal-to-noise ratio (SNR). This quantity is computed from data provided by the correlator in the Gun-Detector Electronics that also generates the coding pattern imposed on the outgoing beams. If the squared SNR ratio exceeds a threshold, this is taken as evidence that the beam is returning to the detector. The thresholds for SNR are chosen dependent on background fluxes. They represent a compromise between getting false hits (induced by strong variations in background electron fluxes) and missing true beam hits. The basic software loop that controls EDI operations is executed every 2 ms. As the times when the beams hit their detectors are neither synchronized with the telemetry nor equidistant, EDI data have no fixed time-resolution. Data are reported in telemetry slots. In Survey, using the standard packing mode 0, there are eight telemetry slots per second and Gyn Detector Unit (GDU). The last beam detected during the previous slot will be reported in the current slot. If no beam has been detected, the data quality will be set to zero. In Burst telemetry there are 128 slots per second and GDU. The data in each slot consists of information regarding the beam firing directions (stored in the form of analytic gun deflection voltages), times-of-flight (if successfully measured), quality indicators, time stamps of the beam hits, and some auxiliary correlator-related information. Whenever EDI is not in electron drift mode, it uses its ambient electron mode. The mode has the capability to sample at either 90 degrees pitch angle or at 0/180 degrees (field aligned), or to alternate between 90 degrees and field aligned with selectable dwell times. While all options have been demonstrated during the commissioning phase, only the field aligned mode has been used in the routine operations phase. The choices for energy are 250 eV, 500 eV, and 1 keV. The two detectors, which are facing opposite hemispheres, are looking strictly into opposite directions, so while one detector is looking along B the other is looking antiparallel to B (corresponding to pitch angles of 180 and 0 degrees, respectively). The two detectors switch roles every half spin of the spacecraft as the tip of the magnetic field vector spins outside the field of view of one detector and into the field of view of the other detector. These data are a by-product generated from data collected in electric field mode. Whenever no return beam is found in a particular time slot by the flight software to be reported will be flagged with the lowest quality level (quality zero). The ground processing generates a separate data product from these counts data. The EDI instrument paper can be found at: http://link.springer.com/article/10.1007%2Fs11214-015-0182-7. The EDI instrument data products guide can be found at https://lasp.colorado.edu/mms/sdc/public/datasets/fields/.
Data from: Preclinical PET data
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ville-Veikko Wettenhovi; Ville-Veikko Wettenhovi; Kimmo Jokivarsi; Kimmo Jokivarsi (2021). Preclinical PET data [Dataset]. http://doi.org/10.5281/zenodo.3528056
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3528056
Dataset updated
Apr 22, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ville-Veikko Wettenhovi; Ville-Veikko Wettenhovi; Kimmo Jokivarsi; Kimmo Jokivarsi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
An open preclinical PET dataset. This dataset has been measured with the preclinical Siemens Inveon PET machine. The measured target is a (naive) rat with an injected dose of 21.4 MBq of FDG. The injection was done intravenously (IV) to the tail vein. No specific organ was investigated, but rather the glucose metabolism as a whole. The examination is a 60 minute dynamic acquisition. The measurement was conducted according to the ethical standards set by the University of Eastern Finland.

The dataset contains the original list-mode data, the (dynamic) sinogram created by the Siemens Inveon Acquisition Workplace (IAW) software (28 frames), the (dynamic) scatter sinogram created by the IAW software (28 frames), the attenuation sinogram created by the IAW software and the normalization coefficients created by the IAW software. Header files are included for all the different data files.

For documentation on reading the list-mode binary data, please ask Siemens.

This dataset can be used in the OMEGA software, including the list-mode data, to import the data to MATLAB/Octave, create sinograms from the list-mode data and reconstruct the imported data. For help on using the dataset with OMEGA, see the wiki.
w
The Dutch Virtual Census of 2001 - IPUMS Subset - Netherlands
microdata.worldbank.org
catalog.ihsn.org
Updated Aug 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Bureau of Statistics (Statistics Netherlands) (2025). The Dutch Virtual Census of 2001 - IPUMS Subset - Netherlands [Dataset]. https://microdata.worldbank.org/index.php/catalog/2102
Explore at:
Dataset updated
Aug 1, 2025
Dataset provided by
IPUMS
Central Bureau of Statistics (Statistics Netherlands)
Time period covered
2001
Area covered
Netherlands
Description
Analysis unit

Persons Persons not organized into households; age grouped into categories; virtual census

UNITS IDENTIFIED: - Dwellings: no - Vacant Units: No - Households: no - Individuals: yes - Group quarters: no

UNIT DESCRIPTIONS: - Dwellings: no - Households: Individuals living in the same dwelling and sharing at least one meal. - Group quarters: Group of persons who share a common roof and food because of work, health, religion, etc.

Universe

The entire population of the country: 15,985,538 persons. Microdata are available for 1.19 % of the population, but exclude the institutional population.

Sampling procedure

MICRODATA SOURCE: Central Bureau of Statistics (Statistics Netherlands)

SAMPLE SIZE (person records): 189725.

SAMPLE DESIGN: 1% sample of the total population, consisting of records of persons prevailing in most sources

Mode of data collection

Face-to-face [f2f]

Research instrument

Dependent on source: register or survey

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2023). Weather and Housing in North America [Dataset]. https://www.kaggle.com/datasets/thedevastator/weather-and-housing-in-north-america

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

Explore at:

zip(512280 bytes)Available download formats

Dataset updated

Feb 13, 2023

Authors

The Devastator

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

North America

Description

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

By [source]

About this dataset

This comprehensive dataset explores the relationship between housing and weather conditions across North America in 2012. Through a range of climate variables such as temperature, wind speed, humidity, pressure and visibility it provides unique insights into the weather-influenced environment of numerous regions. The interrelated nature of housing parameters such as longitude, latitude, median income, median house value and ocean proximity further enhances our understanding of how distinct climates play an integral part in area real estate valuations. Analyzing these two data sets offers a wealth of knowledge when it comes to understanding what factors can dictate the value and comfort level offered by residential areas throughout North America

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset offers plenty of insights into the effects of weather and housing on North American regions. To explore these relationships, you can perform data analysis on the variables provided.

First, start by examining descriptive statistics (i.e., mean, median, mode). This can help show you the general trend and distribution of each variable in this dataset. For example, what is the most common temperature in a given region? What is the average wind speed? How does this vary across different regions? By looking at descriptive statistics, you can get an initial idea of how various weather conditions and housing attributes interact with one another.

Next, explore correlations between variables. Are certain weather variables correlated with specific housing attributes? Is there a link between wind speeds and median house value? Or between humidity and ocean proximity? Analyzing correlations allows for deeper insights into how different aspects may influence one another for a given region or area. These correlations may also inform broader patterns that are present across multiple North American regions or countries.

Finally, use visualizations to further investigate this relationship between climate and housing attributes in North America in 2012. Graphs allow you visualize trends like seasonal variations or long-term changes over time more easily so they are useful when interpreting large amounts of data quickly while providing larger context beyond what numbers alone can tell us about relationships between different aspects within this dataset

Research Ideas

Analyzing the effect of climate change on housing markets across North America. By looking at temperature and weather trends in combination with housing values, researchers can better understand how climate change may be impacting certain regions differently than others.

Investigating the relationship between median income, house values and ocean proximity in coastal areas. Understanding how ocean proximity plays into housing prices may help inform real estate investment decisions and urban planning initiatives related to coastal development.

Utilizing differences in weather patterns across different climates to determine optimal seasonal rental prices for property owners. By analyzing changes in temperature, wind speed, humidity, pressure and visibility from season to season an investor could gain valuable insights into seasonal market trends to maximize their profits from rentals or Airbnb listings over time

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: Weather.csv | Column name | Description | |:---------------------|:-----------------------------------------------| | Date/Time | Date and time of the observation. (Date/Time) | | Temp_C | Temperature in Celsius. (Numeric) | | Dew Point Temp_C | Dew point temperature in Celsius. (Numeric) | | Rel Hum_% | Relative humidity in percent. (Numeric) | | Wind Speed_km/h | Wind speed in kilometers per hour. (Numeric) | | Visibility_km | Visibilit...

Clear search

Close search

Google apps

Main menu

Weather and Housing in North America

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Synthetic Data for an Imaginary Country, Sample, 2023 - World

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Monthly Modal Time Series

Used Car Listings in Indonesia

Dataset Description:

Column Descriptions:

Potential Usages:

Other:

MISR Level 1B1 Local Mode Radiance Data V002

Uniform Sentinel 1-2 Dataset

Dataset name

Year of publication

Author

License

Associated publication

Links

Dataset structure

Dataset size

Description

Data from: FISBe: A real-world benchmark dataset for instance segmentation...

optional:import numpy as npraw_np = np.array(raw)

Graphite//LFP synthetic training prognosis dataset

MMS 4 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...

MMS 2 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...

MMS 3 Electron Drift Instrument (EDI) Electric Field, Level 2 (L2), Survey...

Estimated stand-off distance between ADS-B equipped aircraft and obstacles

RapidScat Level 2B Climate Ocean Wind Vectors in 12.5km Footprints

A geometric shape regularity effect in the human brain: fMRI dataset

A geometric shape regularity effect in the human brain: fMRI dataset

Abstract

Notes about this dataset

Replication Data for: Integrating online data collection in a household...

Physicians Actively Working by Specialty and Activity Hours

wikipedia-small-3000-embedded

load dataset in streaming mode (no download and it's fast)

select 3000 samples

MMS 4 Electron Drift Instrument (EDI) Quality 0 Counts, Level 2 (L2), Burst...

Data from: Preclinical PET data

The Dutch Virtual Census of 2001 - IPUMS Subset - Netherlands

Analysis unit

Universe

Sampling procedure

Mode of data collection

Research instrument

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns