MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides insights into the Indian developer community on GitHub, one of the worldâs largest platforms for developers to collaborate, share, and contribute to open-source projects. Whether you're interested in analyzing trends, understanding community growth, or identifying popular programming languages, this dataset offers a comprehensive look at the profiles of GitHub users from India.
The dataset includes anonymized profile information for a diverse range of GitHub users based in India. Key features include: - Username: Unique identifier for each user (anonymized) - Location: City or region within India - Programming Languages: Most commonly used languages per user - Repositories: Public repositories owned and contributed to - Followers and Following: Social network connections within the platform - GitHub Join Date: Date the user joined GitHub - Organizations: Affiliated organizations (if publicly available)
This dataset is curated from publicly available GitHub profiles with a specific focus on Indian users. It is inspired by the need to understand the growth of the tech ecosystem in India, including the languages, tools, and topics that are currently popular among Indian developers. This dataset aims to provide valuable insights for recruiters, data scientists, and anyone interested in the open-source contributions of Indian developers.
This dataset is perfect for: - Data scientists looking to explore and visualize developer trends - Recruiters interested in talent scouting within the Indian tech ecosystem - Tech enthusiasts who want to explore the dynamics of India's open-source community - Students and educators looking for real-world data to practice analysis and modeling
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access.
This dataset contains long time series for production, consumption, and line flows, amounting to 20 years of data with a time resolution of one hour, for several thousands of loads and several hundreds of generators of various types representing the ultra-high-voltage transmission grid of continental Europe. The synthetic time series have been statistically validated agains real-world data.
The algorithm is described in a Nature Scientific Data paper. It relies on the PanTaGruEl model of the European transmission network -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated data gathered from the ENTSO-E transparency platform, such as power consumption aggregated at the national level.
The network information is encoded in the file europe_network.json. It is given in PowerModels format, which it itself derived from MatPower and compatible with PandaPower. The network features 7822 power lines and 553 transformers connecting 4097 buses, to which are attached 815 generators of various types.
The time series forming the core of this dataset are given in CSV format. Each CSV file is a table with 8736 rows, one for each hourly time step of a 364-day year. All years are truncated to exactly 52 weeks of 7 days, and start on a Monday (the load profiles are typically different during weekdays and weekends). The number of columns depends on the type of table: there are 4097 columns in load files, 815 for generators, and 8375 for lines (including transformers). Each column is described by a header corresponding to the element identifier in the network file. All values are given in per-unit, both in the model file and in the tables, i.e. they are multiples of a base unit taken to be 100 MW.
There are 20 tables of each type, labeled with a reference year (2016 to 2020) and an index (1 to 4), zipped into archive files arranged by year. This amount to a total of 20 years of synthetic data. When using loads, generators, and lines profiles together, it is important to use the same label: for instance, the files loads_2020_1.csv, gens_2020_1.csv, and lines_2020_1.csv represent a same year of the dataset, whereas gens_2020_2.csv is unrelated (it actually shares some features, such as nuclear profiles, but it is based on a dispatch with distinct loads).
The time series can be used without a reference to the network file, simply using all or a selection of columns of the CSV files, depending on the needs. We show below how to select series from a particular country, or how to aggregate hourly time steps into days or weeks. These examples use Python and the data analyis library pandas, but other frameworks can be used as well (Matlab, Julia). Since all the yearly time series are periodic, it is always possible to define a coherent time window modulo the length of the series.
This example illustrates how to select generation data for Switzerland in Python. This can be done without parsing the network file, but using instead gens_by_country.csv, which contains a list of all generators for any country in the network. We start by importing the pandas library, and read the column of the file corresponding to Switzerland (country code CH):
import pandas as pd
CH_gens = pd.read_csv('gens_by_country.csv', usecols=['CH'], dtype=str)
The object created in this way is Dataframe with some null values (not all countries have the same number of generators). It can be turned into a list with:
CH_gens_list = CH_gens.dropna().squeeze().to_list()
Finally, we can import all the time series of Swiss generators from a given data table with
pd.read_csv('gens_2016_1.csv', usecols=CH_gens_list)
The same procedure can be applied to loads using the list contained in the file loads_by_country.csv.
This second example shows how to change the time resolution of the series. Suppose that we are interested in all the loads from a given table, which are given by default with a one-hour resolution:
hourly_loads = pd.read_csv('loads_2018_3.csv')
To get a daily average of the loads, we can use:
daily_loads = hourly_loads.groupby([t // 24 for t in range(24 * 364)]).mean()
This results in series of length 364. To average further over entire weeks and get series of length 52, we use:
weekly_loads = hourly_loads.groupby([t // (24 * 7) for t in range(24 * 364)]).mean()
The code used to generate the dataset is freely available at https://github.com/GeeeHesso/PowerData. It consists in two packages and several documentation notebooks. The first package, written in Python, provides functions to handle the data and to generate synthetic series based on historical data. The second package, written in Julia, is used to perform the optimal power flow. The documentation in the form of Jupyter notebooks contains numerous examples on how to use both packages. The entire workflow used to create this dataset is also provided, starting from raw ENTSO-E data files and ending with the synthetic dataset given in the repository.
This work was supported by the Cyber-Defence Campus of armasuisse and by an internal research grant of the Engineering and Architecture domain of HES-SO.
2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Downloadable data:
https://github.com/CSSEGISandData/COVID-19
Additional Information about the Visual Dashboard:
https://systems.jhu.edu/research/public-health/ncov
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents a set of large-scale Dial-a-Ride Problem (DARP) instances. The instances were created as a standardized set of ridesharing DARP problems for the purpose of benchmarking and comparing different solution methods.
The instances are based on real demand and realistic travel time data from 3 different US cities: Chicago, New York City, and Washington, DC. The instances consist of real travel requests from the selected period, positions of vehicles with their capacities, and realistic shortest travel times between all pairs of locations in each city.
The instances and results of two solution methods, the Insertion Heuristic and the optimal Vehicle-group Assignment method, can be found in the dataset.
đ Paper: arXiv:2305.18859 đ Data: DOI:10.5281/zenodo.7986103 đ©âđ» Code: https://github.com/aicenter/Ridesharing_DARP_instances
The dataset was presented at the IEEE International Conference on Intelligent Transportation Systems (ITSC 2023) in Bilbao, Bizkaia, Spain, 24-28 September 2023 (Session CON03)
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Covid19Kerala.info-Data is a consolidated multi-source open dataset of metadata from the COVID-19 outbreak in the Indian state of Kerala. It is created and maintained by volunteers of âCollective for Open Data Distribution-Keralamâ (CODD-K), a nonprofit consortium of individuals formed for the distribution and longevity of open-datasets. Covid19Kerala.info-Data covers a set of correlated temporal and spatial metadata of SARS-CoV-2 infections and prevention measures in Kerala. Static releases of this dataset snapshots are manually produced from a live database maintained as a set of publicly accessible Google sheets. This dataset is made available under the Open Data Commons Attribution License v1.0 (ODC-BY 1.0).
Schema and data package Datapackage with schema definition is accessible at https://codd-k.github.io/covid19kerala.info-data/datapackage.json. Provided datapackage and schema are based on Frictionless data Data Package specification.
Temporal and Spatial Coverage
This dataset covers COVID-19 outbreak and related data from the state of Kerala, India, from January 31, 2020 till the date of the publication of this snapshot. The dataset shall be maintained throughout the entirety of the COVID-19 outbreak.
The spatial coverage of the data lies within the geographical boundaries of the Kerala state which includes its 14 administrative subdivisions. The state is further divided into Local Self Governing (LSG) Bodies. Reference to this spatial information is included on appropriate data facets. Available spatial information on regions outside Kerala was mentioned, but it is limited as a reference to the possible origins of the infection clusters or movement of the individuals.
Longevity and Provenance
The dataset snapshot releases are published and maintained in a designated GitHub repository maintained by CODD-K team. Periodic snapshots from the live database will be released at regular intervals. The GitHub commit logs for the repository will be maintained as a record of provenance, and archived repository will be maintained at the end of the project lifecycle for the longevity of the dataset.
Data Stewardship
CODD-K expects all administrators, managers, and users of its datasets to manage, access, and utilize them in a manner that is consistent with the consortiumâs need for security and confidentiality and relevant legal frameworks within all geographies, especially Kerala and India. As a responsible steward to maintain and make this dataset accessibleâ CODD-K absolves from all liabilities of the damages, if any caused by inaccuracies in the dataset.
License
This dataset is made available by the CODD-K consortium under ODC-BY 1.0 license. The Open Data Commons Attribution License (ODC-By) v1.0 ensures that users of this dataset are free to copy, distribute and use the dataset to produce works and even to modify, transform and build upon the database, as long as they attribute the public use of the database or works produced from the same, as mentioned in the citation below.
Disclaimer
Covid19Kerala.info-Data is provided under the ODC-BY 1.0 license as-is. Though every attempt is taken to ensure that the data is error-free and up to date, the CODD-K consortium do not bear any responsibilities for inaccuracies in the dataset or any lossesâmonetary or otherwiseâthat users of this dataset may incur.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset accompanies the submission "Generating representative, live network traffic out of millions of code repositories" at HotNets'22: The 21st ACM Workshop on Hot Topics in Networks. Please see the files: - list_of_github_repositories.txt
for a list of GitHub repositories that we found containing a docker-compose*.yml
file - list_of_executed_repositories.csv
for more detailed information on the success of capturing traffic with specific orchestration files found in ~67% of the repositories If you use our dataset, please cite our work as follows: Tobias BĂŒhler, Roland Schmid, Sandro Lutz, and Laurent Vanbever. 2022. Generating representative, live network traffic out of millions of code repositories. In The 21st ACM Workshop on Hot Topics in Networks (HotNets â22), November 14â15, 2022, Austin, TX, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/ 3563766.3564084
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Real-time data from the Vienna lines are processed in Node-RED and sent via MQTT to an ESP32. Content is displayed on a 0.91" OLED display. Instructions for evaluation in Node-RED and implementation on the ESP32 code: Part 1: Part 2: Wiring and code can be found at github.com/pixeledi. Have fun replicating!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains various co-eruptive VT datasets collected during the 2021 eruption at La Palma, Spain, by an underwater High-fidelity Distributed Acoustic Sensing (HDAS). These datasets have been used to train and test the RNN-DAS model, a deep learning framework designed for volcano-seismic event detection using DAS data.
This data was collected as part of the DigiVolCan project, which is a collaboration between the University of Granada, the Canary Islands Volcanological Institute (INVOLCAN), the Institute of Technological and Renewable Energies (ITER), the University of La Laguna, and AragĂłn Photonics. It is funded by the Ministry of Science, Innovation, and Universities / State Research Agency (MICIU/AEI) of Spain and the European Union through the Recovery, Transformation, and Resilience Plan, Next Generation EU Funds. The project reference is PLEC2022-009271, funded by MICIU/AEI /10.13039/501100011033 and by the European Union Next GenerationEU/PRTR.
The shared dataset contains HDAS data recorded over several periods, with one file per minute. Each file is in .h5
format and follows the structure:
file_path
â
ââââ"data" (dataset)
â
ââââdata (2D matrix of strain rate)
â ââââ[channels x time_samples]
â
ââââattrs
â
ââââ"dt_s" (temporal sampling in seconds)
ââââ"dx_m" (spatial sampling in meters)
ââââ"begin_time" (start date in 'YYYY-MM-DDTHH:MM:SS.SSS' format)
Five datasets are provided as separate compressed .zip
archives due to their size. Each archive contains DAS waveform data in the HDF5
(.h5
) format described above, organized in one-minute files. These datasets correspond to figures presented in the RNN-DAS model article and are intended to facilitate reproducibility and further analysis.
Dataset 1 â Main event and aftershocks (Figure 5)
This dataset contains a one-hour DAS recording from November 30, between 07:00 and 08:00 UTC, featuring a main seismic event with magnitude MlâŻ=âŻ3.22 along with several aftershocks.
Dataset 2 â Continuous Test Segment (Figure 6)
This dataset contains one hour of continuous DAS recordings from October 29, between 04:00 and 05:00 UTC.
Dataset 3 â Events with Varying SNR and Magnitude (Figure 4)
This dataset includes three separate 3-minute DAS recordings, each corresponding to a different seismic event with distinct characteristics. The selected events represent a range of conditions, including attenuated signals, low signal-to-noise ratio (SNR), and nearby high-SNR events.
Dataset 4 â High-Magnitude Event Example (Figure 3)
This dataset contains a 3-minute DAS recording corresponding to a seismic event with magnitude MlâŻ=âŻ4.23. This example demonstrates the modelâs response to a clear, high-magnitude event.
Dataset 5 â Moderate Events and Noise-Only Sample (Figure 7)
This dataset includes four separate DAS recordings: three corresponding to moderate seismic events and another containing only seismic noise.
All .zip
archives can be easily decompressed and used directly.
Note: The full HDAS dataset from La Palma used for model training and evaluation is not included due to its large size. It is available upon request from the corresponding author.
The RNN-DAS model is an innovative Deep Learning model based on Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) cells, developed for real-time Volcano-seismic Signal Recognition (VSR) using Distributed Acoustic Sensing (DAS) measurements. The model was trained on a comprehensive dataset of Volcano-Tectonic (VT) events from the 2021 La Palma eruption, recorded by a High-fidelity submarine Distributed Acoustic Sensing array (HDAS) located near the eruption site.
RNN-DAS can detect VT events, track their temporal evolution, and classify their waveforms with approximately 97% accuracy when tested on a database of over 2 million unique strain waveforms, enabling real-time continuous data predictions. The model has demonstrated excellent generalization capabilities for different time intervals and volcanoes, facilitating continuous, real-time seismic monitoring with minimal computational resources and retraining requirements.
The model is available in the RNN-DAS GitHub repository:
https://github.com/Javier-FernandezCarabantes/RNN-DAS
FernĂĄndez-Carabantes, J., Titos, M., D'Auria, L., GarcĂa, J., GarcĂa, L., & BenĂtez, C. (2025). Javier-FernandezCarabantes/RNN-DAS: RNN-DAS v1.1.1 (v1.1.1). Zenodo. https://doi.org/10.5281/zenodo.15858492
A copy of the repository is also provided here as the RNN-DAS_main.zip
file. This archive mirrors the contents of the GitHub repository at the time of submission (v1.0.0). For correct usage, it is recommended to read the included README
file. Users are encouraged to refer to the GitHub repository for future updates or changes.
This dataset is provided as a sample data for the RNN-DAS model. It can be used to test and validate our model, as well as for the development of other machine learning approaches.
If you use this dataset in your research or if you use the RNN-DAS model, proper citation of the related article and this dataset are needed (FernĂĄndez-Carabantes et al., 2025)
FernĂĄndez-Carabantes, J., Titos, M., D'Auria, L., GarcĂa, J., GarcĂa, L., & BenĂtez, C. (2025). RNN-DAS: A new deep learning approach for detection and real-time monitoring of volcano-tectonic events using distributed acoustic sensing. Journal of Geophysical Research: Solid Earth, 130, e2025JB031756. https://doi.org/10.1029/2025JB031756
For further details, please refer to the project documentation or contact the research team (corresponding author email: javierfyc@ugr.es).
From the New York Times GITHUB source: CSV US counties "The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository. United States Data
Data on cumulative coronavirus cases and deaths can be found in two files for states and counties.
Each row of data reports cumulative counts based on our best reporting up to the moment we publish an update. We do our best to revise earlier entries in the data when we receive new information."
The specific data here, is the data PER US COUNTY.
The CSV link for counties is: https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourages poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub. Based on the results, we proposed and evaluated Julynter, a linting tool for Jupyter Notebooks.
Papers:
This repository contains three files:
Reproducing the Notebook Study
The db2020-09-22.dump.gz file contains a PostgreSQL dump of the database, with all the data we extracted from notebooks. For loading it, run:
gunzip -c db2020-09-22.dump.gz | psql jupyter
Note that this file contains only the database with the extracted data. The actual repositories are available in a google drive folder, which also contains the docker images we used in the reproducibility study. The repositories are stored as content/{hash_dir1}/{hash_dir2}.tar.bz2, where hash_dir1 and hash_dir2 are columns of repositories in the database.
For scripts, notebooks, and detailed instructions on how to analyze or reproduce the data collection, please check the instructions on the Jupyter Archaeology repository (tag 1.0.0)
The sample.tar.gz file contains the repositories obtained during the manual sampling.
Reproducing the Julynter Experiment
The julynter_reproducility.tar.gz file contains all the data collected in the Julynter experiment and the analysis notebooks. Reproducing the analysis is straightforward:
The collected data is stored in the julynter/data folder.
Changelog
2019/01/14 - Version 1 - Initial version
2019/01/22 - Version 2 - Update N8.Execution.ipynb to calculate the rate of failure for each reason
2019/03/13 - Version 3 - Update package for camera ready. Add columns to db to detect duplicates, change notebooks to consider them, and add N1.Skip.Notebook.ipynb and N11.Repository.With.Notebook.Restriction.ipynb.
2021/03/15 - Version 4 - Add Julynter experiment; Update database dump to include new data collected for the second paper; remove scripts and analysis notebooks from this package (moved to GitHub), add a link to Google Drive with collected repository files
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recently, GitHub introduced a new social feature, named reactions, which are pictorial characters similar to the emoji symbols widely used nowadays in text-based communications. Particularly, GitHub users can use a set of such symbols to react to issues and pull requests. However, little is known about the real usage and benefits of GitHub reactions. In this paper, we analyze the reactions provided by developers to more than 2.5 million issues and 9.7 million issue comments, in order to answer an extensive list of ten research questions about the usage and adoption of reactions. We show that reactions are being increasingly used by open-source developers. Moreover, we also found that issues with reactions usually take more time to be closed and have longer discussions.
This dataset contains the data used in the paper "Beyond Textual Issues: Understanding the Usage and Impact of GitHub Reactions", accepted for SBES 2019.
In the ACT, we have bluetooth detectors placed in certain roads to monitor traffic flow that provides network-wide performance indicators in real time. Details about congestion & travel time can be accessed via APIs provided in this dataset
Austin Transportation & Public Works maintains road condition sensors across the city which monitor the temperature and surface condition of roadways. These sensors enable our Mobility Management Center to stay apprised of potential roadway freezing events and intervene when necessary.
This data is updated continuously every 5 minutes.
See also, the data descriptions from the sensor's instruction manual:
https://github.com/cityofaustin/atd-road-conditions/blob/production/5433-3X-manual.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains function identifiers extracted from the GitHub Java Corpus (http://groups.inf.ed.ac.uk/cup/javaGithub/).
Each line corresponds to a method declaration. A line contains the name of the method declaration followed by the function identifiers (i.e., function calls) contained within the method body.
The file embeddings_train.json can be used to train a word/sentence embedding model using the code in the Github repository (link below).
The corpus was used for the experiments in the paper Combining Code Embedding with Static Analysis for Function-Call Completion.
Github repository to replicate the experiments: https://github.com/mweyssow/cse-saner
This dataset provides comprehensive social media profile links discovered through real-time web search. It includes profiles from major social networks like Facebook, TikTok, Instagram, Twitter, LinkedIn, Youtube, Pinterest, Github and more. The data is gathered through intelligent search algorithms and pattern matching. Users can leverage this dataset for social media research, influencer discovery, social presence analysis, and social media marketing. The API enables efficient discovery of social profiles across multiple platforms. The dataset is delivered in a JSON format via REST API.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
This dataset contains two hyperspectral and one multispectral anomaly detection images, and their corresponding binary pixel masks. They were initially used for real-time anomaly detection in line-scanning, but they can be used for any anomaly detection task.
They are in .npy file format (will add tiff or geotiff variants in the future), with the image datasets being in the order of (height, width, channels). The SNP dataset was collected using sentinelhub, and the Synthetic dataset was collected from AVIRIS. The Python code used to analyse these datasets can be found at: https://github.com/WiseGamgee/HyperAD
How to Get Started
All that is needed to load these datasets is Python (preferably 3.8+) and the NumPy package. Example code for loading the Beach Dataset if you put it in a folder called "data" with the python script is:
import numpy as np
hsi_array = np.load("data/beach_hsi.npy") n_pixels, n_lines, n_bands = hsi_array.shape print(f"This dataset has {n_pixels} pixels, {n_lines} lines, and {n_bands}.")
mask_array = np.load("data/beach_mask.npy") m_pixels, m_lines = mask_array.shape print(f"The corresponding anomaly mask is {m_pixels} pixels by {m_lines} lines.")
Citing the Datasets
If you use any of these datasets, please cite the following paper:
@article{garske2024erx, title={ERX - a Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line-Scanning}, author={Garske, Samuel and Evans, Bradley and Artlett, Christopher and Wong, KC}, journal={arXiv preprint arXiv:2408.14947}, year={2024},}
If you use the beach dataset please cite the following paper as well (original source):
@article{mao2022openhsi, title={OpenHSI: A complete open-source hyperspectral imaging solution for everyone}, author={Mao, Yiwei and Betters, Christopher H and Evans, Bradley and Artlett, Christopher P and Leon-Saval, Sergio G and Garske, Samuel and Cairns, Iver H and Cocks, Terry and Winter, Robert and Dell, Timothy}, journal={Remote Sensing}, volume={14}, number={9}, pages={2244}, year={2022}, publisher={MDPI} }
This dataset tracks the updates made on the dataset "Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes" as a repository for previous versions of the data and metadata.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
\r This dataset is not being updated currently due to data migration work at IP Australia. We are sorry for the inconvenience and we will update this page once the migration is complete.\r \r The Intellectual Property Government Open Live Data (IPGOLD) includes over 100 years of Intellectual Property (IP) rights administered by IP Australia comprising patents, trade marks, designs and plant breeder's rights. The data is highly detailed, including information on each aspect of the application process from application through to granting of IP rights. We have published a paper to accompany IPGOLD which describes the data and illustrates its use, as well as a technical paper on the firm matching.\r \r IPGOLD is inherently the same data as the IPGOD data set, with a weekly update instead of the annual snapshot available in IPGOD. Many of the scripts of IPGOLD are still being developed and tested. As such IPGOLD should be considered a Beta release.
rt-me-fMRI is a multi-echo functional magnetic resonance imaging dataset (N=28 healthy volunteers) with four task-based and two resting state runs. Its main purpose is to advance the development of methods for real-time multi-echo fMRI analysis with applications in neurofeedback, real-time quality control, and adaptive paradigms, although the variety of experimental task paradigms can support multiple use cases. Tasks include finger tapping, emotional face and shape matching, imagined finger tapping and imagined emotion processing. Further information is available at https://github.com/jsheunis/rt-me-fMRI IMPORTANT FOR DATASET DOWNOAD: Due to an issue with the current installation of Dataverse, it is not currently possible to download the full rt-me-fMRI dataset in bulk. This issue is scheduled to be resolved in early 2021. Individual downloads or downloading small sets of files is currently possible, although cumbersome. In order to download the full dataset in bulk, please request access to the dataset on this page. You will then be required to complete and sign the Data Use Agreement, after which you will be provided with a secure download link for the full dataset.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides insights into the Indian developer community on GitHub, one of the worldâs largest platforms for developers to collaborate, share, and contribute to open-source projects. Whether you're interested in analyzing trends, understanding community growth, or identifying popular programming languages, this dataset offers a comprehensive look at the profiles of GitHub users from India.
The dataset includes anonymized profile information for a diverse range of GitHub users based in India. Key features include: - Username: Unique identifier for each user (anonymized) - Location: City or region within India - Programming Languages: Most commonly used languages per user - Repositories: Public repositories owned and contributed to - Followers and Following: Social network connections within the platform - GitHub Join Date: Date the user joined GitHub - Organizations: Affiliated organizations (if publicly available)
This dataset is curated from publicly available GitHub profiles with a specific focus on Indian users. It is inspired by the need to understand the growth of the tech ecosystem in India, including the languages, tools, and topics that are currently popular among Indian developers. This dataset aims to provide valuable insights for recruiters, data scientists, and anyone interested in the open-source contributions of Indian developers.
This dataset is perfect for: - Data scientists looking to explore and visualize developer trends - Recruiters interested in talent scouting within the Indian tech ecosystem - Tech enthusiasts who want to explore the dynamics of India's open-source community - Students and educators looking for real-world data to practice analysis and modeling