https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
LSEG's Entity and Reference Data offers both static and dynamic data to help classify and describe financial instrument characteristics. Browse the datasets.
This is a searchable historical collection of standards referenced in regulations - Voluntary consensus standards, government-unique standards, industry standards, and international standards referenced in the Code of Federal Regulations (CFR).
The National Software Reference Library (NSRL) collects software from various sources and incorporates file profiles computed from this software into a Reference Data Set (RDS) of information. The RDS can be used by law enforcement, government, and industry organizations to review files on a computer by matching file profiles in the RDS. This alleviates much of the effort involved in determining which files are important as evidence on computers or file systems that have been seized as part of criminal investigations. The RDS is a collection of digital signatures of known, traceable software applications. There are application hash values in the hash set which may be considered malicious, i.e. steganography tools and hacking scripts. There are no hash values of illicit data, i.e. child abuse images.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains all the citation data (in N-Triples format) included in the OpenCitations Index, released on March 24, 2025. In particular, any citation in the dataset, defined as an individual of the class cito:Citation, includes the following information:[citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the final part of the URL identifying the citation (https://w3id.org/oc/index/ci/[OCI]);[property "cito:hasCitingEntity"] the citing entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitedEntity"] the cited entity identified by its OMID URL (https://https://opencitations.net/meta/[OMID]);[property "cito:hasCitationCreationDate"] the creation date of the citation (i.e. the publication date of the citing entity);[property "cito:hasCitationTimeSpan"] the time span of the citation (i.e. the interval between the publication date of the cited entity and the publication date of the citing entity);[type "cito:JournalSelfCitation"] it records whether the citation is a journal self-citations (i.e. the citing and the cited entities are published in the same journal);[type "cito:AuthorSelfCitation"] it records whether the citation is an author self-citation (i.e. the citing and the cited entities have at least one author in common).Note: the information for each citation is sourced from OpenCitations Meta (https://opencitations.net/meta), a database that stores and delivers bibliographic metadata for all bibliographic resources included in the OpenCitations Indexes. The data provided in this dump is therefore based on the state of OpenCitations Meta at the time this collection was generated.This version of the dataset contains:2,155,497,918 citationsThe size of the zipped archive is 80.6 GB, while the size of the unzipped N-Triples files is 1.9 TB.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Static input files to support alignment, quality control, variant calling, and ancestry estimation using the GRCh38 and CHM13 references in the CoLoRSdb workflow. https://github.com/juniper-lake/CoLoRSdb https://colorsdb.org/
colorsdb_resources/ ├── CHM13 │ ├── human_chm13v2.0_maskedY_rCRS.fasta │ ├── human_chm13v2.0_maskedY_rCRS.fasta.fai │ ├── human_chm13v2.0_maskedY_rCRS.trf.bed │ ├── somalier.sites.chm13v2.T2T.vcf.gz │ └── vcfparser.CHM13.ploidy.txt └── GRCh38 ├── hificnv.cnv.excluded_regions.hg38.bed.gz ├── hificnv.cnv.excluded_regions.hg38.bed.gz.tbi ├── hificnv.female_expected_cn.hg38.bed ├── hificnv.male_expected_cn.hg38.bed ├── human_GRCh38_no_alt_analysis_set.fasta ├── human_GRCh38_no_alt_analysis_set.fasta.fai ├── human_GRCh38_no_alt_analysis_set.trf.bed ├── peddy.GRCH38.sites ├── peddy.GRCH38.sites.bin.gz ├── somalier.sites.hg38.vcf.gz ├── trgt.adotto_repeats.hg38.bed ├── trgt.pathogenic_repeats.hg38.bed ├── trgt.repeat_catalog.hg38.bed └── vcfparser.GRCh38.ploidy.txt 2 directories, 19 files
https://www.gnu.org/licenses/gpl-3.0.en.htmlhttps://www.gnu.org/licenses/gpl-3.0.en.html
The reference data contains 65 referenced and cited literature sources from the writing process, including the title, author(s), source and other bibliographic information for each reference.The image data contains two images embedded within the body text of the paper.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results: Here, we look at citation rates while controlling for many known citation predictors, and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered.We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reference data for MITOS. Contains
fasta databases for the BLAST searches (RefSeq release 39)
automatically covariance models for ncRNA searches (for tRNAs the models are taken from MiTFi http://dx.doi.org/10.1093/nar/gkv746)
As described in http://dx.doi.org/10.1016/j.ympev.2012.08.023
This product contains plot location data in a .shp format as well as annual land cover, land use, and change process variables for each reference data plot in a separate .csv table. The same information available in the.csv file is also provided in a .xlsx format. The LCMAP Reference Data Product was utilized for evaluation and validation of the Land Change Monitoring, Assessment, and Projection (LCMAP) land cover and land cover change products. The LCMAP Reference Data Product includes the collection of an independent dataset of 25,000 randomly-distributed 30-meter by 30-meter plots across the conterminous United States (CONUS). This dataset was collected via manual image interpretation to aid in validation of the land cover and land cover change products as well as area estimates. The LCMAP Reference Data Product collected variables related to primary and secondary land use, primary and secondary land cover(s), change processes, and other ancillary variables annually across CONUS from 1984-2018. First posted - May 1, 2020 (available from author) Revised - September 21, 2021 (version 1.1) Revised - November 17, 2021 (version 1.2)
eFIRDS offers you data, information and insights to help you cope with regulatory data requirements for compliance needs, such as LEI, ISIN, CFI and MIC codes. In addition, you are able to verify if an instrument is listed on a recognized trading venue in Europe, and which instruments are listed by a certain company and its subsidiaries.
Our database is updated daily and sources data from a variety of sources, first and foremost key databases in Europe such as ESMA FIRDS, FCA FIRDS, and GLEIF. The data is first scrubbed before we map it together to form our enhanced eFIRDS dataset.
By using thw web portal https://eFIRDS.eu you can monitor new instruments based on your personal filters, including market, type, and currency. After tracking down an instrument you can save it as a favorite to make sure to stay updated on daily changes in the instrument’s reference data.
A short summary of eFIRDS key features:
-> SEARCH issuers, LEI codes and ISINs listed on any recognized trading venue in EU and UK -> OVERVIEW legal entities’ listed equity, debt and derivative instruments -> ACCESS reference, liquidity and regulatory data to comply with trading regulations such as Mifid/Mifir, EMIR, SFTR and other -> VERIFY recognized trading venues on financial instrument level including regulated markets, MTFs, OTFs and SIs -> FIND liquidity flags, thresholds and the most liquid trading venue on a financial instrument level -> OBTAIN and verify company data, LEI codes, including new issuance and renewal of LEIs -> NAVIGATE a company’s corporate group structure and legal entity information (LEI) including parent relationships linked to outstanding financial instruments -> TRACK new listings and financial instruments by their market and currency, filter and save your own personal favorite instruments Transparency and liquidity data is crucial information in today’s financial markets, not only for research and analysis but perhaps more importantly for regulatory reporting purposes.
Allow for our service to help provide you with the pre- and post-trade transparency data you need. By using eFIRDS you will gain insights and indications of trading activity in financial instruments, including output such as:
For more information, please contact our sales team at sales@efirds.eu or visit efirds.eu.
Excel files have percentage impervious cover estimates for the Chesapeake Bay region from 30 m 1 m data for six assessment units - 12-digit hydrologic units (watersheds), the riparian zones for the same watersheds, and four square lattices with cell sizes of 40, 2756, 5625, and 22500 ha. There is an excel file for each assessment unit. These data were used to produce the results in Table 2 of the association publication (https://doi.org/10.1016/j.isprsjprs.2018.09.010). This dataset is associated with the following publication: Wickham, J., N. Herold, S.V. Stehman, C.G. Homer, G. Xian, and P. Clagget. Accuracy assessment of NLCD 2011 impervious cover data for the Chesapeake Bay region, USA. ISPRS Journal of Photogrammetry and Remote Sensing. Elsevier BV, AMSTERDAM, NETHERLANDS, 146: 151-160, (2018).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This web map contains reference data points with specific site information on vegetation dominance type and tree size for the Tongass National Forest to provide up-to-date and more complete information about vegetative communities, structure, and patterns across the project area. Reference data for this project came from numerous sources including: 1) Forest Service field crews collecting vegetation information specific to this project; 2) GO field crews collecting vegetation information for this project; 3) helicopter survey data; 4) Young-Growth Inventory data; 5) legacy data from previous Forest Service survey plots and the Forest Inventory and Analysis (FIA) program (FIA data are not included in this database); 6) legacy data from the prior Yakutat vegetation mapping project; and 7) image interpretation. This database contains reference data collected by GO staff for the Central Tongass Existing Vegetation Type Map. Tongass National Forest personnel collected most of the ground data that was targeted for this mapping effort using a variety of means—primarily by foot using existing trail and road infrastructure, or by boat—to collect samples that capture the diversity of vegetation across the project area. Helicopter survey data were collected over the course of three weeks in July 2024 for the Northern Tongass, with the goal of reaching difficult to access areas. The Young-Growth Inventory information was leveraged as reference data from actively managed forest stands. Legacy data was cross-referenced with the classification key to label each plot with a vegetation type. All sites were reviewed within the context of their corresponding segment using high-resolution imagery. For more detailed information on reference data methodology please see the Central and Northern Tongass Existing Vegetation Project Report.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: A global dataset of crowdsourced land cover and land use reference data (2011-2012). Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.869682 for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘CORDIS reference data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/cordisref-data on 08 January 2022.
--- Dataset description provided by original source is as follows ---
Dataset containing the different reference lists to which CORDIS data links
--- Original source retains full ownership of the source dataset ---
The ARPA-E funded TERRA-REF project is generating open-access reference datasets for the study of plant sensing, genomics, and phenomics. Sensor data were generated by a field scanner sensing platform that captures color, thermal, hyperspectral, and active flourescence imagery as well as three dimensional structure and associated environmental measurements. This dataset is provided alongside data collected using traditional field methods in order to support calibration and validation of algorithms used to extract plot level phenotypes from these datasets.
Data were collected at the University of Arizona Maricopa Agricultural Center in Maricopa, Arizona. This site hosts a large field scanner with fifteen sensors, many of which are capable of capturing mm-scale images and point clouds at daily to weekly intervals.
These data are intended to be re-used, and are accessible as a combination of files and databases linked by spatial, temporal, and genomic information. In addition to providing open access data, the entire computational pipeline is open source, and we enable users to access high-performance computing environments.
The study has evaluated a sorghum diversity panel, biparental cross populations, and elite lines and hybrids from structured sorghum breeding populations. In addition, a durum wheat diversity panel was grown and evaluated over three winter seasons. The initial release includes derived data from from two seasons in which the sorghum diversity panel was evaluated. Future releases will include data from additional seasons and locations.
The TERRA-REF reference dataset can be used to characterize phenotype-to-genotype associations, on a genomic scale, that will enable knowledge-driven breeding and the development of higher-yielding cultivars of sorghum and wheat. The data is also being used to develop new algorithms for machine learning, image analysis, genomics, and optical sensor engineering.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Within the ESA funded WorldCereal project we have built an open harmonized reference data repository at global extent for model training or product validation in support of land cover and crop type mapping. Data from 2017 onwards were collected from many different sources and then harmonized, annotated and evaluated. These steps are explained in the harmonization protocol (10.5281/zenodo.7584463). This protocol also clarifies the naming convention of the shape files and the WorldCereal attributes (LC, CT, IRR, valtime and sampleID) that were added to the original data sets.
This publication includes those harmonized data sets of which the original data set was published under the CC-BY license or a license similar to CC-BY. See document "_In-situ-data-World-Cereal - license - CC-BY.pdf" for an overview of the original data sets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reference data
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides a standardized list of Academic Colleges published by Ministry of Education, on the National Reference Data Management Platform. It is reflecting any changes made by the certified source.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains metabarcoding reference data with 12S sequences from fish and other vertebrates for Teleo and Riaz primers suited for use with the OBITools package.
Files- all_seqs_INBO_riaz_amplified.fasta: reference data for Riaz marker- all_seqs_INBO_Valentini_teleo_amplified.fasta: reference data for Teleo marker- species_INBO_riaz.csv: species for which reference data is included in Riaz dataset- species_INBO_teleo.csv: species for which reference data is included in Teleo dataset
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
LSEG's Entity and Reference Data offers both static and dynamic data to help classify and describe financial instrument characteristics. Browse the datasets.