Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
John Ioannidis and co-authors [1] created a publicly available database of top-cited scientists in the world. This database, intended to address the misuse of citation metrics, has generated a lot of interest among the scientific community, institutions, and media. Many institutions used this as a yardstick to assess the quality of researchers. At the same time, some people look at this list with skepticism citing problems with the methodology used. Two separate databases are created based on career-long and, single recent year impact. This database is created using Scopus data from Elsevier[1-3]. The Scientists included in this database are classified into 22 scientific fields and 174 sub-fields. The parameters considered for this analysis are total citations from 1996 to 2022 (nc9622), h index in 2022 (h22), c-score, and world rank based on c-score (Rank ns). Citations without self-cites are considered in all cases (indicated as ns). In the case of a single-year case, citations during 2022 (nc2222) instead of Nc9622 are considered.
To evaluate the robustness of c-score-based ranking, I have done a detailed analysis of the matrix parameters of the last 25 years (1998-2022) of Nobel laureates of Physics, chemistry, and medicine, and compared them with the top 100 rank holders in the list. The latest career-long and single-year-based databases (2022) were used for this analysis. The details of the analysis are presented below:
Though the article says the selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field, the actual career-based ranking list has 204644 names[1]. The single-year database contains 210199 names. So, the list published contains ~ the top 4% of scientists. In the career-based rank list, for the person with the lowest rank of 4809825, the nc9622, h22, and c-score were 41, 3, and 1.3632, respectively. Whereas for the person with the No.1 rank in the list, the nc9622, h22, and c-score were 345061, 264, and 5.5927, respectively. Three people on the list had less than 100 citations during 96-2022, 1155 people had an h22 less than 10, and 6 people had a C-score less than 2.
In the single year-based rank list, for the person with the lowest rank (6547764), the nc2222, h22, and c-score were 1, 1, and 0. 6, respectively. Whereas for the person with the No.1 rank, the nc9622, h22, and c-score were 34582, 68, and 5.3368, respectively. 4463 people on the list had less than 100 citations in 2022, 71512 people had an h22 less than 10, and 313 people had a C-score less than 2. The entry of many authors having single digit H index and a very meager total number of citations indicates serious shortcomings of the c-score-based ranking methodology. These results indicate shortcomings in the ranking methodology.
The Data Science Ontology is a research project of IBM Research AI and Stanford University Statistics. Its long-term objective is to improve the efficiency and transparency of collaborative, data-driven science.
Data includes metadata from over 12,500 journals from around the world in Sciences, Social Science and Humanities disciplines. Data are available from 1900 and currently include over 73 million article records and 1.4 billion cited references.
Data access is required to view this section.
Timeseries data from 'SAN FRANCISQUITO C A STANFORD UNIVERSITY CA (USGS 11164500)' (gov_usgs_nwis_11164500) cdm_data_type=TimeSeries cdm_timeseries_variables=station,longitude,latitude contributor_email=feedback@axiomdatascience.com contributor_name=Axiom Data Science contributor_role=processor contributor_role_vocabulary=NERC contributor_url=https://www.axiomdatascience.com Conventions=IOOS-1.2, CF-1.6, ACDD-1.3, NCCSV-1.2 defaultDataQuery=water_surface_height_above_reference_datum_above_localstationdatum_qc_agg,river_discharge,water_surface_height_above_reference_datum_above_localstationdatum,water_surface_height_above_reference_datum_above_navd88_qc_agg,z,time,water_surface_height_above_reference_datum_above_navd88,river_discharge_qc_agg&time>=max(time)-3days Easternmost_Easting=-122.189409 featureType=TimeSeries geospatial_lat_max=37.423273 geospatial_lat_min=37.423273 geospatial_lat_units=degrees_north geospatial_lon_max=-122.189409 geospatial_lon_min=-122.189409 geospatial_lon_units=degrees_east geospatial_vertical_max=0.0 geospatial_vertical_min=0.0 geospatial_vertical_positive=up geospatial_vertical_units=m history=Downloaded from USGS National Water Information System (NWIS) at id=132103 infoUrl=https://sensors.ioos.us/#metadata/132103/station institution=USGS National Water Information System (NWIS) naming_authority=com.axiomdatascience Northernmost_Northing=37.423273 platform=fixed platform_name=SAN FRANCISQUITO C A STANFORD UNIVERSITY CA (USGS 11164500) platform_vocabulary=http://mmisw.org/ont/ioos/platform processing_level=Level 2 references=https://waterdata.usgs.gov/monitoring-location/11164500,, sourceUrl=https://waterdata.usgs.gov/monitoring-location/11164500 Southernmost_Northing=37.423273 standard_name_vocabulary=CF Standard Name Table v72 station_id=132103 time_coverage_end=2025-06-29T10:00:00Z time_coverage_start=2023-12-15T03:00:00Z Westernmost_Easting=-122.189409
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Experiments with faster dissemination of research began in the 1960s, and in the 1990s first preprint servers emerged and became widely used in Physical Sciences and Economics. Since 2010, more than 30 new preprint servers have emerged and the number of deposited preprints has grown exponentially, with numerous journals now supporting posting of preprints and accepting preprints as submissions for journal peer review and publication. Research on preprints is, however, still scarce.
The goals of this project are:
1) Study preprint policies, submission requirements and addressing of transparency in reporting and research integrity topics of all know preprint servers that allow deposit of preprints to researchers regardless of their institutional affiliation or funding.
2) Study comments deposited on preprint servers’ platforms and social media and their relation to peer review and information exchange.
3) Study differences between preprint version(s) and version of record.
4) Living review of manuscript changes
Team Members (by first name alphabetical order):
Ana Jerončić,1 Gerben ter Riet,2,3 IJsbrand Jan Aalbersberg,4 John P.A. Ioannidis,5-9 Joseph Costello,10 Juan Pablo Alperin,11,12 Lauren A. Maggio,10 Lex Bouter,13,14 Mario Malički,5 Steve Goodman5-7
1 Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia 2 Urban Vitality Centre of Expertise, Amsterdam University of Applied Sciences, Amsterdam, The Netherlands 3 Amsterdam UMC, University of Amsterdam, Department of Cardiology, Amsterdam, The Netherlands 4 Elsevier, Amsterdam, The Netherlands 5 Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA 6 Department of Medicine, Stanford University School of Medicine, Stanford, California, USA 7 Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California, USA 8 Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, California, USA 9 Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, USA 10 Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA 11 Scholarly Communications Lab, Simon Fraser University, Vancouver, British Columbia, Canada 12 School of Publishing, Simon Fraser University, Vancouver, British Columbia, Canada 13 Department of Philosophy, Faculty of Humanities, Vrije Universiteit, Amsterdam, The Netherlands 14 Amsterdam UMC, Vrije Universiteit, Department of Epidemiology and Statistics, Amsterdam, The Netherlands
The CoreLogic Loan-Level Market Analytics (LLMA) for primary mortgages dataset contains detailed loan data, including origination, events, performance, forbearance and inferred modification data.
CoreLogic sources the Loan-Level Market Analytics data directly from loan servicers. CoreLogic cleans and augments the contributed records with modeled data. The Data Dictionary indicates which fields are contributed and which are inferred.
The Loan-Level Market Analytics data is aimed at providing lenders, servicers, investors, and advisory firms with the insights they need to make trustworthy assessments and accurate decisions. Stanford Libraries has purchased the Loan-Level Market Analytics data for researchers interested in housing, economics, finance and other topics related to prime and subprime first lien data.
CoreLogic provided the data to Stanford Libraries as pipe-delimited text files, which we have uploaded to Data Farm (Redivis) for preview, extraction and analysis.
For more information about how the data was prepared for Redivis, please see CoreLogic 2024 GitLab.
Per the End User License Agreement, the LLMA Data cannot be commingled (i.e. merged, mixed or combined) with Tax and Deed Data that Stanford University has licensed from CoreLogic, or other data which includes the same or similar data elements or that can otherwise be used to identify individual persons or loan servicers.
The 2015 major release of CoreLogic Loan-Level Market Analytics (for primary mortgages) was intended to enhance the CoreLogic servicing consortium through data quality improvements and integrated analytics. See **CL_LLMA_ReleaseNotes.pdf **for more information about these changes.
For more information about included variables, please see CL_LLMA_Data_Dictionary.pdf.
**
For more information about how the database was set up, please see LLMA_Download_Guide.pdf.
Data access is required to view this section.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
SPEED+ is the next-generation dataset for spacecraft pose estimation with specific emphasis on the robustness of Machine Learning (ML) models across the domain gap. Similar to its predecessor, SPEED+ consists of images of the Tango spacecraft from the PRISMA mission. SPEED+ consists of three different domains of imageries from two distinct sources. The first source is the OpenGL-based Optical Stimulator camera emulator software of Stanford’s Space Rendezvous Laboratory (SLAB), which is used to create the synthetic domain comprising 59,960 synthetic images. The labeled synthetic domain is split into 80:20 train/validation sets and is intended to be the main source of training of an ML model. The second source is the Testbed for Rendezvous and Optical Navigation (TRON) facility at SLAB, which is used to generate two simulated Hardware-In-the-Loop (HIL) domains with different sources of illumination: lightbox and sunlamp. Specifically, these two domains are constructed using realistic illumination conditions using lightboxes with diffuser plates for albedo simulation and a sun lamp to mimic direct high-intensity homogeneous light from the Sun. Compared to synthetic imagery, they capture corner cases, stray lights, shadowing, and visual effects in general which are not easy to obtain through computer graphics. The lightbox and sunlamp domains are unlabeled and thus intended mainly for testing, representing a typical scenario in developing a spaceborne ML model in which the labeled images from the target space domain are not available prior to deployment. SPEED+ is made publicly available to the aerospace community and beyond as part of the second international Satellite Pose Estimation Competition (SPEC2021) co-hosted by SLAB and the Advanced Concepts Team (ACT) of the European Space Agency.
The construction of the TRON testbed was partly funded by the U.S. Air Force Office of Scientific Research (AFOSR) through the Defense University Research Instrumentation Program (DURIP) contract FA9550-18-1-0492, titled High-Fidelity Verification and Validation of Spaceborne Vision-Based Navigation. The SPEED+ dataset is created using the TRON testbed by SLAB at Stanford University. The post-processing of the raw images is reviewed by ACT to meet the quality requirement of SPEC2021.
The Stanford VLF Group collects data from ground stations located across the globe. There are two principle types of data collected, broadband and narrowband. Broadband data is full waveform data sampled at 100 kHz (frequency range of 300 Hz to 40 kHz). Narrowband data refers to the demodulated amplitude and phase of narrowband VLF transmitters. Both broadband and narrowband data is typically collected on two orthogonal antennas oriented in the North/South and East/West directions. Data availability charts, data summary charts, raw data, and tools for reading and plotting the data are available.
Scientific Transparency (SciTran) is a software project that has grown out of the Project on Scientific Transparency at Stanford University. At the heart of SciTran is a scientific data management system – SDM – designed to enable and foster reproducible research. SciTran SDM delivers efficient and robust archiving, organization, and sharing of scientific data. We have developed the system around neuroimaging data, but our goal is to build a system that is flexible enough to accomodate all types of scientific data – from paper-and-pencil tests to genomics data. SDM will also allow for the sharing of data and computations between remote sites. SciTran is open-source software, released under the MIT license. Our code is hosted on GitHub. Feel free to try it out or to contribute. Commercial support for SciTran SDM is available through our partners at Flywheel. Check out their demo, if you''d like to give SDM a quick try.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains methylation quantitative trait loci (meQTL) results for the following study:
"regionalpcs improve discovery of DNA methylation associations with complex traits"
Tiffany Eulalio*1, Min Woo Sun1, Olivier Gevaert1, Michael D. Greicius2, Thomas J. Montine3, Daniel Nachun*‡3, Stephen B. Montgomery*‡1,3
‡ These authors contributed equally as senior authors
* Corresponding authors: Tiffany Eulalio (eulalio@alumn.stanford.edu), Daniel Nachun (dnachun@stanford.edu), Stephen B. Montgomery (smontgom@stanford.edu)
Author affiliations:
1. Department of Biomedical Data Science, Stanford University, Stanford, CA
2. Department of Neurology & Neurological Sciences, Stanford University, Stanford, CA
3. Department of Pathology, Stanford University, Stanford, CA
Dataset description:
This dataset contains QTL results generated from FastQTL, organized by region type (full gene, gene body, preTSS, and promoters) and summary types (averages and regional principal components).
Contents:
parquet1
includes chromosomes 1-10, and parquet2
includes chromosomes 11-22.This dataset is intended to support replication and further exploration of QTL associations across different genomic regions and summary methods.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global market for integrated medicine and engineering education is experiencing robust growth, projected at a compound annual growth rate (CAGR) of 8% from 2025 to 2033. This expansion is driven by several key factors. Firstly, the increasing demand for healthcare professionals skilled in both medical science and engineering principles is fueling the need for specialized programs. Advances in medical technology, such as robotics, AI, and bioprinting, require professionals with a holistic understanding of both engineering design and its biological applications. Furthermore, the integration of data science and informatics into healthcare necessitates professionals capable of managing and interpreting vast datasets for improved patient care and medical research. The growing aging population, coupled with a rising prevalence of chronic diseases, further intensifies the demand for such skilled individuals. The market is segmented by subject areas including biomedical engineering, health informatics, clinical engineering, and robotics in healthcare; and by course levels, encompassing undergraduate, graduate, and certificate programs. Major players are established universities globally, each with its unique strengths and competitive strategies, focusing on program innovation, industry partnerships, and attracting top faculty. Geographic distribution reveals strong market presence in North America and Europe, owing to established research infrastructure and well-funded educational institutions. However, the APAC region, particularly India and China, shows significant growth potential driven by rapid economic development and increased investment in healthcare infrastructure and education. The market's future growth hinges on sustained investment in research and development, industry-academia collaborations, and government initiatives promoting STEM education. Challenges include the high cost of specialized equipment and training, along with the need for standardized curriculum and accreditation across different regions and institutions. Ultimately, the market's trajectory reflects a critical need for a multidisciplinary approach to address evolving healthcare challenges, presenting substantial opportunities for educational institutions and technology providers alike.
This data set contains archival results from radio science investigations conducted during the Mars Global Surveyor (MGS) mission. Radio measurements were made using the MGS spacecraft and Earth-based stations of the NASA Deep Space Network (DSN). The data set includes high-resolution spherical harmonic models of Mars' gravity field generated by groups at the Jet Propulsion Laboratory and Goddard Space Flight Center, covariance matrices for some models, and maps for some models; these results were derived from raw radio tracking data. Also included are profiles of atmospheric temperature and pressure and ionospheric electron density, derived from phase measurements collected during radio occultations. The data set also includes analyses of transient surface echoes observed close to occultations during the first few years of MGS operations and a single set of power spectra acquired during a quasi-specular bistatic radar experiment in 2000. The atmospheric and surface investigations were conducted by Radio Science Team members at Stanford University. The data set also includes 93 line-of-sight acceleration profiles derived at JPL from radio tracking data collected near periapsis while Mars Global Surveyor was in its Science Phasing Orbit and below its nominal Mapping altitude of 400 km. The data were delivered to PDS in approximately chronological order at the rate of one CD-WO volume (typically 100 MB) every three months.
This data set contains archival results from radio science investigations conducted during the Mars Global Surveyor (MGS) mission. Radio measurements were made using the MGS spacecraft and Earth-based stations of the NASA Deep Space Network (DSN). The data set includes high-resolution spherical harmonic models of Mars' gravity field generated by groups at the Jet Propulsion Laboratory and Goddard Space Flight Center, covariance matrices for some models, and maps for some models; these results were derived from raw radio tracking data. Also included are profiles of atmospheric temperature and pressure and ionospheric electron density, derived from phase measurements collected during radio occultations. The data set also includes analyses of transient surface echoes observed close to occultations during the first few years of MGS operations. The atmospheric and surface investigations were conducted by Radio Science Team members at Stanford University. The data set also includes 93 line-of-sight acceleration profiles derived at JPL from radio tracking data collected near periapsis while Mars Global Surveyor was in its Science Phasing Orbit and below its nominal Mapping altitude of 400 km. The data were delivered to PDS in approximately chronological order at the rate of one CD-WO volume (typically 100 MB) every three months.
Timeseries data from '158 - Cabrillo Point Nearshore, CA (46240)' (edu_ucsd_cdip_158) cdm_data_type=TimeSeries cdm_timeseries_variables=station,longitude,latitude contributor_email=webmaster.ndbc@noaa.gov,None,,mailto:hmsinformation@lists.stanford.edu,mailto:hmsinformation@lists.stanford.edu,feedback@axiomdatascience.com contributor_name=NOAA National Data Buoy Center (NDBC),U.S. Army Corps of Engineers (USACE),World Meteorological Organization (WMO),Hopkins Marine Station, Stanford University,Hopkins Marine Station, Stanford University,Axiom Data Science contributor_role=contributor,sponsor,contributor,collaborator,sponsor,processor contributor_role_vocabulary=NERC contributor_url=https://www.ndbc.noaa.gov/,http://www.usace.army.mil/,https://wmo.int/,https://hopkinsmarinestation.stanford.edu/,https://hopkinsmarinestation.stanford.edu/,https://www.axiomdatascience.com Conventions=IOOS-1.2, CF-1.6, ACDD-1.3, NCCSV-1.2 defaultDataQuery=sea_surface_wave_from_direction,sea_surface_wave_mean_period_qc_agg,sea_water_temperature,sea_water_temperature_qc_agg,sea_surface_wave_period_at_variance_spectral_density_maximum,z,time,sea_surface_wave_period_at_variance_spectral_density_maximum_qc_agg,sea_surface_wave_significant_height,sea_surface_wave_from_direction_qc_agg,sea_surface_wave_mean_period,sea_surface_wave_significant_height_qc_agg&time>=max(time)-3days Easternmost_Easting=-121.9071 featureType=TimeSeries geospatial_lat_max=36.6263 geospatial_lat_min=36.6263 geospatial_lat_units=degrees_north geospatial_lon_max=-121.9071 geospatial_lon_min=-121.9071 geospatial_lon_units=degrees_east geospatial_vertical_max=0.0 geospatial_vertical_min=0.0 geospatial_vertical_positive=up geospatial_vertical_units=m history=Downloaded from Coastal Data Information Program (CDIP) at https://cdip.ucsd.edu/themes/cdip?pb=1&u2=s:158:st:1&d2=p9 id=130294 infoUrl=https://sensors.ioos.us/#metadata/130294/station institution=Coastal Data Information Program (CDIP) naming_authority=com.axiomdatascience Northernmost_Northing=36.6263 platform=buoy platform_name=158 - Cabrillo Point Nearshore, CA (46240) platform_vocabulary=http://mmisw.org/ont/ioos/platform processing_level=Level 2 references=https://cdip.ucsd.edu/themes/cdip?pb=1&u2=s:158:st:1&d2=p9,https://cdip.ucsd.edu/themes/cdip?pb=1&u2=s:158:st:1&d2=p9,https://www.ndbc.noaa.gov/station_page.php?station=46240,https://cdip.ucsd.edu/m/documents/data_processing.html#quality-control sourceUrl=https://cdip.ucsd.edu/themes/cdip?pb=1&u2=s:158:st:1&d2=p9 Southernmost_Northing=36.6263 standard_name_vocabulary=CF Standard Name Table v72 station_id=130294 time_coverage_end=2025-06-26T23:28:20Z time_coverage_start=2008-12-02T23:14:27Z Westernmost_Easting=-121.9071 wmo_platform_code=46240
Available registries
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
Denmark has hundreds of linkable population-based registries and databases available for research. These include administrative, health, and clinical quality databases linkable using civil registration numbers, part of the Danish Civil Registration System established in 1968, and taken together collectively represent nearly complete population coverage and follow-up, universal healthcare access, and deterministic linkage. An understanding of the state, regional and municipal-levels of the healthcare system is critical to successful research using these data, but the opportunities for working with a dynamic cohort with long-term follow-up are extensive across disciplines.
To date, the Stanford-Aarhus partnership has resulted in deep investigations of associations related to dementia, stress disorders and mental health, healthcare utilization and cost blooms, neighborhood level indicators of later outcomes, among others. Several cohorts have been identified as priorities for future research collaboration, including the Danish Labor Force Survey (LFS), the aging population, and pediatric data.
In collaboration with PHS, researchers can present a proposal to determine if a project of interest is aligned with the research agenda and mission of the Stanford-Aarhus partnership. Once initiated, collaborators work hand-in-hand with an Aarhus data scientist who manages the data and conducts analyses, supports the drafting of manuscripts and presentations.
If you are interested in working with this collaborator, please contact Stanford Population Health Sciences Data Core team at phsdatacore@stanford.edu
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains methylation quantitative trait loci (meQTL) results for the following study:
"regionalpcs improve discovery of DNA methylation associations with complex traits"
Tiffany Eulalio*1, Min Woo Sun1, Olivier Gevaert1, Michael D. Greicius2, Thomas J. Montine3, Daniel Nachun*‡3, Stephen B. Montgomery*‡1,3
‡ These authors contributed equally as senior authors
* Corresponding authors: Tiffany Eulalio (eulalio@alumn.stanford.edu), Daniel Nachun (dnachun@stanford.edu), Stephen B. Montgomery (smontgom@stanford.edu)
Author affiliations:
1. Department of Biomedical Data Science, Stanford University, Stanford, CA
2. Department of Neurology & Neurological Sciences, Stanford University, Stanford, CA
3. Department of Pathology, Stanford University, Stanford, CA
Dataset description:
This dataset contains QTL results generated from FastQTL, organized by region type (full gene, gene body, preTSS, and promoters) and summary types (averages and regional principal components).
Contents:
parquet1
includes chromosomes 1-10, and parquet2
includes chromosomes 11-22.This dataset is intended to support replication and further exploration of QTL associations across different genomic regions and summary methods.
Master Beneficiary Summary Files (MBSF)
This dataset page includes some of the tables from the Medicare Data in PHS's possession. Other Medicare tables are included on other dataset pages on the PHS Data Portal. Depending upon your research question and your DUA with CMS, you may only need tables from a subset of the Medicare dataset pages, or you may need tables from all of them.
The location of each of the Medicare tables (i.e. a chart of which tables are included in each Medicare dataset page) is shown here.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
Metadata access is required to view this section.
Metadata access is required to view this section.
Metadata access is required to view this section.
This data set consists of several tables and supporting documentation from final analysis of the Voyager 2 radio occultation by Triton. The data set is based on a Ph.D. dissertation by Eric M. Gurrola of Stanford University [GURROLA1995]. The tabulated data were derived from raw radio science observations, which are being archived separately. General principles for conducting these types of experiments have been described by [TYLER1987] results of the Triton analysis were published by [TYLERETAL1989].
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data set contains tables containing uranium-lead (U-Pb) isotopic data and the crystallization age of zircon from a metamorphic rock from the Manzano Mountains, New Mexico, collected in 2005. The bulk sample was processed into concentrated mineral separates of zircon at the University of Texas at Austin and analyzed by U.S. Geological Survey (USGS) research scientists at the Stanford-USGS Sensitive High Resolution Ion Microprobe with Reverse-Geometry (SHRIMP-RG) at Stanford University. The data table (geology_SHRIMPRGData_NM_Jones.csv) accompanying this data release reports the isotopic composition of uranium (U) and thorium (Th) measured in each grain, ratios of two isotopes of lead (207Pb and 206Pb) and two isotopes of uranium (235U and 238U), the age of each grain, and concentrations of selected trace elements measured in each grain. A second table (geology_sampleSummary_NM_Jones.csv) reports the sample location, rock characteristics, and interpreted age. Additionally, a th ...
The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of artificial intelligence.
The AI Index 2022 Report is supplemented by raw data and an interactive tool. • Raw data and charts: The public data and high-resolution images of all the charts in the report. • Global AI Vibrancy Tool: The Global AI Vibrancy Tool this year was designed with a visualization to compare up to 29 countries across 23 indicators.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1314380%2Fda995591c8726b060345ef98c1a4fc5d%2FScreen%20Shot%202022-03-25%20at%204.25.31%20PM.png?generation=1648247425427438&alt=media" alt="">
Daniel Zhang, Nestor Maslej, Erik Brynjolfsson, John Etchemendy, Terah Lyons, James Manyika, Helen Ngo, Juan Carlos Niebles, Michael Sellitto, Ellie Sakhaee, Yoav Shoham, Jack Clark, and Raymond Perrault, “The AI Index 2022 Annual Report,” AI Index Steering Committee, Stanford Institute for Human-Centered AI, Stanford University, March 2022. The AI Index 2022 Annual Report by Stanford University is licensed under Attribution-NoDerivatives 4.0 International. For a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
John Ioannidis and co-authors [1] created a publicly available database of top-cited scientists in the world. This database, intended to address the misuse of citation metrics, has generated a lot of interest among the scientific community, institutions, and media. Many institutions used this as a yardstick to assess the quality of researchers. At the same time, some people look at this list with skepticism citing problems with the methodology used. Two separate databases are created based on career-long and, single recent year impact. This database is created using Scopus data from Elsevier[1-3]. The Scientists included in this database are classified into 22 scientific fields and 174 sub-fields. The parameters considered for this analysis are total citations from 1996 to 2022 (nc9622), h index in 2022 (h22), c-score, and world rank based on c-score (Rank ns). Citations without self-cites are considered in all cases (indicated as ns). In the case of a single-year case, citations during 2022 (nc2222) instead of Nc9622 are considered.
To evaluate the robustness of c-score-based ranking, I have done a detailed analysis of the matrix parameters of the last 25 years (1998-2022) of Nobel laureates of Physics, chemistry, and medicine, and compared them with the top 100 rank holders in the list. The latest career-long and single-year-based databases (2022) were used for this analysis. The details of the analysis are presented below:
Though the article says the selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field, the actual career-based ranking list has 204644 names[1]. The single-year database contains 210199 names. So, the list published contains ~ the top 4% of scientists. In the career-based rank list, for the person with the lowest rank of 4809825, the nc9622, h22, and c-score were 41, 3, and 1.3632, respectively. Whereas for the person with the No.1 rank in the list, the nc9622, h22, and c-score were 345061, 264, and 5.5927, respectively. Three people on the list had less than 100 citations during 96-2022, 1155 people had an h22 less than 10, and 6 people had a C-score less than 2.
In the single year-based rank list, for the person with the lowest rank (6547764), the nc2222, h22, and c-score were 1, 1, and 0. 6, respectively. Whereas for the person with the No.1 rank, the nc9622, h22, and c-score were 34582, 68, and 5.3368, respectively. 4463 people on the list had less than 100 citations in 2022, 71512 people had an h22 less than 10, and 313 people had a C-score less than 2. The entry of many authors having single digit H index and a very meager total number of citations indicates serious shortcomings of the c-score-based ranking methodology. These results indicate shortcomings in the ranking methodology.