100+ datasets found
  1. d

    National Cancer Institute 3D Structure Database

    • dknet.org
    • blog.neuinfo.org
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). National Cancer Institute 3D Structure Database [Dataset]. http://identifiers.org/RRID:SCR_008211
    Explore at:
    Dataset updated
    Jun 23, 2025
    Description

    The NCI DIS 3D database is a collection of 3D structures for over 400,000 drugs. The database is an extension of the NCI Drug Information System. The structural information stored in the DIS is only the connection table for each drug. The connection table is just a list of which atoms are connected and how they are connected. It is essentially a searcheable database of three-dimensional structures has been developed from the chemistry database of the NCI Drug Information System (DIS), a file of about 450,000 primarily organic compounds which have been tested by NCI for anticancer activity. The DIS database is very similar in size and content to the proprietary databases used in the pharmaceutical industry; its development began in the 1950s; and this history led to a number of problems in the generation of 3D structures. This information can be searched to find drugs that share similar patterns of connections, which can correlate with similar biological activity. But the cellular targets for drug action, as well as the drugs themselves, are 3 dimensional objects and advances in computer hardware and software have reached the point where they can be represented as such. In many cases the important points of interaction between a drug and its target can be represented by a 3D arrangement of a small number of atoms. Such a group of atoms is called a pharmacophore. The pharmacophore can be used to search 3D databases and drugs that match the pharmacophore could have similar biological activity, but have very different patterns of atomic connections. Having a diverse set of lead compounds increases the chances of finding an active compound with acceptable properties for clinical development. Sponsor: The ICBG are supported by the Cooperative Agreement mechanism, with funds from nine components of the NIH, the National Science Foundation, and the Foreign Agricultural Service of the USDA.

  2. o

    National Cancer Institute Imaging Data Commons (IDC) Collections

    • registry.opendata.aws
    Updated May 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov) team (2023). National Cancer Institute Imaging Data Commons (IDC) Collections [Dataset]. https://registry.opendata.aws/nci-imaging-data-commons/
    Explore at:
    Dataset updated
    May 10, 2023
    Dataset provided by
    Imaging Data Commons (IDC)(<a href="https://imaging.datacommons.cancer.gov">https://imaging.datacommons.cancer.gov</a>) team
    Description

    Imaging Data Commons (IDC) is a repository within the Cancer Research Data Commons (CRDC) that manages imaging data and enables its integration with the other components of CRDC. IDC hosts a growing number of imaging collections that are contributed by either funded US National Cancer Institute (NCI) data collection activities, or by the individual researchers.Image data hosted by IDC is stored in DICOM format.

  3. Lung Cancer Research Data

    • kaggle.com
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Reichard (2025). Lung Cancer Research Data [Dataset]. https://www.kaggle.com/datasets/williamreichard/lung-cancer-research-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    William Reichard
    Description

    This data was collected by the National Cancer Institute in 2021. This dataset provides detailed information on lung cancer patients, covering demographic attributes, medical history, treatment specifics, and survival outcomes.

  4. Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER)...

    • healthdata.gov
    • data.virginia.gov
    • +2more
    application/rdfxml +5
    Updated Feb 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER) Registries Limited-Use [Dataset]. https://healthdata.gov/Health/Cancer-Incidence-Surveillance-Epidemiology-and-End/i3ww-np2h
    Explore at:
    application/rdfxml, csv, xml, application/rssxml, tsv, jsonAvailable download formats
    Dataset updated
    Feb 13, 2021
    Description

    SEER Limited-Use cancer incidence data with associated population data. Geographic areas available are county and SEER registry. The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute collects and distributes high quality, comprehensive cancer data from a number of population-based cancer registries. Data include patient demographics, primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status. The SEER Program is the only comprehensive source of population-based information in the United States that includes stage of cancer at the time of diagnosis and survival rates within each stage.

  5. a

    NCI State Late Stage Breast Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 21, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Late Stage Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/datasets/9dd0d923f8034cc8806173fdc224777d
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset authored and provided by
    National Cancer Institute
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Breast Cancer (Late Stage^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ Late Stage is defined as cases determined to be regional or distant. Due to changes in stage coding, Combined Summary Stage (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.

  6. H

    SEER Cancer Statistics Database

    • data.niaid.nih.gov
    Updated Jul 11, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). SEER Cancer Statistics Database [Dataset]. http://doi.org/10.7910/DVN/C9KBBC
    Explore at:
    Dataset updated
    Jul 11, 2011
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can access data about cancer statistics in the United States including but not limited to searches by type of cancer and race, sex, ethnicity, age at diagnosis, and age at death. Background Surveillance Epidemiology and End Results (SEER) database’s mission is to provide information on cancer statistics to help reduce the burden of disease in the U.S. population. The SEER database is a project to the National Cancer Institute. The SEER database collects information on incidence, prevalence, and survival from specific geographic areas representing 28 percent of the United States population. User functionality Users can access a variety of reso urces. Cancer Stat Fact Sheets allow users to look at summaries of statistics by major cancer type. Cancer Statistic Reviews are available from 1975-2008 in table format. Users are also able to build their own tables and graphs using Fast Stats. The Cancer Query system provides more flexibility and a larger set of cancer statistics than F ast Stats but requires more input from the user. State Cancer Profiles include dynamic maps and graphs enabling the investigation of cancer trends at the county, state, and national levels. SEER research data files and SEER*Stat software are available to download through your Internet connection (SEER*Stat’s client-server mode) or via discs shipped directly to you. A signed data agreement form is required to access the SEER data Data Notes Data is available in different formats depending on which type of data is accessed. Some data is available in table, PDF, and html formats. Detailed information about the data is available under “Data Documentation and Variable Recodes”.

  7. NCI State Breast Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/maps/NCI::nci-state-breast-cancer-incidence-rates
    Explore at:
    Dataset updated
    Jan 2, 2020
    Dataset authored and provided by
    National Cancer Institutehttp://www.cancer.gov/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Breast Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage in the Surveillance, Epidemiology, and End Results (SEER) summary stage.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.

  8. a

    NCI State Lung Cancer Incidence Rates

    • hub.arcgis.com
    • arc-gis-hub-home-arcgishub.hub.arcgis.com
    Updated Jan 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Lung Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/maps/NCI::nci-state-lung-cancer-incidence-rates
    Explore at:
    Dataset updated
    Jan 2, 2020
    Dataset authored and provided by
    National Cancer Institute
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Lung Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are segmented by sex (Both Sexes, Male, and Female) and age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage in the Surveillance, Epidemiology, and End Results (SEER) summary stage.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.

  9. List of all reprocessed vs. reprocessed differentially expressed genes...

    • plos.figshare.com
    csv
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of all reprocessed vs. reprocessed differentially expressed genes (DEGs) comparing tumor data from the GDC and normal data from the GTEx. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s004
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reprocessed counts were generated using our GDC RNA-seq workflow implementation. NA rank changes indicate the DEG cannot be found in the other DEG list. (CSV)

  10. d

    Current Active Clinical Trials - Roswell Park Cancer Institute

    • catalog.data.gov
    • datadiscoverystudio.org
    • +3more
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). Current Active Clinical Trials - Roswell Park Cancer Institute [Dataset]. https://catalog.data.gov/dataset/current-active-clinical-trials-roswell-park-cancer-institute
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    data.ny.gov
    Description

    List of active studies submitted by Roswell Park Cancer Institute (RPCI) to National Cancer Institute (NCI) annually as part of the Cancer Center Report Grant reporting. It includes the primary site, protocol, principal investigator, date opened, phase and study name.

  11. f

    Data from: Identifying Compound-Target Associations by Combining Bioactivity...

    • acs.figshare.com
    application/cdfv2
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tiejun Cheng; Qingliang Li; Yanli Wang; Stephen H. Bryant (2023). Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining [Dataset]. http://doi.org/10.1021/ci200192v.s001
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    ACS Publications
    Authors
    Tiejun Cheng; Qingliang Li; Yanli Wang; Stephen H. Bryant
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets of the query compound. About 45% of the predicted compound-target associations were successfully verified retrospectively, suggesting the possible application of BASS in identifying the targets of uncharacterized compounds and thus providing insight into the study of promiscuity and polypharmacology. Furthermore, BASS identified a significant fraction of structurally diverse compounds with similar bioactivities, indicating its feasibility of “scaffold hopping” in searching novel molecules against the target of interest.

  12. f

    Tumor characteristics and prognosis.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy B. Katzen; Kirtee Raparia; Rishi Agrawal; Jyoti D. Patel; Alfred Rademaker; John Varga; Jane E. Dematte (2023). Tumor characteristics and prognosis. [Dataset]. http://doi.org/10.1371/journal.pone.0117829.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jeremy B. Katzen; Kirtee Raparia; Rishi Agrawal; Jyoti D. Patel; Alfred Rademaker; John Varga; Jane E. Dematte
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NA- not assessed/availableWT: wild typeTumor characteristics and prognosis.

  13. d

    Connecticut State Cancer Profile

    • catalog.data.gov
    • data.ct.gov
    Updated Jun 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2025). Connecticut State Cancer Profile [Dataset]. https://catalog.data.gov/dataset/ct-state-cancer-profile
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    data.ct.gov
    Area covered
    Connecticut
    Description

    Dynamic views of cancer statistics from the National Cancer Institute

  14. Data from: County-level cumulative environmental quality associated with...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). County-level cumulative environmental quality associated with cancer incidence. [Dataset]. https://catalog.data.gov/dataset/county-level-cumulative-environmental-quality-associated-with-cancer-incidence
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Population based cancer incidence rates were abstracted from National Cancer Institute, State Cancer Profiles for all available counties in the United States for which data were available. This is a national county-level database of cancer data that are collected by state public health surveillance systems. All-site cancer is defined as any type of cancer that is captured in the state registry data, though non-melanoma skin cancer is not included. All-site age-adjusted cancer incidence rates were abstracted separately for males and females. County-level annual age-adjusted all-site cancer incidence rates for years 2006–2010 were available for 2687 of 3142 (85.5%) counties in the U.S. Counties for which there are fewer than 16 reported cases in a specific area-sex-race category are suppressed to ensure confidentiality and stability of rate estimates; this accounted for 14 counties in our study. Two states, Kansas and Virginia, do not provide data because of state legislation and regulations which prohibit the release of county level data to outside entities. Data from Michigan does not include cases diagnosed in other states because data exchange agreements prohibit the release of data to third parties. Finally, state data is not available for three states, Minnesota, Ohio, and Washington. The age-adjusted average annual incidence rate for all counties was 453.7 per 100,000 persons. We selected 2006–2010 as it is subsequent in time to the EQI exposure data which was constructed to represent the years 2000–2005. We also gathered data for the three leading causes of cancer for males (lung, prostate, and colorectal) and females (lung, breast, and colorectal). The EQI was used as an exposure metric as an indicator of cumulative environmental exposures at the county-level representing the period 2000 to 2005. A complete description of the datasets used in the EQI are provided in Lobdell et al. and methods used for index construction are described by Messer et al. The EQI was developed for the period 2000– 2005 because it was the time period for which the most recent data were available when index construction was initiated. The EQI includes variables representing each of the environmental domains. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., L. Messer, K. Rappazzo , C. Gray, S. Grabich , and D. Lobdell. County-level environmental quality and associations with cancer incidence#. Cancer. John Wiley & Sons Incorporated, New York, NY, USA, 123(15): 2901-2908, (2017).

  15. CDC WONDER: Cancer Statistics

    • healthdata.gov
    • data.virginia.gov
    • +5more
    application/rdfxml +5
    Updated Feb 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). CDC WONDER: Cancer Statistics [Dataset]. https://healthdata.gov/dataset/CDC-WONDER-Cancer-Statistics/mv5s-m59f
    Explore at:
    xml, tsv, application/rssxml, csv, application/rdfxml, jsonAvailable download formats
    Dataset updated
    Feb 13, 2021
    Description

    The United States Cancer Statistics (USCS) online databases in WONDER provide cancer incidence and mortality data for the United States for the years since 1999, by year, state and metropolitan areas (MSA), age group, race, ethnicity, sex, childhood cancer classifications and cancer site. Report case counts, deaths, crude and age-adjusted incidence and death rates, and 95% confidence intervals for rates. The USCS data are the official federal statistics on cancer incidence from registries having high-quality data and cancer mortality statistics for 50 states and the District of Columbia. USCS are produced by the Centers for Disease Control and Prevention (CDC) and the National Cancer Institute (NCI), in collaboration with the North American Association of Central Cancer Registries (NAACCR). Mortality data are provided by the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), National Vital Statistics System (NVSS).

  16. r

    Genomic Data Commons Data Portal (GDC Data Portal)

    • rrid.site
    • scicrunch.org
    • +2more
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514
    Explore at:
    Dataset updated
    May 24, 2025
    Description

    A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.

  17. SEER-Medicare Linked Database

    • datacatalog.hshsl.umaryland.edu
    Updated Oct 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute-Division of Cancer Control & Population Sciences (2023). SEER-Medicare Linked Database [Dataset]. https://datacatalog.hshsl.umaryland.edu/dataset/48
    Explore at:
    Dataset updated
    Oct 27, 2023
    Dataset provided by
    National Cancer Institutehttp://www.cancer.gov/
    Authors
    National Cancer Institute-Division of Cancer Control & Population Sciences
    Area covered
    United States
    Description

    This series of files links two large population-based sources providing detailed data about Medicare beneficiaries with cancer. The SEER (Surveillance, Epidemiology, and End Results) program consists of clinical, demographic, and cause of death information collected from tumor registries beginning in January 1, 1973. The Medicare contribution includes all claims for covered health care services from beneficiaries’ time of eligibility until death. Linkage is processed biennially by SEER and Centers for Medicare and Medicaid Services (CMS) staff. 95% of individuals age 65 and older are included in the SEER files. Due to privacy concerns, access to this database requires an application, SEER-Medicare Data Use Agreement (DUA), and documentation of institutional review board approval. Additionally, the National Cancer Institute’s information technology contractor assesses a processing fee the amount of which is dependent upon the type and number of files requested.

  18. c

    National Lung Screening Trial

    • cancerimagingarchive.net
    • dev.cancerimagingarchive.net
    dicom, docx, n/a +2
    Updated Sep 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2021). National Lung Screening Trial [Dataset]. http://doi.org/10.7937/TCIA.HMQ8-J677
    Explore at:
    docx, svs, dicom, n/a, sas, zip, and docAvailable download formats
    Dataset updated
    Sep 24, 2021
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 24, 2021
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    https://www.cancerimagingarchive.net/wp-content/uploads/nctn-logo-300x108.png" alt="" width="300" height="108" />

    Demographic Summary of Available Imaging

    CharacteristicValue (N = 26254)
    Age (years)Mean ± SD: 61.4± 5
    Median (IQR): 60 (57-65)
    Range: 43-75
    SexMale: 15512 (59%)
    Female: 10742 (41%)
    Race

    White: 23969 (91.3%)
    Black: 1135 (4.3%)
    Asian: 547 (2.1%)
    American Indian/Alaska Native: 88 (0.3%)
    Native Hawaiian/Other Pacific Islander: 87 (0.3%)
    Unknown: 428 (1.6%)

    Ethnicity

    Not Available

    Background: The aggressive and heterogeneous nature of lung cancer has thwarted efforts to reduce mortality from this cancer through the use of screening. The advent of low-dose helical computed tomography (CT) altered the landscape of lung-cancer screening, with studies indicating that low-dose CT detects many tumors at early stages. The National Lung Screening Trial (NLST) was conducted to determine whether screening with low-dose CT could reduce mortality from lung cancer.

    Methods: From August 2002 through April 2004, we enrolled 53,454 persons at high risk for lung cancer at 33 U.S. medical centers. Participants were randomly assigned to undergo three annual screenings with either low-dose CT (26,722 participants) or single-view posteroanterior chest radiography (26,732). Data were collected on cases of lung cancer and deaths from lung cancer that occurred through December 31, 2009. This dataset includes the low-dose CT scans from 26,254 of these subjects, as well as digitized histopathology images from 451 subjects.

    Results: The rate of adherence to screening was more than 90%. The rate of positive screening tests was 24.2% with low-dose CT and 6.9% with radiography over all three rounds. A total of 96.4% of the positive screening results in the low-dose CT group and 94.5% in the radiography group were false positive results. The incidence of lung cancer was 645 cases per 100,000 person-years (1060 cancers) in the low-dose CT group, as compared with 572 cases per 100,000 person-years (941 cancers) in the radiography group (rate ratio, 1.13; 95% confidence interval [CI], 1.03 to 1.23). There were 247 deaths from lung cancer per 100,000 person-years in the low-dose CT group and 309 deaths per 100,000 person-years in the radiography group, representing a relative reduction in mortality from lung cancer with low-dose CT screening of 20.0% (95% CI, 6.8 to 26.7; P=0.004). The rate of death from any cause was reduced in the low-dose CT group, as compared with the radiography group, by 6.7% (95% CI, 1.2 to 13.6; P=0.02).

    Conclusions: Screening with the use of low-dose CT reduces mortality from lung cancer. (Funded by the National Cancer Institute; National Lung Screening Trial ClinicalTrials.gov number, NCT00047385).

    Data Availability: A summary of the National Lung Screening Trial and its available datasets are provided on the Cancer Data Access System (CDAS). CDAS is maintained by Information Management System (IMS), contracted by the National Cancer Institute (NCI) as keepers and statistical analyzers of the NLST trial data. The full clinical data set from NLST is available through CDAS. Users of TCIA can download without restriction a publicly distributable subset of that clinical data, along with the CT and Histopathology images collected during the trial. (These previously were restricted.)

  19. Lithuanian Cancer registry data

    • healthinformationportal.eu
    html
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute, Lithuania (2022). Lithuanian Cancer registry data [Dataset]. https://www.healthinformationportal.eu/health-information-sources/lithuanian-cancer-registry-data
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset provided by
    National Cancer Institutehttp://www.cancer.gov/
    Authors
    National Cancer Institute, Lithuania
    Area covered
    Lithuania
    Variables measured
    sex, title, topics, country, language, data_owners, description, geo_coverage, contact_email, free_keywords, and 7 more
    Measurement technique
    Registry data
    Description

    National Cancer Institute’s Cancer registry is a nationwide and population-based cancer registry, which covers all territory of Lithuania and it collects information about all new cancer cases (ICD-10-AM codes: C00-C96, D00-D09, D32-D33, D39.1, D42-D43, D45-D47) of all cancer patients.

    The main task of the Cancer Registry is to guarantee as complete and reliable registration of incident cancer cases as possible.

    In 1984 the Lithuanian Cancer Registry was established at the National Cancer Institute by the Order of the Minister of Health. The population-based Cancer Registry was set up in 1990.

  20. c

    Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer Collection

    • dev.cancerimagingarchive.net
    • cancerimagingarchive.net
    dicom, json and svs +1
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2024). Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer Collection [Dataset]. http://doi.org/10.7937/PCTE-6M66
    Explore at:
    n/a, json and svs, dicomAvailable download formats
    Dataset updated
    Dec 19, 2024
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Dec 19, 2024
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who represent the demographic diversity of the U.S. and receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.

    This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer (CMB-AML) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases. A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). National Cancer Institute 3D Structure Database [Dataset]. http://identifiers.org/RRID:SCR_008211

National Cancer Institute 3D Structure Database

RRID:SCR_008211, nif-0000-21279, National Cancer Institute 3D Structure Database (RRID:SCR_008211), NCI DIS 3D Database

Explore at:
Dataset updated
Jun 23, 2025
Description

The NCI DIS 3D database is a collection of 3D structures for over 400,000 drugs. The database is an extension of the NCI Drug Information System. The structural information stored in the DIS is only the connection table for each drug. The connection table is just a list of which atoms are connected and how they are connected. It is essentially a searcheable database of three-dimensional structures has been developed from the chemistry database of the NCI Drug Information System (DIS), a file of about 450,000 primarily organic compounds which have been tested by NCI for anticancer activity. The DIS database is very similar in size and content to the proprietary databases used in the pharmaceutical industry; its development began in the 1950s; and this history led to a number of problems in the generation of 3D structures. This information can be searched to find drugs that share similar patterns of connections, which can correlate with similar biological activity. But the cellular targets for drug action, as well as the drugs themselves, are 3 dimensional objects and advances in computer hardware and software have reached the point where they can be represented as such. In many cases the important points of interaction between a drug and its target can be represented by a 3D arrangement of a small number of atoms. Such a group of atoms is called a pharmacophore. The pharmacophore can be used to search 3D databases and drugs that match the pharmacophore could have similar biological activity, but have very different patterns of atomic connections. Having a diverse set of lead compounds increases the chances of finding an active compound with acceptable properties for clinical development. Sponsor: The ICBG are supported by the Cooperative Agreement mechanism, with funds from nine components of the NIH, the National Science Foundation, and the Foreign Agricultural Service of the USDA.

Search
Clear search
Close search
Google apps
Main menu