100+ datasets found

d
National Cancer Institute 3D Structure Database
dknet.org
blog.neuinfo.org
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). National Cancer Institute 3D Structure Database [Dataset]. http://identifiers.org/RRID:SCR_008211
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008211
Dataset updated
Jun 23, 2025
Description
The NCI DIS 3D database is a collection of 3D structures for over 400,000 drugs. The database is an extension of the NCI Drug Information System. The structural information stored in the DIS is only the connection table for each drug. The connection table is just a list of which atoms are connected and how they are connected. It is essentially a searcheable database of three-dimensional structures has been developed from the chemistry database of the NCI Drug Information System (DIS), a file of about 450,000 primarily organic compounds which have been tested by NCI for anticancer activity. The DIS database is very similar in size and content to the proprietary databases used in the pharmaceutical industry; its development began in the 1950s; and this history led to a number of problems in the generation of 3D structures. This information can be searched to find drugs that share similar patterns of connections, which can correlate with similar biological activity. But the cellular targets for drug action, as well as the drugs themselves, are 3 dimensional objects and advances in computer hardware and software have reached the point where they can be represented as such. In many cases the important points of interaction between a drug and its target can be represented by a 3D arrangement of a small number of atoms. Such a group of atoms is called a pharmacophore. The pharmacophore can be used to search 3D databases and drugs that match the pharmacophore could have similar biological activity, but have very different patterns of atomic connections. Having a diverse set of lead compounds increases the chances of finding an active compound with acceptable properties for clinical development. Sponsor: The ICBG are supported by the Cooperative Agreement mechanism, with funds from nine components of the NIH, the National Science Foundation, and the Foreign Agricultural Service of the USDA.
o
National Cancer Institute Imaging Data Commons (IDC) Collections
registry.opendata.aws
Updated May 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov) team (2023). National Cancer Institute Imaging Data Commons (IDC) Collections [Dataset]. https://registry.opendata.aws/nci-imaging-data-commons/
Explore at:
Dataset updated
May 10, 2023
Dataset provided by
Imaging Data Commons (IDC)(<a href="https://imaging.datacommons.cancer.gov">https://imaging.datacommons.cancer.gov</a>) team
Description
Imaging Data Commons (IDC) is a repository within the Cancer Research Data Commons (CRDC) that manages imaging data and enables its integration with the other components of CRDC. IDC hosts a growing number of imaging collections that are contributed by either funded US National Cancer Institute (NCI) data collection activities, or by the individual researchers.Image data hosted by IDC is stored in DICOM format.
Lung Cancer Research Data
kaggle.com
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William Reichard (2025). Lung Cancer Research Data [Dataset]. https://www.kaggle.com/datasets/williamreichard/lung-cancer-research-data/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
William Reichard
Description
This data was collected by the National Cancer Institute in 2021. This dataset provides detailed information on lung cancer patients, covering demographic attributes, medical history, treatment specifics, and survival outcomes.
Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER)...
healthdata.gov
data.virginia.gov
+2more
application/rdfxml +5
Updated Feb 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER) Registries Limited-Use [Dataset]. https://healthdata.gov/Health/Cancer-Incidence-Surveillance-Epidemiology-and-End/i3ww-np2h
Explore at:
application/rdfxml, csv, xml, application/rssxml, tsv, jsonAvailable download formats
Dataset updated
Feb 13, 2021
Description
SEER Limited-Use cancer incidence data with associated population data. Geographic areas available are county and SEER registry. The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute collects and distributes high quality, comprehensive cancer data from a number of population-based cancer registries. Data include patient demographics, primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status. The SEER Program is the only comprehensive source of population-based information in the United States that includes stage of cancer at the time of diagnosis and survival rates within each stage.
a
NCI State Late Stage Breast Cancer Incidence Rates
hub.arcgis.com
Updated Jan 21, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute (2020). NCI State Late Stage Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/datasets/9dd0d923f8034cc8806173fdc224777d
Explore at:
Dataset updated
Jan 21, 2020
Dataset authored and provided by
National Cancer Institute
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
This dataset contains Cancer Incidence data for Breast Cancer (Late Stage^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ Late Stage is defined as cases determined to be regional or distant. Due to changes in stage coding, Combined Summary Stage (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.
H
SEER Cancer Statistics Database
data.niaid.nih.gov
Updated Jul 11, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2011). SEER Cancer Statistics Database [Dataset]. http://doi.org/10.7910/DVN/C9KBBC
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/C9KBBC
Dataset updated
Jul 11, 2011
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Users can access data about cancer statistics in the United States including but not limited to searches by type of cancer and race, sex, ethnicity, age at diagnosis, and age at death. Background Surveillance Epidemiology and End Results (SEER) database’s mission is to provide information on cancer statistics to help reduce the burden of disease in the U.S. population. The SEER database is a project to the National Cancer Institute. The SEER database collects information on incidence, prevalence, and survival from specific geographic areas representing 28 percent of the United States population. User functionality Users can access a variety of reso urces. Cancer Stat Fact Sheets allow users to look at summaries of statistics by major cancer type. Cancer Statistic Reviews are available from 1975-2008 in table format. Users are also able to build their own tables and graphs using Fast Stats. The Cancer Query system provides more flexibility and a larger set of cancer statistics than F ast Stats but requires more input from the user. State Cancer Profiles include dynamic maps and graphs enabling the investigation of cancer trends at the county, state, and national levels. SEER research data files and SEER*Stat software are available to download through your Internet connection (SEER*Stat’s client-server mode) or via discs shipped directly to you. A signed data agreement form is required to access the SEER data Data Notes Data is available in different formats depending on which type of data is accessed. Some data is available in table, PDF, and html formats. Detailed information about the data is available under “Data Documentation and Variable Recodes”.
NCI State Breast Cancer Incidence Rates
hub.arcgis.com
Updated Jan 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute (2020). NCI State Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/maps/NCI::nci-state-breast-cancer-incidence-rates
Explore at:
Dataset updated
Jan 2, 2020
Dataset authored and provided by
National Cancer Institutehttp://www.cancer.gov/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
This dataset contains Cancer Incidence data for Breast Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage in the Surveillance, Epidemiology, and End Results (SEER) summary stage.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.
a
NCI State Lung Cancer Incidence Rates
hub.arcgis.com
arc-gis-hub-home-arcgishub.hub.arcgis.com
Updated Jan 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute (2020). NCI State Lung Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/maps/NCI::nci-state-lung-cancer-incidence-rates
Explore at:
Dataset updated
Jan 2, 2020
Dataset authored and provided by
National Cancer Institute
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
This dataset contains Cancer Incidence data for Lung Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are segmented by sex (Both Sexes, Male, and Female) and age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage in the Surveillance, Epidemiology, and End Results (SEER) summary stage.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.
List of all reprocessed vs. reprocessed differentially expressed genes...
plos.figshare.com
csv
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of all reprocessed vs. reprocessed differentially expressed genes (DEGs) comparing tumor data from the GDC and normal data from the GTEx. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s004
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318676.s004
Dataset updated
Mar 4, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reprocessed counts were generated using our GDC RNA-seq workflow implementation. NA rank changes indicate the DEG cannot be found in the other DEG list. (CSV)
d
Current Active Clinical Trials - Roswell Park Cancer Institute
catalog.data.gov
datadiscoverystudio.org
+3more
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ny.gov (2025). Current Active Clinical Trials - Roswell Park Cancer Institute [Dataset]. https://catalog.data.gov/dataset/current-active-clinical-trials-roswell-park-cancer-institute
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.ny.gov
Description
List of active studies submitted by Roswell Park Cancer Institute (RPCI) to National Cancer Institute (NCI) annually as part of the Cancer Center Report Grant reporting. It includes the primary site, protocol, principal investigator, date opened, phase and study name.
f
Data from: Identifying Compound-Target Associations by Combining Bioactivity...
acs.figshare.com
application/cdfv2
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tiejun Cheng; Qingliang Li; Yanli Wang; Stephen H. Bryant (2023). Identifying Compound-Target Associations by Combining Bioactivity Profile Similarity Search and Public Databases Mining [Dataset]. http://doi.org/10.1021/ci200192v.s001
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.1021/ci200192v.s001
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Tiejun Cheng; Qingliang Li; Yanli Wang; Stephen H. Bryant
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets of the query compound. About 45% of the predicted compound-target associations were successfully verified retrospectively, suggesting the possible application of BASS in identifying the targets of uncharacterized compounds and thus providing insight into the study of promiscuity and polypharmacology. Furthermore, BASS identified a significant fraction of structurally diverse compounds with similar bioactivities, indicating its feasibility of “scaffold hopping” in searching novel molecules against the target of interest.
f
Tumor characteristics and prognosis.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremy B. Katzen; Kirtee Raparia; Rishi Agrawal; Jyoti D. Patel; Alfred Rademaker; John Varga; Jane E. Dematte (2023). Tumor characteristics and prognosis. [Dataset]. http://doi.org/10.1371/journal.pone.0117829.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0117829.t003
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Jeremy B. Katzen; Kirtee Raparia; Rishi Agrawal; Jyoti D. Patel; Alfred Rademaker; John Varga; Jane E. Dematte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NA- not assessed/availableWT: wild typeTumor characteristics and prognosis.
d
Connecticut State Cancer Profile
catalog.data.gov
data.ct.gov
Updated Jun 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). Connecticut State Cancer Profile [Dataset]. https://catalog.data.gov/dataset/ct-state-cancer-profile
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.ct.gov
Area covered
Connecticut
Description
Dynamic views of cancer statistics from the National Cancer Institute
Data from: County-level cumulative environmental quality associated with...
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). County-level cumulative environmental quality associated with cancer incidence. [Dataset]. https://catalog.data.gov/dataset/county-level-cumulative-environmental-quality-associated-with-cancer-incidence
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Population based cancer incidence rates were abstracted from National Cancer Institute, State Cancer Profiles for all available counties in the United States for which data were available. This is a national county-level database of cancer data that are collected by state public health surveillance systems. All-site cancer is defined as any type of cancer that is captured in the state registry data, though non-melanoma skin cancer is not included. All-site age-adjusted cancer incidence rates were abstracted separately for males and females. County-level annual age-adjusted all-site cancer incidence rates for years 2006–2010 were available for 2687 of 3142 (85.5%) counties in the U.S. Counties for which there are fewer than 16 reported cases in a specific area-sex-race category are suppressed to ensure confidentiality and stability of rate estimates; this accounted for 14 counties in our study. Two states, Kansas and Virginia, do not provide data because of state legislation and regulations which prohibit the release of county level data to outside entities. Data from Michigan does not include cases diagnosed in other states because data exchange agreements prohibit the release of data to third parties. Finally, state data is not available for three states, Minnesota, Ohio, and Washington. The age-adjusted average annual incidence rate for all counties was 453.7 per 100,000 persons. We selected 2006–2010 as it is subsequent in time to the EQI exposure data which was constructed to represent the years 2000–2005. We also gathered data for the three leading causes of cancer for males (lung, prostate, and colorectal) and females (lung, breast, and colorectal). The EQI was used as an exposure metric as an indicator of cumulative environmental exposures at the county-level representing the period 2000 to 2005. A complete description of the datasets used in the EQI are provided in Lobdell et al. and methods used for index construction are described by Messer et al. The EQI was developed for the period 2000– 2005 because it was the time period for which the most recent data were available when index construction was initiated. The EQI includes variables representing each of the environmental domains. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., L. Messer, K. Rappazzo , C. Gray, S. Grabich , and D. Lobdell. County-level environmental quality and associations with cancer incidence#. Cancer. John Wiley & Sons Incorporated, New York, NY, USA, 123(15): 2901-2908, (2017).
CDC WONDER: Cancer Statistics
healthdata.gov
data.virginia.gov
+5more
application/rdfxml +5
Updated Feb 13, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). CDC WONDER: Cancer Statistics [Dataset]. https://healthdata.gov/dataset/CDC-WONDER-Cancer-Statistics/mv5s-m59f
Explore at:
xml, tsv, application/rssxml, csv, application/rdfxml, jsonAvailable download formats
Dataset updated
Feb 13, 2021
Description
The United States Cancer Statistics (USCS) online databases in WONDER provide cancer incidence and mortality data for the United States for the years since 1999, by year, state and metropolitan areas (MSA), age group, race, ethnicity, sex, childhood cancer classifications and cancer site. Report case counts, deaths, crude and age-adjusted incidence and death rates, and 95% confidence intervals for rates. The USCS data are the official federal statistics on cancer incidence from registries having high-quality data and cancer mortality statistics for 50 states and the District of Columbia. USCS are produced by the Centers for Disease Control and Prevention (CDC) and the National Cancer Institute (NCI), in collaboration with the North American Association of Central Cancer Registries (NAACCR). Mortality data are provided by the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), National Vital Statistics System (NVSS).
r
Genomic Data Commons Data Portal (GDC Data Portal)
rrid.site
scicrunch.org
+2more
Updated May 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014514
Dataset updated
May 24, 2025
Description
A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.
SEER-Medicare Linked Database
datacatalog.hshsl.umaryland.edu
Updated Oct 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute-Division of Cancer Control & Population Sciences (2023). SEER-Medicare Linked Database [Dataset]. https://datacatalog.hshsl.umaryland.edu/dataset/48
Explore at:
Dataset updated
Oct 27, 2023
Dataset provided by
National Cancer Institutehttp://www.cancer.gov/
Authors
National Cancer Institute-Division of Cancer Control & Population Sciences
Area covered
United States
Description
This series of files links two large population-based sources providing detailed data about Medicare beneficiaries with cancer. The SEER (Surveillance, Epidemiology, and End Results) program consists of clinical, demographic, and cause of death information collected from tumor registries beginning in January 1, 1973. The Medicare contribution includes all claims for covered health care services from beneficiaries’ time of eligibility until death. Linkage is processed biennially by SEER and Centers for Medicare and Medicaid Services (CMS) staff. 95% of individuals age 65 and older are included in the SEER files. Due to privacy concerns, access to this database requires an application, SEER-Medicare Data Use Agreement (DUA), and documentation of institutional review board approval. Additionally, the National Cancer Institute’s information technology contractor assesses a processing fee the amount of which is dependent upon the type and number of files requested.

National Lung Screening Trial

cancerimagingarchive.net
dev.cancerimagingarchive.net

dicom, docx, n/a +2

Updated Sep 24, 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

The Cancer Imaging Archive (2021). National Lung Screening Trial [Dataset]. http://doi.org/10.7937/TCIA.HMQ8-J677

Explore at:

docx, svs, dicom, n/a, sas, zip, and docAvailable download formats

Unique identifier

https://doi.org/10.7937/TCIA.HMQ8-J677

Dataset updated

Sep 24, 2021

Dataset authored and provided by

The Cancer Imaging Archive

License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered

Sep 24, 2021

Dataset funded by

National Cancer Institutehttp://www.cancer.gov/

Description

https://www.cancerimagingarchive.net/wp-content/uploads/nctn-logo-300x108.png" alt="" width="300" height="108" />

Demographic Summary of Available Imaging

Characteristic	Value (N = 26254)
Age (years)	Mean ± SD: 61.4± 5 Median (IQR): 60 (57-65) Range: 43-75
Sex	Male: 15512 (59%) Female: 10742 (41%)
Race	White: 23969 (91.3%) Black: 1135 (4.3%) Asian: 547 (2.1%) American Indian/Alaska Native: 88 (0.3%) Native Hawaiian/Other Pacific Islander: 87 (0.3%) Unknown: 428 (1.6%)
Ethnicity	Not Available

Background: The aggressive and heterogeneous nature of lung cancer has thwarted efforts to reduce mortality from this cancer through the use of screening. The advent of low-dose helical computed tomography (CT) altered the landscape of lung-cancer screening, with studies indicating that low-dose CT detects many tumors at early stages. The National Lung Screening Trial (NLST) was conducted to determine whether screening with low-dose CT could reduce mortality from lung cancer.

Methods: From August 2002 through April 2004, we enrolled 53,454 persons at high risk for lung cancer at 33 U.S. medical centers. Participants were randomly assigned to undergo three annual screenings with either low-dose CT (26,722 participants) or single-view posteroanterior chest radiography (26,732). Data were collected on cases of lung cancer and deaths from lung cancer that occurred through December 31, 2009. This dataset includes the low-dose CT scans from 26,254 of these subjects, as well as digitized histopathology images from 451 subjects.

Results: The rate of adherence to screening was more than 90%. The rate of positive screening tests was 24.2% with low-dose CT and 6.9% with radiography over all three rounds. A total of 96.4% of the positive screening results in the low-dose CT group and 94.5% in the radiography group were false positive results. The incidence of lung cancer was 645 cases per 100,000 person-years (1060 cancers) in the low-dose CT group, as compared with 572 cases per 100,000 person-years (941 cancers) in the radiography group (rate ratio, 1.13; 95% confidence interval [CI], 1.03 to 1.23). There were 247 deaths from lung cancer per 100,000 person-years in the low-dose CT group and 309 deaths per 100,000 person-years in the radiography group, representing a relative reduction in mortality from lung cancer with low-dose CT screening of 20.0% (95% CI, 6.8 to 26.7; P=0.004). The rate of death from any cause was reduced in the low-dose CT group, as compared with the radiography group, by 6.7% (95% CI, 1.2 to 13.6; P=0.02).

Conclusions: Screening with the use of low-dose CT reduces mortality from lung cancer. (Funded by the National Cancer Institute; National Lung Screening Trial ClinicalTrials.gov number, NCT00047385).

Data Availability: A summary of the National Lung Screening Trial and its available datasets are provided on the Cancer Data Access System (CDAS). CDAS is maintained by Information Management System (IMS), contracted by the National Cancer Institute (NCI) as keepers and statistical analyzers of the NLST trial data. The full clinical data set from NLST is available through CDAS. Users of TCIA can download without restriction a publicly distributable subset of that clinical data, along with the CT and Histopathology images collected during the trial. (These previously were restricted.)

Lithuanian Cancer registry data
healthinformationportal.eu
html
Updated Sep 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute, Lithuania (2022). Lithuanian Cancer registry data [Dataset]. https://www.healthinformationportal.eu/health-information-sources/lithuanian-cancer-registry-data
Explore at:
htmlAvailable download formats
Dataset updated
Sep 7, 2022
Dataset provided by
National Cancer Institutehttp://www.cancer.gov/
Authors
National Cancer Institute, Lithuania
Area covered
Lithuania
Variables measured
sex, title, topics, country, language, data_owners, description, geo_coverage, contact_email, free_keywords, and 7 more
Measurement technique
Registry data
Description
National Cancer Institute’s Cancer registry is a nationwide and population-based cancer registry, which covers all territory of Lithuania and it collects information about all new cancer cases (ICD-10-AM codes: C00-C96, D00-D09, D32-D33, D39.1, D42-D43, D45-D47) of all cancer patients.

The main task of the Cancer Registry is to guarantee as complete and reliable registration of incident cancer cases as possible.

In 1984 the Lithuanian Cancer Registry was established at the National Cancer Institute by the Order of the Minister of Health. The population-based Cancer Registry was set up in 1990.
c
Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer Collection
dev.cancerimagingarchive.net
cancerimagingarchive.net
dicom, json and svs +1
Updated Dec 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2024). Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer Collection [Dataset]. http://doi.org/10.7937/PCTE-6M66
Explore at:
n/a, json and svs, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/PCTE-6M66
Dataset updated
Dec 19, 2024
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Dec 19, 2024
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who represent the demographic diversity of the U.S. and receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.
This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer (CMB-AML) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases. A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). National Cancer Institute 3D Structure Database [Dataset]. http://identifiers.org/RRID:SCR_008211

National Cancer Institute 3D Structure Database

RRID:SCR_008211, nif-0000-21279, National Cancer Institute 3D Structure Database (RRID:SCR_008211), NCI DIS 3D Database

Explore at:

Unique identifier

https://identifiers.org/RRID:SCR_008211

Dataset updated

Jun 23, 2025

Description

The NCI DIS 3D database is a collection of 3D structures for over 400,000 drugs. The database is an extension of the NCI Drug Information System. The structural information stored in the DIS is only the connection table for each drug. The connection table is just a list of which atoms are connected and how they are connected. It is essentially a searcheable database of three-dimensional structures has been developed from the chemistry database of the NCI Drug Information System (DIS), a file of about 450,000 primarily organic compounds which have been tested by NCI for anticancer activity. The DIS database is very similar in size and content to the proprietary databases used in the pharmaceutical industry; its development began in the 1950s; and this history led to a number of problems in the generation of 3D structures. This information can be searched to find drugs that share similar patterns of connections, which can correlate with similar biological activity. But the cellular targets for drug action, as well as the drugs themselves, are 3 dimensional objects and advances in computer hardware and software have reached the point where they can be represented as such. In many cases the important points of interaction between a drug and its target can be represented by a 3D arrangement of a small number of atoms. Such a group of atoms is called a pharmacophore. The pharmacophore can be used to search 3D databases and drugs that match the pharmacophore could have similar biological activity, but have very different patterns of atomic connections. Having a diverse set of lead compounds increases the chances of finding an active compound with acceptable properties for clinical development. Sponsor: The ICBG are supported by the Cooperative Agreement mechanism, with funds from nine components of the NIH, the National Science Foundation, and the Foreign Agricultural Service of the USDA.

Clear search

Close search

Google apps

Main menu

National Cancer Institute 3D Structure Database

National Cancer Institute Imaging Data Commons (IDC) Collections

Lung Cancer Research Data

Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER)...

NCI State Late Stage Breast Cancer Incidence Rates

SEER Cancer Statistics Database

NCI State Breast Cancer Incidence Rates

NCI State Lung Cancer Incidence Rates

List of all reprocessed vs. reprocessed differentially expressed genes...

Current Active Clinical Trials - Roswell Park Cancer Institute

Data from: Identifying Compound-Target Associations by Combining Bioactivity...

Tumor characteristics and prognosis.

Connecticut State Cancer Profile

Data from: County-level cumulative environmental quality associated with...

CDC WONDER: Cancer Statistics

Genomic Data Commons Data Portal (GDC Data Portal)

SEER-Medicare Linked Database

National Lung Screening Trial

Demographic Summary of Available Imaging

Lithuanian Cancer registry data

Cancer Moonshot Biobank - Acute Myeloid Leukemia Cancer Collection

National Cancer Institute 3D Structure Database

RRID:SCR_008211, nif-0000-21279, National Cancer Institute 3D Structure Database (RRID:SCR_008211), NCI DIS 3D Database