U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
A list of databases in cancer research.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Users can access data about cancer statistics in the United States including but not limited to searches by type of cancer and race, sex, ethnicity, age at diagnosis, and age at death. Background Surveillance Epidemiology and End Results (SEER) database’s mission is to provide information on cancer statistics to help reduce the burden of disease in the U.S. population. The SEER database is a project to the National Cancer Institute. The SEER database collects information on incidence, prevalence, and survival from specific geographic areas representing 28 percent of the United States population. User functionality Users can access a variety of reso urces. Cancer Stat Fact Sheets allow users to look at summaries of statistics by major cancer type. Cancer Statistic Reviews are available from 1975-2008 in table format. Users are also able to build their own tables and graphs using Fast Stats. The Cancer Query system provides more flexibility and a larger set of cancer statistics than F ast Stats but requires more input from the user. State Cancer Profiles include dynamic maps and graphs enabling the investigation of cancer trends at the county, state, and national levels. SEER research data files and SEER*Stat software are available to download through your Internet connection (SEER*Stat’s client-server mode) or via discs shipped directly to you. A signed data agreement form is required to access the SEER data Data Notes Data is available in different formats depending on which type of data is accessed. Some data is available in table, PDF, and html formats. Detailed information about the data is available under “Data Documentation and Variable Recodes”.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains information of 213 cancer patients undergoing clinical or surgical treatment characterized on sociodemographic and clinical data as well as data from the Care Transition Measure (CTM 15-Brazil). Data collection was carried out 7 to 30 days after their discharge from hospital from June to August 2019. Understanding these data can contribute to improving quality of care transitions and avoiding hospital readmissions. To this end, this dataset contains a broad array of variables:
*gender
*age group
*place of residence
*race
*marital status
*schooling
*paid work activity
*type of treatment
*cancer staging
*metastasis
*comorbidities
*main complaint
*continue use medication
*diagnosis
*cancer type
*diagnostic year
*oncology treatment
*first hospitalization
*readmission in the last 30 days
*number of hospitalizations in the last 30 days
*readmission in the last 6 months
*number of hospitalizations in the last 6 months
*readmission in the last year
*number of hospitalizations in the last year
*questions 1-15 from CTM 15-Brazil
The data are presented as a single Excel XLSX file: cancer patient´s care transitions dataset.xlsx.
The analyses of the present dataset have the potential to generate hospital readmission prevention strategies to be implemented by the hospital team. Researchers who are interested in CTs of cancer patients can extensively explore the variables described here.
The project from which these data were extracted was approved by the institution’s research ethics committee (approval n. 3.266.259/2019) at Associação Hospital de Caridade Ijuí, Rio Grande do Sul, Brazil.
A database of oncogenes and tumor suppressor genes. Users can search by genes, chromosomes, and keywords. The coAnsensus domain analysis tool functions to identify conserved protein domains and GO terms among selected TAG genes, while the “oncogenic domain analysis” can analyze oncogenic potential of any user-provided protein based on a weighed term frequency table calculated from the TAG proteins. The completion of human genome sequences allows one to rapidly identify and analyze genes of interest through the use of computational approach. The available annotations including physical characterization and functional domains of known tumor-related genes thus can be used to study the role of genes involved in carcinogenesis. The tumor-associated gene (TAG) database was designed to utilize information from well-characterized oncogenes and tumor suppressor genes to facilitate cancer research. All target genes were identified through text-mining approach from the PubMed database. A semi-automatic information retrieving engine was built to collect specific information of these target genes from various resources and store in the TAG database. At current stage, 519 TAGs including 198 oncogenes, 170 tumor suppressor genes, and 151 genes related to oncogenesis were collected. Information collected in TAG database can be browsed through user-friendly web interfaces that provide searching genes by chromosome or by keywords. The “consensus domain analysis” tool functions to identify conserved protein domains and GO terms among selected TAG genes. In addition, the “oncogenic domain analysis” can analyze oncogenic potential of any user-provided protein based on a weighed term frequency table calculated from the TAG proteins. This study was supported by grant from National research program for genomic medicine (NRPGM) and personnel from Bioinformatics Center of Center for Biotechnology and Biosciences in the National Cheng Kung University, Taiwan.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Although all cancers are molecularly distinct, many share common driver mutations. Pan-cancer analysis, utilizes next-generation sequencing (NGS), pan-cancer model systems, and pan-cancer projects such as The Cancer Genome Atlas (TCGA), to assess frequently mutated genes and other genomic abnormalities that are common among many cancer types, regardless of the tumor origin, providing new directions for tumor biology research. However, there is currently no study that has objectively analyzed the results of pan-cancer studies on cancer biology. For this study, 999 articles on pan-cancer published from 2006 to 2020 were obtained from the Scopus database, and bibliometric methods were used to analyze citations, international cooperation, co-authorship and keyword co-occurrence clusters. Furthermore, we also focused on and summarized the application of pan-cancer in breast cancer. Our result shows that the pan-cancer studies were first published in 2006 and entered a period of rapid development after 2013. So far, 86 countries have carried out international cooperation in sharing research. Researchers form the United States and Canada have published the most articles and have made the most extensive contribution to this field, respectively. Through author keyword analysis of the 999 articles, TCGA, biomarkers, NGS, immunotherapy, DNA methylation, prognosis, and several other keywords appear frequently, and these terms are hot spots in pan-cancer studies. There are four subtypes of breast cancer (luminalA, luminalB, HER2, and basal-like) according to pan-cancer analysis of breast cancer. Meanwhile, it was found that breast cancer has genetic similarity to pan-gynecological cancers, such as ovarian cancer, which indicates related etiology and possibly similar treatments. Collectively, with the emergence of new detection methods, new cancer databases, and the involvement of more researchers, pan-cancer analyses will play a greater role in cancer biology research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Framing the investigation of diverse cancers as a machine learning problem has recently shown significant potential in multi-omics analysis and cancer research. Empowering these successful machine learning models are the high-quality training datasets with sufficient data volume and adequate preprocessing. However, while there exist several public data portals including The Cancer Genome Atlas (TCGA) multi-omics initiative or open-bases such as the LinkedOmics, these databases are not off-the-shelf for existing machine learning models. we propose MLOmics, an open cancer multi-omics database aiming at serving better the development and evaluation of bioinformatics and machine learning models. MLOmics contains 8,314 patient samples covering all 32 cancer types with four omics types, stratified features, and extensive baselines. Complementary support for downstream analysis and bio-knowledge linking are also included to support interdisciplinary analysis.
This database of cancer-related citations for publications authored by CDC’s Division of Cancer Prevention and Control (DCPC) staff, fosters collaboration among scientists throughout the world. Allows for searching for links to scientific articles authored or co-authored by researchers from DCPC since 2000.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who are receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.
This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Prostate Cancer (CMB-PCA) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases. A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.
Comprehensive dataset of 8,955 Cancer treatment centers in United States as of June, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
We searched the NCBI BioProject database and downloaded 1,012 experiments with original sequences from 14 projects, involving 7 major types of head and neck cancer, lung cancer, breast cancer, prostate cancer, gastric cancer, colon cancer, and liver cancer. For sequence reading, we performed preprocessing steps and variant calling, followed by a series of filtering steps to remove non-functional variants and minimize false positives, which gave us a refined list of 6981 variants. All the raw data are download from NCBI bioproject database at https://www.ncbi.nlm.nih.gov/bioproject/ The BioProject IDs are as below: PRJNA485408 PRJNA448888 PRJEB15399 PRJNA281253 PRJEB4979 PRJNA343124 PRJNA603789 PRJNA603782 PRJNA575243 PRJNA475218 PRJNA281419 PRJEB32931 PRJNA307236 PRJNA407354
Comprehensive dataset of 478 Cancer treatment centers in Germany as of June, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Global Cancer Registry Software Market size valued at US$ 85.14 Million in 2023, set to reach US$ 204.07 Million by 2032 at a CAGR of about 10.2% from 2024 to 2032.
Cancer Registry Software Market Size 2025-2029
The cancer registry software market size is forecast to increase by USD 121.9 million, at a CAGR of 14% between 2024 and 2029.
The market is witnessing significant growth, driven by the increasing prevalence of cancer cases and the rising demand for accurate and comprehensive data for clinical research in oncology. The growing number of cancer diagnoses worldwide necessitates advanced solutions for managing and analyzing patient data, fueling market expansion. Furthermore, the importance of data privacy and security in the healthcare sector poses a challenge for market participants. Ensuring the confidentiality and protection of sensitive patient information is crucial to maintain trust and regulatory compliance.
Companies in this market must navigate these challenges while continuing to innovate and deliver solutions that address the evolving needs of healthcare providers and researchers. By focusing on data security and privacy, as well as integrating advanced analytics capabilities, market participants can capitalize on the opportunities presented by the growing demand for cancer registry software.
What will be the Size of the Cancer Registry Software Market during the forecast period?
Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample
The market continues to evolve, driven by advancements in technology and the increasing demand for efficient and accurate cancer data management. Cloud computing plays a significant role in the market's dynamics, enabling remote access to data and reducing the need for on-premise infrastructure. Technical support, audit trails, and cancer surveillance are integral components of these solutions, ensuring data security and regulatory compliance. Data warehousing and business intelligence capabilities enable data cleansing, data validation, and data analysis, leading to improved data quality and clinical insights. User experience and customizable reports cater to diverse user needs, while machine learning and artificial intelligence facilitate predictive modeling and statistical analysis.
Healthcare regulations mandate stringent data governance and access control, making data security a top priority. Case management and healthcare IT integration streamline workflows and facilitate data exchange between various stakeholders. Database management and reporting features provide real-time data visualization and decision support, enhancing operational efficiency. Data migration and software updates ensure seamless integration with existing systems, while data validation and data entry tools maintain data accuracy. Tumor registry solutions enable comprehensive cancer surveillance and population health management, contributing to public health initiatives. The market's continuous dynamism reflects the ongoing integration of various technologies and the evolving needs of healthcare providers and regulatory bodies.
How is this Cancer Registry Software Industry segmented?
The cancer registry software industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
End-user
Government and third party
Pharma biotech and medical device companies
Hospitals and medical practice
Private payers
Research institutes
Type
Stand-alone software
Integrated software
Deployment
On-premises
Cloud-based
Geography
North America
US
Canada
Mexico
Europe
France
Germany
Italy
Spain
UK
APAC
China
Japan
Rest of World (ROW)
By End-user Insights
The government and third party segment is estimated to witness significant growth during the forecast period.
Cancer registry software plays a vital role in assisting government and third-party agencies in managing and analyzing data related to cancer cases. These solutions enable the collection, storage, and processing of patient data, clinical information, and statistical analysis. The integration of business intelligence and data warehousing facilitates data mining, trend analysis, and pattern recognition, which is essential for public health planning and resource allocation. Machine learning and artificial intelligence technologies enhance the capabilities of cancer registry software by automating data entry, improving data accuracy, and enabling predictive modeling. User-friendly interfaces, customizable reports, and decision support systems cater to the needs of healthcare IT professionals, medical informatics specialists, and other stakeholders.
Database management, workflow management, and access control ensure data security and privacy, while data governance and da
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases (IQ-OTH/NCCD) lung cancer dataset was collected in the above-mentioned specialist hospitals over a period of three months in fall 2019. It includes CT scans of patients diagnosed with lung cancer in different stages, as well as healthy subjects. IQ-OTH/NCCD slides were marked by oncologists and radiologists in these two centers. The dataset contains a total of 1190 images representing CT scan slices of 110 cases (see Figure 1). These cases are grouped into three classes: normal, benign, and malignant. of these, 40 cases are diagnosed as malignant; 15 cases diagnosed with benign; and 55 cases classified as normal cases. The CT scans were originally collected in DICOM format. The scanner used is SOMATOM from Siemens. CT protocol includes: 120 kV, slice thickness of 1 mm, with window width ranging from 350 to 1200 HU and window center from 50 to 600 were used for reading. with breath hold at full inspiration. All images were de-identified before performing analysis. Written consent was waived by the oversight review board. The study was approved by the institutional review board of participating medical centers. Each scan contains several slices. The number of these slices range from 80 to 200 slices, each of them represents an image of the human chest with different sides and angles. The 110 cases vary in gender, age, educational attainment, area of residence and living status. Some of them are employees of the Iraqi ministries of Transport and Oil, others are farmers and gainers. Most of them come from places in the middle region of Iraq, particularly, the provinces of Baghdad, Wasit, Diyala, Salahuddin, and Babylon.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data comes from two different sources. Population-based lung cancer incidence rates for the period 2010-2014 (most updated data) were abstracted from National Cancer Institute state cancer profiles (Schwartz et al. 1996).This national county-level database of cancer data is collected by state public health surveillance systems. The domain specific county level environmental quality index (EQI) data for the period 2000-2005 were abstracted from United States Environmental Protection Agency (USEPA) profile. Complete descriptions of the datasets used in the EQI are provided in Lobdell’s paper (Lobdell 2011). Data were merged based on the Federal Information Processing Standards (FIPS) code. Out of 3144 counties in United States this study has available information for 2602 counties: Data was not available for four states namely Kansas, Michigan, Minnesota and Nevada due to state legislation and regulations which prohibit the release of county-level data to outside entities, county whose lung cancer mortality information is missing were omitted from the data set, the Union county, Florida is an outlier in terms of mortality information which was deleted from the data set, in the process of local control analysis this study experiences two (cluster 28 and 29) non-informative clusters (non-informative cluster is one for which either treatment or control group information is missing). For analysis, non-informative clusters information was deleted from the data set. Three types of variables are used in this study: (i) lung cancer mortality as an outcome variable (ii) binary treatment indicator is the PM2.5 high (greater than 10.59 mg/m3) vs. low (less than 10.59 mg/m3) (iii) three potential X confounder for clustering namely land EQI, sociodemographic EQI and built EQI. For each index, higher values correspond to poorer environmental quality (Jagai et al. 2017). As PM2.5 is one of the indicators for measuring air EQI, that is why we do not consider the air EQI to avoid confounding effects.
Established in 2019, the Ovarian Cancer Database builds on 40 years of data collection across the region of the South East Scotland Cancer Network. The database holds data on diagnosis, treatment and outcomes of patients undergoing care within the region.
Population based cancer incidence rates were abstracted from National Cancer Institute, State Cancer Profiles for all available counties in the United States for which data were available. This is a national county-level database of cancer data that are collected by state public health surveillance systems. All-site cancer is defined as any type of cancer that is captured in the state registry data, though non-melanoma skin cancer is not included. All-site age-adjusted cancer incidence rates were abstracted separately for males and females. County-level annual age-adjusted all-site cancer incidence rates for years 2006–2010 were available for 2687 of 3142 (85.5%) counties in the U.S. Counties for which there are fewer than 16 reported cases in a specific area-sex-race category are suppressed to ensure confidentiality and stability of rate estimates; this accounted for 14 counties in our study. Two states, Kansas and Virginia, do not provide data because of state legislation and regulations which prohibit the release of county level data to outside entities. Data from Michigan does not include cases diagnosed in other states because data exchange agreements prohibit the release of data to third parties. Finally, state data is not available for three states, Minnesota, Ohio, and Washington. The age-adjusted average annual incidence rate for all counties was 453.7 per 100,000 persons. We selected 2006–2010 as it is subsequent in time to the EQI exposure data which was constructed to represent the years 2000–2005. We also gathered data for the three leading causes of cancer for males (lung, prostate, and colorectal) and females (lung, breast, and colorectal). The EQI was used as an exposure metric as an indicator of cumulative environmental exposures at the county-level representing the period 2000 to 2005. A complete description of the datasets used in the EQI are provided in Lobdell et al. and methods used for index construction are described by Messer et al. The EQI was developed for the period 2000– 2005 because it was the time period for which the most recent data were available when index construction was initiated. The EQI includes variables representing each of the environmental domains. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., L. Messer, K. Rappazzo , C. Gray, S. Grabich , and D. Lobdell. County-level environmental quality and associations with cancer incidence#. Cancer. John Wiley & Sons Incorporated, New York, NY, USA, 123(15): 2901-2908, (2017).
Comprehensive dataset of 388 Cancer treatment centers in Russia as of June, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Ovarian cancer is the most common cause of death among gynaecological malignancies. Due to its asymptomatic development, the disease is often diagnosed as an advanced, incurable stage. Although ovarian cancer generally responds well to platinum- and taxane-based first-line chemotherapy, most patients develop relapse and chemotherapy resistance. Despite years of research, reliable diagnostic indicators are still lacking. and other diagnostic methods that enable early detection and are suitable for screening.Therefore, many current studies aim to find new biomarkers with diagnostic, prognostic, and predictive potential, as well as to find new therapeutic targets. Early diagnosis is the most important factor in determining ovarian cancer survival. Current clinically applied diagnostic tools have had very limited success in early detection. The discovery of new diagnostic biomarkers for the early diagnosis of ovarian cancer is one of the major challenges of modern medicine. With advances in genomics and proteomics technology, many molecular biomarkers have been discovered and shown promise for ovarian cancer diagnosis, but further validation is still needed.In order to integrate and evaluate numerous research resources and results, literature review and information extraction analysis were carried out, so as to construct an ovarian cancer biomarker library to assist in the discovery of new ovarian cancer biomarkers in the diagnosis and treatment of ovarian cancer related diseases, and supplement the current inclusion of biomarkers, including data, classification, targets, etc. It covers the name/combination of biomarkers, molecular type, function, sensitivity, specificity, area under the curve, technique, algorithm, sample type, number of cases, target/signaling pathway, reference year region, reference title, and corresponding URL source. The interface supports the query function of the above biomarker content. Therefore, this biomarker database will help to find biomarkers conducive to early ovarian cancer diagnosis from multi-directional molecular types, mechanisms of action, methods of action, and targets, help ovarian cancer treatment, and improve the survival rate of ovarian cancer.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recent studies have revealed that neural functions are involved in possibly every aspect of a cancer development, serving as bridges connecting microenvironmental stressors, activities of intracellular subsystems, and cell survival. Elucidation of the functional roles played by the neural system could provide the missing links in developing a systems-level understanding of cancer biology. However, the existing information is highly fragmented and scattered across the literature and internet databases, making it difficult for cancer researchers to use. We have conducted computational analyses of transcriptomic data of cancer tissues in TCGA and tissues of healthy organs in GTEx, aiming to demonstrate how the functional roles by the neural genes could be derived and what non-neural functions they are associated with, across different stages of 26 cancer types. Several novel discoveries are made, including i) the expressions of certain neural genes can predict the prognosis of a cancer patient; ii) cancer metastasis tends to involve specific neural functions; iii) cancers of low survival rates involve more neural interactions than those with high survival rates; iv) more malignant cancers involve more complex neural functions; and v) neural functions are probably induced to alleviate stresses and help the associated cancer cells to survive. A database, called NGC, is developed for organizing such derived neural functions and associations, along with gene expressions and functional annotations collected from public databases, aiming to provide an integrated and publicly available information resource to enable cancer researchers to take full advantage of the relevant information in their research, facilitated by tools provided by NGC.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
A list of databases in cancer research.