Facebook
TwitterDeath rate has been age-adjusted to the 2000 U.S. standard population. Single-year data are only available for Los Angeles County overall, Service Planning Areas, Supervisorial Districts, City of Los Angeles overall, and City of Los Angeles Council Districts.Obesity can increase an individual’s lifetime risk of breast cancer. Promoting healthy food retail and physical activity and improving access to preventive care services are important measures that cities and communities can take to prevent breast cancer.For more information about the Community Health Profiles Data Initiative, please see the initiative homepage.
Facebook
TwitterExplore the field of breast cancer diagnosis with the insightful Wisconsin Breast Cancer dataset (Original). This dataset provides detailed attributes representing tumor characteristics observed in breast tissue samples. By analyzing these attributes, researchers and medical professionals can gain insights into tumor behavior and develop predictive models for cancer detection and prognosis.
| Features | |
|---|---|
| 1. Sample code number: Unique identifier for each tissue sample. | |
| 2. Clump Thickness: Assessment of the thickness of tumor cell clusters (1 - 10). | |
| 3. Uniformity of Cell Size: Uniformity in the size of tumor cells (1 - 10). | |
| 4. Uniformity of Cell Shape: Uniformity in the shape of tumor cells (1 - 10). | |
| 5. Marginal Adhesion: Degree of adhesion of tumor cells to surrounding tissue (1 - 10). | |
| 6. Single Epithelial Cell Size: Size of individual tumor cells (1 - 10). | |
| 7. Bare Nuclei: Presence of nuclei without surrounding cytoplasm (1 - 10). | |
| 8. Bland Chromatin: Assessment of chromatin structure in tumor cells (1 - 10). | |
| 9. Normal Nucleoli: Presence of normal-looking nucleoli in tumor cells (1 - 10). | |
| 10. Mitoses: Frequency of mitotic cell divisions (1 - 10). | |
| 11. Class: Classification of tumor type (2 for benign, 4 for malignant). |
The Breast Cancer Wisconsin dataset is sourced from tissue samples collected for diagnostic purposes, with attributes derived from microscopic examination. The dataset is anonymized and made available for research purposes, contributing to advancements in cancer diagnosis and treatment.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset of breast cancer patients was obtained from the 2017 November update of the SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer (SEER primary cites recode NOS histology codes 8522/3) diagnosed in 2006-2010. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included.
Facebook
TwitterThis dataset, available on Kaggle, is a classic machine learning benchmark for binary classification problems. It contains data collected from fine needle aspirates (FNAs) of breast masses. The goal is to classify each mass as either benign or malignant.
Facebook
TwitterRate: Number of new cases of breast cancer (per 100,000) diagnosed at the regional or distant stage among females.
Definition: Age-adjusted incidence rate of invasive breast cancer per 100,000 female population.
Data Sources:
(1) NJ State Cancer Registry, Dec 31, 2015 Analytic File, using NCI SEER*Stat ver 8.2.1 (www.seer.cancer.gov/seerstat)
(2) NJ population estimates as calculated by the NCI's SEER Program, released January 2015, http://www.seer.cancer.gov/popdata/download.html.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Deaths from breast cancer - Directly age-Standardised Rates (DSR) per 100,000 population Source: Office for National Statistics (ONS) Publisher: Information Centre (IC) - Clinical and Health Outcomes Knowledge Base Geographies: Local Authority District (LAD), Government Office Region (GOR), National, Primary Care Trust (PCT), Strategic Health Authority (SHA) Geographic coverage: England Time coverage: 2005-07, 2007 Type of data: Administrative data
Facebook
TwitterNumber and rate of new cancer cases diagnosed annually from 1992 to the most recent diagnosis year available. Included are all invasive cancers and in situ bladder cancer with cases defined using the Surveillance, Epidemiology and End Results (SEER) Groups for Primary Site based on the World Health Organization International Classification of Diseases for Oncology, Third Edition (ICD-O-3). Random rounding of case counts to the nearest multiple of 5 is used to prevent inappropriate disclosure of health-related information.
Facebook
Twitterhttps://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset consists of the state wise estimated incidence of breast cancer and cervical cancer in India as per the National Cancer Registry Programme. The estimates are computer using age specific incidence Rate of 28 PBCRs of 2012-2016 and the projected population (person-years). NB: Incidence estimates of breast cancer is available since 2016 while that of cervical cancer is available since 2015.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
Facebook
TwitterBy UCI [source]
This dataset contains data on breast cancer diagnosis, a devastating medical condition that affects thousands of people around the world each year. The data is comprised of patient ID, diagnosis (Malignant or Benign), and 30 computed features extracted from a digitized image of a fine needle aspirate (FNA) of a breast mass. Features include radius, texture, perimeter, area, smoothness, compactness concavity and concave points as well as symmetry and fractal dimension.
Created by renowned researchers in the fields of General Surgery and Computer Science at the University of Wisconsin-Madison led by Dr. William H Wolberg with contributions from Professor W Nick Street and Olvi L Mangasarian this dataset was used in some groundbreaking research to predict breast cancer prognosis using linear programming methods. More recently statistical methods such as support vector machines have been employed to classify tumour types from this dataset as well other tasks such as identify hidden patterns through pattern recognition techniques like Artificial Neural Networks (ANN).
It has also been used for studies exploring unsupervised classification tools like Ant Colony Optimization for discovering meaningful relationships among different variables which can help physicians better understand the progression of certain types of tumors over time. For example types cardinality analysis allowed researchers to determine tumor’s heterogeneity before deciding on appropriate treatments potentially leading to improved prognosis success rates overall. This Wisconsin Breast Cancer Diagnostic dataset provides an invaluable resource to scientists working on preventing or curing this dreaded disease - a goal we all eagerly hope to achieve someday soon!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Developing a classifier that can accurately predict breast cancer diagnoses based on the provided features.
- Clustering patient data with similar diagnosis to discover trends or connections between certain symptoms and diagnoses.
- Optimizing feature selection algorithms to identify the most relevant predictors of breast cancer diagnosis from a set of given cell nuclei features
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: unformatted-data.csv
File: wpbc.data.csv | Column name | Description | |:--------------|:--------------------------------| | 119513 | ID number (Integer) | | N | Diagnosis (Binary) | | 31 | Radius (Real-valued) | | 18.02 | Texture (Real-valued) | | 27.6 | Perimeter (Real-valued) | | 117.5 | Area (Real-valued) | | 1013 | Smoothness (Real-valued) | | 0.09489 | Compactness (Real-valued) | | 0.1036 | Concavity (Real-valued) | | 0.1086 | Symmetry (Real-valued) | | 0.07055 | Fractal Dimension (Real-valued) | | 0.1865 | Mean Intensity (Real-valued) | | 0.06333 | Standard Error (Real-valued) | | 0.6249 | Worst Radius (Real-valued) | | 1.89 | Worst Texture (Real-valued) | | 3.972 | Worst Perimeter (Real-valued) | | 71.55 | Worst Area (Real-valued) | | 0.004433 | Worst Smoothness (Real-valued) | | 0.01421 | Worst Compactness (Real-valued) | | 0.03233 | Worst Concavity (Real-valued) |
File: breast-cancer-wisconsin.data.csv | Column name | Description | |:--------------|:--------------------------------------| | 119513 | ID number (Integer) | | 1000025 | ID number (Integer) | | 1.1 | Uniformity of Cell Size (Integer) | | 1.2 | Uniformity of Cell Shape (Integer) | | 1.3 | Single Epithelial Cell Size (Integer) | | 1.4 | Bland Chromatin (Integer) | | 1.5 | Normal Nucleoli (Integer) | | 2.1 | Mitoses (Integer) |
File: wdbc.data.csv | Column name | Description | |:--------------|:----------------------------------------| | 842302 | Patient ID number (Integer Type) | | M | Diagnosis (Binary Type) | | **...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains Cancer Incidence data for Breast Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage in the Surveillance, Epidemiology, and End Results (SEER) summary stage.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.
Facebook
TwitterThe population of Middle Tennessee was assessed using publically available data collected in 2009 describing demographic and breast cancer-related characteristics of the population.*The value for each breast cancer risk factor was determined for each Middle Tennessee County, and counties were then ranked in numerical order from lowest to highest. The numerically ranked counties were then subdivided into quartiles, such that the three counties with the lowest risk factor values were placed in Quartile 1, and those with the highest were placed in Quartile 4. The range of risk factor values encompassed by each quartile are shown.1The percentage of the total female population in the county that is over the age of 50 years (a surrogate for menopause).2The breast cancer incidence per 100,000 women. 3Breast cancer mortality per 100,000 women.4The percentage of all breast cancers that were diagnosed at Stage IV.5The percentage of all breast cancers that were diagnosed without a prior mammographic screening.6The percentage of the female population lacking any form of health insurance.7The median household income.8The percentage of the population possessing higher than a high school level education.9The percentage of the population that is not Caucasian.
Facebook
TwitterBackground Increased BRCA1 and BRCA2 germline mutation rates have been reported in Ashkenazi Jewish women in North America, Europe and Israel, and have been mentioned as possibly related to a higher incidence of breast and ovarian cancer among these communities. The present study was carried out with the aim of obtaining evidence on the magnitude of breast cancer as a cause of death among Ashkenazi women in Brazil. Methods We reviewed all death certificates archived in the Jewish Burial Societies of São Paulo (1971-1997) and Porto Alegre (1948-1997), two of the main and oldest Jewish communities in Brazil. Breast cancer observed deaths were compared with expected deaths according to breast cancer mortality in the general population. Results The observed ratios were approximately quite close to unity, suggesting a similar breast cancer mortality pattern among the Ashkenazi population and the general population in both cities. These results maintain similar behavior regardless of whether analyzed before or after the mid-1980s, when mammography came to be increasingly performed in Brazil. Cancer proportional mortality ratios were 1.04 (0.83-1.29) in São Paulo and 1.16 (0.84-1.57) in Porto Alegre before 1985, and 1.17 (1.00-1.44) and 1.21 (0.81-1.79), respectively, between 1985 and 1997. Some evidence of the maintenance of protective risk factors such as high parity has been observed among Ashkenazi women in São Paulo. Conclusion A quite similar breast cancer mortality pattern was observed between Ashkenazi Jewish women and the general population in São Paulo and Porto Alegre, Brazil. These results may suggest an environmental role on germ mutation expression reported in this ethnic group.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThis nationwide study examined breast cancer (BC) incidence and mortality rates in Hungary between 2011–2019, and the impact of the Covid-19 pandemic on the incidence and mortality rates in 2020 using the databases of the National Health Insurance Fund (NHIF) and Central Statistical Office (CSO) of Hungary.MethodsOur nationwide, retrospective study included patients who were newly diagnosed with breast cancer (International Codes of Diseases ICD)-10 C50) between Jan 1, 2011 and Dec 31, 2020. Age-standardized incidence and mortality rates (ASRs) were calculated using European Standard Populations (ESP).Results7,729 to 8,233 new breast cancer cases were recorded in the NHIF database annually, and 3,550 to 4,909 all-cause deaths occurred within BC population per year during 2011-2019 period, while 2,096 to 2,223 breast cancer cause-specific death was recorded (CSO). Age-standardized incidence rates varied between 116.73 and 106.16/100,000 PYs, showing a mean annual change of -0.7% (95% CI: -1.21%–0.16%) and a total change of -5.41% (95% CI: -9.24 to -1.32). Age-standardized mortality rates varied between 26.65–24.97/100,000 PYs (mean annual change: -0.58%; 95% CI: -1.31–0.27%; p=0.101; total change: -5.98%; 95% CI: -13.36–2.66). Age-specific incidence rates significantly decreased between 2011 and 2019 in women aged 50–59, 60–69, 80–89, and ≥90 years (-8.22%, -14.28%, -9.14%, and -36.22%, respectively), while it increased in young females by 30.02% (95%CI 17,01%- 51,97%) during the same period. From 2019 to 2020 (in first COVID-19 pandemic year), breast cancer incidence nominally decreased by 12% (incidence rate ratio [RR]: 0.88; 95% CI: 0.69–1.13; 2020 vs. 2019), all-cause mortality nominally increased by 6% (RR: 1.06; 95% CI: 0.79–1.43) among breast cancer patients, and cause-specific mortality did not change (RR: 1.00; 95%CI: 0.86–1.15).ConclusionThe incidence of breast cancer significantly decreased in older age groups (≥50 years), oppositely increased among young females between 2011 and 2019, while cause-specific mortality in breast cancer patients showed a non-significant decrease. In 2020, the Covid-19 pandemic resulted in a nominal, but not statistically significant, 12% decrease in breast cancer incidence, with no significant increase in cause-specific breast cancer mortality observed during 2020.
Facebook
TwitterAbstract Introduction Despite the preventive actions, breast cancer (BC) in Brazil has a high mortality, probably due to the identification of the tumor in advanced stages. Objective To analyze mortality from BC in the health micro-regions of Minas Gerais (MG), 2013-2017, and its possible association with social inequality. Method Ecological study, whose unit of analysis was the health micro-regions of MG. Mortality, sociodemographic and health data were extracted from SIM, IBGE, PROADESS, and DATASUS. Specific and age-standardized mortality rates were calculated, thematic maps were constructed, and statistical analyzes were performed using the Moran Index and multiple simple regression. Results From 2013-2017 there were 7,571 deaths from BC in MG. The deadliest microregions are in the Center and East; the smallest in the North and Northeast. Most variables had a high coefficient of variation and were significant in the simple linear regression model. In the multiple distal and proximal models, only the degree of urbanization was significant. All variables showed significant spatial autocorrelation and spatial dependence. Conclusion High mortality rates in the most urbanized micro-regions can be explained by reproductive, behavioral factors and the distribution of health resources, present in large urban centers.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Cancer diagnoses and age-standardised incidence rates for all types of cancer by age and sex including breast, prostate, lung and colorectal cancer.
Facebook
Twitterhttps://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
This dataset provides estimated mortality figures for cervical and breast cancer in India, which affect women nationwide, based on the National Cancer Registry Programme's report.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains x-ray images, mammography, from breast cancer screening at the Karolinska University Hospital, Stockholm, Sweden, collected by principal investigator Fredrik Strand at Karolinska Institutet. The purpose for compiling the dataset was to perform AI research to improve screening, diagnostics and prognostics of breast cancer.
The dataset is based on a selection of cases with and without a breast cancer diagnosis, taken from a more comprehensive source dataset.
1,103 cases of first-time breast cancer for women in the screening age range (40-74 years) during the included time period (November 2008 to December 2015) were included. Of these, a random selection of 873 cases have been included in the published dataset.
A random selection of 10,000 healthy controls during the same time period were included. Of these, a random selection of 7,850 cases have been included in the published dataset.
For each individual all screening mammograms, also repeated over time, were included; as well as the date of screening and the age. In addition, there are pixel-level annotations of the tumors created by a breast radiologist (small lesions such as micro-calcifications have been annotated as an area). Annotations were also drawn in mammograms prior to diagnosis; if these contain a single pixel it means no cancer was seen but the estimated location of the center of the future cancer was shown by a single pixel annotation.
In addition to images, the dataset also contains cancer data created at the Karolinska University Hospital and extracted through the Regional Cancer Center Stockholm-Gotland. This data contains information about the time of diagnosis and cancer characteristics including tumor size, histology and lymph node metastasis.
The precision of non-image data was decreased, through categorisation and jittering, to ensure that no single individual can be identified.
The following types of files are available: - CSV: The following data is included (if applicable): cancer/no cancer (meaning breast cancer during 2008 to 2015), age group at screening, days from image to diagnosis (if any), cancer histology, cancer size group, ipsilateral axillary lymph node metastasis. There is one csv file for the entire dataset, with one row per image. Any information about cancer diagnosis is repeated for all rows for an individual who was diagnosed (i.e., it is also included in rows before diagnosis). For each exam date there is the assessment by radiologist 1, radiologist 2 and the consensus decision. - DICOM: Mammograms. For each screening, four images for the standard views were acuqired: left and right, mediolateral oblique and craniocaudal. There should be four files per examination date. - PNG: Cancer annotations. For each DICOM image containing a visible tumor.
Access: The dataset is available upon request due to the size of the material. The image files in DICOM and PNG format comprises approximately 2.5 TB. Access to the CSV file including parametric data is possible via download as associated documentation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)
Facebook
TwitterDeath rate has been age-adjusted to the 2000 U.S. standard population. Single-year data are only available for Los Angeles County overall, Service Planning Areas, Supervisorial Districts, City of Los Angeles overall, and City of Los Angeles Council Districts.Obesity can increase an individual’s lifetime risk of breast cancer. Promoting healthy food retail and physical activity and improving access to preventive care services are important measures that cities and communities can take to prevent breast cancer.For more information about the Community Health Profiles Data Initiative, please see the initiative homepage.