41 datasets found
  1. Table 1_Personalized three-year survival prediction and prognosis forecast...

    • frontiersin.figshare.com
    xlsx
    Updated Oct 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buwei Teng; Xiaofeng Zhang; Mingshu Ge; Miao Miao; Wei Li; Jun Ma (2024). Table 1_Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: a population-based study and an external validation.xlsx [Dataset]. http://doi.org/10.3389/fonc.2024.1488118.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 21, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Buwei Teng; Xiaofeng Zhang; Mingshu Ge; Miao Miao; Wei Li; Jun Ma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PurposeThe overall survival of patients with pancreatic cancer is extremely low. We aimed to establish machine learning (ML) based model to accurately predict three-year survival and prognosis of pancreatic cancer patients.MethodsWe analyzed pancreatic cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2021. Univariate and multivariate logistic analysis were employed to select variables. Recursive Feature Elimination (RFE) method based on 6 ML algorithms was utilized in feature selection. To construct predictive model, 13 ML algorithms were evaluated by area under the curve (AUC), area under precision-recall curve (PRAUC), accuracy, sensitivity, specificity, precision, cross-entropy, Brier scores and Balanced Accuracy (bacc) and F Beta Score (fbeta). An optimal ML model was constructed to predict three-year survival, and the predictive results were explained by SHapley Additive exPlanations (SHAP) framework. Meanwhile, 101 ML algorithm combinations were developed to select the best model with highest C-index to predict prognosis of pancreatic cancer patients.ResultsA total of 20,064 pancreatic cancer patients from SEER database was consecutively enrolled. We utilized eight clinical variables to establish prediction model for three-year survival. CatBoost model was selected as the best prediction model, and AUC was 0.932 [0.924, 0.939], 0.899 [0.873, 0.934] and 0.826 [0.735, 0.919] in training, internal test and external test sets, with 0.839 [0.831, 0.847] accuracy, 0.872 [0.858, 0.887] sensitivity, 0.803 [0.784, 0.825] specificity and 0.832 [0.821, 0.853] precision. Surgery type had the greatest effects on three-year survival according to SHAP results. For prognosis prediction, “RSF+GBM” algorithm was the best prognostic model with C-index of 0.774, 0.722 and 0.674 in training, internal test and external test sets.ConclusionsOur ML models demonstrate excellent accuracy and reliability, offering more precise personalized prognostic prediction to pancreatic cancer patients.

  2. Data filtered from the SEER database

    • kaggle.com
    zip
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caorun980418 (2023). Data filtered from the SEER database [Dataset]. https://www.kaggle.com/datasets/caorun980418/data-filtered-from-the-seer-database
    Explore at:
    zip(150857 bytes)Available download formats
    Dataset updated
    Mar 21, 2023
    Authors
    Caorun980418
    Description

    Selection of study population: adenocarcinoma (8140-3), carcinosarcoma (8980-3) were selected using International Classification of Diseases in Oncology-3 (ICD-O-3) histology codes. patients between 19 and 100 years of age diagnosed with carcinosarcoma of the gallbladder and with adenocarcinoma of the gallbladder from 2004-2015 were included in the study population.The following were the exclusion standards: 1. no follow-up or vital status information. 2. non-first primary tumor or no clear pathological diagnosis. 3. marital status unknown. 4. race information unknown. 5. t-stage unknown and t0 stage. 6. m-stage unknown. 7. SEER Stage information unknown. 8. incomplete surgical information.

    Selection of variables: The following clinical information was obtained for each patient (age, gender, race, marital status, SEER stage, T stage, M stage, AJCC stage, surgery, radiation therapy, chemotherapy, months of survival, and vital status). Patients were evaluated for staging using the American Joint Committee on Cancer (AJCC) 6th edition staging system. The type of surgery was divided into three groups: no surgery (0), undergoing simple cholecystectomy (including local tumor resection or destruction (10-11, 20-27), simple resection of the primary site (30), complete resection of the primary site (40), and tumor reduction (50)), and radical cholecystectomy (60). In this investigation, the sixth edition of the American Joint Committee on Cancer (AJCC) staging manual was used. Overall survival (OS) and cancer-specific survival (CSS) were the main outcomes.Overall survival (OS) was defined as the time from the patient's diagnosis to death from any cause. Cancer-specific survival (CSS) was defined as the interval between the patient's diagnosis of this tumor and death.

  3. NCI State Late Stage Breast Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 21, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Late Stage Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/datasets/NCI::nci-state-late-stage-breast-cancer-incidence-rates/geoservice
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset authored and provided by
    National Cancer Institutehttp://www.cancer.gov/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Breast Cancer (Late Stage^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ Late Stage is defined as cases determined to be regional or distant. Due to changes in stage coding, Combined Summary Stage (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.

  4. f

    Table_1_A SEER database retrospective cohort of 547 patients with penile...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhatt, Arjun; Pasli, Melisa; Ju, Andrew; Ashley, Lucas W.; Sutton, Kent F.; Edwards, George (2023). Table_1_A SEER database retrospective cohort of 547 patients with penile non-squamous cell carcinoma: demographics, clinical characteristics, and outcomes.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001064703
    Explore at:
    Dataset updated
    Oct 31, 2023
    Authors
    Bhatt, Arjun; Pasli, Melisa; Ju, Andrew; Ashley, Lucas W.; Sutton, Kent F.; Edwards, George
    Description

    IntroductionLittle research has investigated the prevalence and distribution of the diverse pathologies of non-squamous cell carcinoma (non-SCC) of the penis. Although rare in clinical practice, these cancers have become a focus of greater importance among patients, clinicians, and researchers, particularly in developing countries. The principal objective of this study was to analyze the major types of penile non-SCC, elucidate common treatment pathways, and highlight outcomes including 5-year survival.Materials/methodsThe Surveillance, Epidemiology, and End Results (SEER) database was queried between 2000 and 2018 to identify a retrospective cohort of patients with penile non-SCC. Demographic information, cancer characteristics, diagnostic methods, treatments administered, and survival were investigated.ResultsA total of 547 cases of penile non-SCC were included in the analysis. The most prevalent non-SCC cancers included epithelial neoplasms, not otherwise specified (NOS) (15.4%), unspecified neoplasms (15.2%), basal cell neoplasms (13.9%), blood vessel tumors (13.0%), nevi and melanomas (11.7%), and ductal and lobular neoplasms (9.9%). Over half (56.7%) of patients elected to undergo surgical intervention. Patients rarely received systemic therapy (3.8%) or radiation (4.0%). Five-year survival was 35.5%. Patients who underwent surgery had greater annual survival for 0–10 years compared to those who did not have surgery. Significant differences in survival were found between patients who had regional, localized, and distant metastases (p < 0.05). A significant difference in survival was found for patients married at diagnosis versus those who were unmarried at diagnosis (p < 0.05). Lower survival rates were observed for patients older than 70 years.DiscussionAlthough less prevalent than SCC, penile non-SCC encompasses a diverse set of neoplasms. Patients in this cohort had a high utilization of surgical management leading to superior outcomes compared to those not receiving surgery. Radiation is an uncommonly pursued treatment pathway. Patient demographics and socioeconomic variables such as marital status may be valuable when investigating cancer outcomes. This updated database analysis can help inform diagnosis, management, and clinical outcomes for this rare group of malignancies.

  5. Baseline characteristics of the training cohort

    • figshare.com
    xlsx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiao Zhang (2025). Baseline characteristics of the training cohort [Dataset]. http://doi.org/10.6084/m9.figshare.26510554.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Qiao Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Clinical and pathological data of CRCLM patients who underwent primary tumor resection and chemotherapy from 2010 to 2015 were extracted from the SEER 8 registries database (SEER*Stat 8.4.3). A total of 3252 cases met the inclusion and exclusion criteria. Continuous variables were converted to categorical variables using X-tile software, which also determined optimal cutoff values for grouping. Patients were randomly divided into a training cohort (n=2276) and a validation cohort (n=976) in a 7:3 ratio.

  6. f

    Supplementary file 2_Competing risk nomogram for predicting cancer-specific...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated May 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiang, Sichun; Qian, Lili; Shen, Rongbin; Zhang, Yu; Ma, Chenyang; Shen, Jianping; Chen, Shana; Guo, Qing; Gu, Jianyou; Xiang, Jingjing (2025). Supplementary file 2_Competing risk nomogram for predicting cancer-specific survival in patients with primary bone diffuse large B-cell lymphoma: a SEER-based retrospective study.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002040001
    Explore at:
    Dataset updated
    May 12, 2025
    Authors
    Xiang, Sichun; Qian, Lili; Shen, Rongbin; Zhang, Yu; Ma, Chenyang; Shen, Jianping; Chen, Shana; Guo, Qing; Gu, Jianyou; Xiang, Jingjing
    Description

    BackgroundCardiovascular death (CVD) represents a significant determinant affecting the long-term survival outcomes of cancer patients, independent of primary tumor effects. Consequently, this study aims to identify prognostic factors in patients with primary bone diffuse large B-cell lymphoma (PB-DLBCL) using CVD as a competing risk and to develop a competing risk nomogram.MethodsData for patients diagnosed with PB-DLBCL from 2000 to 2015 were sourced from the Surveillance Epidemiology, and End Results (SEER) database and a total of 1,224 PB-DLBCL patients were eventually included in this study. The approach of multiple imputation is utilized to address the issue of missing data. Univariate Cox regression analysis and the best subset selection method are utilized for variable screening, from which overlapping independent risk factors are identified for subsequent multivariate Cox analysis and multivariate competing risk analysis. The Fine-Gray test was applied for univariate competing risk analysis. Significant variables from the multivariate competing risk analysis were selected as independent prognostic factors to construct a competing risk nomogram for predicting 1-, 5-, and 10-year cancer-specific survival (CSS). The model's performance was evaluated by Harrell concordance index (C-index), time-dependent receiver operating characteristic (ROC) curves, and calibration curves.ResultsCompared with the competing risk model, the conventional Cox regression model overestimates the impact of variables on the incidence of cancer-specific death (CSD). Age, income, B symptoms, Ann Arbor stage, primary site, laterality, chemotherapy, and systemic therapy were identified as independent risk factors for CSD. A competing risk nomogram was developed incorporating these variables to predict CSS. In the training set, the areas under the curve (AUC) for 1-, 5-, and 10-year CSS were 0.879, 0.848, and 0.839, respectively, while in the testing set, the AUC values were 0.794, 0.781, and 0.790, respectively. The C-index of the model was 0.853, 0.823, and 0.819 for 1-, 5-, and 10-year survival in the training set, and 0.777, 0.757, and 0.754 in the testing set. The calibration curve indicated favorable consistency for the competing risk nomogram.ConclusionsThe competing risk nomogram was effectively utilized to predict CSS in patients with PB-DLBCL It exhibited robust predictive performance and holds potential for enhancing treatment decision-making in clinical practice.

  7. NCI State Colorectal Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 2, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Colorectal Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/datasets/eb26abf367914e259d618d7ce03cc360
    Explore at:
    Dataset updated
    Jan 2, 2020
    Dataset authored and provided by
    National Cancer Institutehttp://www.cancer.gov/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Colorectal Cancer (All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2018 to 2022.Data are segmented by sex (Both Sexes, Male, and Female) and age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0. † Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (SEER areas use 20 age groups and NPCR areas use 19 age groups). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage. Due to changes in stage coding, Combined Summary Stage with Expanded Regional Codes (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(2) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2024 submission).(7) Source: SEER November 2024 submission.

  8. NCI State Prostate Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Prostate Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/maps/NCI::nci-state-prostate-cancer-incidence-rates
    Explore at:
    Dataset updated
    Jan 2, 2020
    Dataset authored and provided by
    National Cancer Institutehttp://www.cancer.gov/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Prostate Cancer(All Stages^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2018 to 2022.Data are for males segmented age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0. † Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (SEER areas use 20 age groups and NPCR areas use 19 age groups). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ All Stages refers to any stage. Due to changes in stage coding, Combined Summary Stage with Expanded Regional Codes (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(2) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2024 submission).(7) Source: SEER November 2024 submission.

  9. Data from: Survival Outcome of Local versus Radical Resection for...

    • figshare.com
    txt
    Updated Oct 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shangcheng Yan (2023). Survival Outcome of Local versus Radical Resection for Jejunoileal Gastrointestinal Stromal Tumors: A Propensity Score-Matched Population-Based Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.22360540.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 19, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Shangcheng Yan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data of patients with jejunoileal gastrointestinal stromal tumors (JI GISTs) were extracted from Surveillance, Epidemiology, and End Results (SEER) database. Predictor variables (age, sex, race, marital status, year of diagnosis, income, area, tumor size, T stage, N stage, M stage, grade, mitotic rate, tumor site, chemotherapy, lymphadenectomy and mode of surgery), survival time and outcome variable (all-cause death and disease-specific death) used in this study were recoded. Propensity score matching (PSM) was performed to reduce bias between local resection (LR) and radical resection (RR) groups.The article using these datasets is published at International Journal of Colorectal Disease.

  10. Baseline characteristics of the validation cohort

    • figshare.com
    xlsx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiao Zhang (2025). Baseline characteristics of the validation cohort [Dataset]. http://doi.org/10.6084/m9.figshare.26520409.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Qiao Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Clinical and pathological data of CRCLM patients who underwent primary tumor resection and chemotherapy from 2010 to 2015 were extracted from the SEER 8 registries database (SEER*Stat 8.4.3). A total of 3252 cases met the inclusion and exclusion criteria. Continuous variables were converted to categorical variables using X-tile software, which also determined optimal cutoff values for grouping. Patients were randomly divided into a training cohort (n=2276) and a validation cohort (n=976) in a 7:3 ratio.

  11. f

    Table_1_Clinicopathological Characteristics and Survival Outcomes of Primary...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Oct 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiang, Xinjie; Chen, Cheng; Chen, Xudong; Xia, Fei; Wang, Weiguo (2021). Table_1_Clinicopathological Characteristics and Survival Outcomes of Primary Renal Leiomyosarcoma.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000856504
    Explore at:
    Dataset updated
    Oct 21, 2021
    Authors
    Jiang, Xinjie; Chen, Cheng; Chen, Xudong; Xia, Fei; Wang, Weiguo
    Description

    Background: Primary renal leiomyosarcoma (LMS) is an exceedingly rare entity with a poor prognosis. We summarized the clinicopathological characteristics, treatment choice, and survival outcomes of LMS from the Surveillance, Epidemiology, and End Results (SEER) database.Methods: Renal LMS and kidney renal clear cell carcinoma (KIRC) data from 1998 to 2016 were collected from the SEER database. The continuous variables were analyzed using t-tests, while the categorical variables were analyzed using Pearson's chi-squared or Fisher's exact tests. Propensity score matching (PSM) was also performed. The cancer-specific survival (CSS) and overall survival (OS) curves were estimated using Kaplan-Meier analyses and compared by log-rank tests. The risk factors for CSS and OS were estimated using univariable and multivariable Cox proportional hazard regression models.Results: A total of 140 patients with renal LMS and 75,401 patients with KIRC were enrolled. These groups differed significantly in sex, race, tumor size, grade, SEER stage, surgery, radiation, and chemotherapy. Renal LMS exhibited poorer CSS and OS compared with KIRC before and after PSM. For renal LMS, the univariate Cox proportional hazard regression model indicated that larger tumor size, higher tumor grade, higher SEER stage, and chemotherapy were risk factors for CSS and OS, while surgery appeared to be a protective factor. However, only tumor grade, SEER stage, and receiving surgery remained independent prognostic factors in the multivariable Cox proportional hazard regression model. In addition, subgroup analyses indicated that surgery remained a protective factor for advanced renal LMS. However, there was no survival benefit for patients receiving chemotherapy.Conclusions: Primary renal LMS is an exceedingly rare entity with distinct clinicopathological features and a poor prognosis. A higher tumor grade and late stage may indicate a poor prognosis. Complete tumor resection remains to be the first treatment choice, while chemotherapy may be a palliative treatment for patients with advanced disease.

  12. Data_Sheet_1_Age-Related Sex Disparities in Esophageal Cancer Survival: A...

    • frontiersin.figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhen-Fei Xiang; Hua-Cai Xiong; Dan-Fei Hu; Ming-Yao Li; Zhan-Chun Zhang; Zheng-Chun Mao; Er-Dong Shen (2023). Data_Sheet_1_Age-Related Sex Disparities in Esophageal Cancer Survival: A Population-Based Study in the United States.PDF [Dataset]. http://doi.org/10.3389/fpubh.2022.836914.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Zhen-Fei Xiang; Hua-Cai Xiong; Dan-Fei Hu; Ming-Yao Li; Zhan-Chun Zhang; Zheng-Chun Mao; Er-Dong Shen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe association between sex and the survival of patients with esophageal cancer (EC) remains controversial. We sought to systematically investigate sex-based disparities in EC survival using the Surveillance, Epidemiology, and End Results (SEER) registry data from the United States.MethodsPatients with EC diagnosed from 2004 to 2015 registered in the SEER database were selected. The association between sex and cancer-specific survival (CSS) was evaluated using survival analysis. The Inverse Probability Weighting (IPW) approach was applied to reduce the observed bias between males and females. Subgroup analyses were used to investigate the robustness of the sex-based disparity and to explore potential interaction effects with other variables.ResultsOverall, 29,312 eligible EC patients were analyzed, of whom 5,781 were females, and 23,531 were males. Females had higher crude CSS compared to males (10-year CSS: 24.5 vs. 21.3%; P < 0.001). Similar results were obtained after adjusting for selection bias using the IPW approach and multivariate regression. Subgroup analyses confirmed the relative robustness of sex as a prognostic factor. However, significant interactions were observed between sex and other variables, such as age, race, tumor grade, histology, and treatment modality. In particular, there was no survival advantage for premenopausal females compared to their male counterparts, but the association between sex and EC survival was prominent in 46–55-year-old patients.ConclusionsFemale EC patients had better long-term survival than males. The association between sex and EC survival vary according to age, race, tumor grade, histology, and treatment modality. Sex-based disparity in EC-specific survival was age-related in the United States population.

  13. f

    Supplementary Material for: Local Surgery Improves Survival in Patients with...

    • datasetcatalog.nlm.nih.gov
    • karger.figshare.com
    Updated Nov 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H. -F. , Sun; M. -T. , Chen; Y. , Zhao; Y. -Y. , Zhao; X. -L. , Yang; W. , Jin (2019). Supplementary Material for: Local Surgery Improves Survival in Patients with Primary Metastatic Breast Cancer: A Population-Based Study [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000107355
    Explore at:
    Dataset updated
    Nov 21, 2019
    Authors
    H. -F. , Sun; M. -T. , Chen; Y. , Zhao; Y. -Y. , Zhao; X. -L. , Yang; W. , Jin
    Description

    The clinical value of local surgery in the breast cancer patients with distant metastasis is still unclear. A total of 8,922 primary metastatic breast cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database were analyzed in the current study. Primary outcome variables included breast cancer-specific survival (BCSS) and overall survival (OS). Among the patients, 1,724 (19.3%) who underwent surgical treatment (ST) of primary breast tumor had increased OS (p < 0.001) and BCSS (p < 0.001) compared with those in the nonsurgical treatment (NST) group. Multivariate analysis revealed that surgery improved survival and was an independent prognostic factor for OS (hazard ratio [HR] = 0.617; 95% confidence interval [CI], 0.562–0.676, p < 0.001) and BCSS (HR = 0.623; 95% CI, 0.565–0.686, p < 0.001). Further result showed that ST tended to prolong the survival of patients with 1 or 2 distant metastatic sites (p < 0.05 for OS, p < 0.05 for BCSS). However, no differences were found in prognostic outcomes between different surgical procedure groups (p = 0.886 for OS, p = 0.943 for BCSS). In conclusion,our study suggested that local surgery appeared to confer a survival benefit, which may provide new understanding of treatment for these patients.

  14. f

    Table_1_Development and Validation of a Clinical Prognostic Nomogram for...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    • +1more
    Updated Sep 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu, Yue; Yi, Jun; Li, Qi-fan; Song, Hai-zhu; Shao, Chen-ye; Liu, Xiao-long; Shen, Yi (2021). Table_1_Development and Validation of a Clinical Prognostic Nomogram for Esophageal Adenocarcinoma Patients.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000890991
    Explore at:
    Dataset updated
    Sep 2, 2021
    Authors
    Yu, Yue; Yi, Jun; Li, Qi-fan; Song, Hai-zhu; Shao, Chen-ye; Liu, Xiao-long; Shen, Yi
    Description

    BackgroundClinical staging is essential for clinical decisions but remains imprecise. We purposed to construct a novel survival prediction model for improving clinical staging system (cTNM) for patients with esophageal adenocarcioma (EAC).MethodsA total of 4180 patients diagnosed with EAC were extracted from the Surveillance, Epidemiology, and End Results (SEER) database and included as the training cohort. Significant prognostic variables were identified for nomogram model development using multivariable Cox regression. The model was validated internally by bootstrap resampling, and then subjected to external validation with a separate cohort of 886 patients from 2 institutions in China. The prognostic performance was measured by concordance index (C-index), Akaike information criterion (AIC) and calibration plots. Different risk groups were stratified by the nomogram scores.ResultsA total of six variables were determined related with survival and entered into the nomogram construction. The calibration curves showed satisfied agreement between nomogram-predicted survival and actual observed survival for 1-, 3-, and 5-year overall survival. By calculating the AIC and C-index values, our nomogram presented superior discriminative and risk-stratifying ability than current TNM staging system. Significant distinctions in survival curves were observed between different risk subgroups stratified by nomogram scores.ConclusionThe established and validated nomogram presented better risk-stratifying ability than current clinical staging system, and could provide a convenient and reliable tool for individual survival prediction and treatment strategy making.

  15. Patient characteristics at the time of NSCLC diagnosis.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling Lin; Daquan Wang; Haizhu Chen (2023). Patient characteristics at the time of NSCLC diagnosis. [Dataset]. http://doi.org/10.1371/journal.pone.0285766.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ling Lin; Daquan Wang; Haizhu Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Patient characteristics at the time of NSCLC diagnosis.

  16. Table_1_Incidence, Clinical Characteristics, and Survival of Collecting Duct...

    • frontiersin.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaopeng Tang; Yulin Zhou; Silun Ge; Xiaoming Yi; Huichen Lv; Wenquan Zhou (2023). Table_1_Incidence, Clinical Characteristics, and Survival of Collecting Duct Carcinoma of the Kidney: A Population-Based Study.docx [Dataset]. http://doi.org/10.3389/fonc.2021.727222.s002
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Chaopeng Tang; Yulin Zhou; Silun Ge; Xiaoming Yi; Huichen Lv; Wenquan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ObjectiveTo investigate the exact age‐adjusted incidence (AAI), clinical characteristics, and survival data of collecting duct carcinoma of the kidney (CDCK) recorded in the Surveillance, Epidemiology, and End Results (SEER) database of the National Cancer Institute.MethodsPatients with CDCK confirmed by microscopic examination from 2004 to 2018 were selected from the SEER database. AAI rates were calculated using SEER*Stat software (version 8.3.9). The Kaplan‐Meier method was used to evaluate cancer-specific survival (CSS) rates according to tumor size, tumor stage, and treatment methods, and differences among these variables were assessed by the log‐rank test. Cox regression analysis was employed to identify variables independently related to CSS.ResultsA total of 286 patients with CDCK were identified from the database. The majority of the patients were white (69.2%), male (67.5%), and married (60.5%), and the median age was 59 years. Most patients with CDCK (74.4%) presented with stages III or IV disease. The diameter of most (59.4%) tumors was less than 7 cm, and the tumors were more commonly found on the left than on the right (55.2% vs. 44.8%). The incidence of CDCK decreased over time. The median CSS time was 17 months. In terms of the treatment modalities used, 83.9% of the patients underwent surgery; 32.9% underwent chemotherapy, and 13.6% underwent radiotherapy. The CSS rates at 1, 2, and 5 years were 57.3%, 43.2%, and 30.7%, respectively. In patients with stage IV CDCK treated with surgery alone, chemotherapy alone, and surgery plus chemotherapy, the median survival time was 5 months, 9 months, and 14 months, respectively (P =0.024). Multivariate Cox regression analysis showed surgery, chemotherapy, stage, regional lymph node metastasis, and distant metastasis were independent prognostic factors for patients with CDCK.ConclusionsCDCK is an uncommon malignant renal carcinoma, and its incidence is decreasing based on the analysis of current data. CDCK is a high stage, regional lymph-nodes positive, and metastatic disease. Compared with surgery alone or chemotherapy alone, patients with stage IV could gain survival benefit from surgery combined with chemotherapy.

  17. f

    Table_2_Development and validation of nomograms for predicting overall...

    • frontiersin.figshare.com
    bin
    Updated Jun 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fangxu Yin; Song Wang; Chong Hou; Yiyuan Zhang; Zhenlin Yang; Xiaohong Wang (2023). Table_2_Development and validation of nomograms for predicting overall survival and cancer specific survival in locally advanced breast cancer patients: A SEER population-based study.XLSX [Dataset]. http://doi.org/10.3389/fpubh.2022.969030.s005
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    Frontiers
    Authors
    Fangxu Yin; Song Wang; Chong Hou; Yiyuan Zhang; Zhenlin Yang; Xiaohong Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundFor patients with locally advanced breast cancer (LABC), conventional TNM staging is not accurate in predicting survival outcomes. The aim of this study was to develop two accurate survival prediction models to guide clinical decision making.MethodsA retrospective analysis of 22,842 LABC patients was performed from 2010 to 2015 using the Surveillance, Epidemiology and End Results (SEER) database. An additional cohort of 200 patients from the Binzhou Medical University Hospital (BMUH) was analyzed. The least absolute shrinkage and selection operator (LASSO) regression was used to screen for variables. The identified variables were used to build a survival prediction model. The performance of the nomogram models was assessed based on the concordance index (C-index), calibration plot, receiver operating characteristic (ROC) curve, and decision curve analysis (DCA).ResultsThe LASSO analysis identified 9 variables in patients with LABC, including age, marital status, Grade, histological type, T-stage, N-stage, surgery, radiotherapy, and chemotherapy. In the training cohort, the C-index of the nomogram in predicting the overall survival (OS) was 0.767 [95% confidence intervals (95% CI): 0.751–0.775], cancer specific survival (CSS) was 0.765 (95% CI: 0.756–0.774). In the external validation cohort, the C-index of the nomogram in predicting the OS was 0.858 (95% CI: 0.812–0.904), the CSS was 0.866 (95% CI: 0.817–0.915). In the training cohort, the area under the receiver operator characteristics curve (AUC) values of the nomogram in prediction of the 1, 3, and 5-year OS were 0.836 (95% CI: 0.821–0.851), 0.769 (95% CI: 0.759–0.780), and 0.750 (95% CI: 0.738–0.762), respectively. The AUC values for prediction of the 1, 3, and 5-year CSS were 0.829 (95% CI: 0.811–0.847), 0.769 (95% CI: 0.757–0.780), and 0.745 (95% CI: 0.732–0.758), respectively. Results of the C-index, ROC curve, and DCA demonstrated that the nomogram was more accurate in predicting the OS and CSS of patients compared with conventional TNM staging.ConclusionTwo prediction models were developed and validated in this study which provided more accurate prediction of the OS and CSS in LABC patients than the TNM staging. The constructed models can be used for predicting survival outcomes and guide treatment plans for LABC patients.

  18. f

    DataSheet_1_Artificial Intelligence Combined With Big Data to Predict Lymph...

    • figshare.com
    xlsx
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liwei Wei; Yongdi Huang; Zheng Chen; Hongyu Lei; Xiaoping Qin; Lihong Cui; Yumin Zhuo (2023). DataSheet_1_Artificial Intelligence Combined With Big Data to Predict Lymph Node Involvement in Prostate Cancer: A Population-Based Study.xlsx [Dataset]. http://doi.org/10.3389/fonc.2021.763381.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    Frontiers
    Authors
    Liwei Wei; Yongdi Huang; Zheng Chen; Hongyu Lei; Xiaoping Qin; Lihong Cui; Yumin Zhuo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundA more accurate preoperative prediction of lymph node involvement (LNI) in prostate cancer (PCa) would improve clinical treatment and follow-up strategies of this disease. We developed a predictive model based on machine learning (ML) combined with big data to achieve this.MethodsClinicopathological characteristics of 2,884 PCa patients who underwent extended pelvic lymph node dissection (ePLND) were collected from the U.S. National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) database from 2010 to 2015. Eight variables were included to establish an ML model. Model performance was evaluated by the receiver operating characteristic (ROC) curves and calibration plots for predictive accuracy. Decision curve analysis (DCA) and cutoff values were obtained to estimate its clinical utility.ResultsThree hundred and forty-four (11.9%) patients were identified with LNI. The five most important factors were the Gleason score, T stage of disease, percentage of positive cores, tumor size, and prostate-specific antigen levels with 158, 137, 128, 113, and 88 points, respectively. The XGBoost (XGB) model showed the best predictive performance and had the highest net benefit when compared with the other algorithms, achieving an area under the curve of 0.883. With a 5%~20% cutoff value, the XGB model performed best in reducing omissions and avoiding overtreatment of patients when dealing with LNI. This model also had a lower false-negative rate and a higher percentage of ePLND was avoided. In addition, DCA showed it has the highest net benefit across the whole range of threshold probabilities.ConclusionsWe established an ML model based on big data for predicting LNI in PCa, and it could lead to a reduction of approximately 50% of ePLND cases. In addition, only ≤3% of patients were misdiagnosed with a cutoff value ranging from 5% to 20%. This promising study warrants further validation by using a larger prospective dataset.

  19. f

    Comparison of overall survival between HL-LC and LC-1.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling Lin; Daquan Wang; Haizhu Chen (2023). Comparison of overall survival between HL-LC and LC-1. [Dataset]. http://doi.org/10.1371/journal.pone.0285766.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ling Lin; Daquan Wang; Haizhu Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of overall survival between HL-LC and LC-1.

  20. Univariate and multivariate analysis of OS in patients with HL-NSCLC.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling Lin; Daquan Wang; Haizhu Chen (2023). Univariate and multivariate analysis of OS in patients with HL-NSCLC. [Dataset]. http://doi.org/10.1371/journal.pone.0285766.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ling Lin; Daquan Wang; Haizhu Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Univariate and multivariate analysis of OS in patients with HL-NSCLC.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Buwei Teng; Xiaofeng Zhang; Mingshu Ge; Miao Miao; Wei Li; Jun Ma (2024). Table 1_Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: a population-based study and an external validation.xlsx [Dataset]. http://doi.org/10.3389/fonc.2024.1488118.s001
Organization logo

Table 1_Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: a population-based study and an external validation.xlsx

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Oct 21, 2024
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Buwei Teng; Xiaofeng Zhang; Mingshu Ge; Miao Miao; Wei Li; Jun Ma
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

PurposeThe overall survival of patients with pancreatic cancer is extremely low. We aimed to establish machine learning (ML) based model to accurately predict three-year survival and prognosis of pancreatic cancer patients.MethodsWe analyzed pancreatic cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2021. Univariate and multivariate logistic analysis were employed to select variables. Recursive Feature Elimination (RFE) method based on 6 ML algorithms was utilized in feature selection. To construct predictive model, 13 ML algorithms were evaluated by area under the curve (AUC), area under precision-recall curve (PRAUC), accuracy, sensitivity, specificity, precision, cross-entropy, Brier scores and Balanced Accuracy (bacc) and F Beta Score (fbeta). An optimal ML model was constructed to predict three-year survival, and the predictive results were explained by SHapley Additive exPlanations (SHAP) framework. Meanwhile, 101 ML algorithm combinations were developed to select the best model with highest C-index to predict prognosis of pancreatic cancer patients.ResultsA total of 20,064 pancreatic cancer patients from SEER database was consecutively enrolled. We utilized eight clinical variables to establish prediction model for three-year survival. CatBoost model was selected as the best prediction model, and AUC was 0.932 [0.924, 0.939], 0.899 [0.873, 0.934] and 0.826 [0.735, 0.919] in training, internal test and external test sets, with 0.839 [0.831, 0.847] accuracy, 0.872 [0.858, 0.887] sensitivity, 0.803 [0.784, 0.825] specificity and 0.832 [0.821, 0.853] precision. Surgery type had the greatest effects on three-year survival according to SHAP results. For prognosis prediction, “RSF+GBM” algorithm was the best prognostic model with C-index of 0.774, 0.722 and 0.674 in training, internal test and external test sets.ConclusionsOur ML models demonstrate excellent accuracy and reliability, offering more precise personalized prognostic prediction to pancreatic cancer patients.

Search
Clear search
Close search
Google apps
Main menu