Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains medical and lifestyle information for 1500 patients, designed to predict the presence of cancer based on various features. The dataset is structured to provide a realistic challenge for predictive modeling in the medical domain.
Age: Integer values representing the patient's age, ranging from 20 to 80.
Gender: Binary values representing gender, where 0 indicates Male and 1 indicates Female.
BMI: Continuous values representing Body Mass Index, ranging from 15 to 40.
Smoking: Binary values indicating smoking status, where 0 means No and 1 means Yes.
GeneticRisk: Categorical values representing genetic risk levels for cancer, with 0 indicating Low, 1 indicating Medium, and 2 indicating High.
PhysicalActivity: Continuous values representing the number of hours per week spent on physical activities, ranging from 0 to 10.
AlcoholIntake: Continuous values representing the number of alcohol units consumed per week, ranging from 0 to 5.
CancerHistory: Binary values indicating whether the patient has a personal history of cancer, where 0 means No and 1 means Yes.
Diagnosis: Binary values indicating the cancer diagnosis status, where 0 indicates No Cancer and 1 indicates Cancer.
This dataset is intended for training and testing machine learning models for cancer prediction. It can be used for:
This dataset has been preprocessed and cleaned to ensure that users can focus on the most critical aspects of their analysis. The preprocessing steps were designed to eliminate noise and irrelevant information, allowing you to concentrate on developing and fine-tuning your predictive models.
This dataset, shared by Rabie El Kharoua, is original and has never been shared before. It is made available under the CC BY 4.0 license, allowing anyone to use the dataset in any form as long as proper citation is given to the author. A DOI is provided for proper referencing. Please note that duplication of this work within Kaggle is not permitted.
This dataset is synthetic and was generated for educational purposes, making it ideal for data science and machine learning projects. It is an original dataset, owned by Mr. Rabie El Kharoua, and has not been previously shared. You are free to use it under the license outlined on the data card. The dataset is offered without any guarantees. Details about the data provider will be shared soon.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Adaptation of http://archive.ics.uci.edu/ml/datasets/Cervical+cancer+(Risk+Factors)Ready for usage with ehrapy
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Synthetic Oral Cancer Prediction Dataset is designed for educational and research purposes to analyse factors associated with oral cancer risk, progression, and treatment outcomes. The dataset includes anonymised, synthetic data on various clinical, lifestyle, and demographic factors for individuals diagnosed with oral cancer.
https://storage.googleapis.com/opendatabay_public/09f348fc-a2e8-4132-9f1b-195765d80afc/622bf59174d1_plot_output.png" alt="Synthetic oral cancer dataset plot_output.png">
This dataset can be used for the following applications:
This synthetic dataset is fully anonymized and complies with data privacy standards. It includes a wide array of factors that support diverse research and analysis in the oncology and public health domains.
CC0 (Public Domain)
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
📄 Dataset Description: This dataset contains global cancer patient data reported from 2015 to 2024, designed to simulate the key factors influencing cancer diagnosis, treatment, and survival. It includes a variety of features that are commonly studied in the medical field, such as age, gender, cancer type, environmental factors, and lifestyle behaviors. The dataset is perfect for:
Exploratory Data Analysis (EDA)
Multiple Linear Regression and other modeling tasks
Feature Selection and Correlation Analysis
Predictive Modeling for cancer severity, treatment cost, and survival prediction
Data Visualization and creating insightful graphs
Key Features: Age: Patient's age (20-90 years)
Gender: Male, Female, or Other
Country/Region: Country or region of the patient
Cancer Type: Various types of cancer (e.g., Breast, Lung, Colon)
Cancer Stage: Stage 0 to Stage IV
Risk Factors: Includes genetic risk, air pollution, alcohol use, smoking, obesity, etc.
Treatment Cost: Estimated cost of cancer treatment (in USD)
Survival Years: Years survived since diagnosis
Severity Score: A composite score representing cancer severity
This dataset provides a broad view of global cancer trends, making it an ideal resource for those learning data science, machine learning, and statistical analysis in healthcare.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Purpose: The dataset is designed to explore the potential relationship between lifestyle habits and the probability of developing cancer. Variables: Sr No.: A unique identifier for each observation. Smoking Habit: Categorizes individuals based on their smoking frequency (e.g., Heavy, Moderate, Occasional, None). Drinking Habit: Categorizes individuals based on their alcohol consumption frequency (e.g., Frequent, Occasional, None). Biking Habit: Measures the frequency of biking activity (e.g., High, Medium, Low). Walking Habit: Measures the frequency of walking activity (e.g., High, Medium, Low). Jogging Habit: Measures the frequency of jogging activity (e.g., High, Medium, Low). Probability of Cancer: A numerical value representing the estimated likelihood of developing cancer, ranging from 0 to 1. Assumptions: The dataset assumes a causal relationship between lifestyle habits and cancer risk. However, correlation does not necessarily imply causation, and other factors may influence cancer development. The probability of cancer is a simplified representation and may vary based on individual factors, genetics, and environmental influences. Potential Use Cases: Exploratory Analysis: To identify potential correlations between lifestyle habits and cancer risk. Predictive Modeling: To build models that predict the probability of cancer based on lifestyle factors. Public Health Initiatives: To inform public health campaigns and interventions aimed at promoting healthy lifestyles and reducing cancer risk.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset was created by S.M. Toufek Hasan
Released under CC BY-SA 4.0
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cervical cancer is one of the leading causes of cancer-related deaths among women worldwide. Early detection and accurate prediction of cervical cancer can significantly improve the chances of successful treatment and save lives. This dataset help to develop a predictive model using machine learning techniques to identify individuals at high risk of cervical cancer, allowing for timely intervention and medical care.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Cervical Cancer vs Demographic, Habits, MedHistory’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/cervical-cancer-risk-factorse on 13 February 2022.
--- Dataset description provided by original source is as follows ---
The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela. The dataset comprises demographic information, habits, and historic medical records of 858 patients. Several patients decided not to answer some of the questions because of privacy concerns (missing values).
(int) Age(int) Number of sexual partners(int) First sexual intercourse (age)(int) Num of pregnancies(bool) Smokes(bool) Smokes (years)(bool) Smokes (packs/year)(bool) Hormonal Contraceptives(int) Hormonal Contraceptives (years)(bool) IUD(int) IUD (years)(bool) STDs(int) STDs (number)(bool) STDs:condylomatosis(bool) STDs:cervical condylomatosis(bool) STDs:vaginal condylomatosis(bool) STDs:vulvo-perineal condylomatosis(bool) STDs:syphilis(bool) STDs:pelvic inflammatory disease(bool) STDs:genital herpes(bool) STDs:molluscum contagiosum(bool) STDs:AIDS(bool) STDs:HIV(bool) STDs:Hepatitis B(bool) STDs:HPV(int) STDs: Number of diagnosis(int) STDs: Time since first diagnosis(int) STDs: Time since last diagnosis(bool) Dx:Cancer(bool) Dx:CIN(bool) Dx:HPV(bool) Dx(bool) Hinselmann: target variable(bool) Schiller: target variable(bool) Cytology: target variable(bool) Biopsy: target variable
Kelwin Fernandes, Jaime S. Cardoso, and Jessica Fernandes. 'Transfer Learning with Partial Observability Applied to Cervical Cancer Screening.' Iberian Conference on Pattern Recognition and Image Analysis. Springer International Publishing, 2017.
Kelwin Fernandes, Jaime S. Cardoso, and Jessica Fernandes. 'Transfer Learning with Partial Observability Applied to Cervical Cancer Screening.' Iberian Conference on Pattern Recognition and Image Analysis. Springer International Publishing, 2017.
Source: http://archive.ics.uci.edu/ml/datasets/Cervical+cancer+(Risk+Factors)
This dataset was created by UCI and contains around 900 samples along with St Ds:vulvo Perineal Condylomatosis, St Ds:pelvic Inflammatory Disease, technical information and other features such as: - St Ds (number) - Smokes - and more.
- Analyze Citology in relation to Biopsy
- Study the influence of St Ds: Time Since First Diagnosis on Age
- More datasets
If you use this dataset in your research, please credit UCI
--- Original source retains full ownership of the source dataset ---
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for Lung Cancer
Dataset Summary
The effectiveness of cancer prediction system helps the people to know their cancer risk with low cost and it also helps the people to take the appropriate decision based on their cancer risk status. The data is collected from the website online lung cancer prediction system .
Supported Tasks and Leaderboards
[More Information Needed]
Languages
[More Information Needed]
Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/virtual10/lungs_cancer.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This synthetic Lung Cancer Risk Prediction Dataset is designed for educational and research purposes in the fields of data science, public health, and cancer research. It contains essential health and lifestyle indicators such as smoking habits, chronic diseases, and respiratory symptoms, which can be used to analyze and predict the risk of lung cancer. The dataset is ideal for building predictive models, conducting risk assessments, and exploring the relationships between lifestyle factors and lung health.
This dataset is ideal for various lung cancer-related applications:
CC0 (Public Domain)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Lung Cancer Dataset includes a diverse array of symptoms essential for comprehensive analysis and model development. The primary categories of data are as follows: Patient Demographics Age: Provides the age at diagnosis, enabling analysis of age-related incidence and outcomes. Gender: Includes information on patient gender, facilitating gender-based studies. Smoking Status: Categorized as current smoker, former smoker, or non-smoker, this data is critical for evaluating the impact of smoking on lung cancer risk and progression. Medical History Comorbidities: Details additional health issues such as chronic obstructive pulmonary disease (COPD), which are relevant for treatment planning and prognosis. Clinical Data Vital Signs: Records of blood pressure, heart rate, respiratory rate, and other vital signs at diagnosis and during treatment.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains information on lung cancer risk factors across various countries, focusing on demographic details, smoking behaviors, and family history. Researchers and public health professionals can use this data to study patterns of lung cancer incidence, identify trends related to smoking and passive smoking exposure, and assess the impact of family history on lung cancer risk.
Risk Factor Analysis: Analyze how smoking habits, exposure to secondhand smoke, and family history correlate with lung cancer risk. Comparative Study: Compare lung cancer risk factors across different countries and regions. Demographic Insights: Explore how age and gender impact the prevalence of lung cancer risk factors. Statistical Modeling: Build models to predict lung cancer risk based on various factors such as smoking history, exposure to passive smoke, and genetic predisposition. Public Health Research: Identify populations with high-risk behaviors and suggest interventions or preventive measures.
An examination of national cancer risk based on monitored hazardous ambient air pollutants. This dataset is associated with the following publication: Weitekamp, C., M. Lein, M. Strum, M. Morris, T. Palma, D. Smith, L. Kerr, and M. Stewart. An Examination of National Cancer Risk Based on Monitored Hazardous Air Pollutants. ENVIRONMENTAL HEALTH PERSPECTIVES. National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, NC, USA, 129(3): 1-12, (2021).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides a detailed and structured overview of oral cancer cases worldwide. It includes key risk factors, symptoms, cancer staging, survival rates, treatment approaches, and economic burden to facilitate research and prediction modeling. The dataset is based on real-world oral cancer statistics, aligning with global health reports and studies.
Key Highlights: Covers high-incidence regions (India, Pakistan, Sri Lanka, Taiwan) and emerging trends in Western nations. Includes tobacco, alcohol, HPV infection, betel quid use, and dietary factors as primary risk factors. Captures economic burden (treatment costs, workdays lost) to assess the financial impact of oral cancer. Provides cancer staging, survival rates, and early diagnosis indicators for better treatment predictions. This dataset is valuable for medical professionals, researchers, data scientists, and policymakers aiming to develop early detection models, assess regional disparities, and improve cancer prevention strategies.
Columns Overview ID – Unique identifier Country – Country name Age – Age of the individual Gender – Male/Female Tobacco Use – Yes/No Alcohol Consumption – Yes/No HPV Infection – Yes/No Betel Quid Use – Yes/No Chronic Sun Exposure – Yes/No Poor Oral Hygiene – Yes/No Diet (Fruits & Vegetables Intake) – Low/Moderate/High Family History of Cancer – Yes/No Compromised Immune System – Yes/No Oral Lesions – Yes/No Unexplained Bleeding – Yes/No Difficulty Swallowing – Yes/No White or Red Patches in Mouth – Yes/No Tumor Size (cm) – Numerical value Cancer Stage – 0 (No Cancer), 1, 2, 3, 4 Treatment Type – Surgery/Radiation/Chemotherapy/Targeted Therapy/No Treatment Survival Rate (5-Year, %) Cost of Treatment (USD) Economic Burden (Lost Workdays per Year) Early Diagnosis (Yes/No) Oral Cancer (Diagnosis) – Yes/No (Target Variable)
These synthetic patient datasets were created for machine learning (ML) study of lung cancer risk prediction in simulation of ML-enabled learning health systems. Five populations of 30K patients were generated by the Synthea patient generator. They were combined sequentially to form 5 different size populations, from 30K to 150K patients. Patients with or without lung cancer were selected roughly at 1:3 ratio and their electronic health records (EHR) were processed to data table files ready for machine learning. The ML-ready table files also have the continuous numeric values converted to categorical values. Because Synthea patients are closely resemble to real patients, these ML-ready dataset can be used to develop and test ML algorithms, and train researchers. Unlike real patient data, these Synthea datasets can be shared with collaborators anywhere without privacy concerns. The first use of these datasets was in a LHS simulation study, which was published in Nature Scientific Reports (see https://www.nature.com/articles/s41598-022-23011-4).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This paper demonstrates the flexibility of a general approach for the analysis of discrete time competing risks data that can accommodate complex data structures, different time scales for different causes, and nonstandard sampling schemes. The data may involve a single data source where all individuals contribute to analyses of both cause-specific hazard functions, overlapping datasets where some individuals contribute to the analysis of the cause-specific hazard function of only one cause while other individuals contribute to analyses of both cause-specific hazard functions, or separate data sources where each individual contributes to the analysis of the cause-specific hazard function of only a single cause. The approach is modularized into estimation and prediction. For the estimation step, the parameters and the variance-covariance matrix can be estimated using widely available software. The prediction step utilizes a generic program with plug-in estimates from the estimation step. The approach is illustrated with three prognostic models for stage IV male oral cancer using different data structures. The first model uses only men with stage IV oral cancer from population-based registry data. The second model strategically extends the cohort to improve the efficiency of the estimates. The third model improves the accuracy for those with a lower risk of other causes of death, by bringing in an independent data source collected under a complex sampling design with additional other-cause covariates. These analyses represent novel extensions of existing methodology, broadly applicable for the development of prognostic models capturing both the cancer and non-cancer aspects of a patient's health.
The table National Cancer Risk Summaries by Pollutant is part of the dataset 2014 National Air Toxics Assessment (NATA) **, available at https://redivis.com/datasets/akx6-00kb9c93h. It contains 76727 rows across 78 variables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Purpose: Screening the general population for ovarian cancer is not recommended by every major medical or public health organization because the harms from screening outweigh the benefit it provides. To improve ovarian cancer detection and survival many are looking at high-risk populations who would benefit from screening.Methods: We train a neural network on readily available personal health data to predict and stratify ovarian cancer risk. We use two different datasets to train our network: The National Health Interview Survey and Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.Results: Our model has an area under the receiver operating characteristic curve of 0.71. We further demonstrate how the model could be used to stratify patients into different risk categories. A simple 3-tier scheme classifies 23.8% of those with cancer and 1.0% of those without as high-risk similar to genetic testing, and 1.1% of those with cancer and 24.4% of those without as low risk.Conclusion: The developed neural network offers a cost-effective and non-invasive way to identify those who could benefit from targeted screening.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset focuses on thyroid cancer recurrence after Radioactive Iodine (RAI) therapy. It contains 383 patient records with 13 key attributes, including age, gender, cancer staging, pathology type, risk classification, treatment response, and recurrence status. The data is valuable for predicting cancer recurrence, understanding risk factors, and evaluating treatment outcomes.
📌 Total Rows: 383
📌 Total Columns: 13
📌 No Missing Values
1️⃣ Are thyroid cancer recurrences more common in men or women?
2️⃣ How does age affect recurrence risk?
3️⃣ Can we predict recurrence based on tumor staging and pathology?
4️⃣ What is the relationship between treatment response and recurrence?
This dataset is ideal for:
✅ Machine Learning Models for recurrence prediction
✅ Statistical Analysis of cancer progression
✅ Medical Research on thyroid cancer
This dataset is a modified version of the original dataset:
Differentiated Thyroid Cancer Recurrence by Joe Beach Capital.
https://www.kaggle.com/datasets/joebeachcapital/differentiated-thyroid-cancer-recurrence
I have removed unnecessary columns to focus on thyroid cancer recurrence analysis.
🔹 Attribution 4.0 International (CC BY 4.0) – You are free to use, share, and modify with proper credit.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.