Facebook
TwitterThe Cancer Mapping data consists of counts of newly diagnosed cancer among New York State residents and is in response to legislation regarding "Cancer incidence and environmental facility maps" signed into law in 2010 (Public Health Law §2401-B). The law specifies the publication of maps showing cancer counts for small geographic areas along with certain facilities regulated by the State Department of Environmental Conservation. The official web site is called Environmental Facilities and Cancer Mapping.
The dataset is ONLY for the cancer-related data fields on the Environmental Facilities and Cancer Mapping web site. This dataset includes observed counts for 23 separate anatomical sites at the level of census block group. Block groups are small geographic areas typically averaging 1,000 to 1,500 people. To protect confidentiality, each area contains a minimum of 6 total cancers among males and 6 total cancers among females.
For more information, check out http://www.health.ny.gov/statistics/cancer/registry/about.htm .
Facebook
TwitterSource: New York State Cancer Registry, 2016-2020 https://www.health.ny.gov/statistics/cancer/registry/ratebyCounty.htm
Facebook
TwitterNew York City zip codes by borough and neighborhood, according to the Department of Health.
Pulled from https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm. This seems to be an orphaned page, so some context may be missing.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The ZIP Code lists show the number of people who developed the specific type of cancer while living in the ZIP Code area between 2005 and 2009. The lists also show the number of people who might have been expected to get cancer in that time period, based on the size of the population of the ZIP Code. For more info, see http://www.health.ny.gov/statistics/cancer/registry/zipcode/faq.htm
Facebook
Twittercolorectal cancer rates in Bronx zip codes for the years 2005-2009. The ZIP Code lists show the number of people who developed the specific type of cancer while living in the ZIP Code area between 2005 and 2009. The lists also show the number of people who might have been expected to get cancer in that time period, based on the size of the population of the ZIP Code. See http://www.health.ny.gov/statistics/cancer/registry/zipcode/faq.htm for more info
Facebook
TwitterFinancial overview and grant giving statistics of New York Cancer Registrars Association Inc.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The ZIP Code lists show the number of people who developed the specific type of cancer while living in the ZIP Code area between 2005 and 2009. The lists also show the number of people who might have been expected to get cancer in that time period, based on the size of the population of the ZIP Code. See http://www.health.ny.gov/statistics/cancer/registry/zipcode/faq.htm for more info
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Association between alcohol consumption and overall, breast cancer- specific, and non-breast cancer mortality.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average alcohol intake by demographic characteristics and lifestyle variables in a cohort of breast cancer survivors from the New York site of the Breast Cancer Family Registry.
Facebook
TwitterKvasir A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection.
https://datasets.simula.no/kvasir/
Automatic detection of diseases by use of computers is an important, but still unexplored field of research. Such innovations may improve medical practice and refine health care systems all over the world. However, datasets containing medical images are hardly available, making reproducibility and comparison of approaches almost impossible. Here, we present Kvasir, a dataset containing images from inside the gastrointestinal (GI) tract. The collection of images are classified into three important anatomical landmarks and three clinically significant findings. In addition, it contains two categories of images related to endoscopic polyp removal. Sorting and annotation of the dataset is performed by medical doctors (ex- perienced endoscopists). In this respect, Kvasir is important for research on both single- and multi-disease computer aided detec- tion. By providing it, we invite and enable multimedia researcher into the medical domain of detection and retrieval.
Data Collection
The data is collected using endoscopic equipment at Vestre Viken Health Trust (VV) in Norway. The VV consists of 4 hospitals and provides health care to 470.000 people. One of these hospitals (the Bærum Hospital) has a large gastroenterology department from where training data have been collected and will be provided, making the dataset larger in the future. Furthermore, the images are carefully annotated by one or more medical experts from VV and the Cancer Registry of Norway (CRN). The CRN provides new knowledge about cancer through research on cancer. It is part of South-Eastern Norway Regional Health Authority and is organized as an independent institution under Oslo University Hospital Trust. CRN is responsible for the national cancer screening programmes with the goal to prevent cancer death by discovering cancers or pre-cancerous lesions as early as possible.
Dataset Details
The Kvasir dataset consists of images, annotated and verified by medical doctors (experienced endoscopists), including several classes showing anatomical landmarks, phatological findings or endoscopic procedures in the GI tract, i.e., hundreds of images for each class. The number of images is sufficient to be used for different tasks, e.g., image retrieval, machine learning, deep learning and transfer learning, etc. The anatomical landmarks include Z-line, pylorus, cecum, etc., while the pathological finding includes esophagitis, polyps, ulcerative colitis, etc. In addition, we provide several set of images related to removal of lesions, e.g., "dyed and lifted polyp", the "dyed resection margins", etc. The dataset consist of the images with different resolution from 720x576 up to 1920x1072 pixels and organized in a way where they are sorted in separate folders named accordingly to the content. Some of the included classes of images have a green picture in picture illustrating the position and configuration of the endoscope inside the bowel, by use of an electromagnetic imaging system (ScopeGuide, Olympus Europe) that may support the interpretation of the image. This type of information may be important for later investigations (thus included), but must be handled with care for the detection of the endoscopic findings.
Terms of use
The use of the Kvasir dataset is restricted for research and educational purposes only. The use of the Kvasir dataset for other purposes including commercial purposes is forbidden without prior written permission. In all documents and papers that use or refer to the Kvasir dataset or report experimental results based on the Kvasir dataset, a reference to the dataset paper have to be included.
Contact
Email michael/paalh (at) simula (dot) no if you have any questions about the dataset and our research activities. We always welcome collaboration and joint research!
Cite
@inproceedings{Pogorelov:2017:KMI:3083187.3083212, title = {KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection}, author = { Pogorelov, Konstantin and Randel, Kristin Ranheim and Griwodz, Carsten and Eskeland, Sigrun Losada and de Lange, Thomas and Johansen, Dag and Spampinato, Concetto and Dang-Nguyen, Duc-Tien and Lux, Mathias and Schmidt, Peter Thelin and Riegler, Michael and Halvorsen, P{\aa}l }, booktitle = {Proceedings of the 8th ACM on Multimedia Systems Conference}, series = {MMSys'17}, year = {2017}, isbn = {978-1-4503-5002-0}, location = {Taipei, Taiwan}, pages = {164--169}, numpages = {6}, doi = {10.1145/3083187.3083212}, acmid = {3083212}, publisher = {ACM}, address = {New York, NY, USA}, }
Facebook
TwitterPopulation based cancer incidence rates were abstracted from National Cancer Institute, State Cancer Profiles for all available counties in the United States for which data were available. This is a national county-level database of cancer data that are collected by state public health surveillance systems. All-site cancer is defined as any type of cancer that is captured in the state registry data, though non-melanoma skin cancer is not included. All-site age-adjusted cancer incidence rates were abstracted separately for males and females. County-level annual age-adjusted all-site cancer incidence rates for years 2006–2010 were available for 2687 of 3142 (85.5%) counties in the U.S. Counties for which there are fewer than 16 reported cases in a specific area-sex-race category are suppressed to ensure confidentiality and stability of rate estimates; this accounted for 14 counties in our study. Two states, Kansas and Virginia, do not provide data because of state legislation and regulations which prohibit the release of county level data to outside entities. Data from Michigan does not include cases diagnosed in other states because data exchange agreements prohibit the release of data to third parties. Finally, state data is not available for three states, Minnesota, Ohio, and Washington. The age-adjusted average annual incidence rate for all counties was 453.7 per 100,000 persons. We selected 2006–2010 as it is subsequent in time to the EQI exposure data which was constructed to represent the years 2000–2005. We also gathered data for the three leading causes of cancer for males (lung, prostate, and colorectal) and females (lung, breast, and colorectal). The EQI was used as an exposure metric as an indicator of cumulative environmental exposures at the county-level representing the period 2000 to 2005. A complete description of the datasets used in the EQI are provided in Lobdell et al. and methods used for index construction are described by Messer et al. The EQI was developed for the period 2000– 2005 because it was the time period for which the most recent data were available when index construction was initiated. The EQI includes variables representing each of the environmental domains. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., L. Messer, K. Rappazzo , C. Gray, S. Grabich , and D. Lobdell. County-level environmental quality and associations with cancer incidence#. Cancer. John Wiley & Sons Incorporated, New York, NY, USA, 123(15): 2901-2908, (2017).
Facebook
TwitterThis Nerthus dataset brings you 21 videos from inside the gastrointestinal tract, showcasing bowel cleansing quality across 5,525 annotated frames. Verified by expert endoscopists, it’s a fantastic resource for multimedia researchers aiming to revolutionize medical automation in colonoscopy assessments! 🩺✨
Good bowel prep is key to successful colonoscopies, affecting disease detection and follow-up plans. The Nerthus dataset offers 21 videos rated by the Boston Bowel Preparation Scale (BBPS), aiming to reduce subjective grading with objective, automated tools. Let’s optimize healthcare and enhance endoscopy reporting together! 🔍
Colonoscopy is the gold standard for spotting colorectal cancer—third most common worldwide. Clear visuals depend on bowel cleanliness, and Nerthus helps build systems to assess it consistently, cutting variability among doctors and boosting procedure quality. 🩻
Videos focus on the left bowel section, with 1-10 videos and 500-2,700 frames per class. Annotated by pros, it’s ready for action! 🎬
Collected at Bærum Hospital, Vestre Viken Hospital Trust, Norway, using endoscopic gear. Annotations come from the Cancer Registry of Norway, with a multi-expert "gold standard" subset (Norway, Sweden, UK, US, Canada) coming soon as an update! 🌍
Fun Fact: Many frames feature a green overlay from ScopeGuide (Olympus Europe), showing endoscope position—useful, but tread carefully for quality scoring!
Videos are split into folders by BBPS score, packed with potential for research. Expect raw endoscopic footage ready to power your next big idea! 📂
Compare your work with these:
- TP, TN, FP, FN
- Recall (REC), Precision (PREC), Specificity (SPEC)
- Accuracy (ACC), MCC, F1 Score
- Frame-Rate (FPS) for real-time systems
Share your stats (total images, per-class counts, positives) for max impact! 📊
Cite this:
bibtex
@inproceedings{Pogorelov:2017:NBP:3083187.3083216,
title = {Nerthus: A Bowel Preparation Quality Video Dataset},
author = {Pogorelov, Konstantin and Randel, Kristin Ranheim and de Lange, Thomas and Eskeland, Sigrun Losada and Griwodz, Carsten and Johansen, Dag and Spampinato, Concetto and Taschwer, Mario and Lux, Mathias and Schmidt, Peter Thelin and Riegler, Michael and Halvorsen, P{\aa}l},
booktitle = {Proceedings of the 8th ACM on Multimedia Systems Conference},
series = {MMSys'17},
year = {2017},
location = {Taipei, Taiwan},
pages = {170--174},
doi = {10.1145/3083187.3083216},
publisher = {ACM},
address = {New York, NY, USA},
}
Colonoscopy · Bowel Prep · BBPS · Endoscopy · Medical Video · Multimedia Research
We hope Nerthus inspires game-changing tools for colonoscopy and beyond. Happy researching, and please upvote this dataset if it fuels your work—let’s keep pushing healthcare innovation forward! 🙌
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundHelicobacter pylori (H. pylori) is closely related to the carcinogenesis of gastric cancer (GC) and gastric non-Hodgkin lymphoma (NHL). However, the systemic trend analysis in H. pylori-related malignancy is limited. We aimed to determine the national incidence trend in non-cardia GC, cardia GC, and gastric NHL in the US during 2000–2019.MethodIn this population-based study, we included 186,769 patients with a newly diagnosed H. pylori-related malignancy, including non-cardia GC, cardia GC, and gastric NHL from the Surveillance, Epidemiology, and End Results (SEER) Registry from January 1, 2000 to December 31, 2019. We determined the age-adjusted incidence of three H. pylori-related malignancies respectively. Average annual percentage change (AAPC) in 2000–2019 was calculated to describe the incidence trends. Analyses were stratified by sex, age, race and ethnicity, geographic location and SEER registries. We also determined the 5-year incidence (during 2015–2019) by SEER registries to examine the geographic variance.ResultsThe incidence in non-cardia GC and gastric NHL significantly decreased during 2000–2019, while the rate plateaued for cardia GC (AAPCs, −1.0% [95% CI, −1.1%−0.9%], −2.6% [95% CI, −2.9%−2.3%], and −0.2% [95% CI, −0.7%−0.3%], respectively). For non-cardia GC, the incidence significantly increased among individuals aged 20–64 years (AAPC, 0.8% [95% CI, 0.6–1.0%]). A relative slower decline in incidence was also observed for women (AAPC, −0.4% [95% CI, −0.6%−0.2%], P for interaction < 0.05). The incidence of cardia GC reduced dramatically among Hispanics (AAPC, −0.8% [95% CI, −1.4%−0.3%]), however it increased significantly among nonmetropolitan residents (AAPC, 0.8% [95% CI, 0.4–1.3%]). For gastric NHL, the decreasing incidence were significantly slower for those aged 20–64 years (AAPC, −1.5% [95% CI, −1.9–1.1%]) and Black individuals (AAPC, −1.3% [95% CI, −1.9–1.1%]). Additionally, the highest incidence was observed among Asian and the Black for non-cardia GC, while Whites had the highest incidence of cardia GC and Hispanics had the highest incidence of gastric NHL (incidence rate, 8.0, 8.0, 3.1, and 1.2, respectively) in 2019. Geographic variance in incidence rates and trends were observed for all three H. pylori-related malignancies. The geographic disparities were more pronounced for non-cardia GC, with the most rapid decline occurring in Hawaii (AAPC, −4.5% [95% CI, −5.5–3.6%]) and a constant trend in New York (AAPC 0.0% [95% CI, −0.4–0.4%]), the highest incidence in Alaska Natives, and the lowest incidence among Iowans (14.3 and 2.3, respectively).ConclusionThe incidence of H. pylori-related cancer declined dramatically in the US between 2000 and 2019, with the exception of cardia GC. For young people, a rising trend in non-cardia GC was noted. Existence of racial/ethnic difference and geographic diversity persists. More cost-effective strategies of detection and management for H. pylori are still in demand.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThe Cancer Mapping data consists of counts of newly diagnosed cancer among New York State residents and is in response to legislation regarding "Cancer incidence and environmental facility maps" signed into law in 2010 (Public Health Law §2401-B). The law specifies the publication of maps showing cancer counts for small geographic areas along with certain facilities regulated by the State Department of Environmental Conservation. The official web site is called Environmental Facilities and Cancer Mapping.
The dataset is ONLY for the cancer-related data fields on the Environmental Facilities and Cancer Mapping web site. This dataset includes observed counts for 23 separate anatomical sites at the level of census block group. Block groups are small geographic areas typically averaging 1,000 to 1,500 people. To protect confidentiality, each area contains a minimum of 6 total cancers among males and 6 total cancers among females.
For more information, check out http://www.health.ny.gov/statistics/cancer/registry/about.htm .