Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.
This dataset can be used for various analyses such as:
This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides detailed insights into website traffic metrics and user engagement statistics, collected from SimilarWeb. The data includes information on various websites, such as rank, category, average visit duration, pages per visit, and bounce rate. This data aims to facilitate an understanding of online behavior and performance trends across different sectors, making it a valuable resource for researchers, marketers, and data analysts. The dataset is ideal for exploring patterns in web traffic and user interaction and conducting comparative analyses across various website categories.
Important Warning: Running this code within Kaggle may result in a ban, as scraping activities are prohibited on the platform. There is no guarantee that any ban will be lifted, as Kaggle staff may interpret scraping as a denial-of-service attack. Although I have implemented measures to reduce server load, such as adding sleep intervals, it is advisable to run this code locally to ensure compliance with Kaggle's policies.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Website Traffic Analysis
Website traffic analysis is the process of monitoring and evaluating the visitors to a website. It provides insights into how users are interacting with the site, where they are coming from, which pages they visit most often, and how long they stay. By analyzing this data, businesses can understand user behavior, improve site performance, and optimize content to increase engagement and conversions.
Key metrics include the number of visitors, page views, bounce rate, traffic sources (organic, referral, direct), and geographic location. Website traffic analysis is essential for enhancing SEO, refining marketing strategies, and boosting overall user experience.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This Website Statistics dataset has four resources showing usage of the Lincolnshire Open Data website. Web analytics terms used in each resource are defined in their accompanying Metadata file.
Website Usage Statistics: This document shows a statistical summary of usage of the Lincolnshire Open Data site for the latest calendar year.
Website Statistics Summary: This dataset shows a website statistics summary for the Lincolnshire Open Data site for the latest calendar year.
Webpage Statistics: This dataset shows statistics for individual Webpages on the Lincolnshire Open Data site by calendar year.
Dataset Statistics: This dataset shows cumulative totals for Datasets on the Lincolnshire Open Data site that have also been published on the national Open Data site Data.Gov.UK - see the Source link.
Note: Website and Webpage statistics (the first three resources above) show only UK users, and exclude API calls (automated requests for datasets). The Dataset Statistics are confined to users with javascript enabled, which excludes web crawlers and API calls.
These Website Statistics resources are updated annually in January by the Lincolnshire County Council Business Intelligence team. For any enquiries about the information contact opendata@lincolnshire.gov.uk.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was obtained from website visit data. These are real data. It contains monthly visit information of the tr-metaverse.com website hosted on Linux. Day Hit Hit% Files Files% Pages Pages% Visit Visit% Sites Sites% Kbytes Kbytes% It consists of fields. Values with a % sign next to them are numbers in percent. 30-day visit data from the beginning of the month to the end of the month. Day: Day index number, which day of the month Hit: How much reach there is in general Hit%: How much access there is overall in percentage Files: How many visits have been made as files Files%: Percentage in files Pages Pages% Visit: Number of unique visitors Visit%: Unique visitor rate sites sites% Kbytes: how much data has been downloaded Kbytes%: percentage in data
Facebook
TwitterPer the Federal Digital Government Strategy, the Department of Homeland Security Metrics Plan, and the Open FEMA Initiative, FEMA is providing the following web performance metrics with regards to FEMA.gov.rnrnInformation in this dataset includes total visits, avg visit duration, pageviews, unique visitors, avg pages/visit, avg time/page, bounce ratevisits by source, visits by Social Media Platform, and metrics on new vs returning visitors.rnrnExternal Affairs strives to make all communications accessible. If you have any challenges accessing this information, please contact FEMAWebTeam@fema.dhs.gov.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This anonymized data set consists of one month's (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.
We acknowledge the support of Respondi AG, which provided the web tracking and survey data free of charge for research purposes, with special thanks to François Erner and Luc Kalaora at Respondi for their insights and help with data extraction.
The data set is analyzed in the following paper:
The code used to analyze the data is also available at https://github.com/gesiscss/web_tracking.
If you use data or code from this repository, please cite the paper above and the Zenodo link.
Users are advised that some domains in this data set may link to potentially questionable or inappropriate content. The domains have not been individually reviewed, as content verification was not the primary objective of this data set. Therefore, user discretion is strongly recommended when accessing or scraping any content from these domains.
Facebook
TwitterThe CalFish Abundance Database contains a comprehensive collection of anadromous fisheries abundance information. Beginning in 1998, the Pacific States Marine Fisheries Commission, the California Department of Fish and Game, and the National Marine Fisheries Service, began a cooperative project aimed at collecting, archiving, and entering into standardized electronic formats, the wealth of information generated by fisheries resource management agencies and tribes throughout California.Extensive data are currently available for chinook, coho, and steelhead. Major data categories include adult abundance population estimates, actual fish and/or carcass counts, counts of fish collected at dams, weirs, or traps, and redd counts. Harvest data has been compiled for many streams, and hatchery return data has been compiled for the states mitigation facilities. A draft format has been developed for juvenile abundance and awaits final approval. This CalFish Abundance Database shapefile was generated from fully routed 1:100,000 hydrography. In a few cases streams had to be added to the hydrography dataset in order to provide a means to create shapefiles to represent abundance data associated with them. Streams added were digitized at no more than 1:24,000 scale based on stream line images portrayed in 1:24,000 Digital Raster Graphics (DRG).These features generally represent abundance counts resulting from stream surveys. The linear features in this layer typically represent the location for which abundance data records apply. This would be the reach or length of stream surveyed, or the stream sections for which a given population estimate applies. In some cases the actual stream section surveyed was not specified and linear features represent the entire stream. In many cases there are multiple datasets associated with the same length of stream, and so, linear features overlap. Please view the associated datasets for detail regarding specific features. In CalFish these are accessed through the "link" that is visible when performing an identify or query operation. A URL string is provided with each feature in the downloadable data which can also be used to access the underlying datasets.The coho data that is available via the CalFish website is actually linked directly to the StreamNet website where the database's tabular data is currently stored. Additional information about StreamNet may be downloaded at http://www.streamnet.org. Complete documentation for the StreamNet database may be accessed at http://http://www.streamnet.org/def.html
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
LinkedIn Company Page Data - The Data Analytics Academy Dataset Overview This dataset contains detailed insights from The Data Analytics Academy's LinkedIn Company Page, including information on content performance, followers, and visitors. The data is sourced directly from our LinkedIn analytics and has been organized into CSV files for ease of use.
Files Included: Content Data: Performance metrics for posts and updates shared on our LinkedIn page. Followers Data: Demographics and growth metrics of our LinkedIn page followers. Visitors Data: Insights on page visitors, including demographics and engagement levels. Use Cases: Social Media Analytics: Analyze the performance of content and its reach among different demographics. Market Research: Understand audience demographics and how they engage with our page. Data Science Projects: Apply machine learning algorithms to predict content performance or audience growth. Acknowledgments This data is free to use for any purpose, including commercial use. However, if you use this dataset, please give credit to The Data Analytics Academy by mentioning us or linking to our LinkedIn page: The Data Analytics Academy.
Inspiration This dataset can be used to explore various aspects of LinkedIn analytics, such as identifying trends in audience engagement, understanding content performance, and predicting follower growth.
Facebook
Twitterhttp://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Brief Description of Dataset Files
images.zip -- contains all the ROADWork images that have been manually annotated sem_seg_labels.zip -- contains semantic segmentation labels for images in images.zip in the Cityscapes format. annotations.zip -- contains instance segmentations, sign information, scene descriptions and other labels for images in images.zip in a COCO-like format. It contains multiple splits, suited for different tasks. Please see Usage for more information. discovered_images.zip-- contains discovered images with roadwork scenes from BDD100K and Mapillary dataset (less than 1000 images in total). These images are provided for ease of access ONLY. See below for specific license information for these external datasets. traj_images.zip -- contains images associated with pathways. These images were manually filtered to contain ground truth pathways obtained from COLMAP. The split is described in Usage, to avoid data contamination from models trained on images.zip. traj_annotations.zip -- contains pathway annotations corresponding to images in traj_images.zip. traj_images_dense.zip -- contains the dense set of images with associated pathways. These are similar to traj_images.zip, they are not subsampled. traj_annotations_dense.zip -- contains pathway annotations corresponding to images in traj_images_dense.zip videos_compressed.zip -- contains video snippets from Robotics Open Dataset that we used to compute pathways using COLMAP. Repository contains all the data from ROADWork Dataset. Please visit our project webpage for more information on the dataset: www.cs.cmu.edu/~ILIM/roadwork_dataset/ Usage Please go to our Github repository: https://github.com/anuragxel/roadwork-dataset/
License ROADWork dataset images collected by us and all the annotations are licensed under the Open Data Commons Attribution License v1.0. All images from Roadbotics Dataset are provided for ease of access, and they are licensed under the Open Data Commons Attribution License v1.0. Any other data from other datasets (e.g. data in discovered_images.zip) is distributed with its own licenses and terms. License of Discovered Images A small sample of Mappilary Vistas dataset images (in mappilary/ subdirectory in discovered_images.zip) are provided for ease of use. These images are licensed under the CC BY-NC-SA license and Mappilary Terms of Use. You agree to these terms if you use these images in any form. Please visit the following link for more information about the Mappilary Vistas dataset: https://www.mapillary.com/dataset/vistas
A small sample of BDD100K dataset images (in bdd100k/ subdirectory in discovered_images.zip) are provided for ease of use. These images are licensed according to (https://doc.bdd100k.com/license.html) which allows us to distibute these images with attribution. You agree to their license agreement if you use these images in any form. Please visit the following link for more information about the BDD100K dataset: http://bdd-data.berkeley.edu/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistics of trips taken on HireDevice in Hamilton City. To get data for this dataset, please call the API directly talking to the HCC Data Warehouse: https://api.hcc.govt.nz/OpenData/get_hiredevice_trip?Page=1&Start_Date=2020-10-01&End_Date=2020-10-02. For this API, there are three mandatory parameters: Page, Start_Date, End_Date. Sample values for these parameters are in the link above. When calling the API for the first time, please always start with Page 1. Then from the returned JSON, you can see more information such as the total page count and page size. For help on using the API in your preferred data analysis software, please contact dale.townsend@hcc.govt.nz. NOTE: Anomalies and missing data may be present in the dataset. Column_InfoTrip_Id, varchar : Unique identifier of the tripTrip_Duration, int : Duration of the trip in secondsTrip_Distance, int : Distance of the trip in metresDevice_Id, varchar : Unique identifier of the GPS device on the scooterVehicle_Id, varchar : Unique identifier of the scooterStart_Time, datetime : Date and time that the trip startedEnd_Time, datetime : Date and time that the trip ended Relationship This table is referenced by HireDevice_Route Analytics For convenience Hamilton City Council has also built a Quick Analytics Dashboard over this dataset that you can access here. Disclaimer Hamilton City Council does not make any representation or give any warranty as to the accuracy or exhaustiveness of the data released for public download. Levels, locations and dimensions of works depicted in the data may not be accurate due to circumstances not notified to Council. A physical check should be made on all levels, locations and dimensions before starting design or works. Hamilton City Council shall not be liable for any loss, damage, cost or expense (whether direct or indirect) arising from reliance upon or use of any data provided, or Council's failure to provide this data. While you are free to crop, export and re-purpose the data, we ask that you attribute the Hamilton City Council and clearly state that your work is a derivative and not the authoritative data source. Please include the following statement when distributing any work derived from this data: ‘This work is derived entirely or in part from Hamilton City Council data; the provided information may be updated at any time, and may at times be out of date, inaccurate, and/or incomplete.'
Facebook
TwitterThe data represent web-scraping of hyperlinks from a selection of environmental stewardship organizations that were identified in the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017). There are two data sets: 1) the original scrape containing all hyperlinks within the websites and associated attribute values (see "README" file); 2) a cleaned and reduced dataset formatted for network analysis. For dataset 1: Organizations were selected from from the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017), a publicly available, spatial data set about environmental stewardship organizations working in New York City, USA (N = 719). To create a smaller and more manageable sample to analyze, all organizations that intersected (i.e., worked entirely within or overlapped) the NYC borough of Staten Island were selected for a geographically bounded sample. Only organizations with working websites and that the web scraper could access were retained for the study (n = 78). The websites were scraped between 09 and 17 June 2020 to a maximum search depth of ten using the snaWeb package (version 1.0.1, Stockton 2020) in the R computational language environment (R Core Team 2020). For dataset 2: The complete scrape results were cleaned, reduced, and formatted as a standard edge-array (node1, node2, edge attribute) for network analysis. See "READ ME" file for further details. References: R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Version 4.0.3. Stockton, T. (2020). snaWeb Package: An R package for finding and building social networks for a website, version 1.0.1. USDA Forest Service. (2017). Stewardship Mapping and Assessment Project (STEW-MAP). New York City Data Set. Available online at https://www.nrs.fs.fed.us/STEW-MAP/data/. This dataset is associated with the following publication: Sayles, J., R. Furey, and M. Ten Brink. How deep to dig: effects of web-scraping search depth on hyperlink network analysis of environmental stewardship organizations. Applied Network Science. Springer Nature, New York, NY, 7: 36, (2022).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vehicle travel time and delay data on sections of road in Hamilton City, based on Bluetooth sensor records. To get data for this dataset, please call the API directly talking to the HCC Data Warehouse: https://api.hcc.govt.nz/OpenData/get_traffic_link_stats?Page=1&Start_Date=2021-06-02&End_Date=2021-06-03. For this API, there are three mandatory parameters: Page, Start_Date, End_Date. Sample values for these parameters are in the link above. When calling the API for the first time, please always start with Page 1. Then from the returned JSON, you can see more information such as the total page count and page size. For help on using the API in your preferred data analysis software, please contact dale.townsend@hcc.govt.nz. NOTE: Anomalies and missing data may be present in the dataset. Column_InfoLink_Id, int : Unique link identifierTravel_Time, int : Average travel time in seconds to travel along the linkAverage_Delay, int : Average travel delay in seconds, calculated as the difference between the free flow travel time and observed travel timeDate, varchar : Starting date and time for the recorded delay and travel time, in 15 minute periods Relationship This table reference to table Traffic_Link Analytics For convenience Hamilton City Council has also built a Quick Analytics Dashboard over this dataset that you can access here. Disclaimer Hamilton City Council does not make any representation or give any warranty as to the accuracy or exhaustiveness of the data released for public download. Levels, locations and dimensions of works depicted in the data may not be accurate due to circumstances not notified to Council. A physical check should be made on all levels, locations and dimensions before starting design or works. Hamilton City Council shall not be liable for any loss, damage, cost or expense (whether direct or indirect) arising from reliance upon or use of any data provided, or Council's failure to provide this data. While you are free to crop, export and re-purpose the data, we ask that you attribute the Hamilton City Council and clearly state that your work is a derivative and not the authoritative data source. Please include the following statement when distributing any work derived from this data: ‘This work is derived entirely or in part from Hamilton City Council data; the provided information may be updated at any time, and may at times be out of date, inaccurate, and/or incomplete.'
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Unlock valuable insights with our comprehensive Home Depot product dataset. This dataset is meticulously curated, offering detailed information on a wide range of products available at Home Depot.
Homedepot available datasets:
We offer a wide range of categories, including furniture, home décor, painting, plumbing, and many more. Explore all available options here.
Whether you're conducting market research, enhancing your e-commerce platform, or analyzing retail trends, this dataset is an invaluable resource. It includes product names, descriptions, prices, categories, and more. Optimize your projects with high-quality, structured data from one of the largest home improvement retailers in the world.
Stay ahead in the competitive market with accurate and up-to-date product information.
Home Depot products latest dataset having around 2 million records. Get in touch with crawl feeds to require any updates in dataset.
For a closer look at the product-level data we’ve extracted from Home Depot, including pricing, stock status, and detailed specifications, visit the Home Depot dataset page. You can explore sample records and submit a request for tailored extracts directly from there.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Variable Message Signs (VMS) in York. For further information about traffic management please visit the City of York Council website. *Please note that the data published within this dataset is a live API link to CYC's GIS server. Any changes made to the master copy of the data will be immediately reflected in the resources of this dataset.The date shown in the "Last Updated" field of each GIS resource reflects when the data was first published.
Facebook
TwitterAuthor: Rami Mustafa A Mohammad ( University of Huddersfield","rami.mohammad '@' hud.ac.uk","rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield","t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai","fadi '@' cud.ac.ae)
Source: UCI
Please cite: Please refer to the Machine Learning Repository's citation policy
Source:
Rami Mustafa A Mohammad ( University of Huddersfield, rami.mohammad '@' hud.ac.uk, rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield,t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai,fadi '@' cud.ac.ae)
Data Set Information:
One of the challenges faced by our research was the unavailability of reliable training datasets. In fact this challenge faces any researcher in the field. However, although plenty of articles about predicting phishing websites have been disseminated these days, no reliable training dataset has been published publically, may be because there is no agreement in literature on the definitive features that characterize phishing webpages, hence it is difficult to shape a dataset that covers all possible features. In this dataset, we shed light on the important features that have proved to be sound and effective in predicting phishing websites. In addition, we propose some new features.
Attribute Information:
For Further information about the features see the features file in the data folder of UCI.
Relevant Papers:
Mohammad, Rami, McCluskey, T.L. and Thabtah, Fadi (2012) An Assessment of Features Related to Phishing Websites using an Automated Technique. In: International Conferece For Internet Technology And Secured Transactions. ICITST 2012 . IEEE, London, UK, pp. 492-497. ISBN 978-1-4673-5325-0
Mohammad, Rami, Thabtah, Fadi Abdeljaber and McCluskey, T.L. (2014) Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25 (2). pp. 443-458. ISSN 0941-0643
Mohammad, Rami, McCluskey, T.L. and Thabtah, Fadi Abdeljaber (2014) Intelligent Rule based Phishing Websites Classification. IET Information Security, 8 (3). pp. 153-160. ISSN 1751-8709
Citation Request:
Please refer to the Machine Learning Repository's citation policy
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In medical imaging, DL models are often tasked with delineating structures or abnormalities within complex anatomical structures, such as tumors, blood vessels, or organs. Uncertainty arises from the inherent complexity and variability of these structures, leading to challenges in precisely defining their boundaries. This uncertainty is further compounded by interrater variability, as different medical experts may have varying opinions on where the true boundaries lie. DL models must grapple with these discrepancies, leading to inconsistencies in segmentation results across different annotators and potentially impacting diagnosis and treatment decisions. Addressing interrater variability in DL for medical segmentation involves the development of robust algorithms capable of capturing and quantifying uncertainty, as well as standardizing annotation practices and promoting collaboration among medical experts to reduce variability and improve the reliability of DL-based medical image analysis. Interrater variability poses significant challenges in the field of DL for medical image segmentation.
Furthermore, achieving model calibration, a fundamental aspect of reliable predictions, becomes notably challenging when dealing with multiple classes and raters. Calibration is pivotal for ensuring that predicted probabilities align with the true likelihood of events, enhancing the model's reliability. It must be considered that, even if not clearly, having multiple classes account for uncertainties arising from their interactions. Moreover, incorporating annotations from multiple raters adds another layer of complexity, as differing expert opinions may contribute to a broader spectrum of variability and computational complexity.
Due to all the previously stated reasons, we have created a challenge that considers all of the above. In this challenge, we will work with abdominal CT scans. Each of them will have three different annotations obtained from different experts and each of the annotations will have three classes: pancreas, kidney and liver.
The challenge cohort consists of 90 CT images prospectively gathered at the University Hospital Erlangen between August 2023 and October 2023. Each CT will have multiple classes: background (0), pancreas (1), kidney (2) and liver (3). In addition, each of the CTs will have three different annotators from three different experts that will contain the four classes specified previously.
20 CT scans belonging to group A with the respective annotations will be given. It is encouraged to leverage publicly available external data annotated by multiple raters. The idea of giving a small amount of data for the training set and giving the opportunity of using a public dataset for training is to make the challenge more inclusive, giving the option to develop a method by using data that is in anyone's hands. Furthermore, by using this data to train and using other data to evaluate, it makes it more robust to shifts and other sources of variability between datasets.
5 CT scans belonging to group A will be used for this phase.
65 CT scans will be used for evaluation. 20 CTs belonging to group A, 22 CTs belonging to group B and 23 CTs belonging to group C.
Both validation and testing CT scans cohorts will not be published until the end of the challenge. Furthermore, to which group each CT scan belongs will not be revealed until after the challenge.
Inclusion criteria were a maximum of 10 cysts with a diameter of less than 2,0 cm. Furthermore, CT scans with major artifacts (e.g. breathing artifacts) or incomplete registrations were excluded.
Participants were required to be over 18 years old and provide both verbal and written consent for the use of their CT images in the Challenge. Both study-specific and broad consent were obtained. Among the 90 patients, there were 51 males and 39 females, aged between 37 and 94 years, with an average age of 65.7 years. All patients received treatment at the University Hospital Erlangen in Bavaria, Germany. No additional selection criteria was set to ensure a representative sample of a typical patient cohort.
Our overall data consists on 90 CTs splitted in three different groups:
Group A: cases with 2 cysts or less with no contour altering pathologies - 45 CTs
Group B: cases with 3-5 cysts with no contour altering pathologies - 22 CTs
Group C: cases with 6-10 cysts with some pathologies included (liver metastases, hydronephrosis, adrenal gland metastases, missing kidney) - 23 CTs
However, in any case, the participants will not know which case belongs to which group. This information will be released after the challenge, together with the whole dataset.
The first step for obtaining de labels was using the TotalSegmentator [1] [2] to get rough annotations. Then, the labels were sent to three radiologists (R1, R2, R3), to both correct the automatic annotations and add possible missing organs. One of the three labeling radiologists, the MD PhD candidate, previously defined both the dataset cohort and the criteria of what belongs to the parenchyma and what does not and it was given to the other two labeling radiologists to follow the same criteria to be coherent with each other [3]. Separately, two other clinicians (C1, C2) supervised the criteria of the cohort defined by the MD PhD candidate, but not having any relation with the labeling itself, hence, there is no bias between the annotations of the different radiologists.
Each labeled class for this challenge has specific instructions. Below are listed per organ.
The CTs used needed to be contrast-enhanced CT scans in a portal venous phase with the acquisition of thin slices ranging from 0.6 to 1mm. Thoracic-Abdominal CT images were taken during the patients' hospital stay, motivated by various medical needs. Given the focus on abdominal organs, the Br40 soft kernel was employed. CT examinations were conducted using SIEMENS CT scanners at the university hospital Erlangen, with rotation speeds of 0.25 or 0.5 sec. Detector collimation varied from 128x0.6mm single source to 98x0.6x2 and 144x0.4x2 dual source configurations. Spiral pitch factors ranged from 0.3 to 1.3. The mean reference tube current was set at 200 mAs, adjustable to 120 mAs. Automated tube voltage adaptation and tube current modulation were implemented in all instances. Contrast agent administration was standard practice, with an injection rate of 3-4 mL/s and a body weight-adjusted dosage of 400 mg(iodine)/kg (equivalent to 1.14 ml/kg Iomeprol 350mg/ml). All images underwent reconstruction using soft convolution kernels and iterative techniques.
Ethical Approval and Data Usage
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The 2018 TRI preliminary dataset consists of TRI data for 2018. Users should note that while these preliminary data have undergone the basic data quality checks included in the online TRI reporting software, they have not undergone the complete TRI data quality process. In addition, EPA does not aggregate or summarize these data, or offer any analysis or interpretation of them.You can use the TRI preliminary dataset to: Identify how many TRI facilities operate in a certain geographic area (for example, a ZIP code);Identify which chemicals are being managed by TRI facilities and in what quantities; andFind out if a particular facility initiated any pollution prevention activities in the most recent calendar year.The agency will update the dataset several times in August and September based on information from facilities. EPA plans to publish the complete, quality-checked 2018 dataset in October 2019, followed by the 2018 TRI National Analysis in January 2020.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset was curated for TabArena by the TabArena team as part of the TabArena Tabular ML IID Study. For more details on the study, see our paper.
Dataset Focus: This dataset shall be used for evaluating predictive machine learning models for independent and identically distributed tabular data. The intended task is classification.
Facebook
TwitterThe CalFish Abundance Database contains a comprehensive collection of anadromous fisheries abundance information. Beginning in 1998, the Pacific States Marine Fisheries Commission, the California Department of Fish and Game, and the National Marine Fisheries Service, began a cooperative project aimed at collecting, archiving, and entering into standardized electronic formats, the wealth of information generated by fisheries resource management agencies and tribes throughout California.Extensive data are currently available for chinook, coho, and steelhead. Major data categories include adult abundance population estimates, actual fish and/or carcass counts, counts of fish collected at dams, weirs, or traps, and redd counts. Harvest data has also been compiled for many streams.This CalFish Abundance Database shapefile was generated from fully routed 1:100,000 hydrography. In a few cases streams had to be added to the hydrography dataset in order to provide a means to create shapefiles to represent abundance data associated with them. Streams added were digitized at no more than 1:24,000 scale based on stream line images portrayed in 1:24,000 Digital Raster Graphics (DRG).These features represent abundance information resulting from counts at weirs, fish ladders, or other point-type monitoring protocols such as beach seining. The point features in this layer typically represent the location for which abundance data records apply. In many cases there are multiple datasets associated with the same point location, and so, point features overlap. Please view the associated datasets for detail regarding specific features. In CalFish these are accessed through the "link" field that is visible when performing an identify or query operation. A URL string is provided with each feature in the downloadable data which can also be used to access the underlying datasets.The coho data that is available via the CalFish website is actually linked directly to the StreamNet website where the database's tabular data is currently stored. Additional information about StreamNet may be downloaded at http://www.streamnet.org. Complete documentation for the StreamNet database may be accessed at http://http://www.streamnet.org/def.html
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.
This dataset can be used for various analyses such as:
This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.