100+ datasets found

P
VALUE Dataset
paperswithcode.com
library.toponeai.link
Updated Apr 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu (2024). VALUE Dataset [Dataset]. https://paperswithcode.com/dataset/value
Explore at:
Dataset updated
Apr 21, 2024
Authors
Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu
Description
VALUE is a Video-And-Language Understanding Evaluation benchmark to test models that are generalizable to diverse tasks, domains, and datasets. It is an assemblage of 11 VidL (video-and-language) datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning. VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels. Rather than focusing on single-channel videos with visual information only, VALUE promotes models that leverage information from both video frames and their associated subtitles, as well as models that share knowledge across multiple tasks.

The datasets used for the VALUE benchmark are: TVQA, TVR, TVC, How2R, How2QA, VIOLIN, VLEP, YouCook2 (YC2C, YC2R), VATEX
Savings Bonds Value Files
datasets.ai
catalog.data.gov
21, 24
Updated Aug 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of the Treasury (2024). Savings Bonds Value Files [Dataset]. https://datasets.ai/datasets/savings-bonds-value-files
Explore at:
24, 21Available download formats
Dataset updated
Aug 6, 2024
Dataset provided by
United States Department of the Treasuryhttps://treasury.gov/
Authors
Department of the Treasury
Description
The Savings Bond Value Files dataset is used by developers of bond pricing programs to update their systems with new redemption values for accrual savings bonds (Series E, EE, I & Savings Notes). The core data is the same as the Redemption Tables but there are differences in format, amount of data, and date range. The Savings Bonds Value Files dataset is meant for programmers and developers to read in redemption values without having to first convert PDFs.
Film Circulation dataset
zenodo.org
data.niaid.nih.gov
bin, csv, png
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova (2024). Film Circulation dataset [Dataset]. http://doi.org/10.5281/zenodo.7887672
Explore at:
csv, png, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7887672
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

Please cite this when using the dataset.

Detailed description of the dataset:

1 Film Dataset: Festival Programs

The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

2 Survey Dataset

The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

3 IMDb & Scripts

The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

4 Festival Library Dataset

The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories,
Mobil Price range
kaggle.com
Updated Jun 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
faezeh bagherizadeh (2023). Mobil Price range [Dataset]. https://www.kaggle.com/datasets/faezehbagheri/mobil-price-range
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
faezeh bagherizadeh
Description
Dataset

This dataset was created by faezeh bagherizadeh

Contents
30-m Topographic Wetness Index
catalog.data.gov
data.amerigeoss.org
Updated Jun 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Park Service (2024). 30-m Topographic Wetness Index [Dataset]. https://catalog.data.gov/dataset/30-m-topographic-wetness-index
Explore at:
Dataset updated
Jun 5, 2024
Dataset provided by
National Park Servicehttp://www.nps.gov/
Description
The lidar Topographic Wetness Index (TWI) is the TWI data product produced and distributed by the National Park Service, Great Smoky Mountains National Park. Concave, low gradient areas will gather water (low TWI values), whereas steep, convex areas will shed water (high TWI values). Values range range from less than 1 (dry cells) to greater than 20 (wet cells).
b
Car Prices Dataset
brightdata.com
.json, .csv, .xlsx
Updated Mar 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). Car Prices Dataset [Dataset]. https://brightdata.com/products/datasets/car-prices
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Mar 20, 2023
Dataset authored and provided by
Bright Data
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Gain valuable insights into the automotive market with our comprehensive Car Prices Dataset. Designed for businesses, analysts, and researchers, this dataset provides real-time and historical car pricing data to support market analysis, pricing strategies, and trend forecasting.

Dataset Features

Vehicle Listings: Access detailed car listings, including make, model, year, trim, and specifications. Ideal for tracking market trends and pricing fluctuations. Pricing Data: Get real-time and historical car prices from multiple sources, including dealerships, marketplaces, and private sellers. Market Trends & Valuations: Analyze price changes over time, compare vehicle depreciation rates, and identify emerging pricing trends. Dealer & Seller Information: Extract seller details, including dealership names, locations, and contact information for lead generation and competitive analysis.

Customizable Subsets for Specific Needs Our Car Prices Dataset is fully customizable, allowing you to filter data based on vehicle type, location, price range, and other key attributes. Whether you need a broad dataset for market research or a focused subset for competitive analysis, we tailor the dataset to your needs.

Popular Use Cases

Market Analysis & Pricing Strategy: Track vehicle price trends, compare competitor pricing, and optimize pricing strategies for dealerships and resellers. Automotive Valuation & Depreciation Studies: Analyze historical pricing data to assess vehicle depreciation rates and predict future values. Competitive Intelligence: Monitor competitor pricing, dealership inventory, and promotional offers to stay ahead in the market. Lead Generation & Sales Optimization: Identify potential buyers and sellers, track demand for specific vehicle models, and enhance sales strategies. AI & Predictive Analytics: Leverage structured car pricing data for AI-driven forecasting, automated pricing models, and trend prediction.

Whether you're tracking car prices, analyzing market trends, or optimizing sales strategies, our Car Prices Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
f
Data from: Count-Based Morgan Fingerprint: A More Efficient and...
acs.figshare.com
xlsx
Updated Jul 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shifa Zhong; Xiaohong Guan (2023). Count-Based Morgan Fingerprint: A More Efficient and Interpretable Molecular Representation in Developing Machine Learning-Based Predictive Regression Models for Water Contaminants’ Activities and Properties [Dataset]. http://doi.org/10.1021/acs.est.3c02198.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.est.3c02198.s002
Dataset updated
Jul 5, 2023
Dataset provided by
ACS Publications
Authors
Shifa Zhong; Xiaohong Guan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
In this study, we introduce the count-based Morgan fingerprint (C-MF) to represent chemical structures of contaminants and develop machine learning (ML)-based predictive models for their activities and properties. Compared with the binary Morgan fingerprint (B-MF), C-MF not only qualifies the presence or absence of an atom group but also quantifies its counts in a molecule. We employ six different ML algorithms (ridge regression, SVM, KNN, RF, XGBoost, and CatBoost) to develop models on 10 contaminant-related data sets based on C-MF and B-MF to compare them in terms of the model’s predictive performance, interpretation, and applicability domain (AD). Our results show that C-MF outperforms B-MF in nine of 10 data sets in terms of model predictive performance. The advantage of C-MF over B-MF is dependent on the ML algorithm, and the performance enhancements are proportional to the difference in the chemical diversity of data sets calculated by B-MF and C-MF. Model interpretation results show that the C-MF-based model can elucidate the effect of atom group counts on the target and have a wider range of SHAP values. AD analysis shows that C-MF-based models have an AD similar to that of B-MF-based ones. Finally, we developed a “ContaminaNET” platform to deploy these C-MF-based models for free use.
g
TIGER/Line Shapefile, 2023, County, Worth County, MO, Address Range-Feature
gimi9.com
catalog.data.gov
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TIGER/Line Shapefile, 2023, County, Worth County, MO, Address Range-Feature [Dataset]. https://gimi9.com/dataset/data-gov_tiger-line-shapefile-2023-county-worth-county-mo-address-range-feature
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Worth County, Missouri
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
Dataset for the study on Interactively building a representation of actions...
data.4tu.nl
zip
Updated Sep 24, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J. (Jan) Balata; Myrthe Tielman; J. (Jakub) Berka; X. (Xueliang) Li; C.M. (Catholijn) Jonker; Z. (Zdenek) Mikovec; M.B. (Birna) van Riemsdijk (2018). Dataset for the study on Interactively building a representation of actions and values [Dataset]. http://doi.org/10.4121/uuid:95600630-8fb2-45b2-bf49-a6a14ffc4fde
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:95600630-8fb2-45b2-bf49-a6a14ffc4fde
Dataset updated
Sep 24, 2018
Dataset provided by
4TUhttps://www.4tu.nl/
Authors
J. (Jan) Balata; Myrthe Tielman; J. (Jakub) Berka; X. (Xueliang) Li; C.M. (Catholijn) Jonker; Z. (Zdenek) Mikovec; M.B. (Birna) van Riemsdijk
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Personal electronic partners can play an important role in people's daily lives, especially for vulnerable user groups. However, to achieve an e-partner that can support a wide range of users it should personalize in a manner that is closely related to the users actual behavior, and respects their values. We, therefore, propose to use Action Identification Hierarchies (AIH) to represent the user's actions and values. In an qualitative study, we investigated how users themselves can build their AIH in conversation with an agent. Both visually impaired (n=7) and university workers (n=9) participated, talking about traveling and stressful behavior respectively. Our goals were to see how understandable the AIH was to different users, whether the AIH content was subjectively correct, and how usable the system was. The results from this study are an important step in applying AIH's to different user groups, and highlight the importance of understandability, usability and flexibility.
n
Data from: HomeRange: A global database of mammalian home ranges
data.niaid.nih.gov
datadryad.org
zip
Updated Sep 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maarten Broekman; Selwyn Hoeks; Rosa Freriks; Merel Langendoen; Katharina Runge; Ecaterina Savenco; Ruben ter Harmsel; Mark Huijbregts; Marlee Tucker (2023). HomeRange: A global database of mammalian home ranges [Dataset]. http://doi.org/10.5061/dryad.d2547d85x
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.d2547d85x
Dataset updated
Sep 13, 2023
Dataset provided by
Radboud University Nijmegen
Authors
Maarten Broekman; Selwyn Hoeks; Rosa Freriks; Merel Langendoen; Katharina Runge; Ecaterina Savenco; Ruben ter Harmsel; Mark Huijbregts; Marlee Tucker
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Motivation: Home range is a common measure of animal space use as it provides ecological information that is useful for conservation applications. In macroecological studies, values are typically aggregated to species means to examine general patterns of animal space use. However, this ignores the environmental context in which the home range was estimated and does not account for intraspecific variation in home range size. In addition, the focus of macroecological studies on home ranges has been historically biased toward terrestrial mammals. The use of aggregated numbers and terrestrial focus limits our ability to examine home range patterns across different environments, variation in time and between different levels of organisation. Here we introduce HomeRange, a global database with 75,611 home-range values across 960 different mammal species, including terrestrial, as well as aquatic and aerial species. Main types of variable contained: The dataset contains mammal home-range estimates, species names, methodological information on data collection, home-range estimation method, period of data collection, study coordinates and name of location, as well as species traits derived from the studies, such as body mass, life stage, reproductive status and locomotor habit. Spatial location and grain: The collected data is distributed globally. Across studies, the spatial accuracy varies, with the coarsest resolution being 1 degree. Time period and grain: The data represent information published between 1939 and 2022. Across studies, the temporal accuracy varies, some studies report start and end dates specific to the day. For other studies, only the month or year is reported. Major taxa and level of measurement: Mammal species from 24 of the 27 different taxonomic orders. Home-range estimates range from individual-level values to population-level averages. Methods Mammalian home range papers were compiled via an extensive literature search. All home range values were extracted from the literature including individual, group and population-level home range values. Associated values were also compiled including species names, methodological information on data collection, home-range estimation method, period of data collection, study coordinates and name of location, as well as species traits derived from the studies, such as body mass, life stage, reproductive status and locomotor habit. Here we include the database, associated metadata and reference list of all sources from which home range data was extracted from. We also provide an R package, which can be installed from https://github.com/SHoeks/HomeRange. The HomeRange R package provides functions for downloading the latest version of the HomeRange database and loading it as a standard dataframe into R, plotting several statistics of the database and finally attaching species traits (e.g. species average body mass, trophic level) from the COMBINE (Soria et al. 2021) for statistical analysis.
h
daily-historical-stock-price-data-for-deep-value-driller-as-20212025
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Ben Ali, daily-historical-stock-price-data-for-deep-value-driller-as-20212025 [Dataset]. https://huggingface.co/datasets/khaledxbenali/daily-historical-stock-price-data-for-deep-value-driller-as-20212025
Explore at:
Authors
Khaled Ben Ali
Description
📈 Daily Historical Stock Price Data for Deep Value Driller AS (2021–2025)

A clean, ready-to-use dataset containing daily stock prices for Deep Value Driller AS from 2021-05-31 to 2025-05-28. This dataset is ideal for use in financial analysis, algorithmic trading, machine learning, and academic research.

🗂️ Dataset Overview

Company: Deep Value Driller AS Ticker Symbol: DVD.OL Date Range: 2021-05-31 to 2025-05-28 Frequency: Daily Total Records: 1008 rows (one per… See the full description on the dataset page: https://huggingface.co/datasets/khaledxbenali/daily-historical-stock-price-data-for-deep-value-driller-as-20212025.
u
Number of residential properties, by property type, assessment value range...
beta.data.urbandatacentre.ca
datasets.ai
+4more
Updated Sep 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Number of residential properties, by property type, assessment value range and residency status, census metropolitan area of Toronto and Vancouver and their census subdivisions [Dataset]. https://beta.data.urbandatacentre.ca/dataset/gov-canada-fde28f54-9253-498e-84e7-489da69c73b6
Explore at:
Dataset updated
Sep 13, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Vancouver, Toronto
Description
This table contains data on the number of residential properties, by property type, assessment value range and residency type for the census metropolitan areas of Toronto and Vancouver and their census subdivisions.
c
Aquifer framework datasets used to represent the Arbuckle-Simpson aquifer,...
s.cnmilf.com
data.usgs.gov
+1more
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Aquifer framework datasets used to represent the Arbuckle-Simpson aquifer, Oklahoma [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/aquifer-framework-datasets-used-to-represent-the-arbuckle-simpson-aquifer-oklahoma
Explore at:
Dataset updated
Sep 26, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Oklahoma
Description
The Arbuckle-Simpson aquifer covers an area of about 800 square miles in the Arbuckle Mountains and Arbuckle Plains of South-Central Oklahoma. The aquifer is in the Central Lowland Physiographic Province and is composed of the Simpson and Arbuckle Groups of Ordovician and Cambrian age. The aquifer is as thick as 9,000 feet in some areas. The aquifer provides relatively small, but important, amounts of water depended on for public supply, agricultural, and industrial use (HA 730-E). This product provides source data for the Arbuckle-Simpson aquifer framework, including: Georeferenced images: 1. i_46ARBSMP_bot.tif: Digitized figure of depth contour lines below land surface representing the base of fresh water in the Arbuckle-Simpson aquifer. The base of fresh water is considered to be the bottom of the Arbuckle-Simpson aquifer. The original figure is from the "Reconnaissance of the water resources of the Ardmore and Sherman Quadrangles, southern Oklahoma" report, map HA-3, page 2, prepared by the Oklahoma Geological Survey in cooperation with the U.S. Geological Survey (HA3_P2). Extent shapefiles: 1. p_46ABKSMP.shp: Polygon shapefile containing the areal extent of the Arbuckle-Simpson aquifer (Arbuckle-Simpson_AqExtent). The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_46ABKSMP_bot.shp: Contour line dataset containing depth values, in feet below land surface, across the bottom of the Arbuckle-Simpson aquifer. This dataset is a digitized version of the map published in HA3_P2. This dataset was used to create the rd_46ABKSMP_bot.tif raster dataset. This map generalized depth values into zoned areas with associated ranges of depth. The edge of each zone was treated as the minimum value of the assigned range, thus creating the depth contour lines. This interpretation was favorable as it allowed for the creation of the resulting raster. This map was used because more detailed point or contour data for the area is unavailable. Altitude raster files: 1. ra_46ABKSMP_top.tif: Altitude raster dataset of the top of the Arbuckle-Simpson aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. This raster was interpolated from the Digital Elevation Model (DEM) dataset (NED, 100-meter). 2. ra_46ABKSMP_bot.tif: Altitude raster dataset of the bottom of the Arbuckle-Simpson aquifer. The altitude values are in meters referenced to NAVD88. Depth raster files: 1. rd_46ABKSMP_top.tif: Depth raster dataset of the top of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. 2. rd_46ABKSMP_bot.tif: Depth raster dataset of the bottom of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). This raster was interpolated from the contour line dataset c_46ABKSMP_bot.shp.
NLCD 2016 Tree Canopy Cover Puerto Rico Virgin Islands (Image Service)
agdatacommons.nal.usda.gov
cloud.csiss.gmu.edu
+1more
bin
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2024). NLCD 2016 Tree Canopy Cover Puerto Rico Virgin Islands (Image Service) [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/NLCD_2016_Tree_Canopy_Cover_Puerto_Rico_Virgin_Islands_Image_Service_/25973254
Explore at:
binAvailable download formats
Dataset updated
Oct 1, 2024
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Authors
U.S. Forest Service
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Puerto Rico
Description
The USDA Forest Service (USFS) builds multiple versions of percent tree canopy cover data, in order to serve needs of multiple user communities. These datasets encompass CONUS, Coastal Alaska, Hawaii, U.S. Virgin Islands and Puerto Rico. There are three versions of data within the 2016 TCC Product Suite, which include: The initial model outputs referred to as the Analytical data; A masked version of the initial output referred to as Cartographic data; And a modified version built for the National Land Cover Database and referred to as NLCD data, which includes a canopy cover change dataset derived from subtraction of datasets for the nominal years of 2011 and 2016. The Analytical data are the initial model outputs generated in the production workflow. These data are best suited for users who will carry out their own detailed statistical and uncertainty analyses on the dataset and place lower priority on the visual appearance of the dataset for cartographic purposes. Datasets for the nominal years of 2011 and 2016 are available.

The Cartographic products mask the initial model outputs to improve the visual appearance of the datasets. These data are best suited for users who prioritize visual appearance of the data for cartographic and illustrative purposes. Datasets for the nominal years of 2011 and 2016 are available.

The NLCD data are the result of further processing of the masked data. The goal was to generate three coordinated components. The components are (1) a dataset for the nominal year of 2011, (2) a dataset for the nominal year of 2016, and (3) a dataset that captures the change in canopy cover between the two nominal years of 2011 and 2016. For the NLCD data, the three components meet the criterion of 2011 TCC + change in TCC = 2016 TCC. These NLCD data are best suited for users who require a coordinated three-component data stack where each pixels values meet the criterion of 2011 TCC + change in TCC = 2016 TCC. Datasets for the nominal years of 2011 and 2016 are available, as well as a dataset that captures the change (loss or gain) in canopy cover between those two nominal years of 2011 and 2016, in areas where change was identified.

These tree canopy cover data are accessible for multiple user communities, through multiple channels and platforms, as listed below: Analytical USFS Tree Canopy Cover Datasets (Download) USFS Enterprise Data Warehouse (Image Service) Cartographic USFS Tree Canopy Cover Datasets (Download) USFS Enterprise Data Warehouse (Map Service) NLCD Multi-Resolution Land Characteristics (MRLC) Consortium (Download) USFS Enterprise Data Warehouse (Image Service) The Puerto Rico and the US Virgin Islands TCC NLCD change dataset is comprised of a single layer. The pixel values range from -97 to 98 percent where negative values represent canopy loss and positive values represent canopy gain. The background is represented by the value 127 and data gaps are represented by the value 110 since this is a signed 8-bit image.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: ISO-19139 metadata ArcGIS Hub Dataset ArcGIS GeoService For complete information, please visit https://data.gov.
m
Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating...
data.mendeley.com
Updated Jun 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yudhi Prabowo (2022). Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating Burned Areas in Indonesia [Dataset]. http://doi.org/10.17632/fs7mtkg2wk.5
Explore at:
Unique identifier
https://doi.org/10.17632/fs7mtkg2wk.5
Dataset updated
Jun 6, 2022
Authors
Yudhi Prabowo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Indonesia
Description
The dataset consist of three categories; image subsets, burned area masks and quicklooks. The image subsets are derived from Landsat-8 scenes taken during the years 2019 and 2021. Each image has a size of 512x512 pixels and consists of 8 multispectral. The sequence of band names from band 1 to band 7 of the image subset is same as the sequence of band names of landsat-8 scene, except for band 8 of the image subset which is band 9 (cirrus band) in the original landsat-8 scene. The image subsets are saved in GeoTIFF file format with the latitude longitude coordinate system and WGS 1984 as the datum. The spatial resolution of image subsets is 0.00025 degree and the pixel values are stored in 16 bit unsigned integer with the range of value from 0 to 65535. The total of the dataset is 227 images which containing object of burned area surrounded by various ecological diversity backgrounds such as forest, shrub, grassland, waterbody, bare land, settlement, cloud and cloud shadow. In some cases, there are some image subsets with the burned areas covered by smoke due to the fire is still active. Some image subsets also overlap each other to cover the area of burned scar which the area is too large. The burned area mask is a binary annotation image which consists of two classes; burned area as the foreground and non-burned area as the background. These binary images are saved in 8 bit unsigned integer where the burned area is indicated by the pixel value of 1, whereas the non-burned area is indicated by 0. The burned area masks in this dataset contain only burned scars and are not contaminated with thick clouds, shadows, and vegetation. Among 227 images, 206 images contain burned areas whereas 21 images contain only background. The highest number of images in this dataset is dominated by images with coverage percentage of burned area between 0 and 10 percent. Our dataset also provides quicklook image as a quick preview of image subset. It offers a fast and full size preview of image subset without opening the file using any GIS software. The quicklook images can also be used for training and evaluating the model as a substitute of image subsets. The image size is 512x512 pixels same as the size of image subset and annotation image. It consists of three bands as a false color composite quicklook images, with combination of band 7 (SWIR-2), band 5 (NIR), and band 4 (red). These RGB composite images have been performed contrast stretching to enhance the images visualizations. The quicklook images are stored in GeoTIFF file format with 8 bit unsigned integer.

This work was financed by Riset Inovatif Produktif (RISPRO) fund through Prioritas Riset Nasional (PRN) project, grant no. 255/E1/PRN/2020 for 2020 - 2021 contract period.
Z
Game of Life Benchmark Dataset
data.niaid.nih.gov
Updated Oct 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Girgin, Serkan (2023). Game of Life Benchmark Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011941
Explore at:
Dataset updated
Oct 17, 2023
Dataset authored and provided by
Girgin, Serkan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes grids of various sizes and percentages of living cells to benchmark Conway's Game of Life implementations. An example implementation in Python is available at https://github.com/girgink/game-of-life. Grid files can be converted into Run Length Encoded (RLE) format by using the convert_to_rle.py tool available on the code repository. Grid file format

Grids are provided as ASCII text files. The first line includes] the width and height of the grid as unsigned integer values separated by a space. The other lines indicate the initial locations of the living cells:

Each line has two unsigned integer values separated by space indicating vertical and horizontal coordinates of the living cells, respectively. The top left cell has the coordinates of (0, 0) Valid vertical coordinate values range between 0 - (height-1), increasing from top to bottom. Valid horizontal coordinate values range between 0 - (width-1), increasing from left to right. Grids

1,000 x 1,000 grids:

10% living cells: 1000x1000_0.1.txt.zip, 0.76MB uncompressed 20% living cells: 1000x1000_0.2.txt.zip, 1.48MB uncompressed 50% living cells: 1000x1000_0.5.txt.zip, 3.70MB uncompressed 10,000 x 10,000 grids

10% living cells: 10000x10000_0.1.txt.zip, 93MB uncompressed 20% living cells: 10000x10000_0.2.txt.zip, 186MB uncompressed 50% living cells: 10000x10000_0.5.txt.zip, 466MB uncompressed 100,000 x 100,000 grids

0.1% living cells: 100000x100000_0.001.txt.zip, 112MB uncompressed 1.0% living cells: 100000x100000_0.01.txt.zip, 1.09GB uncompressed 2.5% living cells: 100000x100000_0.025.txt.zip, 2.74GB uncompressed 1,000,000 x 1,000,000 grids

0.001% living cells: 1000000x1000000_0.00001.txt.zip, 131MB uncompressed 0.010% living cells: 1000000x1000000_0.0001.txt.zip, 1.28GB uncompressed 0.025% living cells: 1000000x1000000_0.00025.txt.zip, 3.20GB uncompressed
f
Data Sheet 1_Visual analysis of multi-omics data.csv
frontiersin.figshare.com
csv
Updated Sep 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp (2024). Data Sheet 1_Visual analysis of multi-omics data.csv [Dataset]. http://doi.org/10.3389/fbinf.2024.1395981.s001
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.3389/fbinf.2024.1395981.s001
Dataset updated
Sep 10, 2024
Dataset provided by
Frontiers
Authors
Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool’s interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different “visual channel” of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.
s
Swedish High Value Data Collection: Companies, Geospatial, Meteorological,...
store.smartdatahub.io
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Swedish High Value Data Collection: Companies, Geospatial, Meteorological, Statistics, and Earth Observation & Environment - Datasets - This service has been deprecated - please visit https://www.smartdatahub.io/ to access data. See the About page for details. // [Dataset]. https://store.smartdatahub.io/dataset/se_lantmateriet_bilaga_1_sweden_proposal_on_high_value_data_20200430_xlsx
Explore at:
Dataset updated
Aug 26, 2024
Area covered
Sweden, Earth
Description
The dataset collection in focus comprises an assortment of tables, each carrying a distinct set of data. These tables are meticulously sourced from the website of Lantmäteriet (The Swedish Mapping, Cadastral and Land Registration Authority) in Sweden. The dataset provides a wide range of valuable data, including but not limited to, information about companies, geospatial data, meteorological data, statistical data, and earth observation & environmental data. The tables present the data in an organized manner, with the information arranged systematically in columns and rows. This makes it convenient to analyze and draw insights from the dataset. Overall, it's a comprehensive dataset collection that offers a diverse and substantial range of information.
d
Little's Range and FIA Importance Value Distribution Maps (A Spatial...
search.dataone.org
dataone.org
Updated Nov 17, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasad, Anantha M.; Iverson, Louis R. (2014). Little's Range and FIA Importance Value Distribution Maps (A Spatial Database for 135 Eastern U.S. Tree Species) [Dataset]. https://search.dataone.org/view/Little%27s_Range_and_FIA_Importance_Value_Distribution_Maps_%28A_Spatial_Database_for_135_Eastern_U.S._Tree_Species%29.xml
Explore at:
Dataset updated
Nov 17, 2014
Dataset provided by
Regional and Global Biogeochemical Dynamics Data (RGD)
Authors
Prasad, Anantha M.; Iverson, Louis R.
Time period covered
Jan 1, 1971
Area covered

Description
This database contains distribution maps of 135 eastern U.S. tree species based on Importance Values (IV) derived from Forest Inventory Analysis (FIA) data and a geographical information system (GIS) database of Elbert L. Jr. Little's published ranges. Between 1971 and 1977, Elbert L. Jr. Little, Chief Dendrologist with the U.S. Forest Service, published a series of maps of tree species ranges based on botanical lists, forest surveys, field notes ad herbarium specimens. These published maps have become the standard reference for most U.S. and Canadian tree species ranges.

The USDA Forest Service's FIA units are charged with periodically assessing the extent, timber potential, and health of the trees in the United States. The investigators have created a spatial database of individual species IV based on the number of stems and basal area of understory and overstory trees using FIA data from more than 100,000 plots in the eastern United States. The IV ranges from 0 to 100 and gives a measure of the abundance of the species. (See the investigator's Climate Change Atlas for 80 Forest Tree Species of the Eastern United States at [http://www.fs.fed.us/ne/delaware/atlas/web_atlas.html] for details). The investigators have aggregated the plot-level IV to 20km cells.

Both sets of maps (Little's ranges and IV based on FIA data) are available for download. The Little's range maps (little.shp) are vector based and are provided as shape files. These maps can span United States or United States and Canada in extent depending on the species. The Importance Value (IV) are raster maps (asciigrid) in asciigrid format. This is an ascii file with header information that can be used to import data into ArcInfo GRID or ArcView's Spatial Analyst GIS software. The spatial resolution is 20km. These raster maps span the eastern U.S. (east of the 100th meridian) in extent.
mobile price range prediction
kaggle.com
Updated Aug 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sowmya Jonnalagadda (2020). mobile price range prediction [Dataset]. https://www.kaggle.com/datasets/sowmya1120/mobile-price-range-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sowmya Jonnalagadda
Description
Dataset

This dataset was created by Sowmya Jonnalagadda

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu (2024). VALUE Dataset [Dataset]. https://paperswithcode.com/dataset/value

VALUE Dataset

Video-And-Language Understanding Evaluation

Explore at:

Dataset updated

Apr 21, 2024

Authors

Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu

Description

VALUE is a Video-And-Language Understanding Evaluation benchmark to test models that are generalizable to diverse tasks, domains, and datasets. It is an assemblage of 11 VidL (video-and-language) datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning. VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels. Rather than focusing on single-channel videos with visual information only, VALUE promotes models that leverage information from both video frames and their associated subtitles, as well as models that share knowledge across multiple tasks.

The datasets used for the VALUE benchmark are: TVQA, TVR, TVC, How2R, How2QA, VIOLIN, VLEP, YouCook2 (YC2C, YC2R), VATEX

Clear search

Close search

Google apps

Main menu

VALUE Dataset

Savings Bonds Value Files

Film Circulation dataset

Mobil Price range

Dataset

Contents

30-m Topographic Wetness Index

Car Prices Dataset

Data from: Count-Based Morgan Fingerprint: A More Efficient and...

TIGER/Line Shapefile, 2023, County, Worth County, MO, Address Range-Feature

Dataset for the study on Interactively building a representation of actions...

Data from: HomeRange: A global database of mammalian home ranges

daily-historical-stock-price-data-for-deep-value-driller-as-20212025

Number of residential properties, by property type, assessment value range...

Aquifer framework datasets used to represent the Arbuckle-Simpson aquifer,...

NLCD 2016 Tree Canopy Cover Puerto Rico Virgin Islands (Image Service)

Dataset of Deep Learning from Landsat-8 Satellite Images for Estimating...

Game of Life Benchmark Dataset

Data Sheet 1_Visual analysis of multi-omics data.csv

Swedish High Value Data Collection: Companies, Geospatial, Meteorological,...

Little's Range and FIA Importance Value Distribution Maps (A Spatial...

mobile price range prediction

Dataset

Contents

VALUE DatasetSee More Versions

Video-And-Language Understanding Evaluation

VALUE Dataset