Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Useful information and links for navigating this site, understanding and utilizing Open Data
Facebook
TwitterIn 2013, an invasive plant inventory of priority invasive plant species in priority areas was conducted at San Pablo Bay National Wildlife Refuge. Results from this effort will inform the development of invasive plant management objectives, strategies, and serves as a baseline for assessing change in the status of invasive plant distribution or abundance over time.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository holds the Clarity Dataset which is a companion to the SANER'22 entitled "An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation". The dataset consists of 45,998 captions 10,204 GUI screenshots and xml metadata files (akin to the "html" for stipulating GUIs) of Android applications. The NL captions were obtained from human labelers, underwent several quality control mechanisms, and contain both high- (screen-level) and low-(component) level descriptions of screen functionality. This dataset is meant as a new source of data to augment techniques for software documentation that can take advantage of the rich pixel-based information contained within screenshots.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This bundle contains documentation about data products that are collected using radio science and supporting equipment. With one exception, each member collection contains one or more versions of a single Software Interface Specification (SIS) or an equivalent document. A SIS describes the format and content of a data file at a granularity suffient for use -- typically byte-level, but sometimes bit-level. Examples of products and descriptions of their use may also be included in a collection, as appropriate. The exception is the DOCUMENT collection, which contains supporting material -- usually journal publications, technical reports, or other documents that describe investigations, analysis methods, and/or data but not at the level of a SIS. Members of the DOCUMENT collection were usually released once, whereas a SIS often evolves over many years.
Facebook
TwitterIn 2019, an invasive plant inventory of priority invasive plant species in priority areas was conducted at Ruby Lake National Wildlife Refuge. Results from this effort will inform the development of invasive plant management objectives, strategies, and serves as a baseline for assessing change in the status of invasive plant distribution or abundance over time. This report holds the data documenting this effort.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data Documentation and Metadata session from the 2015 Virginia Data Management Bootcamp. Introduces non-structural (data dictionaries, read me files, code books) and structured ways (XML schemas) to document research data.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains a collection of over 2,000 company documents, categorized into four main types: invoices, inventory reports, purchase orders, and shipping orders. Each document is provided in PDF format, accompanied by a CSV file that includes the text extracted from these documents, their respective labels, and the word count of each document. This dataset is ideal for various natural language processing (NLP) tasks, including text classification, information extraction, and document clustering.
PDF Documents: The dataset includes 2,677 PDF files, each representing a unique company document. These documents are derived from the Northwind dataset, which is commonly used for demonstrating database functionalities.
The document types are:
Here are a few example entries from the CSV file:
This dataset can be used for:
Facebook
TwitterCertificates and Documents Documentation
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This documentation offers an overview of data employed in energy system optimization models built with the ESTRAM framework by a research group of the Leibniz University Hannover and the Institute for Solar Energy Research Hamelin (ISFH). It is important to note that specific models may utilize distinct data as indicated in their respective studies.
Facebook
TwitterDocumentation describing the Race and Identity-Based Data Collection Strategy data tables released as open data, including table descriptions, metadata, and glossary of terms.
Facebook
TwitterA collection of data serving as documentation of Stillwater National Wildlife Refuge Complex priority resources of concern.
Facebook
TwitterPlease find attached the data documentation. The Atlas is based on 2010 census tract polygons. To use the underlying Atlas data in a GIS, the data from this spreadsheet needs to be joined to a census tract boundary file. With ESRI software, users should have access to the tract layer on ESRI's "Data and Maps" data distribution. For users of other software, tract boundaries can be downloaded directly from the Census Bureau's Cartographic Boundary Files. The underlying map services used in the Food Access Research Atlas are also available for both developers and GIS users. See the Geospatial API documentation for more information.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
如需完整数据集或了解更多,请发邮件至commercialproduct@appen.com For the complete dataset or more, please email commercialproduct@appen.com
The dataset product can be used in many AI pilot projects and supplement production models with other data. It can improve the model performance and be cost-effectiveness. Dataset is an excellent solution when time and budget is limited. Appen database team can provide a large number of database products, such as ASR, TTS, video, text, image. At the same time, we are also constantly building new datasets to expand resources. Database team always strive to deliver as soon as possible to meet the needs of the global customers. This OCR database consists of image data in Korean, Vietnamese, Spanish, French, Thai, Japanese, Indonesian, Tamil, and Burmese, as well as handwritten images in both Chinese and English (including annotations). On average, each image contains 30 to 40 frames, including texts in various languages, special characters, and numbers. The accuracy rate requirement is over 99% (both position and content are correct). The images include the following categories: - RECEIPT - IDCARD - TRADE - TABLE - WHITEBOARD - NEWSPAPER - THESIS - CARD - NOTE - CONTRACT - BOOKCONTENT - HANDWRITING
Database Name Category Quantity
RECEIPT 1500 IDCARD 500 TRADE 1012 TABLE 512 WHITEBOARD 500 NEWSPAPER 500 THESIS 500 CARD 500 NOTE 499 CONTRACT 501 BOOKCONTENT 500 TOTAL 7,024
RECEIPT 337 IDCARD 100 TRADE 227 TABLE 100 WHITEBOARD 111 NEWSPAPER 100 THESIS 100 CARD 100 NOTE 100 CONTRACT 105 BOOKCONTENT 700 TOTAL 2,080
RECEIPT 1500 IDCARD 500 TRADE 1000 TABLE 500 WHITEBOARD 500 NEWSPAPER 500 THESIS 500 CARD 500 NOTE 500 CONTRACT 500 BOOKCONTENT 500 TOTAL 7000
RECEIPT 300 IDCARD 100 TRADE 200 TABLE 100 WHITEBOARD 100 NEWSPAPER 100 THESIS 103 CARD 100 NOTE 100 CONTRACT 100 BOOKCONTENT 700 TOTAL 2003
RECEIPT 1500 IDCARD 500 TRADE 1000 TABLE 537 WHITEBOARD 500 NEWSPAPER 500 THESIS 500 CARD 500 NOTE 500 CONTRACT 500 BOOKCONTENT 500 TOTAL 7037
RECEIPT 1586 IDCARD 500 TRADE 1000 TABLE 552 WHITEBOARD 500 NEWSPAPER 500 THESIS 509 CARD 500 NOTE 500 CONTRACT 500 BOOKCONTENT 500 TOTAL 7147
RECEIPT 1500 IDCARD 500 TRADE 1003 TABLE 500 WHITEBOARD 501 NEWSPAPER 502 THESIS 500 CARD 500 NOTE 500 CONTRACT 500 BOOKCONTENT 500 TOTAL 7006
RECEIPT 356 IDCARD 98 TRADE 475 TABLE 532 WHITEBOARD 501 NEWSPAPER 500 THESIS 500 CARD 500 NOTE 501 CONTRACT 500 BOOKCONTENT 500 TOTAL 4963
RECEIPT 300 IDCARD 100 TRADE 200 TABLE 117 WHITEBOARD 110 NEWSPAPER 108 THESIS 102 CARD 100 NOTE 120 CONTRACT 100 BOOKCONTENT 761 TOTAL 2118
English Handwritten Datasets HANDWRITING 2278 Chinese Handwritten Datasets HANDWRITING 11118
Facebook
TwitterIn 2017, invasive plant species and area priorities for baseline inventory and early detection were identified for Marin Islands National Wildlife Refuge. Results from this effort will inform a future inventory, and guide development of invasive plant management objectives and strategies. This record holds the data documenting this effort.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Educational materials used in the course 'Research and data documentation', University of Twente, Enschede, the Netherlands
Facebook
TwitterThis data set contains version 3.0 of the updated collection of documentation for the raw and calibrated science data sets for the Deep Impact and EPOXI missions. This data set supersedes version 2.0.
Facebook
TwitterThis is an auto-generated index table corresponding to a folder of files in this dataset with the same name. This table can be used to extract a subset of files based on their metadata, which can then be used for further analysis. You can view the contents of specific files by navigating to the "cells" tab and clicking on an individual file_id.
Facebook
TwitterThe data and interpretations presented are based on firsthand experience, being compiled by the Department of Conservation and Land Management’s regional nature conservation staff between July 2001 and January 2002. Note: to access the data, select the data source link located on the right-hand side. Show full description
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Master Address Repository (MAR) 2.0 is the successor to the Master Address Repository. The Master Address Repository is a complex and widely accessed database that is increasingly being accessed by many DC Government applications. It is important to have high quality documentation readily accessible for such widely used databases. This document contains the column (field) definitions for the most important views, tables and feature classes within the MAR 2.0.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Useful information and links for navigating this site, understanding and utilizing Open Data