51 datasets found
  1. d

    Data from: Domestic and International Common Language Database (DICL)

    • catalog.data.gov
    Updated Apr 6, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Economics (2021). Domestic and International Common Language Database (DICL) [Dataset]. https://catalog.data.gov/dataset/domestic-and-international-common-language-database-dicl
    Explore at:
    Dataset updated
    Apr 6, 2021
    Dataset provided by
    Office of Economics
    Description

    The database contains index measures of linguistic similarity both domestically and internationally. The domestic measures capture linguistic similarities present among populations within a single country while the international indexes capture language similarities between two different countries. The indexes reflect three aspects of language: common official languages, common native languages, and linguistic proximity across languages.

  2. u

    University of Cape Town Student Admissions Data 2006-2014 - South Africa

    • datafirst.uct.ac.za
    Updated Jul 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCT Student Administration (2020). University of Cape Town Student Admissions Data 2006-2014 - South Africa [Dataset]. https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/556
    Explore at:
    Dataset updated
    Jul 28, 2020
    Dataset authored and provided by
    UCT Student Administration
    Time period covered
    2006 - 2014
    Area covered
    South Africa
    Description

    Abstract

    This dataset was generated from a set of Excel spreadsheets from an Information and Communication Technology Services (ICTS) administrative database on student applications to the University of Cape Town (UCT). This database contains information on applications to UCT between the January 2006 and December 2014. In the original form received by DataFirst the data were ill suited to research purposes. This dataset represents an attempt at cleaning and organizing these data into a more tractable format. To ensure data confidentiality direct identifiers have been removed from the data and the data is only made available to accredited researchers through DataFirst's Secure Data Service.

    The dataset was separated into the following data files:

    1. Application level information: the "finest" unit of analysis. Individuals may have multiple applications. Uniquely identified by an application ID variable. There are a total of 1,714,669 applications on record.
    2. Individual level information: individuals may have multiple applications. Each individual is uniquely identified by an individual ID variable. Each individual is associated with information on "key subjects" from a separate data file also contained in the database. These key subjects are all separate variables in the individual level data file. There are a total of 285,005 individuals on record.
    3. Secondary Education Information: individuals can also be associated with row entries for each subject. This data file does not have a unique identifier. Instead, each row entry represents a specific secondary school subject for a specific individual. These subjects are quite specific and the data allows the user to distinguish between, for example, higher grade accounting and standard grade accounting. It also allows the user to identify the educational authority issuing the qualification e.g. Cambridge Internal Examinations (CIE) versus National Senior Certificate (NSC).
    4. Tertiary Education Information: the smallest of the four data files. There are multiple entries for each individual in this dataset. Each row entry contains information on the year, institution and transcript information and can be associated with individuals.

    Analysis unit

    Applications, individuals

    Kind of data

    Administrative records [adm]

    Mode of data collection

    Other [oth]

    Cleaning operations

    The data files were made available to DataFirst as a group of Excel spreadsheet documents from an SQL database managed by the University of Cape Town's Information and Communication Technology Services . The process of combining these original data files to create a research-ready dataset is summarised in a document entitled "Notes on preparing the UCT Student Application Data 2006-2014" accompanying the data.

  3. Common basic data set (CBDS): requests for change (RFC) 2019

    • gov.uk
    • sasastunts.com
    Updated Aug 16, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2019). Common basic data set (CBDS): requests for change (RFC) 2019 [Dataset]. https://www.gov.uk/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2019
    Explore at:
    Dataset updated
    Aug 16, 2019
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  4. Common basic data set (CBDS): requests for change (RFC) 2018

    • gov.uk
    • sasastunts.com
    Updated Jan 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2019). Common basic data set (CBDS): requests for change (RFC) 2018 [Dataset]. https://www.gov.uk/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2018
    Explore at:
    Dataset updated
    Jan 9, 2019
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  5. s

    Common basic data set (CBDS): requests for change (RFC) 2020

    • sasastunts.com
    Updated Nov 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2020). Common basic data set (CBDS): requests for change (RFC) 2020 [Dataset]. https://sasastunts.com/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2020
    Explore at:
    Dataset updated
    Nov 24, 2020
    Dataset provided by
    188体育
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  6. o

    US Public Schools

    • public.opendatasoft.com
    • data.smartidf.services
    • +1more
    csv, excel, geojson +1
    Updated Jan 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). US Public Schools [Dataset]. https://public.opendatasoft.com/explore/dataset/us-public-schools/
    Explore at:
    csv, json, excel, geojsonAvailable download formats
    Dataset updated
    Jan 6, 2023
    License

    https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain

    Area covered
    United States
    Description

    This Public Schools feature dataset is composed of all Public elementary and secondary education facilities in the United States as defined by the Common Core of Data (CCD, https://nces.ed.gov/ccd/ ), National Center for Education Statistics (NCES, https://nces.ed.gov ), US Department of Education for the 2017-2018 school year. This includes all Kindergarten through 12th grade schools as tracked by the Common Core of Data. Included in this dataset are military schools in US territories and referenced in the city field with an APO or FPO address. DOD schools represented in the NCES data that are outside of the United States or US territories have been omitted. This feature class contains all MEDS/MEDS+ as approved by NGA. Complete field and attribute information is available in the ”Entities and Attributes” metadata section. Geographical coverage is depicted in the thumbnail above and detailed in the Place Keyword section of the metadata. This release includes the addition of 3065 new records, modifications to the spatial location and/or attribution of 99,287 records, and removal of 2996 records not present in the NCES CCD data.

  7. s

    Common basic data set (CBDS): requests for change (RFC) 2022

    • sasastunts.com
    • gov.uk
    • +1more
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2023). Common basic data set (CBDS): requests for change (RFC) 2022 [Dataset]. https://sasastunts.com/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2022
    Explore at:
    Dataset updated
    Apr 13, 2023
    Dataset provided by
    188体育
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  8. s

    Common basic data set (CBDS): requests for change (RFC) 2021

    • sasastunts.com
    • gov.uk
    • +1more
    Updated Nov 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2021). Common basic data set (CBDS): requests for change (RFC) 2021 [Dataset]. https://sasastunts.com/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2021
    Explore at:
    Dataset updated
    Nov 24, 2021
    Dataset provided by
    188体育
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  9. v

    Forest Inventory and Analysis Database

    • anrgeodata.vermont.gov
    • datadiscoverystudio.org
    • +9more
    Updated Apr 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Forest Service (2017). Forest Inventory and Analysis Database [Dataset]. https://anrgeodata.vermont.gov/documents/bc09d4e07dbb4d539a8e46dd3639b5fe
    Explore at:
    Dataset updated
    Apr 14, 2017
    Dataset authored and provided by
    U.S. Forest Service
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    The Forest Inventory and Analysis (FIA) research program has been in existence since mandated by Congress in 1928. FIA's primary objective is to determine the extent, condition, volume, growth, and depletion of timber on the Nation's forest land. Before 1999, all inventories were conducted on a periodic basis. The passage of the 1998 Farm Bill requires FIA to collect data annually on plots within each State. This kind of up-to-date information is essential to frame realistic forest policies and programs. Summary reports for individual States are published but the Forest Service also provides data collected in each inventory to those interested in further analysis. Data is distributed via the FIA DataMart in a standard format. This standard format, referred to as the Forest Inventory and Analysis Database (FIADB) structure, was developed to provide users with as much data as possible in a consistent manner among States. A number of inventories conducted prior to the implementation of the annual inventory are available in the FIADB. However, various data attributes may be empty or the items may have been collected or computed differently. Annual inventories use a common plot design and common data collection procedures nationwide, resulting in greater consistency among FIA work units than earlier inventories. Links to field collection manuals and the FIADB user's manual are provided in the FIA DataMart.

  10. Voice Conversion Challenge 2020 database v1.0

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Dec 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhao Yi; Wen-Chin Huang; Xiaohai Tian; Junichi Yamagishi; Rohan Kumar Das; Tomi Kinnunen; Zhenhua Ling; Tomoki Toda; Zhao Yi; Wen-Chin Huang; Xiaohai Tian; Junichi Yamagishi; Rohan Kumar Das; Tomi Kinnunen; Zhenhua Ling; Tomoki Toda (2020). Voice Conversion Challenge 2020 database v1.0 [Dataset]. http://doi.org/10.5281/zenodo.4345689
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 23, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zhao Yi; Wen-Chin Huang; Xiaohai Tian; Junichi Yamagishi; Rohan Kumar Das; Tomi Kinnunen; Zhenhua Ling; Tomoki Toda; Zhao Yi; Wen-Chin Huang; Xiaohai Tian; Junichi Yamagishi; Rohan Kumar Das; Tomi Kinnunen; Zhenhua Ling; Tomoki Toda
    Description
    Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform.
    
    In 2016, we have launched the Voice Conversion Challenge (VCC) 2016 [1][2] at Interspeech 2016. The objective of the 2016 challenge was to better understand different VC techniques built on a freely-available common dataset to look at a common goal, and to share views about unsolved problems and challenges faced by the current VC techniques. The VCC 2016 focused on the most basic VC task, that is, the construction of VC models that automatically transform the voice identity of a source speaker into that of a target speaker using a parallel clean training database where source and target speakers read out the same set of utterances in a professional recording studio. 17 research groups had participated in the 2016 challenge. The challenge was successful and it established new standard evaluation methodology and protocols for bench-marking the performance of VC systems.
    
    In 2018, we have launched the second edition of VCC, the VCC 2018 [3]. In the second edition, we revised three aspects of the challenge. First, we educed the amount of speech data used for the construction of participant's VC systems to half. This is based on feedback from participants in the previous challenge and this is also essential for practical applications. Second, we introduced a more challenging task refereed to a Spoke task in addition to a similar task to the 1st edition, which we call a Hub task. In the Spoke task, participants need to build their VC systems using a non-parallel database in which source and target speakers read out different sets of utterances. We then evaluate both parallel and non-parallel voice conversion systems via the same large-scale crowdsourcing listening test. Third, we also attempted to bridge the gap between the ASV and VC communities. Since new VC systems developed for the VCC 2018 may be strong candidates for enhancing the ASVspoof 2015 database, we also asses spoofing performance of the VC systems based on anti-spoofing scores.
    
    In 2020, we launched the third edition of VCC, the VCC 2020 [4][5]. In this third edition, we constructed and distributed a new database for two tasks, intra-lingual semi-parallel and cross-lingual VC. The dataset for intra-lingual VC consists of a smaller parallel corpus and a larger nonparallel corpus, where both of them are of the same language. The dataset for cross-lingual VC consists of a corpus of the source speakers speaking in the source language and another corpus of the
    target speakers speaking in the target language. As a more challenging task than the previous ones, we focused on cross-lingual VC, in which the speaker identity is transformed between two speakers uttering different languages, which requires handling completely nonparallel training over different languages.
    
    This repository contains the training and evaluation data released to participants, target speaker’s speech data in English for reference purpose, and the transcriptions for evaluation data. For more details about the challenge and the listening test results please refer to [4] and README file.
    
    [1] Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, Junichi Yamagishi "The Voice Conversion Challenge 2016" in Proc. of Interspeech, San Francisco.
    
    [2] Mirjam Wester, Zhizheng Wu, Junichi Yamagishi "Analysis of the Voice Conversion Challenge 2016 Evaluation Results" in Proc. of Interspeech 2016.
    
    [3] Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling, "The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods", Proc Speaker Odyssey 2018, June 2018.
    
    [4] Yi Zhao, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, and Tomoki Toda. "Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion" Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 80-98, DOI: 10.21437/VCC_BC.2020-14.
  11. t

    Common Database on Designated Areas (CDDA)

    • testsdi.gov.mt
    • msdi.data.gov.mt
    Updated May 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Common Database on Designated Areas (CDDA) [Dataset]. https://testsdi.gov.mt/geonetwork/srv/search?keyword=INSPIRE%20Priority%20List%20No.%2050
    Explore at:
    Dataset updated
    May 31, 2020
    Description

    The Common Database on Designated Areas (CDDA) is more commonly known as Nationally designated areas, and is one of the agreed Eionet priority data flows maintained by EEA with support from the European Topic Centre on Biological Diversity. It is a result of an annual data flow through Eionet countries. In fact, Malta, being a member of the EEA, submits this report on an annual basis to fulfill this requirement. The EEA publishes the data set and makes it available to the World Database of Protected Areas (WDPA). The CDDA data can also be queried online in the European Nature Information System (EUNIS).

  12. Common basic data set (CBDS): requests for change (RFC) 2023

    • gov.uk
    • sasastunts.com
    • +1more
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2023). Common basic data set (CBDS): requests for change (RFC) 2023 [Dataset]. https://www.gov.uk/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2023
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  13. NOAA/WDS Paleoclimatology - CoralHydro2k Database (Common Era coral d18O and...

    • catalog.data.gov
    • datasets.ai
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA National Centers for Environmental Information (Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2024). NOAA/WDS Paleoclimatology - CoralHydro2k Database (Common Era coral d18O and Sr/Ca data compilation) [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-coralhydro2k-database-common-era-coral-d18o-and-sr-ca-data-compilatio1
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    Description

    This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Coral. The data include parameters of corals and sclerosponges (trace metals) with a geographic location of Global Ocean. The time period coverage is from 1783 to -66 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

  14. Common basic data set (CBDS): requests for change (RFC) 2020

    • s3.amazonaws.com
    • gov.uk
    Updated Feb 19, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2020). Common basic data set (CBDS): requests for change (RFC) 2020 [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/161/1611069.html
    Explore at:
    Dataset updated
    Feb 19, 2020
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  15. s

    Common basic data set (CBDS): requests for change (RFC) 2015

    • sasastunts.com
    • gov.uk
    Updated Dec 29, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2015). Common basic data set (CBDS): requests for change (RFC) 2015 [Dataset]. https://sasastunts.com/government/publications/common-basic-data-set-cbds-requests-for-change-rfc-2015
    Explore at:
    Dataset updated
    Dec 29, 2015
    Dataset provided by
    188体育
    Authors
    Department for Education
    Description

    These files contain information for suppliers developing software and management information systems (MIS) for local authorities and schools.

    The CBDS database is also available.

  16. Z

    CVEfixes Dataset: Automatically Collected Vulnerabilities and Their Fixes...

    • data.niaid.nih.gov
    Updated Jul 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CVEfixes Dataset: Automatically Collected Vulnerabilities and Their Fixes from Open-Source Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4476563
    Explore at:
    Dataset updated
    Jul 28, 2024
    Dataset provided by
    Moonen, Leon
    Vidziunas, Linas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CVEfixes is a comprehensive vulnerability dataset that is automatically collected and curated from Common Vulnerabilities and Exposures (CVE) records in the public U.S. National Vulnerability Database (NVD). The goal is to support data-driven security research based on source code and source code metrics related to fixes for CVEs in the NVD by providing detailed information at different interlinked levels of abstraction, such as the commit-, file-, and method level, as well as the repository- and CVE level.

    This release, v1.0.8, covers all published CVEs up to 23 July 2024. All open-source projects that were reported in CVE records in the NVD in this time frame and had publicly available git repositories were fetched and considered for the construction of this vulnerability dataset. The dataset is organized as a relational database and covers 12107 vulnerability fixing commits in 4249 open source projects for a total of 11873 CVEs in 272 different Common Weakness Enumeration (CWE) types. The dataset includes the source code before and after changing 51342 files and 138974 functions. The collection took 48 hours with 4 workers (AMD EPYC Genoa-X 9684X).

    This repository includes the SQL dump of the dataset, as well as the JSON for the CVEs and XML of the CWEs at the time of collection. The complete process has been documented in the paper "CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software", which is published in the Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '21). You will find a copy of the paper in the Doc folder.

    Citation and Zenodo links

    Please cite this work by referring to the published paper:

    Guru Bhandari, Amara Naseer, and Leon Moonen. 2021. CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software. In Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '21). ACM, 10 pages. https://doi.org/10.1145/3475960.3475985

    @inproceedings{bhandari2021:cvefixes, title = {{CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software}}, booktitle = {{Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '21)}}, author = {Bhandari, Guru and Naseer, Amara and Moonen, Leon}, year = {2021}, pages = {10}, publisher = {{ACM}}, doi = {10.1145/3475960.3475985}, copyright = {Open Access}, isbn = {978-1-4503-8680-7}, language = {en} }

    The dataset has been released on Zenodo with DOI:10.5281/zenodo.4476563. The GitHub repository containing the code to automatically collect the dataset can be found at https://github.com/secureIT-project/CVEfixes, released with DOI:10.5281/zenodo.5111494.

  17. C

    NCEI Standard Product: Global Ocean Currents Database (GOCD)

    • data.cnra.ca.gov
    • accession.nodc.noaa.gov
    • +2more
    Updated May 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ocean Data Partners (2019). NCEI Standard Product: Global Ocean Currents Database (GOCD) [Dataset]. https://data.cnra.ca.gov/dataset/ncei-standard-product-global-ocean-currents-database-gocd
    Explore at:
    Dataset updated
    May 9, 2019
    Dataset authored and provided by
    Ocean Data Partners
    Description

    This collection contains the Global Ocean Currents Database (GOCD). The GOCD is an NCEI Standard Product, and is derived from datasets archived at NCEI that contain in situ ocean current data from a diverse range of instruments, collection protocols, processing methods, and data storage formats. For acceptance into the GOCD, the data meet quality control requirements and have thorough documentation. The GOCD merges the variety of original formats into an NCEI standard network common data form (netCDF) format. From the shipboard acoustic Doppler current profiler sets, the GOCD creates files that hold single vertical ocean currents profiles. The GOCD includes data collected from 1962-09-30 to 2013-12-23.

  18. CIFOR's Poverty and Environment Network (PEN) global dataset

    • data.cifor.org
    pdf, png, tsv
    Updated Jul 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Center for International Forestry Research (CIFOR) (2019). CIFOR's Poverty and Environment Network (PEN) global dataset [Dataset]. http://doi.org/10.17528/CIFOR/DATA.00021
    Explore at:
    png(415407), pdf(21330), tsv(2443)Available download formats
    Dataset updated
    Jul 3, 2019
    Dataset provided by
    Center for International Forestry Researchhttp://www.cifor.org/
    License
    Time period covered
    2013 - 2015
    Area covered
    Uganda, Viet Nam, Nepal, Mozambique, Bangladesh, Malawi, Pakistan, Congo, the Democratic Republic of the, India, Niger
    Dataset funded by
    Department for International Development (DFID)
    Description

    The PEN network was launched in September 2004 by the Center for International Forestry Research (CIFOR) with the aim of collecting uniform socio-economic and environmental data at household and village levels in rural areas of developing countries. The data presented here were collected by 33 PEN partners (mainly PhD students) and comprise 8,301 households in 334 villages located in 24 countries in Asia, Africa and Latin America. Three types of quantitative surveys were conducted: 1. Village surveys (V1, V2) 2. Annual household surveys (A1, A2) 3. Quarterly household surveys (Q1, Q2, Q3, Q4) The village surveys (V1-V2) collected data that were common to all or showed little variation among households. The first village survey, V1, was conducted at the beginning of the fieldwork to get background information on the villages while the second survey, V2 was conducted the end of the fieldwork period to get information for the 12 months period covered by the surveys. The household surveys were grouped into two categories: quarterly surveys (Q1, Q2, Q3, Q4) to collect income information, and, household surveys (A1, A2) to collect all other household information. A critical feature of the PEN research project was to collect detailed, high-quality data on forest use. This was done through quarterly income household surveys, for two reasons: first, short recall periods increase accuracy and reliability and, second, quarterly data would allow us to document seasonal variation in (forest) income and thus, inter alia, help us understand to what extent forests act as seasonal gap fillers. There are three partners (10101, 10203, and 10301 ) who, because of various particular circumstances, only conducted three of the four income surveys. In addition, 598 of the households missed out on one of the quarterly surveys, e.g., due to temporal absence or sickness, or insecurity in the area. These are still included in the database, while households missing more than one quarter were excluded. Two other household surveys were conducted. The first annual household survey (A1) collected basic household information (demographics, assets, forest-related information) and was done at the beginning of the survey period while the second (A2) collected information for the 12-month period covered by the surveys (e.g., on risk management) and was done at the end of the survey period. Note, however, that we did not collect any systematic data on the time allocation of households: while highly relevant for many analyses, we believed that it would be too time-consuming a component to add to our standard survey questions. The project is further described and discussed in two edited volumes by Angelsen et al. (2011) (describes particular the methods used) and Wunder et al. (2014) (includes six articles based on the PEN project).

  19. P

    MS COCO Dataset

    • paperswithcode.com
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár, MS COCO Dataset [Dataset]. https://paperswithcode.com/dataset/coco
    Explore at:
    Dataset updated
    Apr 15, 2024
    Authors
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár
    Description

    The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

    Splits: The first version of MS COCO dataset was released in 2014. It contains 164K images split into training (83K), validation (41K) and test (41K) sets. In 2015 additional test set of 81K images was released, including all the previous test images and 40K new images.

    Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.

    Annotations: The dataset has annotations for

    object detection: bounding boxes and per-instance segmentation masks with 80 object categories, captioning: natural language descriptions of the images (see MS COCO Captions), keypoints detection: containing more than 200,000 images and 250,000 person instances labeled with keypoints (17 possible keypoints, such as left eye, nose, right hip, right ankle), stuff image segmentation – per-pixel segmentation masks with 91 stuff categories, such as grass, wall, sky (see MS COCO Stuff), panoptic: full scene segmentation, with 80 thing categories (such as person, bicycle, elephant) and a subset of 91 stuff categories (grass, sky, road), dense pose: more than 39,000 images and 56,000 person instances labeled with DensePose annotations – each labeled person is annotated with an instance id and a mapping between image pixels that belong to that person body and a template 3D model. The annotations are publicly available only for training and validation images.

  20. HCUP State Emergency Department Databases (SEDD) - Restricted Access File

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). HCUP State Emergency Department Databases (SEDD) - Restricted Access File [Dataset]. https://catalog.data.gov/dataset/hcup-state-emergency-department-databases-sedd-restricted-access-file
    Explore at:
    Dataset updated
    Feb 22, 2025
    Description

    The Healthcare Cost and Utilization Project (HCUP) State Emergency Department Databases (SEDD) contain the universe of emergency department visits in participating States. The data are translated into a uniform format to facilitate multi-State comparisons and analyses. The SEDD consist of data from hospital-based emergency department visits that do not result in an admission. The SEDD include all patients, regardless of the expected payer including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels. The SEDD contain clinical and resource use information included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and facilities (as required by data sources). Data elements include but are not limited to: diagnoses, procedures, admission and discharge status, patient demographics (e.g., sex, age, race), total charges, length of stay, and expected payment source, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. In addition to the core set of uniform data elements common to all SEDD, some include State-specific data elements. The SEDD exclude data elements that could directly or indirectly identify individuals. For some States, hospital and county identifiers are included that permit linkage to the American Hospital Association Annual Survey File and the Bureau of Health Professions' Area Resource File except in States that do not allow the release of hospital identifiers. Restricted access data files are available with a data use agreement and brief online security training.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Office of Economics (2021). Domestic and International Common Language Database (DICL) [Dataset]. https://catalog.data.gov/dataset/domestic-and-international-common-language-database-dicl

Data from: Domestic and International Common Language Database (DICL)

Related Article
Explore at:
Dataset updated
Apr 6, 2021
Dataset provided by
Office of Economics
Description

The database contains index measures of linguistic similarity both domestically and internationally. The domestic measures capture linguistic similarities present among populations within a single country while the international indexes capture language similarities between two different countries. The indexes reflect three aspects of language: common official languages, common native languages, and linguistic proximity across languages.

Search
Clear search
Close search
Google apps
Main menu