This dataset contains the metadata of the datasets published in 77 Dataverse installations, information about each installation's metadata blocks, and the list of standard licenses that dataset depositors can apply to the datasets they publish in the 36 installations running more recent versions of the Dataverse software. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation on October 2 and October 3, 2022 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another named "apikey" listing my accounts' API tokens. The Python script expects and uses the API tokens in this CSV file to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation).csv │ ├── basic.csv │ ├── contributor(citation).csv │ ├── ... │ └── topic_classification(citation).csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2022.10.02_17.11.19.zip │ ├── dataset_pids_Abacus_2022.10.02_17.11.19.csv │ ├── Dataverse_JSON_metadata_2022.10.02_17.11.19 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0.json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2022.10.02_17.26.19.zip │ ├── ADA_Dataverse_2022.10.02_17.26.57.zip │ ├── Arca_Dados_2022.10.02_17.44.35.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2022.10.02_22.59.36.zip └── dataset_pids_from_most_known_dataverse_installations.csv └── licenses_used_by_dataverse_installations.csv └── metadatablocks_from_most_known_dataverse_installations.csv This dataset contains two directories and three CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 18 CSV files that contain the values from common metadata fields of all 77 Dataverse installations. For example, author(citation)_2022.10.02-2022.10.03.csv contains the "Author" metadata for all published, non-deaccessioned, versions of all datasets in the 77 installations, where there's a row for each author name, affiliation, identifier type and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 77 zipped files, one for each of the 77 Dataverse installations whose dataset metadata I was able to download using Dataverse APIs. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate whether or not the Python script was able to download the Dataverse JSON metadata for each dataset. For Dataverse installations using Dataverse software versions whose Search APIs include each dataset's owning Dataverse collection name and alias, the CSV files also include which Dataverse collection (within the installation) that dataset was published in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I saved them so that they can be used when extracting metadata from the Dataverse JSON files. The dataset_pids_from_most_known_dataverse_installations.csv file contains the dataset PIDs of all published datasets in the 77 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all of the "dataset_pids_..." files in each of the 77 zip files. The licenses_used_by_dataverse_installations.csv file contains information about the licenses that a number of the installations let depositors choose when creating datasets. When I collected ... Visit https://dataone.org/datasets/sha256%3Ad27d528dae8cf01e3ea915f450426c38fd6320e8c11d3e901c43580f997a3146 for complete metadata about this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows SIRIM Industry Standards by SIRIM Training Services Sdn. Bhd (STS).
Provides a listing of the States and privately owned entities designated and/or delegated by the Grain Inspection, Packers and Stockyards Administration (GIPSA), Federal Grain Inspection Service (FGIS) to provide official inspection and/or weighing services under the authority of the United States Grain Standards Act (USGSA). Only entities listed in this Directory are recognized as Official Agencies (OAs) by FGIS.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This list contains the government API cases collected, cleaned and analysed in the APIs4DGov study "Web API landscape: relevant general purpose ICT standards, technical specifications and terms".
The list does not represent a complete list of all government cases in Europe, as it is built to support the goals of the study and is limited to the analysis and data gathered from the following sources:
The EU open data portal
The European data portal
The INSPIRE catalogue
JoinUp: The API cases collected from the European Commission JoinUp platform
Literature-document review: the API cases gathered from the research activities of the study performed till the end of 2019
ProgrammableWeb: the ProgrammableWeb API directory
Smart 2015/0041: the database of 395 cases created by the study ‘The project Towards faster implementation and uptake of open government’ (SMART 2015/0041).
Workshops/meetings/interviews: a list of API cases collected in the workshops, surveys and interviews organised within the APIs4DGov
Each API case is classified accordingly to the following rationale:
Unique id: a unique key of each case, obtained by concatenating the following fields: (Country Code) + (Governmental level) + (Name Id) + (Type of API)
API Country or type of provider: the country in which the API case has been published
API provider: the specific provider that published and maintain the API case
Name Id: an acronym of the name of the API case (it can be not unique)
Short description
Type of API: (i) API registry, a set, catalogue, registry or directory of APIs; (ii) API platform: a platform that supports the use of APIs; (iii) API tool: a tool used to manage APIs; (iv) API standard: a set of standards related to government APIs; (v) Data catalogue, an API published to access metadata of datasets, normally published by a data catalogue; (vi) Specific API, a unique (can have many endpoints) API built for a specific purpose
Number of APIs: normally only one, in the case of API registry, the number of APIs published by the registry at the 31/12/2019
Theme: list of domains related to the API case (controlled vocabulary)
Governmental level: the geographical scope of the API (city, regional, national or international)
Country code: the country two letters internal code
Source: the source (among the ones listed in the previous) from where the API case has been gathered
These data are collected to inform families applying to Pre-K for All of the programs available. This is a spreadsheet version of the exact data points printed in the borough-level directories and the online PDFs. Each record represents a school participating in Pre-K for All. The information for each school is collected by the Department of Early Childhood Education (Department of Education). The "Who Got Offers" section of the spreadsheet is calculated by the Office of Student Enrollment (Department of Education) based on results from Round 1 of the Fall 2017 admissions process. This spreadsheet is simply a different representation of the same material produced in the printed and widely distributed Pre-K directories. This spreadsheet should not be used to identify current programs, as the directory was printed in December 2017 and schools are subject to change. For the most updated list of Pre-K for All schools, use the UPK Sites Directory compiled by the Department of Early Childhood Education. Disclaimer: The following columns were added to this directory to meet the Geo-spatial Standards of Local Law 108 of 2015 • Postcode / Zip code • Latitude • Longitude • Community Board • Council District • Census tract • BIN • BBL • NTA
The National Health Services Directory ('NHSD'), published by Healthdirect Australia, is a comprehensive and consolidated national directory of health services and related health service providers in both the public and private sectors across all Australian jurisdictions. The purpose of the NHSD is to provide consistent, authoritative, reliable and easily accessible information about health services to the public and to support health professionals with the delivery of healthcare. Service types are classified using the SNOMED CT-AU terminology standard to classify the service types used in NHSD. Refer to the National Clinical Terminology Service for more information. This dataset was derived from service, location and organisation profiles accessed through the Healthdirect provided NHSD API and has been spatialised as a point dataset. This directory is presented as a snapshot in time as at August 2023. This dataset is available for access to academic users once they have agreed to the NHSD terms of use; other AURIN users can apply for access to the dataset here.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Center for Human Neuroscience (CHN) Retinotopic Mapping Dataset collected at the University of Washington is part of "Improving the reliability and accuracy of population receptive field measures using a 'log-bar' stimulus" by Kelly Chang, Ione Fine, and Geoffrey M. Boynton.
The full dataset is comprised of the raw, preprocessed (with fMRIPrep), and pRF estimated data from 12 participants across 2 sessions.
dataset
This directory contains the raw, unprocessed data for each participant.
dataset/derivatives/fmriprep
This directory contains the fMRIPrep processed data for each participant.
dataset/derivatives/freesurfer
This directory contains the standard FreeSurfer processed data for each participant.
dataset/derivatives/prf-estimation
This directory contains the pRF estimation data and results for each participant.
dataset/derivatives/prf-estimation/files
This directory contains miscellaneous files used for pRF estimation or visualizations.
angle_lut.json
: Custom polar angle lookup table for visualization with FreeSurfer's freeview
.eccen_lut.json
: Custom eccentricity lookup table for visualization with FreeSurfer's freeview
.participants_hrf_paramters.json
: Corresponding metadata for participants_hrf_paramters.tsv
.participants_hrf_paramters.tsv
: Estimated HRF parameters used during pRF estimation by participant and hemisphere. dataset/derivatives/prf-estimation/stimuli
This directory contains the stimuli used in the experiment and stimulus apertures used in pRF estimation.
task-(fixed|log)bar_run-<n>
: Name of the stimulus condition and run number.*_desc-full_stim.mat
: Stimulus images (uint8) at full
resolution of 540 by 540 pixels and 6 Hz.*_desc-down_aperture.mat
: Stimulus aperture (binary) where 1s indicated stimulus and 0s indicated the background at a downsampled (down
) resolution of 108 by 108 pixels and 1 Hz. dataset/derivatives/prf-estimation/sub-<n>/anat
This directory contains the participant's surface (inflated and sphere) and curvature files for visualization using FreeSurfer's freeview
.
dataset/derivatives/prf-estimation/sub-<n>/func
This directory contains the preprocessed and denoised functional data, sampled onto the participant's surface, used during pRF estimation.
dataset/derivatives/prf-estimation/sub-<n>/prfs
This directory contains the estimated pRF parameter maps separated by which data was used during estimation.
ses-(01|02|all)
: Sessions used during pRF estimation, either Session 1, Session 2, or both. task-(fixedbar|logbar|all)
: Stimuli type used during pRF estimation, either fixed-bar, log-bar, or both. Within the pRF estimate directories are the estimated pRF parameter maps for:
- *_angle.mgz
: Polar angle maps, degrees from (-180, 180). Negative values represent the left hemifield and positive values represent the right hemifield.
- *_eccen.mgz
: Eccentricity maps, visual degrees.
- *_sigma.mgz
: pRF size maps, visual degrees.
- *_vexpl.mgz
: Proportion of variance explained maps.
- *_x0.mgz
: x-coordinate maps, visual degrees, with origin (0,0) at screen center.
- *_y0.mgz
: y-coordinate maps, visual degrees, with origin (0,0) at screen center.
dataset/derivatives/prf-estimation/sub-<n>/rois
This directory contains the roi (.label) files for each
participant.
*_evc.label
: Early visual cortex (EVC). A liberal ROI that covered V1, V2, and V3 used for pRF estimation.*_fovea.label
: Foveal confluence ROI.*_v<n>.label
: Corresponding visual area ROI files. dataset/tutorials
This directory contains tutorial scripts in MATLAB and Python to generate logarithmically distorted images from a directory of input images.
create_distorted_images.[m,ipynb]
: Tutorial script that generates logarithmically distorted images when given an image input directory.logarithmic_distortion_demo.[m,ipynb]
: Tutorial script that demonstrates the logarithmic distortion warping on a single image.fixed-bar
: Sample image input directory for create_distorted_images.[m,ipynb]
.log-bar
: Sample image output directory for create_distorted_images.[m,ipynb]
.Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Directory of Standard Malaysian Glove (SMG) Certified Suppliers
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Last Version: 4
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/12/15
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 4th version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.
Version: 3
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/10/28
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 3rd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).
Erratum - Data articles in journals Version 3:
Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2
Data -- ISSN 2306-5729 -- JCR (JIF) n/a
Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a
Version: 2
Author: Francisco Rubio, Universitat Politècnia de València.
Date of data collection: 2020/06/23
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 2nd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)
Total size: 32 KB
Version 1: Description
This dataset contains a list of journals that publish data articles, code, software articles and database articles.
The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals.
Acknowledgements:
Xaquín Lores Torres for his invaluable help in preparing this dataset.
The dataset contains data on publishers participating in the ISBN (International Standard Book Numbering) system in the Czech Republic since 1989 and in the ISMN (International Standard Music Numbering) system in the Czech Republic since 1996. It also contains data on publishers who have not registered with either of the two aforementioned systems but these data are not updated. This dataset contains almost 20,000 records.
The National Center for Education Statistics' (NCES) Education Demographic and Geographic Estimate (EDGE) program develops annually updated point locations (latitude and longitude) for public elementary and secondary schools included in the NCES Common Core of Data (CCD). The CCD program annually collects administrative and fiscal data about all public schools, school districts, and state education agencies in the United States. The data are supplied by state education agency officials and include basic directory and contact information for schools and school districts, as well as characteristics about student demographics, number of teachers, school grade span, and various other administrative conditions. CCD school and agency point locations are derived from reported information about the physical location of schools and agency administrative offices. The point locations and administrative attributes in this data layer represent the most current CCD collection. For more information about NCES school point data, see: https://nces.ed.gov/programs/edge/Geographic/SchoolLocations. For more information about these CCD attributes, as well as additional attributes not included, see: https://nces.ed.gov/ccd/files.asp.Notes:-1 or MIndicates that the data are missing.-2 or NIndicates that the data are not applicable.-9Indicates that the data do not meet NCES data quality standards.Collections are available for the following years:2022-232021-222020-212019-202018-192017-18All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data. Collections are available for the following years:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The provided code processes a Tajweed dataset, which appears to be a collection of audio recordings categorized by different Tajweed rules (Ikhfa, Izhar, Idgham, Iqlab). Let's break down the dataset's structure and the code's functionality:
Dataset Structure:
Code Functionality:
Initialization and Imports: The code begins with necessary imports (pandas, pydub) and mounts Google Drive. Pydub is used for audio file format conversion.
Directory Listing: It initially checks if a specified directory exists (for example, Alaa_alhsri/Ikhfa) and lists its files, demonstrating basic file system access.
Metadata Creation: The core of the script is the generation of metadata, which provides essential information about each audio file. The tajweed_paths
dictionary maps each Tajweed rule to a list of paths, associating each path with the reciter's name.
global_id
: A unique identifier for each audio file.original_filename
: The original filename of the audio file.new_filename
: A standardized filename that incorporates the Tajweed rule (label), sheikh's ID, audio number, and a global ID.label
: The Tajweed rule.sheikh_id
: A numerical identifier for each sheikh.sheikh_name
: The name of the reciter.audio_number
: A sequential number for the audio files within a specific sheikh and Tajweed rule combination.original_path
: Full path to the original audio file.new_path
: Full path to the intended location for the renamed and potentially converted audio file.File Renaming and Conversion:
new_filename
and store it in the designated directory..wav
format, creating standardized files in a new output_dataset
directory. The new filenames are based on rules, sheikh and a counter.Metadata Export: Finally, the compiled metadata is saved as a CSV file (metadata.csv
) in the output directory. This CSV file is crucial for training any machine learning model using this data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Directory of Standard Malaysian Rubber (SMR) Producer 2019
The World Ocean Database 1998 (WOD98) is comprised of five CD-ROMs containing profile and plankton/biomass data in compressed format. WOD98-01 through WOD98-04 contain observed level data, WOD98-05 contains all the standard level data. World Ocean Database 1998 (WOD98) expands on World Ocean Atlas 1994 (WOA94) by including the additional variables nitrite, pH, alkalinity, chlorophyll, and plankton, as well as all available metadata and meteorology. WOD98 is an International Year of the Ocean product. WOD98-01 Observed Level Data; North Atlantic 30° N-90° N; WOD98-02 Observed Level Data; North Atlantic 0°-30° N, South Atlantic; WOD98-03 Observed Level Data; North Pacific 20° N-90° N; WOD98-01 Observed Level Data; North Pacific 0°-20° N; South Pacific, Indian; WOD98-01 Standard Level Data for all Ocean Basins. Discs may be created by burning the appropriate .iso file(s) in the data/0-data/disc_image/ directory to blank CD-ROM media using standard CD-ROM authoring software. Software that was developed or provided with this NODC Standard Product may be included in the disc_image/ directory as part of a disc image, but executable software that was developed or provided with this NODC Standard Product has been excluded from the disc_contents/ directory.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The open data portal catalogue is a downloadable dataset containing some key metadata for the general datasets available on the Government of Canada's Open Data portal. Resource 1 is generated using the ckanapi tool (external link) Resources 2 - 8 are generated using the Flatterer (external link) utility. ###Description of resources: 1. Dataset is a JSON Lines (external link) file where the metadata of each Dataset/Open Information Record is one line of JSON. The file is compressed with GZip. The file is heavily nested and recommended for users familiar with working with nested JSON. 2. Catalogue is a XLSX workbook where the nested metadata of each Dataset/Open Information Record is flattened into worksheets for each type of metadata. 3. datasets metadata contains metadata at the dataset
level. This is also referred to as the package
in some CKAN documentation. This is the main
table/worksheet in the SQLite database and XLSX output. 4. Resources Metadata contains the metadata for the resources contained within each dataset. 5. resource views metadata contains the metadata for the views applied to each resource, if a resource has a view configured. 6. datastore fields metadata contains the DataStore information for CSV datasets that have been loaded into the DataStore. This information is displayed in the Data Dictionary for DataStore enabled CSVs. 7. Data Package Fields contains a description of the fields available in each of the tables within the Catalogue, as well as the count of the number of records each table contains. 8. data package entity relation diagram Displays the title and format for column, in each table in the Data Package in the form of a ERD Diagram. The Data Package resource offers a text based version. 9. SQLite Database is a .db
database, similar in structure to Catalogue. This can be queried with database or analytical software tools for doing analysis.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary
One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put in a lot of effort to collect large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.
Data record
The data were organized according to the Brain-Imaging-Data-Structure (BIDS) Specification version 1.7.0 and can be accessed from the OpenNeuro public repository (accession number: XXX). In short, raw data of each subject were stored in “sub-
Stimulus images The stimulus images for different fMRI experiments are deposited in separate folders: “stimuli/imagenet”, “stimuli/coco”, “stimuli/prf”, and “stimuli/floc”. Each experiment folder contains corresponding stimulus images, and the auxiliary files can be found within the “info” subfolder.
Raw MRI data Each participant folder consists of several session folders: anat, coco, imagenet, prf, floc. Each session folder in turn includes “anat”, “func”, or “fmap” folders for corresponding modality data. The scan information for each session is provided in a TSV file.
Preprocessed volume data from fMRIprep The preprocessed volume-based fMRI data are in subject's native space, saved as “sub-
Preprocessed surface-based data from ciftify The preprocessed surface-based data are in standard fsLR space, saved as “sub-
Brain activation data from surface-based GLM analyses The brain activation data are derived from GLM analyses on the standard fsLR space, saved as “sub-
2023 Updates to the National Incident Feature Service and Event Geodatabase For 2023, there are no schema updates and no major changes to GeoOps or the GISS Workflow! This is a conscious choice and is intended to provide a needed break for both users and administrators. Over the last 5 years, nearly every aspect of the GISS position has seen a major overhaul and while the advancements have been overwhelmingly positive, many of us are experiencing change fatigue. This is not to say there is no room for improvement. Many great suggestions were received throughout the season and in the GISS Survey, and they will be considered for inclusion in 2024. That there are no critical updates necessary also indicates that we have reached a level of maturity with the current state, and that is good news for everyone. Please continue to submit your ideas; they are appreciated and valuable insight, even if the change is not implemented. For information on 2023 AGOL updates please see the Create and Share Web Maps | NWCG page. There are three smaller changes worth noting this year: Standard Symbology is now the default on the NIFS For most workflows, the update will be seamless. All the Event Standard symbols are now supported in Field Maps and Map Viewer. Most users will now see the same symbols in all print and digital products. However, in AGOL some web apps do not support the complex line symbols. The simplified lines will still be present in the official Editing Apps (Operations, SITL, and GISS), and any custom apps built with the Web App Builder (WAB) interface. Experience Builder can be used for any new app creation. If you must use WAB or another app that cannot display the complex line symbology in the NIFS, please contact wildfireresponse@firenet.gov for guidance. Event Line now has Preconfigured Labels Labels on Event Line have historically been uncommon, but to speed their implementation when necessary, color-coded labels classes have been added to the NIFS and the lyrx files provided in the GIS Folder Structure. They can be disabled or modified as needed, should they interfere with any of your workflows. “Restricted” Folder added to GeoOps Folder Structure At the base level within the 2023_Template, a ‘restricted’ folder is now included. This folder should be used for all data and products that contain sensitive, restricted, or controlled-unclassified information. This will aid the DOCL and any future FOIA liaisons in protecting this information. When using OneDrive, this folder can optionally be password protected. Reminder: Sensitive Data is not allowed to be hosted within the NIFC Org.
The World Ocean Database 1998 (WOD98) is comprised of five CD-ROMs containing profile and plankton/biomass data in compressed format. WOD98-01 through WOD98-04 contain observed level data, WOD98-05 contains all the standard level data.
World Ocean Database 1998 (WOD98) expands on World Ocean Atlas 1994 (WOA94) by including the additional variables nitrite, pH, alkalinity, chlorophyll, and plankton, as well as all available metadata and meteorology. WOD98 is an International Year of the Ocean product.
WOD98-01 Observed Level Data; North Atlantic 30° N-90° N; WOD98-02 Observed Level Data; North Atlantic 0°-30° N, South Atlantic; WOD98-03 Observed Level Data; North Pacific 20° N-90° N; WOD98-01 Observed Level Data; North Pacific 0°-20° N; South Pacific, Indian; WOD98-01 Standard Level Data for all Ocean Basins.
Copies of the World Ocean Atlas 1998 (WOA98) version 1 CD-ROMs are no longer available from the NODC Online Store. Duplicate discs may be created by burning the appropriate .iso file(s) in the data/0-data/disc_image/ directory to blank CD-ROM media using standard CD-ROM authoring software. Software that was developed or provided with this NODC Standard Product may be included in the disc_image/ directory as part of a disc image, but executable software that was developed or provided with this NODC Standard Product has been excluded from the disc_contents/ directory.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary
Human action recognition is one of our critical living abilities, allowing us to interact easily with the environment and others in everyday life. Although the neural basis of action recognition has been widely studied using a few categories of actions from simple contexts as stimuli, how the human brain recognizes diverse human actions in real-world environments still need to be explored. Here, we present the Human Action Dataset (HAD), a large-scale functional magnetic resonance imaging (fMRI) dataset for human action recognition. HAD contains fMRI responses to 21,600 video clips from 30 participants. The video clips encompass 180 human action categories and offer a comprehensive coverage of complex activities in daily life. We demonstrate that the data are reliable within and across participants and, notably, capture rich representation information of the observed human actions. This extensive dataset, with its vast number of action categories and exemplars, has the potential to deepen our understanding of human action recognition in natural environments.
Data record
The data were organized according to the Brain-Imaging-Data-Structure (BIDS) Specification version 1.7.0 and can be accessed from the OpenNeuro public repository (accession number: ds004488). The raw data of each subject were stored in "sub-< ID>" directories. The preprocessed volume data and the derived surface-based data were stored in “derivatives/fmriprep” and “derivatives/ciftify” directories, respectively. The video clips stimuli were stored in “stimuli” directory.
Video clips stimuli The video clips stimuli selected from HACS are deposited in the "stimuli" folder. Each of the 180 action categories holds a folder in which 120 unique video clips are stored.
Raw data The data for each participant are distributed in three sub-folders, including the “anat” folder for the T1 MRI data, the “fmap” folder for the field map data, and the “func” folder for functional MRI data. The events file in “func” folder contains the onset, duration, trial type (category index) in specific scanning run.
Preprocessed volume data from fMRIprep The preprocessed volume-based fMRI data are in subject's native space, saved as “sub-
Preprocessed surface data from ciftify Under the “results” folder, the preprocessed surface-based data are saved in standard fsLR space, named as “sub-
Automatically describing images using natural sentences is an essential task to visually impaired people's inclusion on the Internet. Although there are many datasets in the literature, most of them contain only English captions, whereas datasets with captions described in other languages are scarce.
PraCegoVer arose on the Internet, stimulating users from social media to publish images, tag #PraCegoVer and add a short description of their content. Inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese with freely annotated images.
Dataset Structure
containing the images. The file dataset.json comprehends a list of json objects with the attributes:
user: anonymized user that made the post;
filename: image file name;
raw_caption: raw caption;
caption: clean caption;
date: post date.
Each instance in dataset.json is associated with exactly one image in the images directory whose filename is pointed by the attribute filename. Also, we provide a sample with five instances, so the users can download the sample to get an overview of the dataset before downloading it completely.
Download Instructions
If you just want to have an overview of the dataset structure, you can download sample.tar.gz. But, if you want to use the dataset, or any of its subsets (63k and 173k), you must download all the files and run the following commands to uncompress and join the files:
cat images.tar.gz.part* > images.tar.gz tar -xzvf images.tar.gz
Alternatively, you can download the entire dataset from the terminal using the python script download_dataset.py available in PraCegoVer repository. In this case, first, you have to download the script and create an access token here. Then, you can run the following command to download and uncompress the image files:
python download_dataset.py --access_token=
This dataset contains the metadata of the datasets published in 77 Dataverse installations, information about each installation's metadata blocks, and the list of standard licenses that dataset depositors can apply to the datasets they publish in the 36 installations running more recent versions of the Dataverse software. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation on October 2 and October 3, 2022 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another named "apikey" listing my accounts' API tokens. The Python script expects and uses the API tokens in this CSV file to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation).csv │ ├── basic.csv │ ├── contributor(citation).csv │ ├── ... │ └── topic_classification(citation).csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2022.10.02_17.11.19.zip │ ├── dataset_pids_Abacus_2022.10.02_17.11.19.csv │ ├── Dataverse_JSON_metadata_2022.10.02_17.11.19 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0.json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2022.10.02_17.26.19.zip │ ├── ADA_Dataverse_2022.10.02_17.26.57.zip │ ├── Arca_Dados_2022.10.02_17.44.35.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2022.10.02_22.59.36.zip └── dataset_pids_from_most_known_dataverse_installations.csv └── licenses_used_by_dataverse_installations.csv └── metadatablocks_from_most_known_dataverse_installations.csv This dataset contains two directories and three CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 18 CSV files that contain the values from common metadata fields of all 77 Dataverse installations. For example, author(citation)_2022.10.02-2022.10.03.csv contains the "Author" metadata for all published, non-deaccessioned, versions of all datasets in the 77 installations, where there's a row for each author name, affiliation, identifier type and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 77 zipped files, one for each of the 77 Dataverse installations whose dataset metadata I was able to download using Dataverse APIs. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate whether or not the Python script was able to download the Dataverse JSON metadata for each dataset. For Dataverse installations using Dataverse software versions whose Search APIs include each dataset's owning Dataverse collection name and alias, the CSV files also include which Dataverse collection (within the installation) that dataset was published in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I saved them so that they can be used when extracting metadata from the Dataverse JSON files. The dataset_pids_from_most_known_dataverse_installations.csv file contains the dataset PIDs of all published datasets in the 77 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all of the "dataset_pids_..." files in each of the 77 zip files. The licenses_used_by_dataverse_installations.csv file contains information about the licenses that a number of the installations let depositors choose when creating datasets. When I collected ... Visit https://dataone.org/datasets/sha256%3Ad27d528dae8cf01e3ea915f450426c38fd6320e8c11d3e901c43580f997a3146 for complete metadata about this dataset.