The 2021-2022 School Learning Modalities dataset provides weekly estimates of school learning modality (including in-person, remote, or hybrid learning) for U.S. K-12 public and independent charter school districts for the 2021-2022 school year and the Fall 2022 semester, from August 2021 – December 2022. These data were modeled using multiple sources of input data (see below) to infer the most likely learning modality of a school district for a given week. These data should be considered district-level estimates and may not always reflect true learning modality, particularly for districts in which data are unavailable. If a district reports multiple modality types within the same week, the modality offered for the majority of those days is reflected in the weekly estimate. All school district metadata are sourced from the National Center for Educational Statistics (NCES) for 2020-2021. School learning modality types are defined as follows: In-Person: All schools within the district offer face-to-face instruction 5 days per week to all students at all available grade levels. Remote: Schools within the district do not offer face-to-face instruction; all learning is conducted online/remotely to all students at all available grade levels. Hybrid: Schools within the district offer a combination of in-person and remote learning; face-to-face instruction is offered less than 5 days per week, or only to a subset of students. Data Information School learning modality data provided here are model estimates using combined input data and are not guaranteed to be 100% accurate. This learning modality dataset was generated by combining data from four different sources: Burbio [1], MCH Strategic Data [2], the AEI/Return to Learn Tracker [3], and state dashboards [4-20]. These data were combined using a Hidden Markov model which infers the sequence of learning modalities (In-Person, Hybrid, or Remote) for each district that is most likely to produce the modalities reported by these sources. This model was trained using data from the 2020-2021 school year. Metadata describing the _location, number of schools and number of students in each district comes from NCES [21]. You can read more about the model in the CDC MMWR: COVID-19–Related School Closures and Learning Modality Changes — United States, August 1–September 17, 2021. The metrics listed for each school learning modality reflect totals by district and the number of enrolled students per district for which data are available. School districts represented here exclude private schools and include the following NCES subtypes: Public school district that is NOT a component of a supervisory union Public school district that is a component of a supervisory union Independent charter district “BI” in the state column refers to school districts funded by the Bureau of Indian Education. Technical Notes Data from August 1, 2021 to June 24, 2022 correspond to the 2021-2022 school year. During this time frame, data from the AEI/Return to Learn Tracker and most state dashboards were not available. Inferred modalities with a probability below 0.6 were deemed inconclusive and were omitted. During the Fall 2022 semester, modalities for districts with a school closure reported by Burbio were updated to either “Remote”, if the closure spanned the entire week, or “Hybrid”, if the closure spanned 1-4 days of the week. Data from August
ngqtrung/full-modality-data dataset hosted on Hugging Face and contributed by the HF Datasets community
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the raw fMRI data of a preregistered study. Dataset includes:
session pre 1. anat/ anatomical scans (T1-weighted images) for each subject 2. func/ whole-brain EPI data from all task runs (8x single task, 2x dual task, 1x resting state and 2x localizer task) 3. fmap/ fieldmaps with magnitude1, magnitude2 and phasediff
session post 2. func/ whole-brain EPI data from all task runs (8x single task, 2x dual task) 3. fmap/ fieldmaps with magnitude1, magnitude2 and phasediff
Please note, some participants did not complete the post session. We updated our consent form to get explicit permission to publish the individual data, although not all participants resigned the new version. Those participants are excluded here but part of the t-maps on neurovault (compare participants.tsv).
Tasks were always included either visual or/and auditory input and required either manual or/and vocal responses (visual+manual and auditory+vocal are modality compatible and visual+vocal and auditory+manual are modality incompatible). Tasks were presented as either single task, or dual task. Participants completed a practice intervention prior to session post in which one group worked for 80 minutes outside the scanner on modality incompatible dual-tasks, one on modality compatible dual-task and the third one paused for 80 min.
For exact tasks description and material and scripts, please see the preregistration: https://osf.io/whpz8
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a collection of medical imaging files for use in the "Medical Image Processing with Python" lesson, developed by the Netherlands eScience Center.
The dataset includes:
These files represent various medical imaging modalities and formats commonly used in clinical research and practice. They are intended for educational purposes, allowing students to practice image processing techniques, machine learning applications, and statistical analysis of medical images using Python libraries such as scikit-image, pydicom, and SimpleITK.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multi-modal Exercises Dataset is a multi- sensor, multi-modal dataset, implemented to benchmark Human Activity Recognition(HAR) and Multi-modal Fusion algorithms. Collection of this dataset was inspired by the need for recognising and evaluating quality of exercise performance to support patients with Musculoskeletal Disorders(MSD).The MEx Dataset contains data from 25 people recorded with four sensors, 2 accelerometers, a pressure mat and a depth camera. Seven different exercises that are highly recommended by physiotherapists for patients with low-back pain were selected for this data collection. Two accelerometers were placed on the wrist and the thigh of the person and they performed exercises on the pressure mat while being recorded by a depth camera from top. One person performed one exercise for maximum 60 seconds. The dataset contains three data modalities; numerical time-series data, video data and pressure sensor data posing interesting research challenges when reasoning for HAR and Exercise Quality Assessment. With the recent advancement in multi-modal fusion, we also believe MEx is instrumental in benchmarking not only HAR algorithms, but also fusion algorithms of heterogeneous data types in multiple application domains.
Dataset description: This is a study of examples of Russian impersonal constructions with the modal word možno ‘can, be possible’ with and without the future copula budet ‘will be,’ i.e., možno + budet + INF and možno + INF. The data was collected in 2020-2021 from the old version of the Russian National Corpus (ruscorpora.ru). In the spreadsheet 01DataMoznoBudet, the data merges the results of four searches conducted to extract examples of sentences with the following construction types: možno + budet + INF.PFV, možno + budet + INF.IPFV, možno + INF.PFV and možno + INF.IPFV. The results for each search were downloaded, pseudorandomized, and the first 200 examples were manually annotated, based on the syntactic analyses given in the corpus. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/corpus/main. In the spreadsheet 01DataZavtraMoznoBudet, the data merges the results of four searches conducted to extract examples of sentences with the following structure: zavtra + možno + budet + INF.PFV, zavtra + možno + budet + INF.IPFV, zavtra + možno + INF.PFV and zavtra + možno + INF.IPFV. All of the examples (103 sentences) were imported to a spreadsheet and annotated manually, based on the syntactic analyses given in the corpus. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/corpus/main. Article abstract: This paper examines Russian impersonal constructions with the modal word možno ‘can, be possible’ with and without the future copula budet ‘will be,’ i.e., možno + budet + INF and možno + INF. My contribution can be summarized as follows. First, corpus-based evidence reveals that možno + INF constructions are vastly more frequent than constructions with copula. Second, the meaning of constructions without the future copula is more flexible: while the possibility is typically located in the present, the situation denoted by the infinitive may be located in the present or the future. Third, I show that the možno + INF construction is more ambiguous and can denote present, gnomic or future situations. Fourth, I identify a number of contextual factors that unambiguously locate the situation in the future. I demonstrate that such factors are more frequently used with the future copula, and thus motivate the choice between the two constructions. Finally, I illustrate the interpretations in a straightforward manner by means of schemas of the type used in cognitive linguistics.
Data and response to the request posted at: https://www.data.govt.nz/datasetrequest/show/536
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
The 2021-2022 School Learning Modalities dataset provides weekly estimates of school learning modality (including in-person, remote, or hybrid learning) for U.S. K-12 public and independent charter school districts for the 2021-2022 school year and the Fall 2022 semester, from August 2021 – December 2022.
These data were modeled using multiple sources of input data (see below) to infer the most likely learning modality of a school district for a given week. These data should be considered district-level estimates and may not always reflect true learning modality, particularly for districts in which data are unavailable. If a district reports multiple modality types within the same week, the modality offered for the majority of those days is reflected in the weekly estimate. All school district metadata are sourced from the https://nces.ed.gov/ccd/files.asp#Fiscal:2,LevelId:5,SchoolYearId:35,Page:1">National Center for Educational Statistics (NCES) for 2020-2021.
School learning modality types are defined as follows:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2. Top predictive genes by CMOT in human brain [1] and DEX-treated A549 lung cancer cells [2]
This dataset encompasses data from the Main corpus of the Russian National Corpus (RNC, ruscorpora.ru) used for analysis provided in Chapter 3 of the Introductory Chapter in the doctoral dissertation "The many faces of "možno" in Russian and across Slavic. Corpus investigation of constructions with the modal možno". Chapter 3 presents a study of 500 examples of Russian constructions with the modal word možno ‘can, be possible’. The query consisted of a single word možno without specification of a time period. The search returned 361 755 examples, 5000 examples were downloaded in the .xlsx format, pseudorandomized, and then the first 500 examples were extracted for the analysis. The data in the spreadsheet 01DataTheManyFacesOfMozno comprises these 500 examples. The data was collected in March 2023 from the RNC. All of the examples are semantically and syntactically annotated by hand based on the syntactic analyses given in the corpus. The syntactic and morphological categories used in the corpus are explained here https://ruscorpora.ru/corpus/main.
There has been a tremendous increase in the volume of Earth Science data over the last decade from modern satellites, in-situ sensors and different climate models. All these datasets need to be co-analyzed for finding interesting patterns or for searching for extremes or outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations. Moving these petabytes of data over the network to a single _location may waste a lot of bandwidth, and can take days to finish. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the global data without moving all the data to one _location. The algorithm is highly accurate (close to 99%) and requires centralizing less than 5% of the entire dataset. We demonstrate the performance of the algorithm using data obtained from the NASA MODerate-resolution Imaging Spectroradiometer (MODIS) satellite images.
https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.11588/DATA/68HOOPhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.11588/DATA/68HOOP
This dataset contains source code and data used in the PhD thesis "Measuring the Contributions of Vision and Text Modalities in Multimodal Transformers". The dataset is split into five repositories: Code and resources related to chapter 2 of the thesis (Section 2.2., method described in "Using Scene Graph Representations and Knowledge Bases") Code and resources related to chapter 3 of the thesis (VALSE dataset). Code and resources related to chapter 4 of the thesis: MM-SHAP measure and experiments code. Code and resources related to chapter 5 of the thesis: CCSHAP measure and experiments code related to large language models (LLMs). Code and resources related to the experiments with vision and language model decoders from chapters 3, 4, and 5.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
When individuals are asked to reproduce intervals of stimuli that are intermixedly presented at various times, longer intervals are often underestimated and shorter intervals overestimated. This phenomenon may be attributed to the central tendency of time perception, and suggests that our brain optimally encodes a stimulus interval based on current stimulus input and prior knowledge of the distribution of stimulus intervals. Two distinct systems are thought to be recruited in the perception of sub- and supra-second intervals. Sub-second timing is subject to local sensory processing, whereas supra-second timing depends on more centralized mechanisms. To clarify the factors that influence time perception, the present study investigated how both sensory modality and timescale affect the central tendency. In Experiment 1, participants were asked to reproduce sub- or supra-second intervals, defined by visual or auditory stimuli. In the sub-second range, the magnitude of the central tendency was significantly larger for visual intervals compared to auditory intervals, while visual and auditory intervals exhibited a correlated and comparable central tendency in the supra-second range. In Experiment 2, the ability to discriminate sub-second intervals in the reproduction task was controlled across modalities by using an interval discrimination task. Even when the ability to discriminate intervals was controlled, visual intervals exhibited a larger central tendency than auditory intervals in the sub-second range. In addition, the magnitude of the central tendency for visual and auditory sub-second intervals was significantly correlated. These results suggest that a common modality-independent mechanism is responsible for the supra-second central tendency, and that both the modality-dependent and modality-independent components of the timing system contribute to the central tendency in the sub-second range.
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Oral diseases affect nearly 3.5 billion people, with the majority residing in low- and middle-income countries. Due to limited healthcare resources, many individuals are unable to access proper oral healthcare services. Image-based machine learning technology is one of the most promising approaches to improving oral healthcare services and reducing patient costs. Openly accessible datasets play a crucial role in facilitating the development of machine learning techniques. However, existing dental datasets have limitations such as a scarcity of Cone Beam Computed Tomography (CBCT) data, lack of matched multi-modal data, and insufficient complexity and diversity of the data. This project addresses these challenges by providing a dataset that includes 329 CBCT images from 169 patients, multi-modal data with matching modalities, and images representing various oral health conditions.
700,000 sets of images and descriptions,the types of pictures include landscapes, animals, flowers and trees, people, cars, sports, industries, and buildings. Category and an aesthetic subset, each image has no less than two descriptions, each with one sentence; a small number of images have only one description, and the description languages are English and Chinese
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Brazil Lending Rate: Average: Finanical System: Nonearmarked: Households: Households: Overdraft data was reported at 12.400 % per Month in Jun 2018. This records a decrease from the previous number of 12.500 % per Month for May 2018. Brazil Lending Rate: Average: Finanical System: Nonearmarked: Households: Households: Overdraft data is updated monthly, averaging 12.800 % per Month from Dec 2015 (Median) to Jun 2018, with 31 observations. The data reached an all-time high of 12.900 % per Month in Apr 2017 and a record low of 11.900 % per Month in Dec 2015. Brazil Lending Rate: Average: Finanical System: Nonearmarked: Households: Households: Overdraft data remains active status in CEIC and is reported by Central Bank of Brazil. The data is categorized under Brazil Premium Database’s Interest and Foreign Exchange Rates – Table BR.MC003: Lending Rate: By Modality.
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
<strong>README.md</strong>
includes the Mudestreda description and images <strong>Mudestreda.png</strong>
and <strong>Mudestreda_Stage.png</strong>
.Sharp
, Used
, Dulled
Read Me Raw DataRead me file describing the data filesCall Duration Raw DataRaw data for female responses to the call duration stimuliCallDurationRaw.txtCall Period Raw DataRaw data for female responses to the call period stimuliCallPeriodRaw.txtCall Frequency Raw DataRaw data for female responses to the call frequency stimuliCallFrequencyRaw.txtCall Duration Preference Function dataDescriptive measurements for female preference functions measured in response to the call duration stimuliCallDurationPF.txtCall Period Preference Function DataDescriptive measurements for female preference functions measured in response to the call period stimuliCallPeriodPF.txtCall Frequency Preference Function DataDescriptive measurements for female preference functions measured in response to the call frequency stimuliCallFrequencyPF.txt
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset on Modal (Im)possibility in isiNdebele, isiZulu, siSwati, Xitsonga, and isiXhosa © 2024 by Thera Marie Crane, Remah Lubambo, M Petrus Mabena, Cordelia Nkwinika, Muhle Sibisi, Onelisa Slater is licensed under CC BY 4.0
This spreadsheet is the work of
Thera Marie Crane
Univerisity of Helsinki
Remah Lubambo
University of South Africa
M Petrus Mabena
University of South Africa
Cordelia Nkwinika
University of South Africa
Muhle Sibisi
University of KwaZulu-Natal
Onelisa Slater
Rhodes University
This is a work in progress and may have typos and mistakes. Please use with caution and contact one of the authors regarding any questions or uncertainties. The information reflects our current views and understanding and may be updated over time.
Contact information (corresponding author):
thera.crane@helsinki.fi
theracrane@gmail.com
current version DOI (17 Feb 2024) v1.0.0
10.5281/zenodo.10673293
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The AVCaps dataset is an audio-visual captioning resource designed to advance research in multimodal machine perception. Derived from the VidOR dataset, it features 2061 video clips spanning a total of 28.8 hours.
For each clip, the dataset provides:
AVCaps is a valuable resource for researchers working on tasks such as multimodal captioning, audio-visual alignment, and video content understanding. By providing separate and combined modality-specific annotations, it enables fine-grained studies in the interaction and alignment of audio and visual modalities.
The video clips are provided in three ZIP files:
train_videos.zip
: 1661 training clips.val_videos.zip
: 200 validation clips.test_videos.zip
: 200 testing clips.The captions are available in three JSON files:
train_captions.json
val_captions.json
test_captions.json
Each JSON file contains entries with video filenames as keys, and the corresponding values include audio captions, visual captions, audio-visual captions, and LLM-generated audio-visual captions.
The 2021-2022 School Learning Modalities dataset provides weekly estimates of school learning modality (including in-person, remote, or hybrid learning) for U.S. K-12 public and independent charter school districts for the 2021-2022 school year and the Fall 2022 semester, from August 2021 – December 2022. These data were modeled using multiple sources of input data (see below) to infer the most likely learning modality of a school district for a given week. These data should be considered district-level estimates and may not always reflect true learning modality, particularly for districts in which data are unavailable. If a district reports multiple modality types within the same week, the modality offered for the majority of those days is reflected in the weekly estimate. All school district metadata are sourced from the National Center for Educational Statistics (NCES) for 2020-2021. School learning modality types are defined as follows: In-Person: All schools within the district offer face-to-face instruction 5 days per week to all students at all available grade levels. Remote: Schools within the district do not offer face-to-face instruction; all learning is conducted online/remotely to all students at all available grade levels. Hybrid: Schools within the district offer a combination of in-person and remote learning; face-to-face instruction is offered less than 5 days per week, or only to a subset of students. Data Information School learning modality data provided here are model estimates using combined input data and are not guaranteed to be 100% accurate. This learning modality dataset was generated by combining data from four different sources: Burbio [1], MCH Strategic Data [2], the AEI/Return to Learn Tracker [3], and state dashboards [4-20]. These data were combined using a Hidden Markov model which infers the sequence of learning modalities (In-Person, Hybrid, or Remote) for each district that is most likely to produce the modalities reported by these sources. This model was trained using data from the 2020-2021 school year. Metadata describing the _location, number of schools and number of students in each district comes from NCES [21]. You can read more about the model in the CDC MMWR: COVID-19–Related School Closures and Learning Modality Changes — United States, August 1–September 17, 2021. The metrics listed for each school learning modality reflect totals by district and the number of enrolled students per district for which data are available. School districts represented here exclude private schools and include the following NCES subtypes: Public school district that is NOT a component of a supervisory union Public school district that is a component of a supervisory union Independent charter district “BI” in the state column refers to school districts funded by the Bureau of Indian Education. Technical Notes Data from August 1, 2021 to June 24, 2022 correspond to the 2021-2022 school year. During this time frame, data from the AEI/Return to Learn Tracker and most state dashboards were not available. Inferred modalities with a probability below 0.6 were deemed inconclusive and were omitted. During the Fall 2022 semester, modalities for districts with a school closure reported by Burbio were updated to either “Remote”, if the closure spanned the entire week, or “Hybrid”, if the closure spanned 1-4 days of the week. Data from August