Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium's provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This part of the data release includes graphical representation (figures) of data from sediment cores collected in 2009 offshore of Palos Verdes, California. This file graphically presents combined data for each core (one core per page). Data on each figure are continuous core photograph, CT scan (where available), graphic diagram core description (graphic legend included at right; visual grain size scale of clay, silt, very fine sand [vf], fine sand [f], medium sand [med], coarse sand [c], and very coarse sand [vc]), multi-sensor core logger (MSCL) p-wave velocity (meters per second) and gamma-ray density (grams per cc), radiocarbon age (calibrated years before present) with analytical error (years), and pie charts that present grain-size data as percent sand (white), silt (light gray), and clay (dark gray). This is one of seven files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, offshore L ...
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Data Representation of Computer Architecture, 2nd Semester , Bachelor of Computer Applications
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Data Representation of Computer Architecture, 2nd Semester , Bachelor of Computer Application 2020-2021
Facebook
TwitterUsers should cite: Asgari E, Mofrad MRK. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics. PLoS ONE 10(11): e0141287. doi:10.1371/journal.pone.0141287. This archive also contains the family classification data that we used in the above mentioned PLoS ONE paper. This data can be used as a benchmark for family classification task.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository includes a Linked Data representation of the covid19-ita dataset provided by the Department of Civil Protection in Italy, following the RDF Data Cube Vocabulary and the KPIOnto ontology. The dataset includes measurements of various indicators related to COVID19 spread at the province, regional, and country levels, on a daily basis from February 24th to November 20th, 2020. The RDF format allows describing statistical multidimensional data as Linked Data on the Web. As such, each data point is represented as an observation of the relevant measures along two dimensions, namely time and geographical area.
The project is also available on Github.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Public speaking is an important skill, the acquisition of which requires dedicated and time consuming training. In recent years, researchers have started to investigate automatic methods to support public speaking skills training. These methods include assessment of the trainee's oral presentation delivery skills which may be accomplished through automatic understanding and processing of social and behavioral cues displayed by the presenter. In this study, we propose an automatic scoring system for presentation delivery skills using a novel active data representation method to automatically rate segments of a full video presentation. While most approaches have employed a two step strategy consisting of detecting multiple events followed by classification, which involve the annotation of data for building the different event detectors and generating a data representation based on their output for classification, our method does not require event detectors. The proposed data representation is generated unsupervised using low-level audiovisual descriptors and self-organizing mapping and used for video classification. This representation is also used to analyse video segments within a full video presentation in terms of several characteristics of the presenter's performance. The audio representation provides the best prediction results for self-confidence and enthusiasm, posture and body language, structure and connection of ideas, and overall presentation delivery. The video data representation provides the best results for presentation of relevant information with good pronunciation, usage of language according to audience, and maintenance of adequate voice volume for the audience. The fusion of audio and video data provides the best results for eye contact. Applications of the method to provision of feedback to teachers and trainees are discussed.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This record contains the data and codes for this paper:Guanzhou Ke, Yang Yu, Guoqing Chao, Xiaoli Wang, Chenyang Xu, and Shengfeng He. 2023. "Disentangling Multi-view Representations Beyond Inductive Bias." In Proceedings of the 31st ACM International Conference on Multimedia (MM '23), October 29–November 3, 2023, Ottawa, ON, Canada. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3581783.3611794dmrib-weights is the file for pre-trained weights. DMRIB-main is a copy of the project's GitHub Repository at https://github.com/Guanzhou-Ke/DMRIBThe official repos for ""Disentangling Multi-view Representations Beyond Inductive Bias"" (DMRIB)Status: Accepted in ACM MM 2023.Training stepWe show that how DMRIB train on the EdgeMnist dataset.Before the training step, you need to set the CUDA_VISIBLE_DEVICES, because of the faiss will use all gpu. It means that it will cause some error if you using tensor.to() to set a specific device.set environment.export CUDA_VISIBLE_DEVICES=0train the pretext model. First, we need to run the pretext training script src/train_pretext.py. We use simclr-style to training a self-supervised learning model to mine neighbors information. The pretext config commonly put at configs/pretext. You just need to run the following command in you terminal:python train_pretext.py -f ./configs/pretext/pretext_EdgeMnist.yamltrain the self-label clustering model. Then, we could use the pretext model to training clustering model via src/train_scan.py.python train_scan.py -f ./configs/scan/scan_EdgeMnist.yamlAfter that, we use the fine-tune script to train clustering model scr/train_selflabel.py.python train_selflabel.py -f ./configs/scan/selflabel_EdgeMnist.yamltraining the view-specific encoder and disentangled. Finally, we could set the self-label clustering model as the consisten encoder. And train the second stage via src/train_dmrib.py.python train_dmrib.py -f ./configs/dmrib/dmrib_EdgeMnist.yamlValidationNote: you can find the pre-train weights in the file dmrib-weights. And put the pretrained models into the following folders path to/{config.train.log_dir}/{results}/{config.dataset.name}/eid-{config.experiment_id}/dmrib/final_model.pth, respectively. For example, if you try to validate the EdgeMnist dataset, the default folder is ./experiments/results/EdgeMnist/eid-0/dmrib. And then, put the pretrained model edge-mnist.pth into this folder and rename it to final_model.pth.If you do not want to use the default setting, you have to modify the line 58 of the validate.py.python validate.py -f ./configs/dmrib/dmrib_EdgeMnist.yamlCreditThanks: Van Gansbeke, Wouter, et al. "Scan: Learning to classify images without labels." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X. Cham: Springer International Publishing, 2020.CitationGuanzhou Ke, Yang Yu, Guoqing Chao, Xiaoli Wang, Chenyang Xu,and Shengfeng He. 2023. Disentangling Multi-view Representations Be-yond Inductive Bias. In Proceedings of the 31st ACM International Conferenceon Multimedia (MM ’23), October 29–November 3, 2023, Ottawa, ON, Canada.ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3581783.3611794
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the main bulk data repository for the Digitally Accountable Public Representation (DAPR) Database, an innovative archive that systematically tracks and analyzes the online communications of federal, state, and local officials in the U.S. Focusing on X/Twitter and Facebook, the current database includes 28,834 public officials, their demographic information, and 5,769,904 Tweets along with 450,972 Facebook posts, dating from January 2020 to December 2024 for X/Twitter and January 2020 to December 2021 for Facebook, offering a rich historical perspective on digital political discourse by elected officials in the U.S.. To comply with the terms of data access on platform APIs, the raw post-level data is aggregated to the week level, and we disseminate content information as bags-of-words rather than the original raw text of posts. The data does include URLs to the original posts, which can be used to "rehydrate" the original data. We also distribute metadata on the officials in the DAPR database. Due to the size of the individual files, we disseminate each post data file in a compressed format. Note that due to changes in the Twitter/X API in 2023, as well as funding for the DAPR project, the scope and coverage of data collected from X shifted between 2024. See readme.pdf for details on the data as well as a description of how the data collection was affected by changes in the X API.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribute reclassification for fixed amplitude and varying Cmin.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world. Disentangled representations can improve robustness, but existing disentanglement techniques that minimise mutual information between features require independent features, thus they cannot disentangle correlated features. We propose an auxiliary task for RL algorithms that learns a disentangled representation of high-dimensional observations with correlated features by minimising the conditional mutual information between features in the representation. We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features. This is the data for the experiment results in the paper 'Conditional Mutual Information for Disentangled Representations in Reinforcement Learning' (https://arxiv.org/abs/2305.14133). These files contain the evaluation returns for all algorithms and seeds used to create Figures 4 and 5 in the paper. Further details are provided in the README file.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains the list of 213 journal articles and book chapters that deal with media empirical evidence on femicide, available in Scopus, Web of Science (WoS), and EBSCOhost. The dataset includes three sheets in one file, the first contains a list of the 213 documents available in the three databases including their bibliographic information, general description, and keywords describing the main results. The second sheet contains the codes for variables, and the third sheet contains the list of a subgroup of articles available in Scopus and Web of Science (WoS).
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Data Representation of Computer Architecture, 2nd Semester , Bachelor of Computer Application 2023-2024
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
From hyperbard.net: "Hyperbard is a new dataset of diverse relational data representations derived from William Shakespeare’s plays. Our representations range from simple graphs capturing character co-occurrence in single scenes to hypergraphs encoding complex communication settings and character contributions as hyperedges with edge-specific node weights. By making multiple intuitive representations readily available for experimentation, we facilitate rigorous representation robustness checks in graph learning, graph mining, and network analysis, highlighting the advantages and drawbacks of specific representations."
If using these data, please cite "All the world’s a (hyper)graph: A data drama" by Corinna Coupette, Jilles Vreeken, and Bastian Rieck.
Facebook
Twitterhttps://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Graph data represents complex relationships across diverse domains, from social networks to healthcare and chemical sciences. However, real-world graph data often spans multiple modalities, including time-varying signals from sensors, semantic information from textual representations, and domain-specific encodings. This dissertation introduces innovative multimodal learning techniques for graph-based predictive modeling, addressing the intricate nature of these multidimensional data representations. The research systematically advances graph learning through innovative methodological approaches across three critical modalities. Initially, we establish robust graph-based methodological foundations through advanced techniques including prompt tuning for heterogeneous graphs and a comprehensive framework for imbalanced learning on graph data. we then extend these methods to time series analysis, demonstrating their practical utility through applications such as hierarchical spatio-temporal modeling for COVID-19 forecasting and graph-based density estimation for anomaly detection in unmanned aerial systems. Finally, we explore textual representations of graphs in the chemical domain, reformulating reaction yield prediction as an imbalanced regression problem to enhance performance in underrepresented high-yield regions critical to chemists.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This MATLAB code is part of the study titled "Joint Image Processing with Learning-Driven Data Representation and Model Behavior for Non-Intrusive Anemia Diagnosis in Pediatric Patients", which has been accepted for publication in the Journal of Imaging (MDPI). The code supports image processing, feature extraction, and deep learning model training (including LSTM and RexNet) to classify pediatric patients as anemic or non-anemic based on palm, conjunctival, and fingernail images. Full study details are available in this paper:
Berghout T. Joint Image Processing with Learning-Driven Data Representation and Model Behavior for Non-Intrusive Anemia Diagnosis in Pediatric Patients. Journal of Imaging. 2024; 10(10):245. https://doi.org/10.3390/jimaging10100245
The datsets use in this work are:
Asare, J. W., Appiahene, P. & Donkoh, E. (2022). Anemia Detection using Palpable Palm Image Datasets from Ghana. Mendeley Data. https://doi.org/10.17632/ccr8cm22vz.1
Asare, J. W., Appiahene, P. & Donkoh, E. (2023). CP-AnemiC (A Conjunctival Pallor) Dataset from Ghana. Mendeley Data. https://doi.org/10.17632/m53vz6b7fx.1
Asare, J. W., Appiahene, P. & Donkoh, E. (2020). Detection of Anemia using Colour of the Fingernails Image Datasets from Ghana. Mendeley Data. https://doi.org/10.17632/2xx4j3kjg2.1
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data and analysis scripts associated with Experiment 3 in the paper "The Cost of Multiple Representations: Learning Number Symbols with Abstract and Concrete Representations".
Facebook
Twitterhttps://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
This the data for R program
Facebook
TwitterVisual reinforcement learning (RL) has made significant progress in recent years, but the choice of visual feature extractor remains a crucial design decision.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Lists of diplomatic representations and international organizations with the same privileges (zero value-added tax) since 2018 with amendments
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium's provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute.