Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Sets of Cause-Effect Pairs first used in the following paper:
Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts Oktie Hassanzadeh, Debarun Bhattacharjya, Mark Feblowitz, Kavitha Srinivas, Michael Perrone, Shirin Sohrabi, Michael Katz IJCAI 2019
@inproceedings{Hassanzadeh19, author = {Oktie Hassanzadeh and Debarun Bhattacharjya and Mark Feblowitz and Kavitha Srinivas and Michael Perrone and Shirin Sohrabi and Michael Katz}, title = {Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts}, booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, {IJCAI} 2019, August 10-16, 2019, Macao, China}, year = {2019} }
See README.txt for details.
$ wc -l * 319 ce_me_benchmark_v1.csv 118 nato_sfa_benchmark_v1.csv 804 risk_models_benchmark_v1.csv 1730 semeval_benchmark_v1.csv 2971 total
$ ls -lh * | awk '{print $5,$9}' 23K ce_me_benchmark_v1.csv 11K nato_sfa_benchmark_v1.csv 73K risk_models_benchmark_v1.csv 42K semeval_benchmark_v1.csv
NATO SFA Benchmark is created from the tables in the Appendix of the following publicly available document: STRATEGIC FORESIGHT ANALYSIS 2017 REPORT Links: https://www.act.nato.int/images/stories/media/doclibrary/171004_sfa_2017_report_hr.pdf https://www.act.nato.int/images/stories/media/doclibrary/171004_sfa_2017_report_txt.pdf https://www.act.nato.int/futures-work
SemEval data is published under the Creative Commons Attribution 3.0 Unported license: https://creativecommons.org/licenses/by/3.0/ Details: https://docs.google.com/document/d/1QO_CnmvNRnYwNWu1-QCAeR5ToQYkXUqFeAJbdEhsq7w/preview Original source: https://drive.google.com/file/d/0B_jQiLugGTAkMDQ5ZjZiMTUtMzQ1Yy00YWNmLWJlZDYtOWY1ZDMwY2U4YjFk/view?sort=name&layout=list&num=50
The rest of the data sets are covered by the Creative Commons: Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) license https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Sudip Ghosh
Released under CC0: Public Domain
Large language models are enabling rapid progress in robotic verbal communication, but nonverbal communication is not keeping pace. Physical humanoid robots struggle to express and communicate using facial movement, relying primarily on voice. The challenge is twofold: First, the actuation of an expressively versatile robotic face is mechanically challenging. A second challenge is knowing what expression to generate so that they appear natural, timely, and genuine. Here we propose that both barriers can be alleviated by training a robot to anticipate future facial expressions and execute them simultaneously with a human. Whereas delayed facial mimicry looks disingenuous, facial co-expression feels more genuine since it requires correctly inferring the human's emotional state for timely execution. We find that a robot can learn to predict a forthcoming smile about 839 milliseconds before the human smiles, and using a learned inverse kinematic facial self-model, co-express the smile simul..., During the data collection phase, the robot generated symmetrical facial expressions, which we thought can cover most of the situation and could reduce the size of the model. We used an Intel RealSense D435i to capture RGB images and cropped them to 480 320. We logged each motor command value and robot images to form a single data pair without any human labeling., , # Dataset for Paper "Human-Robot Facial Co-expression"
This dataset accompanies the research on human-robot facial co-expression, aiming to enhance nonverbal interaction by training robots to anticipate and simultaneously execute human facial expressions. Our study proposes a method where robots can learn to predict forthcoming human facial expressions and execute them in real time, thereby making the interaction feel more genuine and natural.
https://doi.org/10.5061/dryad.gxd2547t7
The dataset is organized into several zip files, each containing different components essential for replicating our study's results or for use in related research projects:
This data release includes three sets of data collected for a farm- and field-focused phosphorus reduction study in south-central, Wisconsin, USA. Paired samples collected in the control and treatment watershed analyzed for suspended sediment, total phosphorus, and total dissolved phosphorus during the calibration and post-treatment phases are presented. Samples were collected in fall 2006 through fall 2016. The data sets include: 1) Paired storm event loads parsed into calibration and post-treatment period and by the presence of frozen ground; 2) Paired low flow concentrations; and 3) Daily load data for each watershed separated by total flow and baseflow. These data are interpreted in journal articles to be published in Journal of Soil and Water Conservation and Journal of Water Quality.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
134758 Global import shipment records of Pairs Pair with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Paired Omics Data Platform is a community-based initiative standardizing links between genomic and metabolomics data in a computer readable format to further the field of natural products discovery. The goals are to link molecules to their producers, find large scale genome-metabolome associations, use genomic data to assist in structural elucidation of molecules, and provide a centralized database for paired datasets. This dataset contains the projects in http://pairedomicsdata.bioinformatics.nl/.
The JSON documents adhere to the http://pairedomicsdata.bioinformatics.nl/schema.json JSON schema.
Overview Off-the-shelf parallel corpus data (Translation Data) covers many fields including spoken language, traveling, medical treatment,news, and finance. Data cleaning, desensitization, and quality inspection have been carried out.
Specifications Storage format : TXT Data content : Parallel Corpus Data Data size : 200 million pairs Language : 20 languages Application scenario : machine translation Accuracy rate : 90%
About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go Translation Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/nlu?source=Datarade
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data set contains the simulation data, input files and post-processing files used in the publication "Jetting mechanisms in bubble-pair interactions".
Agencies can upload their translation pairs in this dataset. Languages available: Amharic, Arabic, Armenian, Burmese, Chinese_simplified, Chinese_traditional, Farsi/Dari, Hmong, Korean, Russian, Somali, Spanish, Tagalog, Urdu, Vietnamese,
File name format: [agencyName]_[sourceLanguage]_targetLanguage. For example, cityofsanjose_english_spanish; where Agency Name: "City of San Jose" , cityofsanjose. Source Language: "English" english. Target Language: "Spanish" spanish
Schema Format: 7 columns: source_text, target_text, source_language, target_language, source_dialect_optional, target_dialect_optional, human_verification, user_tested.
Upload in tsv, csv, and Excel file formats. Please use utf-8-sig character encoding to ensure that all characters are correctly displayed in files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Survey of violations (2).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Percentage distribution of the pair victim/reported person by place of birth of both. VGD (API identifier: 28317)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/urn-ine-es-tabla-t3-312-28317 on 07 January 2022.
--- Dataset description provided by original source is as follows ---
Table of INEBase Percentage distribution of the pair victim/reported person by place of birth of both. Annual. National. Statistics on Domestic Violence and Gender Violence
--- Original source retains full ownership of the source dataset ---
Capture histories used to model pair fidelity and survival in E_surgeData include capture histories (column 'H:') for a) female blue and great tits of Wytham woods; b) male blue and great tits of Wytham woods; c) female great tits of Wytham and Bagley woods; d) male great tits of Wytham and Bagley woods. For each dataset, the capture history is followed by the column 'S:' (denotes number of individuals with the capture history), the column '$COV:Mgp' (covariate coding for age, either Juvenile or Adult), and the column '$COV:Sp; or ;$COV:Pop; (coding for Species: G - great tit, B -Blue tit; or population Bag - Bagley wood, Wyth = Wytham woods). These capture histories were used to model pair fidelity and survival in program E_Surge as described in the Supplementary material of the Culina et al. 2015. Capture histories consist of 6 different codes that describe the 'event' that happen for a particular bird in a particular season. The original data on breeding pairs come from the long-ter...
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The XBT/CTD pairs dataset (Version 2) contains additional datasets and updated datasets from the Version 1 data. Version 1 data was used to update the calculation of historical XBT fall rate and temperature corrections presented in Cowley, R., Wijffels, S., Cheng, L., Boyer, T., and Kizu, S. (2013). Biases in Expendable Bathythermograph Data: A New View Based on Historical Side-by-Side Comparisons. Journal of Atmospheric and Oceanic Technology, 30, 1195–1225, doi:10.1175/JTECH-D-12-00127.1. http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-12-00127.1 Version 2 contains 1,188 pairs from seven datasets that add to Version 1 which contains 4,115 pairs from 114 datasets. There are also 10 updated datasets included in Version 2. The updates apply to the CTD depth data in the Quality Controlled version of the 10 datasets. The 10 updated Version 2 datasets should be used in preference to the copies in Version 1. Note that future versions of the XBT/CTD pairs database may supersede this version. Please check more recent versions for updates to individual datasets. Each dataset contains the scientifically quality controlled version and (where available) the originator's data. The XBT/CTD pairs are identified in the document 'XBT_CTDpairs_metadata_V2.csv'. Although the XBT data in the additional datasets was collected after 2008, much of the probes in the ss2012t01 dataset were manufactured during the mid-1980s. Lineage: Data is sourced from CSIRO Oceans and Atmosphere Flagship, Australian Antarctic Division and Italian National Agency for New Technologies, Energy and Sustainable Economic Development. Original and raw data files are included where available. Quality controlled datasets follow the procedure of Bailey, R., Gronell, A., Phillips, H., Tanner, E., and Meyers, G. (1994). Quality control cookbook for XBT data, Version 1.1. CSIRO Marine Laboratories Reports, 221. Quality controlled data is in the 'MQNC' format used at CSIRO Marine and Atmospheric Research. The MQNC format is described in the document 'XBT_CTDpairs_descriptionV2.pdf'. Note that future versions of the XBT/CTD pairs database may supersede this version. Please check more recent versions for updates to individual datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cause-effect is a two dimensional database with two-variable cause-effect pairs chosen from the different datasets created by Max-Planck-Institute for Biological Cybernetics in Tuebingen, Germany.
Size: 83 datasets of various sizes
Number of features: 2 in every datasets
Ground truth: avalaible for every dataset
Type of Graph: directed
Extension of the datasets used in CauseEffectPairs task. Each dataset consists of samples of a pair of statistically dependent random variables, where one variable is known to cause the other one. The task is to identify for each pair which of the two variables is the cause and which one the effect, using the observed samples only
More information about the dataset is contained in causal_description.html file.
Reference
J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, B. Schoelkopf: “Distinguishing cause from effect using observational data: methods and benchmarks”, Journal of Machine Learning Research 17(32):1-102, 2016
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1774 Active Global Pairs Pair buyers list and Global Pairs Pair importers directory compiled from actual Global import shipments of Pairs Pair.
asymmetric modelC code, parameter file, and Mathematica file for the asymmetric model. The positions of entries in the parameter file can be found in the "getinfo" function in the C file.asymmetric feedback modelC code, parameter file, and Mathematica file for the asymmetric feedback model. The positions of entries in the parameter file can be found in the "getinfo" function in the C file.symmetric modelC code, parameter file, and Mathematica file for the symmetric model. The positions of entries in the parameter file can be found in the "getinfo" function in the C file.symmetric model with fertility selC code, parameter file, and Mathematica file for the symmetric model with fertility selection. The positions of entries in the parameter file can be found in the "getinfo" function in the C file.asymmetric model with fertility selC code, parameter file, and Mathematica file for the asymmetric model with fertility selection. The positions of entries in the parameter file can be found...
analphipy is a python package to calculate metrics for classical models for pair potentials. It provides a simple and extendable api for pair potentials creation. Several routines to calculate metrics are included in the package. The main features of analphipy are 1) Pre-defined spherically symmetric potentials. 2) Simple interface to extended to user defined pair potentials. 3) Routines to calculate Noro-Frenkel effective parameters. 4) Routines to calculate Jensen-Shannon divergence.
Twelve Data is a technology-driven company that provides financial market data, financial tools, and dedicated solutions. Large audiences - from individuals to financial institutions - use our products to stay ahead of the competition and success.
At Twelve Data we feel responsible for where the markets are going and how people are able to explore them. Coming from different technological backgrounds, we see how the world is lacking the unique and simple place where financial data can be accessed by anyone, at any time. This is what distinguishes us from others, we do not only supply the financial data but instead, we want you to benefit from it, by using the convenient format, tools, and special solutions.
We believe that the human factor is still a very important aspect of our work and therefore our ethics guides us on how to treat people, with convenient and understandable resources. This includes world-class documentation, human support, and dedicated solutions.
Data used in the analysis done in "Pair Correlations in Doped Hubbard Ladders". Doi of paper will be provided once available.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
LC-HRMS experiments detect thousands of compounds, with only a small fraction of them identified in most studies. Traditional data processing pipelines contain an alignment step to assemble the measurements of overlapping features across samples into a unified table. However, data sets acquired under nonidentical conditions are not amenable to this process, mostly due to significant alterations in chromatographic retention times. Alignment of features between disparately acquired LC-MS metabolomics data could aid collaborative compound identification efforts and enable meta-analyses of expanded data sets. Here, we describe metabCombiner, a new computational pipeline for matching known and unknown features in a pair of untargeted LC-MS data sets and concatenating their abundances into a combined table of intersecting feature measurements. metabCombiner groups features by mass-to-charge (m/z) values to generate a search space of possible feature pair alignments, fits a spline through a set of selected retention time ordered pairs, and ranks alignments by m/z, mapped retention time, and relative abundance similarity. We evaluated this workflow on a pair of plasma metabolomics data sets acquired with different gradient elution methods, achieving a mean absolute retention time prediction error of roughly 0.06 min and a weighted per-compound matching accuracy of approximately 90%. We further demonstrate the utility of this method by comprehensively mapping features in urine and muscle metabolomics data sets acquired from different laboratories. metabCombiner has the potential to bridge the gap between otherwise incompatible metabolomics data sets and is available as an R package at https://github.com/hhabra/metabCombiner and Bioconductor.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Sets of Cause-Effect Pairs first used in the following paper:
Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts Oktie Hassanzadeh, Debarun Bhattacharjya, Mark Feblowitz, Kavitha Srinivas, Michael Perrone, Shirin Sohrabi, Michael Katz IJCAI 2019
@inproceedings{Hassanzadeh19, author = {Oktie Hassanzadeh and Debarun Bhattacharjya and Mark Feblowitz and Kavitha Srinivas and Michael Perrone and Shirin Sohrabi and Michael Katz}, title = {Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts}, booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, {IJCAI} 2019, August 10-16, 2019, Macao, China}, year = {2019} }
See README.txt for details.
$ wc -l * 319 ce_me_benchmark_v1.csv 118 nato_sfa_benchmark_v1.csv 804 risk_models_benchmark_v1.csv 1730 semeval_benchmark_v1.csv 2971 total
$ ls -lh * | awk '{print $5,$9}' 23K ce_me_benchmark_v1.csv 11K nato_sfa_benchmark_v1.csv 73K risk_models_benchmark_v1.csv 42K semeval_benchmark_v1.csv
NATO SFA Benchmark is created from the tables in the Appendix of the following publicly available document: STRATEGIC FORESIGHT ANALYSIS 2017 REPORT Links: https://www.act.nato.int/images/stories/media/doclibrary/171004_sfa_2017_report_hr.pdf https://www.act.nato.int/images/stories/media/doclibrary/171004_sfa_2017_report_txt.pdf https://www.act.nato.int/futures-work
SemEval data is published under the Creative Commons Attribution 3.0 Unported license: https://creativecommons.org/licenses/by/3.0/ Details: https://docs.google.com/document/d/1QO_CnmvNRnYwNWu1-QCAeR5ToQYkXUqFeAJbdEhsq7w/preview Original source: https://drive.google.com/file/d/0B_jQiLugGTAkMDQ5ZjZiMTUtMzQ1Yy00YWNmLWJlZDYtOWY1ZDMwY2U4YjFk/view?sort=name&layout=list&num=50
The rest of the data sets are covered by the Creative Commons: Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) license https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.