http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
This dataset comprises a set of information cascades generated by Singapore Twitter users. Here a cascade is defined as a set of tweets about the same topic. This dataset was collected via the Twitter REST and streaming APIs in the following way. Starting from popular seed users (i.e., users having many followers), we crawled their follow, retweet, and user mention links. We then added those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. With this, we have a total of 184,794 Twitter user accounts. Then tweets are crawled from these users from 1 April to 31 August 2012. In all, we got 32,479,134 tweets. To identify cascades, we extracted all the URL links and hashtags from the above tweets. And these URL links and hashtags are considered as the identities of cascades. In other words, all the tweets which contain the same URL link (or the same hashtag) represent a cascade. Mathematically, a cascade is represented as a set of user-timestamp pairs. Figure 1 provides an example, i.e. cascade C = {< u1, t1 >, < u2, t2 >, < u1, t3 >, < u3, t4 >, < u4, t5 >}. For evaluation, the dataset was split into two parts: four months data for training and the last one month data for testing. Table 1summarizes the basic (count) statistics of the dataset. Each line in each file represents a cascade. The first term in each line is a hashtag or URL, the second term is a list of user-timestamp pairs. Due to privacy concerns, all user identities are anonymized.
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:
Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.
This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This record contains the data and codes for the paper "SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing" published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). RequirementPython 3.7Pytorch 1.9.1Network ArchitectureTrainPlace the training and test image pairs in the data folder.Run data/makedataset.py to generate the NH-Haze20-21-23.h5 file.Run train.py to start training.TestPlace the pre-training weight in the checkpoint folder.Place test hazy images in the input folder.Modify the weight name in the test.py.parser.add_argument("--model_name", type=str, default='Gmodel_40', help='model name')Run test.pyThe results is saved in output folder.Pre-training Weight DownloadThe weight40 Gmodel_40.tar for the NTIRE2023 val/test datasets, i.e., the weight used in the NTIRE2023 challenge.The weight105 Gmodel_105.tar for the NTIRE2020/2021/2023 datasets.The weight120 Gmodel_120.tar for the NTIRE2020/2021/2023 datasets (Add the 15 tested images as the training dataset).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This record contains the underlying data/supplementary materials/appendix for the publication "Socially responsible firms" published in Journal of Financial Economics in 2016.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the replication data for the paper "The Intergenerational Mortality Tradeoff of COVID-19 Lockdown Policies" by Lin Ma, Gil Shapira, Damien de Walque, Quy-Toan Do, Jed Friedman, and Andrei Levchenko.
For steps to replicate the results, please refer to the readme file included alongside the data files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Replication Data for "JUE Insight: Migration, Transportation Infrastructure, and the Spatial Transmission of COVID-19 in China"
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
See the readme file inside for replication steps
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This record contains the underlying research data for the publication "Extended Comprehensive Study of Association Measures for Fault Localization" and the full-text is available from: https://ink.library.smu.edu.sg/sis_research/1818Spectrum-based fault localization is a promising approach to automatically locate root causes of failures quickly. Two well-known spectrum-based fault localization techniques, Tarantula and Ochiai, measure how likely a program element is a root cause of failures based on profiles of correct and failed program executions. These techniques are conceptually similar to association measures that have been proposed in statistics, data mining, and have been utilized to quantify the relationship strength between two variables of interest (e.g., the use of a medicine and the cure rate of a disease). In this paper, we view fault localization as a measurement of the relationship strength between the execution of program elements and program failures. We investigate the effectiveness of 40 association measures from the literature on locating bugs. Our empirical evaluations involve single-bug and multiple-bug programs. We find there is no best single measure for all cases. Klosgen and Ochiai outperform other measures for localizing single-bug programs. Although localizing multiple-bug programs, Added Value could localize the bugs with on average smallest percentage of inspected code, whereas a number of other measures have similar performance. The accuracies of the measures in localizing multi-bug programs are lower than single-bug programs, which provokes future research.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This is the online appendix to the working paper "The GATT/WTO welfare effects: 1950–2015" available at: https://ink.library.smu.edu.sg/soe_research/1957/
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Both attractiveness judgements and mate preferences vary considerably cross-culturally. We investigated whether men's preference for femininity in women's faces varies between 28 countries with diverse health conditions by analysing responses of 1972 heterosexual participants. Although men in all countries preferred feminized over masculinized female faces, we found substantial differences between countries in the magnitude of men's preferences. Using an average femininity preference for each country, we found men's facial femininity preferences correlated positively with the health of the nation, which explained 50.4% of the variation among countries. The weakest preferences for femininity were found in Nepal and strongest in Japan. As high femininity in women is associated with lower success in competition for resources and lower dominance, it is possible that in harsher environments, men prefer cues to resource holding potential over high fecundity.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This e-companion contains four sets of supporting materials for the main paper. EC.1 provides algorithmic treatments to handle key market implementation issues. EC.2 examines effects of active market intermediation on market performance and the dealer’s wealth under the controlled market experiment. EC.3 studies market liquidity and heterogeneous market participation in a randomized market environment. EC.4 includes proofs of Lemmas and Corollaries.
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Fieldwork conducted in August 2023 in Shandong Province, China, investigating forms of agricultural production in several sectors.
Fieldwork sites: 1. Rongcheng City, Weihai 2. Qixia, Yantai 3. Changyi, Weifang 4. Shouguang, Weifang
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These files are used to replicate all analyses in Media in a Time of Crisis: Newspaper Coverage of Covid-19 in East Asia, available at https://ink.library.smu.edu.sg/soss_research/3348/.
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
The dataset and source code for paper "Automating Intention Mining".
The code is based on dennybritz's implementation of Yoon Kim's paper Convolutional Neural Networks for Sentence Classification.
By default, the code uses Tensorflow 0.12. Some errors might be reported when using other versions of Tensorflow due to the incompatibility of some APIs.
Running 'online_prediction.py', you can input any sentence and check the classification result produced by a pre-trained CNN model. The model uses all sentences of the four Github projects as training data.
Running 'play.py', you can get the evaluation result of cross-project prediction. Please check the code for more details of the configuration. By default, it will use the four Github projects as training data to predict the sentences in DECA dataset, and in this setting, the category 'aspect evaluation' and 'others' are dropped since DECA dataset does not contain these two categories.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This record contains the underlying research data for the publication "Singaporean mothers' perception of their three-year-old child's weight status: A cross-sectional study" and the full-text is available from: https://ink.library.smu.edu.sg/soss_research/2459Objective: Inaccurate parental perception of their child's weight status is commonly reported in Western countries. It is unclear whether similar misperception exists in Asian populations. This study aimed to evaluate the ability of Singaporean mothers to accurately describe their three-year-old child's weight status verbally and visually.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This is the underlying dataset for the PhD dissertation: Debiasing a decision maker facing supply uncertainties in a newsvendor setting, available at: https://ink.library.smu.edu.sg/etd_coll/347/Companies must be prepared to manage uncontrollable events that will disrupt their supply chain and add uncertainty to their inventory models.This thesis first studies the effect of different types of supply disruption risks on the ordering performance of profit-maximizing decision makers in a newsvendor setting.Then, this thesis aims at extending the literature on the newsvendor model in studying the effect of a Decision Support System and the effect of a Secondary Task on the ordering performance of profit-maximizing decision makers who face supply uncertainties in a newsvendor setting.Finally, implications for scholars and practitioners are discussed.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The zip file includes datasets and code used for the paper "Public health insurance and pharmaceutical innovation: Evidence from China" published in the Journal of Development Economics, January 2021.
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
This is the underlying research data for the published PhD dissertation, Fuelling effects of unique opinion holder’s emotions on team creativity: A collective information processing perspective, available at: https://ink.library.smu.edu.sg/etd_coll/336/This research examined the influence of unique opinion holder's emotions on team creativity. As compared to teams that interacted with a neutral unique opinion holder, teams working with either an angry or happy unique opinion holder were found to utilize qualitatively different ways of achieving creative ideas. The team dataset and code could be found in this page whereas the main paper provides the description of the methodology and measures.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the replication data for "Globalization and Top Income Shares" published in Journal of International Economics, Volume 125, July 2020.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Previous research has found that young adults exhibit patterns of poor sleep and that poor sleep is associated with a host of negative psychological consequences. One potential intervention to improve sleep quality is listening to music at bedtime. Although there exist previous works investigating the efficacy of listening to music as a form of sleep aid, these works have been hindered by statistically weak designs, a lack of systematic investigation of critical characteristics of music that may affect its efficacy, and limited generalizability. In light of the limitations in the existing literature, a 15-day randomized cross-over trial was carried out with 62 young adults. Participants completed 5 nights of bedtime listening in each condition (happy music vs. sad music vs. pink noise, which acted as an active control condition) over 3 weeks. Upon awakening each morning, participants rated their subjective sleep quality, current stress, positive and negative affective states, and current life satisfaction. Frequentist and Bayesian multilevel modeling revealed that happy and sad music were both beneficial for subjective sleep quality and next-morning well-being, compared with the pink noise condition; potential nuances are discussed. The current study bears potential practical applications for health-care professionals and lay individuals.
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
This dataset comprises a set of information cascades generated by Singapore Twitter users. Here a cascade is defined as a set of tweets about the same topic. This dataset was collected via the Twitter REST and streaming APIs in the following way. Starting from popular seed users (i.e., users having many followers), we crawled their follow, retweet, and user mention links. We then added those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. With this, we have a total of 184,794 Twitter user accounts. Then tweets are crawled from these users from 1 April to 31 August 2012. In all, we got 32,479,134 tweets. To identify cascades, we extracted all the URL links and hashtags from the above tweets. And these URL links and hashtags are considered as the identities of cascades. In other words, all the tweets which contain the same URL link (or the same hashtag) represent a cascade. Mathematically, a cascade is represented as a set of user-timestamp pairs. Figure 1 provides an example, i.e. cascade C = {< u1, t1 >, < u2, t2 >, < u1, t3 >, < u3, t4 >, < u4, t5 >}. For evaluation, the dataset was split into two parts: four months data for training and the last one month data for testing. Table 1summarizes the basic (count) statistics of the dataset. Each line in each file represents a cascade. The first term in each line is a hashtag or URL, the second term is a list of user-timestamp pairs. Due to privacy concerns, all user identities are anonymized.