6 datasets found
  1. r

    R codes and dataset for Visualisation of Diachronic Constructional Change...

    • researchdata.edu.au
    • bridges.monash.edu
    Updated Apr 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg (2019). R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart [Dataset]. http://doi.org/10.26180/5c844c7a81768
    Explore at:
    Dataset updated
    Apr 1, 2019
    Dataset provided by
    Monash University
    Authors
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Publication


    Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387

    Description of R codes and data files in the repository

    This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).

    The raw input data consists of two files (i.e. will_INF.txt and go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).

    These two input files are used in the R code file 1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade, (ii) coll (for "collocate"), (iii) BE going to (for frequency of the collocates with be going to) and (iv) will (for frequency of the collocates with will); it is available in the input_data_raw.txt.

    Then, the script 2-script-create-motion-chart-input-data.R processes the input_data_raw.txt for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt). The output from the second script is input_data_futurate.txt.

    Next, input_data_futurate.txt contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R).

    The repository adopts the project-oriented workflow in RStudio; double-click on the Future Constructions.Rproj file to open an RStudio session whose working directory is associated with the contents of this repository.

  2. Datasets and R source code of manuscript "Adding insult to injury:...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Mar 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fernandez Declerck; Fernandez Declerck; Rojas; Rojas; Prosnier; Prosnier; Teulier; Teulier; Dechaume-Moncharmont; Dechaume-Moncharmont; Médoc; Médoc (2023). Datasets and R source code of manuscript "Adding insult to injury: anthropogenic noise intensifies predation risk by an invasive freshwater fish species" by Fernandez Declerck et al. [Dataset]. http://doi.org/10.5281/zenodo.7706393
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 8, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Fernandez Declerck; Fernandez Declerck; Rojas; Rojas; Prosnier; Prosnier; Teulier; Teulier; Dechaume-Moncharmont; Dechaume-Moncharmont; Médoc; Médoc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets and R source code of manuscript "Adding insult to injury: anthropogenic noise intensifies predation risk by an invasive freshwater fish species" by Fernandez Declerck et al. (submitted)

    Project
    ├── README
    ├── data_functional_response.txt
    ├── data_prey_behaviour.txt
    ├── R_script.R
    └── BINV-D-22-00447_Playback.wav

    Main dataset: 'data_functional_response.txt'. Dataset for the functional response of the predator. Fish behaviour was recorded under two noise conditions (either boat noise or ambient noise). The dataset corresponds to data frame "d" in the script. Variables description:
    - id: identity of the fish
    - condition: noise condition, either "boat noise" or "ambient noise"
    - prey_number: number of chironomid larvae introduced in the tank
    - prey_captured : number of chironomid larvae consumed
    - fish_mass : fish body mass (g)
    - swim_distance: swim distance (m)

    Secondary dataset: 'data_prey_behaviour.txt'. Dataset for control experiment on prey behaviour. Prey behaviour was recorded under two noise conditions (either boat noise or ambient noise). The dataset corresponds to data frame "f" in the script. We used 20 replicates with 10 replicates for ambient noise condition, and 10 replicates for boat noise condition. Two focal prey larva were observed per replicates. Each prey larva was observed during two time periods (corresponding to two noise sequences). Variables description:
    - condition: noise condition, either "boat noise" or "ambient noise"
    - replicate: number of the replicate
    - unique_id: identity of each focal larva
    - noise_sequence: number of the noise sequence (either second or third)
    - prop_inactive: proportion of time spent inactive by the focal larva
    - prop_active: proportion of time spent active by the focal larva

    R source code 'code R_script.R'. Complete analysis as one single R script. See comments for additional information.

    The last file `BINV-D-22-00447_Playback.wav` is an audio file (Waveform Audio File Format). It is the soundtrack used in the playback experiments.

  3. Market Basket Analysis

    • kaggle.com
    zip
    Updated Dec 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
    Explore at:
    zip(23875170 bytes)Available download formats
    Dataset updated
    Dec 9, 2021
    Authors
    Aslan Ahmedov
    Description

    Market Basket Analysis

    Market basket analysis with Apriori algorithm

    The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

    Introduction

    Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

    An Example of Association Rules

    Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

    Strategy

    • Data Import
    • Data Understanding and Exploration
    • Transformation of the data – so that is ready to be consumed by the association rules algorithm
    • Running association rules
    • Exploring the rules generated
    • Filtering the generated rules
    • Visualization of Rule

    Dataset Description

    • File name: Assignment-1_Data
    • List name: retaildata
    • File format: . xlsx
    • Number of Row: 522065
    • Number of Attributes: 7

      • BillNo: 6-digit number assigned to each transaction. Nominal.
      • Itemname: Product name. Nominal.
      • Quantity: The quantities of each product per transaction. Numeric.
      • Date: The day and time when each transaction was generated. Numeric.
      • Price: Product price. Numeric.
      • CustomerID: 5-digit number assigned to each customer. Nominal.
      • Country: Name of the country where each customer resides. Nominal.

    imagehttps://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

    Libraries in R

    First, we need to load required libraries. Shortly I describe all libraries.

    • arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).
    • arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.
    • tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.
    • readxl - Read Excel Files in R.
    • plyr - Tools for Splitting, Applying and Combining Data.
    • ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
    • knitr - Dynamic Report generation in R.
    • magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.
    • dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
    • tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

    imagehttps://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

    Data Pre-processing

    Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

    imagehttps://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> imagehttps://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

    After we will clear our data frame, will remove missing values.

    imagehttps://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

    To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...

  4. Rcode – Custom code written the R programming language that will translate...

    • plos.figshare.com
    txt
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Nearman; Alriana Buller-Jarrett; Dawn Boncristiani; Eugene Ryabov; Yanping Chen; Jay D. Evans (2025). Rcode – Custom code written the R programming language that will translate an open reading frame for an existing sequence, then compare it to a data frame of nucleotide polymorphisms at specific locations, and retranslate the amino acid changes into a new data frame. [Dataset]. http://doi.org/10.1371/journal.pone.0337191.s009
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Anthony Nearman; Alriana Buller-Jarrett; Dawn Boncristiani; Eugene Ryabov; Yanping Chen; Jay D. Evans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Rcode – Custom code written the R programming language that will translate an open reading frame for an existing sequence, then compare it to a data frame of nucleotide polymorphisms at specific locations, and retranslate the amino acid changes into a new data frame.

  5. f

    Data used in "A summer heatwave reduced activity, heart rate and autumn body...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evans, Alina L.; Albon, Steve; Król, Elżbieta; Trondrud, L. Monica; Kumpula, Jouko; Speakman, john; Loe, Leif Egil; Pigeon, Gabriel; Ropstad, Erik (2023). Data used in "A summer heatwave reduced activity, heart rate and autumn body mass in a cold-adapted ungulate" [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001093609
    Explore at:
    Dataset updated
    Mar 31, 2023
    Authors
    Evans, Alina L.; Albon, Steve; Król, Elżbieta; Trondrud, L. Monica; Kumpula, Jouko; Speakman, john; Loe, Leif Egil; Pigeon, Gabriel; Ropstad, Erik
    Description

    Overview This dataset contains biologging data and R script used to produce the results in "A summer heatwave reduced activity, heart rate and autumn body mass in a cold-adapted ungulate", a submitted manuscript. The longitudinal data of female reindeer and calf body masses used in the paper is owned by the Finnish Reindeer Herders’ Association. Natural Resources Institute Finland (Luke) updates, saves and administrates this long-term reindeer herd data. Methods of data collection Animals and study area The study involved biologging (see below) 14 adult semi-domesticated reindeer females (Focal animals: Table S1) at the Kutuharju Reindeer Research Facility (Kaamanen, Northern Finland, 69° 8’ N, 26° 59’ E, Figure S1), during June–September 2018. Ten of these individuals had been intensively handled in June as part of another study (Trondrud, 2021). The 14 females were part of a herd of ~100 animals, belonging to the Reindeer Herders’ Association. The herding management included keeping reindeer in two large enclosures (~13.8 and ~15 km2) after calving until the rut, after which animals were moved to a winter enclosure (~15 km2) and then in spring to a calving paddock (~0.3 km2) to give birth (See Supporting Information for further details on the study area). Kutuharju reindeer graze freely on natural pastures from May to November and after that are provided with silage and pellets as a supplementary feed in winter. During the period from September to April animals are weighed 5–6 times. In September, body masses of the focal females did not differ from the rest of the herd. Heart rate (HR) and subcutaneous body temperature (Tsc) data In February 2018, the focal females were instrumented with a heart rate (HR) and temperature logger (DST centi-HRT, Star-Oddi, Gardabaer, Iceland). The surgical protocol is described in the Supporting Information. The DST centi-HRT sensors recorded HR and subcutaneous body temperature (Tsc) every 15 min. HR was automatically calculated from a 4-sec electrocardiogram (ECG) at 150 Hz measurement frequency, alongside an index for signal quality. Additional data processing is described in Supporting Information. Activity data The animals were fitted with collar-mounted tri-axial accelerometers (Vertex Plus Activity Sensor, Vectronic Aerospace GmbH, Berlin, Germany) to monitor their activity levels. These sensors recorded acceleration (g) in three directions representing back-forward, lateral, and dorsal-ventral movements at 8 Hz resolution. For each axis, partial dynamic body acceleration (PDBA) was calculated by subtracting the static acceleration using a 4 sec running average from the raw acceleration (Shepard et al., 2008). We estimated vectorial dynamic body acceleration (VeDBA) by calculating the square root of the sum of squared PDBAs (Wilson et al., 2020). We aggregated VeDBA data into 15-min sums (hereafter “sum VeDBA”) to match with HR and Tsc records. Corrections for time offsets are described in Supporting Information. Due to logger failures, only 10 of the 14 individuals had complete data from both loggers (activity and heart rate). Weather and climate data We set up a HOBO weather station (Onset Computer Corporation, Bourne, MA, USA) mounted on a 2 m tall tripod in May 2018 that measured air temperature (Ta, °C) at 15-minute intervals. The placement of the station was between the two summer paddocks. These measurements were matched to the nearest timestamps for VeDBA, HR and Tsc recordings. Also, we obtained weather records from the nearest public weather stations for the years 1990–2021 (Table S2). Weather station IDs and locations relative to the study area are shown in Figure S1 in the Supporting Information. The temperatures at the study site and the nearest weather station were strongly correlated (Pearson’s, r = 0.99), but temperatures were on average ~1.0°C higher at the study site (Figure S2). Statistical analyses All statistical analyses were conducted in R version 4.1.1 (The R Core Team, 2021). Mean values are presented with standard deviation (SD), and parameter estimates with standard error (SE). Environmental effects on activity states and transition probabilities We fitted hidden Markov models (HMM) to 15-min sum VeDBA using the package ‘momentuHMM’ (McClintock & Michelot, 2018). HMMs assume that the observed pattern is driven by an underlying latent state sequence (a finite Markov chain). These states can then be used as proxies to interpret the animal’s unobserved behaviour (Langrock et al., 2012). We assumed only two underlying states, thought to represent ‘inactive’ and ‘active’ (Figure S3). The ‘active’ state thus contains multiple forms of movement, e.g., foraging, walking, and running, but reindeer spend more than 50% of the time foraging in summer (Skogland, 1980). We fitted several HMMs to evaluate both external (temperature and time of day) and individual-level (calf status) effects on the probability to occupy each state (stationary state probabilities). The combination of the explanatory variables in each HMM is listed in Table S5. Ta was fitted as a continuous variable with piecewise polynomial spline with 8 knots, asserted from visual inspection of the model outputs. We included sine and cosine terms for time of day to account for cyclicity. In addition, to assess the impact of Ta on activity patterns, we fitted five temperature-day categories in interaction with time of day. These categories were based on 20% intervals of the distribution of temperature data from our local weather station, in the period 19 June to 19 August 2018, with ranges of < 10°C (cold), 10−13°C (cool), 13−16°C (intermediate) 16−20°C (warm) and ≥ 20°C (hot). We evaluated the significance of each variable on the transition probabilities from the confidence intervals of each estimate, and the goodness-of-fit of each model using Akaike information criteria (AIC) (Burnham & Anderson, 2002), retaining models within ΔAIC < 5. We extracted the most likely state occupied by an individual using the viterbi function, returning the optimal state pathway, i.e., a two-level categorical variable indicating whether the individual was most likely resting or active. We used this output to calculate daily activity budgets (% time spent active). Drivers of heart rate (HR) and subcutaneous body temperature (Tsc) We matched the activity states derived from the HMM to the HR and Tsc data. We opted to investigate the drivers of variation in HR and Tsc only within the inactive state. HR and Tsc were fitted as response variables in separate generalised additive mixed-effects models (GAMM), which included the following smooth terms: calendar day as a thin-plate regression spline, time of day (ToD, in hours, knots [k] = 10) as a cubic circular regression spline and individual as random intercept. All models were fitted using restricted maximum likelihood, a penalization value (λ) of 1.4 (Wood, 2017), and an autoregressive structure (AR1) to account for temporal autocorrelation. We used the ‘gam.check’ function from the ‘mgcv’ package to select k. The sum of VeDBA in the past 15 minutes was included as a predictor in all models. All models were fitted with the same set of explanatory variables: sum VeDBA, age, body mass (BM), lactation status, Ta, as well as the interaction between lactation status and Ta. Description of files 1. Data: "kutuharju_weather.csv" weather data recorded from local weather station during study period "Inari_Ivalo_lentoasema.csv" public weather data from weather station ID 102033, owned and managed by the Finnish Meterorological Institute "activitydata.Rdata" dataset used in analyses of activity patterns in reindeer "HR_temp_data.Rdata" dataset used in analyses of heart rate and body temperature responses in reindeer "HRfigureData.Rdata" and "TempFigureData.Rdata" are data files (lists) with model outputs generated in "heartrate_bodytemp_analyses.R" and used in "figures_in_paper.R" "HMM_df_withStates.Rdata" data frame used in HMM models including output from viterbi function "plotdf_m16.Rdata" dataframe for plotting output from model 16 "plotdf_m22.Rdata" dataframe for plotting output from model 22 2. Scripts "activitydata_HMMs.R" R script for data prep and hidden markov models to analyse activity patterns in reindeer "heartrate_bodytemp_analyses.R" R script for data prep and generalized additive mixed models to analyse heart rate and body temperature responses in reindeer "figures_in_paper.R" R script for generating figures 1-3 in the manuscript 3. HMM_model "modelList.Rdata" list containing 2 items: string of all 25 HMM models created, and dataframe with model number and formula "m16.Rdata" and "m22.Rdata" direct acces to two best-fit models

  6. s

    Surface soil moisture for Europe 2014-2024 at 1 km annual and quarterly...

    • repository.soilwise-he.eu
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Surface soil moisture for Europe 2014-2024 at 1 km annual and quarterly aggregates [Dataset]. http://doi.org/10.5281/zenodo.14833053
    Explore at:
    Dataset updated
    Feb 7, 2025
    Description

    Copernicus Land Monitoring Services provides Surface Soil Moisture 2014-present (raster 1 km), Europe, daily – version 1. Each day covers only 5 to 10% of European land mask and shows lines of scenes (obvious artifacts). This is the long-term aggregates of daily images of soil moisture (0–100%) based on two types of aggregation:

    • Long-term quarterly (qr.1 - winter, qr.2 - spring, qr.3 - summer and qr.4 - autumn),
    • Annual quantiles P.05, P.50 and P.95,

    The soil moisture rasters are based on Sentinel 1 and described in detail in:

    • Bauer-Marschallinger, B. ; Freeman, V. ; Cao, S. ; Paulik, C. ; Schaufler, S. ; Stachl, T. ; Modanesi, S. ; Massari, C. ; Ciabatta, L. ; Brocca, L. ; Wagner, W. Toward Global Soil Moisture Monitoring With Sentinel-1: Harnessing Assets and Overcoming Obstacles. IEEE Transactions on Geoscience and Remote Sensing 2019, 1 - 20. DOI 10.1109/TGRS.2018.2858004

    You can access and download the original data as .nc files from: https://globalland.vito.be/download/manifest/ssm_1km_v1_daily_netcdf/.

    Aggregation has been generated using the terra package in R in combination with the matrixStats::rowQuantiles function. Tiling system and land mask for pan-EU is also available.

    library(terra)library(matrixStats)g1 = terra::vect('/mnt/inca/EU_landmask/tilling_filter/eu_ard2_final_status.gpkg')## 1254 tilestile = g1[534]nc.lst = list.files('/mnt/landmark/SM1km/ssm_1km_v1_daily_netcdf/', pattern = glob2rx('*.nc$'), full.names=TRUE)## 3726## test it#r = terra::rast(nc.lst[100:210])agg_tile = function(r, tile, pv=c(0.05,0.5,0.95), out.year='2015.annual'){ bb = paste(as.vector(ext(tile)), collapse = '.') out.tif = paste0('./eu_tmp/', out.year, '/sm1km_', pv, '_', out.year, '_', bb, '.tif') if(any(!file.exists(out.tif))){  r.t = terra::crop(r, ext(tile))  ## each tile is 100x100 pixels 365 days  r.t = as.data.frame(r.t, xy=TRUE, na.rm=FALSE)  sel.c = grep(glob2rx('ssm$'), colnames(r.t))  ## remove everything outside the range  t1s = cbind(data.frame(matrixStats::rowQuantiles(as.matrix(r.t[,sel.c]), probs = pv, na.rm=TRUE)), data.frame(x=r.t$x, y=r.t$y))  #str(t1s)  ## write to GeoTIFFs  r.o = terra::rast(t1s[,c('x','y','X5.','X50.','X95.')], type='xyz', crs='+proj=longlat +datum=WGS84 +no_defs')  for(k in 1:length(pv)){    terra::writeRaster(r.o[[k]], filename=out.tif[k], gdal=c('COMPRESS=DEFLATE'), datatype='INT2U', NAflag=32768, overwrite=FALSE)  }  rm(r.t); gc()  tmpFiles(remove=TRUE) }}## quarterly values:lA = data.frame(filename=nc.lst)library(lubridate)lA$Date = ymd(sapply(lA$filename, function(i){substr(strsplit(basename(i), '_')[[1]][4], 1, 8)}))#summary(is.na(lA$Date))#hist(lA$Date, breaks=60)lA$quarter = quarter(lA$Date, fiscal_start = 11)summary(as.factor(lA$quarter))for(qr in 1:4){ #qr=1 pth = paste0('A.q', qr) rs = terra::rast(lA$filename[lA$quarter==qr]) #agg_tile(rs, tile, out.year=pth) x = parallel::mclapply(sample(1:length(g1)), function(i){try( agg_tile(rs, tile=g1[i], out.year=pth) )}, mc.cores=20) for(type in c(0.05,0.5,0.95)){  x <- list.files(path=paste0('./eu_tmp/', pth), pattern=glob2rx(paste0('sm1km_', type, '_*.tif$')), full.names=TRUE)  out.tmp <- paste0(pth, '.', type, '.sm1km_eu.txt')  vrt.tmp <- paste0(pth, '.', type, '.sm1km_eu.vrt')  cat(x, sep=' n', file=out.tmp)  system(paste0('gdalbuildvrt -input_file_list ', out.tmp, ' ', vrt.tmp))  system(paste0('gdal_translate ', vrt.tmp, ' ./cogs/soil.moisture_s1.clms.qr.', qr, '.p', type, '_m_1km_20140101_20241231_eu_epsg4326_v20250206.tif -ot 'Byte' -r 'near' --config GDAL_CACHEMAX 9216 -co BIGTIFF=YES -co NUM_THREADS=80 -co COMPRESS=DEFLATE -of COG -projwin -32 72 45 27')) }}## per year ----for(year in 2015:2023){ l.lst = nc.lst[grep(year, basename(nc.lst))] r = terra::rast(l.lst) ## test it: pth = paste0(year, '.annual') x = parallel::mclapply(sample(1:length(g1)), function(i){try( agg_tile(r, tile=g1[i], out.year=pth) )}, mc.cores=40) ## Mosaics: for(type in c(0.05,0.5,0.95)){  x <- list.files(path=paste0('./eu_tmp/', pth), pattern=glob2rx(paste0('sm1km_', type, '_*.tif$')), full.names=TRUE)  out.tmp <- paste0(pth, '.', type, '.sm1km_eu.txt')  vrt.tmp <- paste0(pth, '.', type, '.sm1km_eu.vrt')  cat(x, sep=' n', file=out.tmp)  system(paste0('gdalbuildvrt -input_file_list ', out.tmp, ' ', vrt.tmp))  system(paste0('gdal_translate ', vrt.tmp, ' ./cogs/soil.moisture_s1.clms.annual.', type, '_m_1km_', year, '0101_', year, '1231_eu_epsg4326_v20250206.tif -ot 'Byte' -r 'near' --config GDAL_CACHEMAX 9216 -co BIGTIFF=YES -co NUM_THREADS=80 -co COMPRESS=DEFLATE -of COG -projwin -32 72 45 27')) }}

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg (2019). R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart [Dataset]. http://doi.org/10.26180/5c844c7a81768

R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 1, 2019
Dataset provided by
Monash University
Authors
Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Publication


Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387

Description of R codes and data files in the repository

This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).

The raw input data consists of two files (i.e. will_INF.txt and go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).

These two input files are used in the R code file 1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade, (ii) coll (for "collocate"), (iii) BE going to (for frequency of the collocates with be going to) and (iv) will (for frequency of the collocates with will); it is available in the input_data_raw.txt.

Then, the script 2-script-create-motion-chart-input-data.R processes the input_data_raw.txt for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt). The output from the second script is input_data_futurate.txt.

Next, input_data_futurate.txt contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R).

The repository adopts the project-oriented workflow in RStudio; double-click on the Future Constructions.Rproj file to open an RStudio session whose working directory is associated with the contents of this repository.

Search
Clear search
Close search
Google apps
Main menu