31 datasets found

D
Replication Data for: The long and the short of it: Russian predicate...
dataverse.no
dataverse.azure.uit.no
pdf +4
Updated Sep 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Alexis Janda; Laura Alexis Janda (2023). Replication Data for: The long and the short of it: Russian predicate adjectives with zero copula [Dataset]. http://doi.org/10.18710/XKDBLF
Explore at:
txt(7215), text/x-r-notebook(11795), text/comma-separated-values(2122117), xlsx(1093987), pdf(60832)Available download formats
Unique identifier
https://doi.org/10.18710/XKDBLF
Dataset updated
Sep 2, 2023
Dataset provided by
DataverseNO
Authors
Laura Alexis Janda; Laura Alexis Janda
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
1960 - 2016
Area covered
Russian Federation
Description
Description of Dataset This is a study of examples of Russian predicate adjectives in clauses with zero-copula present tense, where the adjective is a short form (SF) or a long form nominative (LF). The data was collected in 2022 from SynTagRus (https://universaldependencies.org/treebanks/ru_syntagrus/index.html), the syntactic subcorpus of the Russian National Corpus (https://ruscorpora.ru/new/). The data merges the results of several searches conducted to extract examples of sentences with long form and short form adjectives in predicate position, as identified by the corpus. The examples were imported to a spreadsheet and annotated manually, based on the syntactic analyses given in the corpus. For present tense sentences with no copula (Река спокойна or Река спокойная), it was necessary to search for an adjective as the top (root) node in the syntactic structure. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/page/instruction-syntax/. In order for the R code to run from these files, one needs to set up an R project with the data files in a folder named "data" and the R markdown files in a folder named "scripts". Method: Logistic regression analysis of corpus data carried out in R (R version 4.2.3 (2023-03-15)-- "Shortstop Beagle" Copyright (C) 2023 The R Foundation for Statistical Computing) and documented in an .Rmd file. Publication Abstract The present article presents an empirical investigation of the choice between so-called long (e.g., prostoj ‘simple’) and short forms (e.g., prost ‘simple’) of predicate adjectives in Russian based on data from the syntactic subcorpus of the Russian National Corpus. The data under scrutiny suggest that short forms represent the dominant option for predicate adjectives. It is proposed that long forms are descriptions of thematic participants in sentences with no complement, while short forms may take complements and describe both participants (thematic and rhematic) and situations. Within the “space of competition” where both long and short forms are well attested, it is argued that the choice of form to some extent depends on subject type, gender/number, and frequency. On the methodological level, the approach adopted in the present study may be extended to other cases of competition in morphosyntax. It is suggested that one should first “peel off” contexts where (nearly) categorical rules are at work, before one undertakes a statistical analysis of the “space of competition”.
D
Replication Data for: Contextually determined or semantically distinct? The...
dataverse.azure.uit.no
dataverse.no
+1more
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Alexis Janda; Laura Alexis Janda (2025). Replication Data for: Contextually determined or semantically distinct? The competition between instrumental, long form nominative and short form nominative in Russian predicate adjectives [Dataset]. http://doi.org/10.18710/ZTQURH
Explore at:
txt(11823), text/comma-separated-values(626056), txt(10025), txt(6203), text/comma-separated-values(909349), text/comma-separated-values(127532)Available download formats
Unique identifier
https://doi.org/10.18710/ZTQURH
Dataset updated
Feb 4, 2025
Dataset provided by
DataverseNO
Authors
Laura Alexis Janda; Laura Alexis Janda
License
https://dataverse.no/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18710/ZTQURHhttps://dataverse.no/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18710/ZTQURH
Time period covered
1900 - 2015
Area covered
Russian Federation
Dataset funded by
Norwegian Directorate for Higher Education and Skills
Description
Dataset description This post provides the data and R scripts for analysis of data on the variation between long form nominative, short form nominative, and instrumental case in Russian predicate adjectives in sentences containing an overt copula verb. We analyze the various factors associated with the choice of form of the adjective. This is the abstract of the article: Based on data from the syntactic subcorpus of the Russian National Corpus, we undertake a quantitative analysis of the competition between Russian predicate adjectives in the instrumental (e.g., pustym ‘empty’), the long form nominative (e.g., pustoj ‘empty’), and the short form nominative (e.g., pust ‘empty’). It is argued that the choice of adjective form is partly determined by the context. Four (nearly) categorical rules are proposed based on the following contextual factors: the form of the copula verb, the presence/absence of a complement, and the nature of the subject of the sentence. At the same time, a “space of competition” is identified, where all three adjective forms are attested. It is hypothesized that within the space of competition, the three forms are recruited to convey different meanings, and it is argued that our analysis lends support to the traditional idea that the short form nominative is closely related to verbs. Our findings are furthermore compatible with the idea that the short form nominative expresses temporary states, rather than inherent permanent characteristics.
Reddit: /r/technology (Submissions & Comments)
kaggle.com
Updated Dec 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Reddit: /r/technology (Submissions & Comments) [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-technology-insights-through-reddit-di
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 18, 2022
Dataset provided by
Kaggle
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Reddit: /r/technology (Submissions & Comments)

Title, Score, ID, URL, Comment Number, and Timestamp

By Reddit [source]

About this dataset

This dataset, labeled as Reddit Technology Data, provides thorough insights into the conversations and interactions around technology-related topics shared on Reddit – a well-known Internet discussion forum. This dataset contains titles of discussions, scores as contributed by users on Reddit, the unique IDs attributed to different discussions, URLs of those hidden discussions (if any), comment counts in each discussion thread and timestamps of when those conversations were initiated. As such, this data is supremely valuable for tech-savvy people wanting to stay up to date with the new developments in their field or professionals looking to keep abreast with industry trends. In short, it is a repository which helps people make sense and draw meaning out of what’s happening in the technology world at large - inspiring action on their part or simply educating them about forthcoming changes

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

The dataset includes six columns containing title, score, url address link to the discussion page on Reddit itself ,comment count ,created time stamp meaning when was it posted/uploaded/communicated and body containing actual text written regarding that particular post/discussion. By separately analyzing each column it can be made out what type information user require in regard with various aspects related to technology based discussions. One can develop hypothesis about correlations between different factors associated with rating or comment count by separate analysis within those columns themselves like discuss what does people comment or react mostly upon viewing which type of post inside reddit ? Does high rating always come along with extremely long comments.? And many more .By researching this way one can discover real facts hidden behind social networking websites such as reddit which contains large amount of rich information regarding user’s interest in different topics related to tech gadgets or otherwise .We can analyze different trends using voice search technology etc in order visualize users overall reaction towards any kind of information shared through public forums like stack overflow sites ,facebook posts etc .These small instances will allow us gain heavy insights for research purpose thereby providing another layer for potential business opportunities one may benefit from over a given period if not periodcally monitored .

Research Ideas

Companies can use this dataset to create targeted online marketing campaigns directed towards Reddit users interested in specific areas of technology.

Academic researchers can use the data to track and analyze trends in conversations related to technology on Reddit over time.

Technology professionals can utilize the comments and discussions on this dataset as a way of gauging public opinion and consumer sentiment towards certain technological advancements or products

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: technology.csv | Column name | Description | |:--------------|:--------------------------------------------------------------------------| | title | The title of the discussion. (String) | | score | The score of the discussion as measured by Reddit contributors. (Integer) | | url | The website URL associated with the discussion. (String) | | comms_num | The number of comments associated with the discussion. (Integer) | | created | The date and time the discussion was created. (DateTime) | | body | The body content of the discussion. (String) | | timestamp | The timestamp of the discussion. (Integer) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Reddit.
Z
Data from: The long and short of it: converting between maximum and minimum...
data.niaid.nih.gov
Updated Mar 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Caravaggi; Sam Bayley; Richard J. Facey; Ivan de la Hera; Mike P. Shewring; Jez A. Smith (2022). The long and short of it: converting between maximum and minimum tarsus measurements in passerine birds [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6347025
Explore at:
Dataset updated
Mar 12, 2022
Dataset provided by
University of South Wales
Authors
Anthony Caravaggi; Sam Bayley; Richard J. Facey; Ivan de la Hera; Mike P. Shewring; Jez A. Smith
Description
Data and R scripts relating to:

Caravaggi A, Bayley S, Facey RJ, de la Hera I, Shewring M, Smith JA. (2022) The long and short of it: converting between maximum and minimum tarsus measurements in passerine birds. Ringing & Migration. doi: 10.1080/03078698.2022.2050937
r
DHS Annual Report 2013-14 Dataset - Additional Data - output performance
researchdata.edu.au
Updated Oct 20, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.vic.gov.au (2014). DHS Annual Report 2013-14 Dataset - Additional Data - output performance [Dataset]. https://researchdata.edu.au/dhs-annual-report-output-performance/634336
Explore at:
Dataset updated
Oct 20, 2014
Dataset provided by
data.vic.gov.au
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset supports the 2013-2014 annual report of the Department of Human Services, which details how the department met its objectives and highlights key achievements for the reporting period. This particular dataset is additional information providing a summary of Social housing data including, public rental housing, public housing client profiles, rental stock, stock management program activities, social housing dwellings and changes to Director-owned dwellings during 2013–14,\r \r Social housing assistance focuses on providing adequate, affordable and accessible housing targeted to those in greatest need, delivered cost-effectively and in coordination with support services where required. Social housing assistance is provided on a long or short-term basis. \r \r Long-term social housing assistance includes public rental accommodation, community-managed housing in Director-owned properties and community-owned stock for designated client groups and rental accommodation for low income Victorians with identified support needs. Long-term public rental housing also includes movable units.\r \r In recent years, housing assistance has been increasingly targeted to people in greatest need. Targeting to high need groups has impacts in terms of stock turnover and costs. \r \r Short-term social housing is provided to Victoria’s homeless individuals and families. Clients are assisted under the Crisis Supported Accommodation and Transitional Housing Management programs.\r
d
Data from: Data and R Code to Derive Estimates of Groundwater Levels Using...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Data and R Code to Derive Estimates of Groundwater Levels Using MOVE.1 Regression and Compute Monthly Percentiles for Select Wells in Massachusetts [Dataset]. https://catalog.data.gov/dataset/data-and-r-code-to-derive-estimates-of-groundwater-levels-using-move-1-regression-and-comp
Explore at:
Dataset updated
Nov 21, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Massachusetts
Description
This data release contains extended estimates of daily groundwater levels and monthly percentiles at 27 short-term monitoring wells in Massachusetts. The Maintenance of Variance Extension Type 1 (MOVE.1) regression method was used to extend short-term groundwater levels at wells with less than 10 years of continuous data. This method uses groundwater level data from a correlated long-term monitoring well (index well) to estimate the groundwater level record for the short-term monitoring well. MOVE.1 regressions are used widely throughout the hydrologic community to extend flow records from streamgaging stations but are less commonly used to extend groundwater records at wells. The data in this data release document the results of the MOVE.1 regressions to estimate groundwater levels and compute updated monthly percentiles for select wells used in the groundwater index in the Massachusetts Drought Management Plan (2019). The U.S. Geological Survey (USGS) groundwater identification site numbers and groundwater level data are available via the USGS National Water Information System (NWIS) database (available at https://waterdata.usgs.gov/nwis). Groundwater levels provided are in depth to water level, in feet below land surface datum. This data release accompanies a USGS scientific investigations report that describes the methods and results in detail (Ahearn and Crozier, 2024). Reference: Massachusetts Executive Office of Energy and Environmental Affairs and Massachusetts Emergency Management Agency, 2019, Massachusetts drought management plan: Executive Office of Energy and Environmental Affairs, 115 p., accessed September 2022, at https://www.mass.gov/doc/massachusetts-drought-management-plan The following are included in the data release: (1) R input file that lists the final site pairings (R_Input_MOVE1_Site_List.csv) (2) R script that performs the MOVE.1 and produces outputs for evaluation purposes (MOVE1_R_code.R) (3) MOVE.1 model outputs (MOVE1_Models.zip) (4) Estimates of daily groundwater levels using the MOVE.1 regression technique (MOVE1_Estimated_Record_Tables.zip) (5) Plots showing time series of estimated daily groundwater levels from the MOVE.1 technique (MOVE1_Estimated_Record_Plots.zip) (6) Plots showing time series of estimated daily groundwater levels from the MOVE.1 technique zoomed into the period of observed daily groundwater levels for the short-term site (Zoomed_MOVE1_Estimated_Record_Plots.zip) (7) Plots showing residuals (Residuals_WL_Plots.zip) (8) Monthly percentile table for 27 study wells (GWL_Percentiles_All_Study_Wells.csv)
e
Meteorological Data: Balbina R. 15 - 20 July 2013 (5 minute averaged)
knb.ecoinformatics.org
search.dataone.org
+1more
Updated Sep 27, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sally MacIntyre (2021). Meteorological Data: Balbina R. 15 - 20 July 2013 (5 minute averaged) [Dataset]. http://doi.org/10.5063/4Q7SDG
Explore at:
Unique identifier
https://doi.org/10.5063/4Q7SDG
Dataset updated
Sep 27, 2021
Dataset provided by
Knowledge Network for Biocomplexity
Authors
Sally MacIntyre
Time period covered
Jul 15, 2013 - Jul 20, 2015
Area covered

Description
As part of a physical limnology experiment in July 2013, an Onset meteorological station was deployed on a floating platform at an offshore site in Balbina Reservoir, Brazil (01°54'38.5'' S; 59°28'08.5'' W) using Onset D-WSET-B to measure wind speed, maximal wind speed, wind direction and standard deviation of wind direction (set to 0), using Onset S-THB-MOO2 to measure air temperature and relative humidity, and an Onset pyranometer to measure downwelling shortwave radiation. The anemometer height was 2.8 m. Downwelling and upwelling longwave radiation were measured at an inshore site (1o 54’ 33.3” S; 59o 27’42.21S W) using a net radiometer (Kipp and Zonen CGR3). These data were corrected for temperature in post-processing based on the mean of the temperature measured by the upwelling and downwelling sensors. Sampling of all sensors was once per second, and 5-minute averages were stored. Attenuation of photosynthetically available radiation was computed following Beer’s Law from irradiance measurements with a Li Cor 2 π quantum sensor. The value was 0.5 m-1.
g
Acoustic Doppler Current Profiler (ADCP) and underway SCS data collected...
data.griidc.org
Updated Apr 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
April Cook (2023). Acoustic Doppler Current Profiler (ADCP) and underway SCS data collected aboard from R/V Point Sur cruise PS23_04 (DP08) in the northern Gulf of Mexico from 2022-07-27 to 2022-08-07 [Dataset]. http://doi.org/10.7266/nysgy6h4
Explore at:
Unique identifier
https://doi.org/10.7266/nysgy6h4
Dataset updated
Apr 18, 2023
Dataset provided by
GRIIDC
Authors
April Cook
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered

Description
This dataset contains raw Acoustic Doppler Current Profiler (ADCP) and underway SCS data collected aboard the R/V Point cruise PS23_04 (DP08, cruise doi https://doi.org/10.7284/909796) in the northern Gulf of Mexico from 2022-07-27 to 2022-08-07. The overall purpose of the R/V Point Sur cruise PS23-04 led by chief scientist Dr. Tracey Sutton was to perform deep water sampling of in-situ seawater and associated fauna. Fauna data from the cruise are found in related datasets under GRIIDC Unique Dataset Identifiers (UDIs): NO.x959.000:0006 (https://doi.org/10.7266/2JR97MGC), MOCNESS trawl data), NO.x959.000:0010 (fauna inventory). The CTD data are available in GRIIDC Unique Dataset Identifier (UDI) NO.x959.000:0008 (https://doi.org/10.7266/81EA5Q5G). Dataset also includes the cruise logbook and cruise data documentation. This dataset is a result of research funded by the National Oceanic and Atmospheric Administration's RESTORE Science Program (ROR - https://ror.org/0042xzm63) under award #NA19NOS4510193 to Nova Southeastern University.
Stress-strain data sets.
plos.figshare.com
txt
Updated Apr 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Changsheng Li; Xinsong Zhang (2025). Stress-strain data sets. [Dataset]. http://doi.org/10.1371/journal.pone.0321478.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321478.s002
Dataset updated
Apr 29, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Changsheng Li; Xinsong Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Deep learning has significantly advanced in predicting stress-strain curves. However, due to the complex mechanical properties of rock materials, existing deep learning methods have the problem of insufficient accuracy in predicting the stress-strain curves of rock materials. This paper proposes a deep learning method based on a long short-term memory autoencoder (LSTM-AE) for predicting stress-strain curves of rock materials in discrete element numerical simulations. The LSTM-AE approach uses the LSTM network to construct both the encoder and decoder, where the encoder extracts features from the input data and the decoder generates the target sequence for prediction. The mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) of the predicted and true values are used as the evaluation metrics. The proposed LSTM-AE network is compared with the LSTM network, recurrent neural network (RNN), BP neural network (BPNN), and XGBoost model. The results indicate that the accuracy of the proposed LSTM-AE network outperforms LSTM, RNN, BPNN, and XGBoost. Furthermore, the robustness of the LSTM-AE network is confirmed by predicting 10 sets of special samples. However, the scalability of the LSTM-AE network in handling large datasets and its applicability to predicting laboratory datasets need further verification. Nevertheless, this study provides a valuable reference for solving the prediction accuracy of stress-strain curves in rock materials.
d
Data from: A model-derived short-term estimation method of effective size...
datadryad.org
data.niaid.nih.gov
zip
Updated Dec 22, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Annegret Grimm; Bernd Gruber; Marion Hoehn; Katrin Enders; Klaus Henle (2016). A model-derived short-term estimation method of effective size for small populations with overlapping generations [Dataset]. http://doi.org/10.5061/dryad.9h7p4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.9h7p4
Dataset updated
Dec 22, 2016
Dataset provided by
Dryad
Authors
Annegret Grimm; Bernd Gruber; Marion Hoehn; Katrin Enders; Klaus Henle
Time period covered
Dec 22, 2015
Area covered
New South Wales, Australia, Kinchega National Park
Description
Grimm_etal_Gehyra_micsatsThis data file contains the microsatellite loci analysed from genetic samples and individual traits of the gecko Gehyra variegata (alias Gehyra versicolor) population of Kinchega National Park, Australia. These data were used for an application of the R-package NEff.
S
LSS-FMCWR-2.0：Multi-band multi-angle FMCW radar low-slow-small target...
scidb.cn
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
chen xiao long; Yuan Wang; guanjian (2025). LSS-FMCWR-2.0：Multi-band multi-angle FMCW radar low-slow-small target detection dataset [Dataset]. http://doi.org/10.57760/sciencedb.radars.00054
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.radars.00054
Dataset updated
Apr 23, 2025
Dataset provided by
Science Data Bank
Authors
chen xiao long; Yuan Wang; guanjian
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The Multi band and Multi angle FMCW Radar Low Slow Small Detection Dataset (LSS-FMCW R-2.0) is an expansion of the previously published Multi band FMCW Low Slow Small Detection Dataset (LSS-FMCW R-1.0) in the Journal of Radar. The dataset is formed by collecting echoes from multiple types of low slow small targets (rotary wing drones, flying birds, helicopters, etc.) using two different band (K+L) frequency modulated continuous wave radars. The K-band radar data includes low slow small target echo data collected from different angles, where the angles between the two band radars are 0 °, 60 °, 90 °, 120 °, and 180 °. This dataset effectively supplements high-resolution radar low slow small target feature data, aiming to promote the development of radar low slow small target detection technology. Collect 6 types of target echo data at a certain distance, while changing the radar modulation period and modulation bandwidth. The dataset contains a total of 90 data points.The copyright of the Low Slow Small Target Echo Dataset of Multi angle and Multi band Frequency Modulated Continuous Wave Radar (LSS-FMCWR-2.0) belongs to the Marine Target Detection Team of Naval Aviation University, and the editorial department of Radar Journal has editing and publishing rights. Readers can use this data for teaching, research, etc. for free, but they need to cite or acknowledge it in the paper report results.The radar settings, acquisition scenarios, and signal processing flow of the LSS-FMCW R-2.0 dataset can refer to the paper "Multi band and Multi angle FMCW Radar Low Slow Small Detection Dataset (LSS-FMCW R-2.0) and Feature Fusion Classification Method". The DJI M350, quadcopter drone, DJI Wu 2, and DJI Yu 2 each contain 20 sets of data, the AC311 helicopter contains 2 sets of data, and the simulated flying bird contains 4 sets of data. The LSS-FMCWR-2.0 dataset is constructed by naming the data. In order to facilitate the use of the dataset, the format of the dataset is uniformly named as AA time collection angle BB-CC-DD-L/K (FF), where AA represents the type of drone and ranges from 01 to 06; The collected angles include 0, 60, 90, 120, and 180 degrees; BB represents the modulation period, with values of 0.300 and 1.024 (in ms), CC represents the bandwidth, and DD represents the distance unit where the target is located; L/K represents data collected in the L-band or K-band, where L represents data collected by the L-band radar and K represents data collected by the K-band radar; FF represents the number of data numbers. The matrix file suffix for collecting data is ". mat".Several points to note: - L-band radar data are real, which does not affect time-frequency analysis; - Some parameters in L-band data collection remain unchanged in position, so the data remains unaltered.
g
Trend analysis of select hydrologic metrics in the Mobile Bay contributing...
gimi9.com
Updated May 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Trend analysis of select hydrologic metrics in the Mobile Bay contributing watershed | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_trend-analysis-of-select-hydrologic-metrics-in-the-mobile-bay-contributing-watershed/
Explore at:
Dataset updated
May 24, 2025
Area covered
Mobile Bay
Description
This data release provides comprehensive results of monotonic trend assessment for long-term U.S. Geological Survey (USGS) streamgages in or proximal to the watersheds of Mobile and Perdido Bays, south-central United States (Tatum and others, 2024). Long-term is defined as streamgages having at least five complete decades of daily streamflow data since January 1, 1950, exclusive to those streamgages also having the entire 2010s decade represented. Input data for the trend assessment are daily streamflow data retrieved on March 8, 2024 (U.S. Geological Survey, 2024) and formatted using the fill_dvenv() function in akqdecay (Crowley-Ornelas and others, 2024). Monotonic trends were assessed for each of 69 streamgages using 26 Mann-Kendall hypothesis tests for 20 hydrologic metrics understood as particularly useful in ecological studies (Henriksen and others, 2006) with another 6 metrics measuring well-known streamflow properties, such as annual harmonic mean streamflow (Asquith and Heitmuller, 2008) and annual mean streamflow with decadal flow-duration curve quantiles (10th, 50th, and 90th percentiles) (Crowley-Ornelas and others, 2023). Helsel and others (2020) provide background and description of the Mann-Kendall hypothesis test. Some of the trend analyses are based on the annual values of a hydrologic metric (calendar year is the time interval test) whereas others are decadal (decade is the time interval for the test). The principal result output for this data release (monotrnd_1hyp.txt) clearly distinguishes the time interval for the respective tests. This data release includes the computational workflow to conduct the hypothesis testing and requisite data manipulations to do so. The workflow is comprised of the core computation script monotrnd_script.R and an auxiliary script containing functions for 20 ecological flow metrics. This means that script monotrnd_script.R requires additional functions to be loaded into the R workspace and sources the file monotrnd_ecomets_include.R. This design is useful as part of isolation of the 20 ecological-oriented hydrologic metrics (subroutines) (logic and nomenclature therein is informed by Henriksen and others, 2006) from the streamgage-looping workflow and other data manipulation features in monotrnd_script.R. The monotrnd_script.R is designed to use time series of daily mean streamflow stored in an R environment data object using the streamgage identification number as the key and a data frame (table) of the daily streamflows in the format defined by the dvget() and filled by the filldv_env() functions of the akqdecay R package (See supplemental information section; Crowley-Ornelas and others, 2024). Additionally, monotrnd_script.R tags a specific subset of streamgages within the workflow, identified by the authors as "major nodes," with a binary indicator (1 or 0) to support targeted analyses on these selected locations. The data in file monotrnd_1hyp.txt are comma-delimited results of Kendall tau or other test statistics and p-values of the Mann-Kendall hypothesis tests as part of monotonic trend assessment for 69 USGS streamgages using 26 Mann–Kendall hypothesis tests on a variety of streamflow metrics. The data include USGS streamgage identification numbers with prepended "S" character, decimal latitudes and longitudes for the streamgage locations, range of calendar year and decades of streamflow processed along with integer counts of number of calendar years and decades, Kendall tau (or other test statistic) and associated p-value of the test statistic for the 26 streamflow metrics considered. Broadly, the "left side of the table" presents the results for the tests on metrics using calendar year time steps, and the "right side of the table" presents the results for the tests on metrics using decade time steps. The content of the file does not assign or draw conclusions on statistical significance because the p-values are provided. The file monotrnd_dictionary_1hyp.txt is a simple plain-text, pipe-delimited file of directly human-readable short definitions for the columns in the monotrnd_1hyp.txt. (This dictionary and two others accompany this data release to facilitate potential reuse of information by some users.) The source of monotrnd_1hyp.txt stems from ending computational steps in script monotrnd_script.R. Short summaries synthesizing information in file monotrnd_1hyp.txt are available in files monotrnd_3cnt.txt and monotrnd_2stn.txt also accompanying this data release. The data in file monotrnd_2stn.txt are comma-delimited summaries by streamgage identification number of the monotonic trend assessments for 26 Mann-Kendall hypothesis tests on streamflow metrics as described elsewhere in this data release. The summary data herein are composed of records (rows) by streamgage that include columns of (1) streamgage identification numbers with a prepended "S" character, (2) decimal latitudes and longitudes for the streamgage locations, (3) the integer counts of the number of hypothesis tests, (4) the integer count of number of tests for which the computed hypothesis test p-values less than the 0.05 level of statistical significance (so-called alpha = 0.05), and (5) colon-delimited strings of alphanumeric characters identifying each of the statistically significant tests for the respective streamgage. The file monotrnd_dictionary_2stn.txt is a simple plain-text, pipe-delimited file of directly human-readable short definitions for the columns in monotrnd_2stn.txt. The source of monotrnd_2stn.txt stems from ending computational steps in script monotrnd_script.R described elsewhere in this data release from its production of the monotrnd_1hyp.txt; this later data file provides the values used to assemble monotrnd_2stn.txt. The information in file monotrnd_3cnt.txt are comma-delimited summaries of Kendall tau or other test statistic arithmetic means as well as integer counts of statistically significant trends as part of monotonic trend assessment using 26 Mann-Kendall hypothesis tests on a variety of streamflow metrics for 69 USGS streamgages as described elsewhere in this data release. The two-column summary data herein are composed of a first row indicating by character string of the integer number of streamgages (69) and then subsequent rows in pairs of three-decimal character-string representation of mean Kendall tau (or the test statistics of a seasonal Mann-Kendall test) followed by character string of the integer number of the counts of statistically significant tests for the respective test at it was applied to the 69 streamgages. Statistical significance is defined as p-values less than the 0.05 level of statistical significance (so-called alpha = 0.05). The file monotrnd_dictionary_3cnt.txt is a simple plain-text, pipe-delimited file of directly human-readable short definitions for the columns in the monotrnd_3cnt.txt. The source of monotrnd_3cnt.txt stems from ending computational steps in script monotrnd_script.R described elsewhere in this data release from its production of the monotrnd_1hyp.txt; this later data file provides the values used to assemble monotrnd_3cnt.txt.
s
Seair Exim Solutions
seair.co.in
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Seair Exim Solutions [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Mar 3, 2025
Dataset provided by
Seair Info Solutions PVT LTD
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
e
Data analysis and graphs for publication
portal.edirepository.org
bin, jpeg, png
Updated 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Douglas Moore; Seth Newsome; Esteban Muldavin; Alesia Hallmark; Sydney Jones (2017). Data analysis and graphs for publication [Dataset]. http://doi.org/10.6073/pasta/c6e0aa643a5740a2c905bff3dcf2fa41
Explore at:
jpeg, png, binAvailable download formats
Unique identifier
https://doi.org/10.6073/pasta/c6e0aa643a5740a2c905bff3dcf2fa41
Dataset updated
2017
Dataset provided by
EDI
Authors
Douglas Moore; Seth Newsome; Esteban Muldavin; Alesia Hallmark; Sydney Jones
Time period covered
Jan 1, 1992 - Sep 18, 2016
Area covered

Description
Sevilleta LTER data-sets 182 and 129 (http://sev.lternet.edu/data/sev-182, -129) and R code veg_analysis.r were used to calculate standing biomass and net primary productivity (NPP) for the Creosote and Black Grama core sites. Gap-filled meteorological data from station 49 detailed in data packet 1 was used to plot precipitation in relationship to these plant productivity metrics. Long-term small mammal trapping data from 1989 to present, (http://sev.lternet.edu/data/sev-8) was utilized to plot temporal variation in the small mammal community in both creosote and black grama core sites. The program veg_analysis.r utilizes all of this data to generate a multi-panel figure illustrating precipitation, standing biomass, seasonal net primary productivity, and small mammal population trends for each site. In addition, we used ordination and community dynamic metrics (CoDyn in R) to show temporal differences in community composition of the small mammal data in each of the core sites (sev_rodent_com_analysis.r).
R
R script to generate climate indicators to support adaptation of vegetable...
entrepot.recherche.data.gouv.fr
pdf, tsv, txt +1
Updated Feb 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Morel; Nabil Touili; Kevin Morel; Nabil Touili (2023). R script to generate climate indicators to support adaptation of vegetable farms [Dataset]. http://doi.org/10.57745/0BWBPD
Explore at:
tsv(435317), tsv(3187), txt(2496608), tsv(389), txt(33174), tsv(2015), txt(3848), txt(427799), type/x-r-syntax(33227), pdf(261998), tsv(22002), tsv(2561071), txt(35742), txt(4210), tsv(292), tsv(3385)Available download formats
Unique identifier
https://doi.org/10.57745/0BWBPD
Dataset updated
Feb 13, 2023
Dataset provided by
Recherche Data Gouv
Authors
Kevin Morel; Nabil Touili; Kevin Morel; Nabil Touili
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Dataset funded by
Conseil départemental de l'Essonne
LEADER
Labex BASC
Description
Ce dossier contient un script R pour générer des tableaux synthétiques (saisonniers et annuels) d'indicateurs climatiques pertinents pour soutenir les producteurs de légumes dans leur réflexion sur l'adaptation au changement climatique à court terme (2021-2040) et plus long terme (2060). Ce script a pour données d'entrée des projections climatiques issues du portail DRIAS. Un exemple de données d'entrée et de tableau de sortie est donné pour la zone de Saclay (Essonne, France). Ce travail a été réalisé dans le cadre du projet CLIMALEG: adaptation des producteurs de légumes au changement climatique, de 2021 à 2022 en Ile-de-France. Les fichiers sont organisés en différents dossiers, donc pour les voir, penser à utiliser la visualisation par "Arborescence". This folders contains an R script to generate synthetic tables (at seasonal and annual scale) of climate indicators which are relevant to support vegetable farmers in anticipating climate change at short (2021-2040) and long (2060) term. The input data are climate projections coming from the DRIAS platform. One example of input data and output tables is given for the Saclay area (Essone, France). This work was carried out in the framework of the following project: "CLIMALEG: adaptation des producteurs de légumes au changement climatique", from 2021 to 2022 in the Ile-de-France region, France. The files are organized in folder, so in order to see it, use the "Tree" view.
s
David r small USA Import & Buyer Data
seair.co.in
Updated Dec 18, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim Solutions (2016). David r small USA Import & Buyer Data [Dataset]. https://www.seair.co.in/us-importers/david-r-small.aspx
Explore at:
.text/.csv/.xml/.xls/.binAvailable download formats
Dataset updated
Dec 18, 2016
Dataset authored and provided by
Seair Exim Solutions
Area covered
United States
Description
View David r small import data USA including customs records, shipments, HS codes, suppliers, buyer details & company profile at Seair Exim.
e
Edward R Small Export Import Data | Eximpedia
eximpedia.app
Updated Sep 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Edward R Small Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Sep 4, 2025
Dataset provided by
Eximpedia PTE LTD
Eximpedia Export Import Trade Data
Authors
Seair Exim
Area covered
Guinea, Christmas Island, Belgium, Russian Federation, Saint Lucia, Sri Lanka, Myanmar, Bahrain, Nicaragua, Montenegro
Description
Edward R Small Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
The forecasting results in different models.
plos.figshare.com
xls
Updated Apr 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Changsheng Li; Xinsong Zhang (2025). The forecasting results in different models. [Dataset]. http://doi.org/10.1371/journal.pone.0321478.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321478.t004
Dataset updated
Apr 29, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Changsheng Li; Xinsong Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Deep learning has significantly advanced in predicting stress-strain curves. However, due to the complex mechanical properties of rock materials, existing deep learning methods have the problem of insufficient accuracy in predicting the stress-strain curves of rock materials. This paper proposes a deep learning method based on a long short-term memory autoencoder (LSTM-AE) for predicting stress-strain curves of rock materials in discrete element numerical simulations. The LSTM-AE approach uses the LSTM network to construct both the encoder and decoder, where the encoder extracts features from the input data and the decoder generates the target sequence for prediction. The mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) of the predicted and true values are used as the evaluation metrics. The proposed LSTM-AE network is compared with the LSTM network, recurrent neural network (RNN), BP neural network (BPNN), and XGBoost model. The results indicate that the accuracy of the proposed LSTM-AE network outperforms LSTM, RNN, BPNN, and XGBoost. Furthermore, the robustness of the LSTM-AE network is confirmed by predicting 10 sets of special samples. However, the scalability of the LSTM-AE network in handling large datasets and its applicability to predicting laboratory datasets need further verification. Nevertheless, this study provides a valuable reference for solving the prediction accuracy of stress-strain curves in rock materials.
Original data.
plos.figshare.com
xlsx
Updated Mar 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shengchao Zhu; Yongjun Qin; Xin Meng; Liangfu Xie; Yongkang Zhang; Yangchun Yuan (2024). Original data. [Dataset]. http://doi.org/10.1371/journal.pone.0298524.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0298524.s001
Dataset updated
Mar 7, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Shengchao Zhu; Yongjun Qin; Xin Meng; Liangfu Xie; Yongkang Zhang; Yangchun Yuan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The uneven settlement of the surrounding ground surface caused by subway construction is not only complicated but also liable to cause casualties and property damage, so a timely understanding of the ground settlement deformation in the subway excavation and its prediction in real time is of practical significance. Due to the complex nonlinear relationship between subway settlement deformation and numerous influencing factors, as well as the existence of a time lag effect and the influence of various factors in the process, the prediction performance and accuracy of traditional prediction methods can no longer meet industry demands. Therefore, this paper proposes a surface settlement deformation prediction model by combining noise reduction and attention mechanism (AM) with the long short-term memory (LSTM). The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and independent component analysis (ICA) methods are used to denoise the input original data and then combined with AM and LSTM for prediction to obtain the CEEMDAN-ICA-AM-LSTM (CIAL) prediction model. Taking the settlement monitoring data of the construction site of Urumqi Rail Transit Line 1 as an example for analysis reveals that the model in this paper has better effectiveness and applicability in the prediction of surface settlement deformation than multiple prediction models. The RMSE, MAE, and MAPE values of the CIAL model are 0.041, 0.033 and 0.384%; R2 is the largest; the prediction effect is the best; the prediction accuracy is the highest; and its reliability is good. The new method is effective for monitoring the safety of surface settlement deformation.
Simulation study comparing length of time series
figshare.com
txt
Updated Nov 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alfonso Ruiz Moreno (2025). Simulation study comparing length of time series [Dataset]. http://doi.org/10.6084/m9.figshare.28783709.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28783709.v1
Dataset updated
Nov 6, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Alfonso Ruiz Moreno
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains all the scripts and data used in the simulation study titled “Simulation study comparing length of time series”, presented in the manuscript “Longer time series with missing data improve parameter estimation in State-Space mode in coral reef fish communities”. There are 108 files in total.All model fits were run on the HPC cluster at James Cook University. Depending on the simulated dataset, run times ranged from several hours up to 1-2 days per fit.Simulated datamultisp_sim_dat.R: Simulates communities of 20 species across 41 reefs. We used this script to generate 20 simulated communities, which were saved as sim_s1.RData through sim_s20.RData.Model fitting to simulated datafit_11y.R: Script to implement the Short fit from the manuscript. For each simulated dataset it: 1) saves posterior model parameters, 2) computes diagnostics (divergent transitions, tree depth saturation, E-BFMI), and 3) generates effective sample size (ESS) and R-hat plots. Example output: input simulated dataset sim_s1.RData and the output fit_11y_s1.RData. We include all fit output files up to fit_11y_s20.RData. We did not include the diagnostics figures in this repository, but users can get them by running the code.fit_18y.R: Script to implement the Intermediate fit from the main text. It follows the same process as above. Output files range from fit_18y_s1.RData up to fit_18y_s20.RDatafit_25yNA.R: Script to implement the Missing-data fit from the main text. It follows the same process as above. Output files range from fit_25yNA_s1.RData up to fit_25yNA_s20.RDatafit_25yc.R: Script to implement the Full fit from the main text. It follows the same process as above. Output files range from fit_25yc_s1.RData up to fit_25yc_s20.RDataStan modelMARPLN_LV.stan: Stan code for the multivariate autoregressive Poisson-Lognormal model with the latent variables. This model is used in the files fit_11y.R, fit_18y.R and fit_25yc.RMARPLN_LV_withNA.stan: Same as the model above, but it can also handle missing data. This model is used in the file fit_25yNA.R.Figure and analysis scriptFigure 2.R: This script calculates the accuracy and precision estimates for all key parameters across simulations and generates Figure 2 from the manuscript.

Facebook

Twitter

Click to copy link

Link copied

Cite

Laura Alexis Janda; Laura Alexis Janda (2023). Replication Data for: The long and the short of it: Russian predicate adjectives with zero copula [Dataset]. http://doi.org/10.18710/XKDBLF

Replication Data for: The long and the short of it: Russian predicate adjectives with zero copula

Explore at:

txt(7215), text/x-r-notebook(11795), text/comma-separated-values(2122117), xlsx(1093987), pdf(60832)Available download formats

Unique identifier

https://doi.org/10.18710/XKDBLF

Dataset updated

Sep 2, 2023

Dataset provided by

DataverseNO

Authors

Laura Alexis Janda; Laura Alexis Janda

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Time period covered

1960 - 2016

Area covered

Russian Federation

Description

Description of Dataset This is a study of examples of Russian predicate adjectives in clauses with zero-copula present tense, where the adjective is a short form (SF) or a long form nominative (LF). The data was collected in 2022 from SynTagRus (https://universaldependencies.org/treebanks/ru_syntagrus/index.html), the syntactic subcorpus of the Russian National Corpus (https://ruscorpora.ru/new/). The data merges the results of several searches conducted to extract examples of sentences with long form and short form adjectives in predicate position, as identified by the corpus. The examples were imported to a spreadsheet and annotated manually, based on the syntactic analyses given in the corpus. For present tense sentences with no copula (Река спокойна or Река спокойная), it was necessary to search for an adjective as the top (root) node in the syntactic structure. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/page/instruction-syntax/. In order for the R code to run from these files, one needs to set up an R project with the data files in a folder named "data" and the R markdown files in a folder named "scripts". Method: Logistic regression analysis of corpus data carried out in R (R version 4.2.3 (2023-03-15)-- "Shortstop Beagle" Copyright (C) 2023 The R Foundation for Statistical Computing) and documented in an .Rmd file. Publication Abstract The present article presents an empirical investigation of the choice between so-called long (e.g., prostoj ‘simple’) and short forms (e.g., prost ‘simple’) of predicate adjectives in Russian based on data from the syntactic subcorpus of the Russian National Corpus. The data under scrutiny suggest that short forms represent the dominant option for predicate adjectives. It is proposed that long forms are descriptions of thematic participants in sentences with no complement, while short forms may take complements and describe both participants (thematic and rhematic) and situations. Within the “space of competition” where both long and short forms are well attested, it is argued that the choice of form to some extent depends on subject type, gender/number, and frequency. On the methodological level, the approach adopted in the present study may be extended to other cases of competition in morphosyntax. It is suggested that one should first “peel off” contexts where (nearly) categorical rules are at work, before one undertakes a statistical analysis of the “space of competition”.

Clear search

Close search

Google apps

Main menu

Replication Data for: The long and the short of it: Russian predicate...

Replication Data for: Contextually determined or semantically distinct? The...

Reddit: /r/technology (Submissions & Comments)

Reddit: /r/technology (Submissions & Comments)

Title, Score, ID, URL, Comment Number, and Timestamp

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Data from: The long and short of it: converting between maximum and minimum...

DHS Annual Report 2013-14 Dataset - Additional Data - output performance

Data from: Data and R Code to Derive Estimates of Groundwater Levels Using...

Meteorological Data: Balbina R. 15 - 20 July 2013 (5 minute averaged)

Acoustic Doppler Current Profiler (ADCP) and underway SCS data collected...

Stress-strain data sets.

Data from: A model-derived short-term estimation method of effective size...

LSS-FMCWR-2.0：Multi-band multi-angle FMCW radar low-slow-small target...

Trend analysis of select hydrologic metrics in the Mobile Bay contributing...

Seair Exim Solutions

Data analysis and graphs for publication

R script to generate climate indicators to support adaptation of vegetable...

David r small USA Import & Buyer Data

Edward R Small Export Import Data | Eximpedia

The forecasting results in different models.

Original data.

Simulation study comparing length of time series

Replication Data for: The long and the short of it: Russian predicate adjectives with zero copula