30 datasets found

E
Dataset: The plural interpretability of German linking elements...
live.european-language-grid.eu
data.niaid.nih.gov
csv
Updated Aug 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422
Explore at:
csvAvailable download formats
Dataset updated
Aug 15, 2021
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.
Common English Parts-of-speech
kaggle.com
zip
Updated Nov 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Common English Parts-of-speech [Dataset]. https://www.kaggle.com/datasets/thedevastator/common-english-parts-of-speech
Explore at:
zip(1253764 bytes)Available download formats
Dataset updated
Nov 3, 2022
Authors
The Devastator
Description
Common English Parts-of-speech

Over 8,000 words and their plural forms

About this dataset

This dataset provides ample information on over 8,000 various English words, including nouns and their plural forms. By mining this data, researchers can gain valuable insights into understanding the English language in a more efficient way

How to use the dataset

This dataset can be used to help researchers understand the English language in a new and innovative way. The data includes information on over 8,000 different English words, including nouns and their plural forms. This dataset is particularly useful for investigating the relationships between words and their plural forms

Research Ideas

To create a program that can automatically generating plural forms of nouns.

To study the relationships between different words and their plural forms.

To develop a better understanding of the English language for non-native speakers

Acknowledgements

License

See the dataset description for more information.

Columns

File: adjectives.csv

File: adverbs.csv

File: nouns.csv | Column name | Description | |:--------------|:-----------------------------------------------------------| | 007 | The code name of the character. (String) | | 007s | The number of times the character has been used. (Integer) |

File: plural-nouns.csv

File: verbs.csv | Column name | Description | |:--------------|:-----------------------------| | awake | (adjective) to stop sleeping | | awoke | (verb) to stop sleeping | | awoken | (verb) to stop sleeping |

File: words-multiple-present-participle.csv | Column name | Description | |:-----------------------------------|:-------------------------------------------------------------| | Word | The word being described. (String) | | Present Participle | The present participle form of the word. (String) | | Present Participle Alternative | An alternative present participle form of the word. (String) |
Z
Data from: On the Approximation of Singular Functions by Series of...
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Sep 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohan Zhao; Kirill Serkh (2023). On the Approximation of Singular Functions by Series of Non-integer Powers [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8323314
Explore at:
Dataset updated
Sep 7, 2023
Dataset provided by
University of Toronto
Authors
Mohan Zhao; Kirill Serkh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file provides the singular powers and collocation points described in the paper "On the Approximation of Singular Functions by Series of Non-integer Powers," available on arXiv, for several useful combinations of the parameters a, b, and the precision ε. It also includes a MATLAB script which demonstrates the effectiveness of these singular powers and collocation points for approximating singular functions of the form x^c, where c is in the interval [a,b].
d
Replication Data for: Understanding ‘many’ through the lens of Ukrainian...
search-demo.dataone.org
dataverse.no
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Janda, Laura Alexis (2024). Replication Data for: Understanding ‘many’ through the lens of Ukrainian багато [Dataset]. http://doi.org/10.18710/Y7VGQE
Explore at:
Unique identifier
https://doi.org/10.18710/Y7VGQE
Dataset updated
Sep 25, 2024
Dataset provided by
DataverseNO
Authors
Janda, Laura Alexis
Time period covered
Jan 1, 1742 - Jan 1, 2023
Description
Dataset description: The General Regionally Annotated Corpus of Ukrainian (GRAC, Shvedova et al. 2017-2024, uacorpus.org) was consulted to collect data for further analysis concerning the distribution of Singular vs. Plural verb forms in the target bahato construction. GRAC is a Sketch Engine corpus of over 1.8 billion words, representing texts from over 30,000 authors created between 1816 and 2023. This corpus is designed to serve as source material for linguistic research on Standard Ukrainian. Our data was collected during the month of February 2024. We extracted and annotated 28,491 examples of the bahato construction. An additional set of examples was collected from the Russian National Corpus (ruscorpora.ru) during the month of August 2024 to provide comparison with the Russian mnogo construction. For this purpose, 6,612 examples were extracted and annotated for word order and Singular vs. Plural verb agreement. Both the Ukrainian and the Russian data are included in this dataset, along with the R scripts used to analyze this data. Article abstract: We reveal an ongoing language change in Ukrainian involving a construction with a subject comprised of the indefinite quantifier багато ‘many’ modifying a noun phrase in the Genitive Plural. Number agreement on the verb varies, allowing both Singular (in 69.1% of attestations) and Plural (in 30.9% of attestations). Based on statistical analysis of corpus data, we investigate the influence of the factors of year of creation, word order of subject and verb, and animacy of the subject on the choice of verb number. We find that, while all combinations of word order and animacy are robustly attested, VS word order and inanimate subjects tend to prefer Singular, whereas SV word order and animate subjects tend to prefer Plural. Since about the 1950s, the proportion of Plural has been increasing, overtaking Singular in the current decade. We propose that this Singular vs. Plural variation is motivated by the human embodied experience of construing a group of items as either a homogeneous mass (and therefore Singular) or a multiplicity of individuals (and therefore Plural). This proposal is supported by the identification of micro-constructions that prefer Singular and show reduced individuation of human beings.
D
Data for: Filling the data gaps within GRACE missions using Singular...
darus.uni-stuttgart.de
Updated May 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shuang Yi; Nico Sneeuw (2021). Data for: Filling the data gaps within GRACE missions using Singular Spectrum Analysis [Dataset]. http://doi.org/10.18419/DARUS-807
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-807
Dataset updated
May 14, 2021
Dataset provided by
DaRUS
Authors
Shuang Yi; Nico Sneeuw
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dozens of missing epochs in the monthly gravity product of the satellite mission Gravity Recovery and Climate Experiment (GRACE) and its follow-on (GRACE-FO) mission greatly inhibit the complete analysis and full utilization of the data. Despite previous attempts to handle this problem, a general all-purpose gap-filling solution is still lacking. Here we propose a non-parametric, data-adaptive and easy-to-implement approach - composed of the Singular Spectrum Analysis (SSA) gap-filling technique, cross-validation, and spectral testing for significant components - to produce reasonable gap-filling results in the form of spherical harmonic coefficients (SHCs). We demonstrate that this approach is adept at inferring missing data from long-term and oscillatory changes extracted from available observations. A comparison in the spectral domain reveals that the gap-filling result resembles the product of GRACE missions below spherical harmonic degree 30 very well. As the degree increases above 30, the amplitude per degree of the gap-filling result decreases more rapidly than that of GRACE/GRACE-FO SHCs, showing effective suppression of noise. As a result, our approach can reduce noise in the oceans without sacrificing resolutions on land. The gap filling dataset is stored in the “SSA_filing/" folder. Each file represents a monthly result in the form of spherical harmonics. The data format follows the convention of the site ftp://isdcftp.gfz-potsdam.de/grace/. Low degree corrections (degree-1, C20, C30) have been made. The code to generate the dataset is located in the “code_share/“ folder, with an example for C30. The model-based Greenland mass balance result for data validation (results given in the paper) is provided in the "Greenland_SMB-D.txt” file.
Error rates (in %) for each error type of Experiment 1, averaged across...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz (2023). Error rates (in %) for each error type of Experiment 1, averaged across items for each participant. [Dataset]. http://doi.org/10.1371/journal.pone.0200723.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200723.t003
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Standard deviations are presented in parentheses.
D
Data from: Production of Dutch variable plurals in language corpora
ssh.datastations.nl
pdf, tsv, txt +3
Updated Aug 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus; T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus (2021). Production of Dutch variable plurals in language corpora [Dataset]. http://doi.org/10.17026/DANS-XVR-QSCF
Explore at:
tsv(32355), txt(305001), zip(21019), txt(48119), tsv(140592), type/x-r-syntax(52913), txt(2916), txt(12328), pdf(195426), xml(5232), txt(1562)Available download formats
Unique identifier
https://doi.org/10.17026/DANS-XVR-QSCF
Dataset updated
Aug 24, 2021
Dataset provided by
DANS Data Station Social Sciences and Humanities
Authors
T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus; T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A growing body of work in psycholinguistics suggests that morphological relations between word forms affect the processing of complex words. Previous studies have usually focused on a particular type of paradigmatic relation, for example the relation between paradigm members, or the relation between alternative forms filling a particular paradigm cell. However, potential interactions between different types of paradigmatic relations have remained relatively unexplored. The data in in this data set were used in two corpus studies of variable plurals in Dutch to test hypotheses about potentially interacting paradigmatic effects. The first study (which uses the s_dist data) shows that generalization across noun paradigms predicts the distribution of plural variants, and that this effect is diminished for paradigms in which the plural variants are more likely to have a strong representation in the mental lexicon. The second study (which uses the s_dur data) demonstrates that the pronunciation of a target plural variant is affected by coactivation of the alternative variant, resulting in shorter segmental durations. This effect is dependent on the representational strength of the alternative plural variant. In sum, the distributional and durational measurements in these data provide evidence that storage of morphologically complex words may affect the role of generalization and coactivation during production. A full description of the data gathering process and the analyses is given in the Methodology file. The Readme file describes how the remaining files relate to the research.
D
Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling...
dataverse.no
dataverse.azure.uit.no
+1more
bin, text/tsv, txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lukas Sönning; Lukas Sönning (2025). Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data [Dataset]. http://doi.org/10.18710/5KCE4U
Explore at:
bin(13462), text/tsv(2120816), txt(12381)Available download formats
Unique identifier
https://doi.org/10.18710/5KCE4U
Dataset updated
Jul 17, 2025
Dataset provided by
DataverseNO
Authors
Lukas Sönning; Lukas Sönning
License
https://dataverse.no/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.18710/5KCE4Uhttps://dataverse.no/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.18710/5KCE4U
Time period covered
Jan 1, 1500 - Dec 31, 1707
Area covered
United Kingdom
Description
Dataset description This dataset, which is adapted from Jenset and McGillivray (2017), contains tabular files documenting the alternating usage of -(e)th and -(e)s to mark third-person verb inflection in Early Modern English. The data provided by Jenset and McGillivray (2017) are drawn from the PPCEME corpus (Kroch et al. 2004) and cover the period from 1500 to 1700. In total, 13,757 third-person singular tokens (excluding the verb BE) were annotated by these authors for a range of variables. For the purposes of the present methodological study, this dataset was reduced to a subset of 11,645 tokens, and the coding of variables was in some parts revised, completed, or modified. The dataset includes information about the Author and Verb Lemma, as well as a number of predictor variables, including Genre, Year, Frequency (of the verb lemma in the third-person singular), Phonological Context (stem-final sound), and the Gender of the author. Abstract for related publication Resource constraints often force researchers to down-size the list of tokens returned by a corpus query. This paper sketches a methodology for down-sampling and offers a survey of current practices. We build on earlier work and extend the evaluation of down-sampling designs to settings where tokens are clustered by text file and lexeme. Our case study deals with third-person present-tense verb inflection in Early Modern English and focuses on five predictors: Year, Gender, Genre, Frequency, and Phonological Context. We evaluate two strategies for selecting 2,000 (out of 11,645) tokens: simple down-sampling, where each hit has the same selection probability; and structured down-sampling, where this probability is inversely proportional to the author- and verb-specific token count. We form 500 sub-samples using each scheme and compare regression results to a reference model fit to the full set of cases. We observe that structured down-sampling shows better performance on several evaluation criteria.
Computing a partial elastic shape registration of 3D surfaces using dynamic...
data.nist.gov
catalog.data.gov
Updated Oct 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). Computing a partial elastic shape registration of 3D surfaces using dynamic programming [Dataset]. http://doi.org/10.18434/mds2-3056
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-3056, https://identifiers.org/ark:/88434/mds2-3056
Dataset updated
Oct 30, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
Fortran and Matlab programs, Matlab mex file of Fortran program, compiled mex file, and sample data files, etc. for computing a partial elastic shape registration of two simple surfaces in 3-dimensional space and the elastic shape distance between them corresponding to the partial registration.
Data Files for Tresoldi/Robinson article on spelling variation in Canterbury...
zenodo.org
data-staging.niaid.nih.gov
+1more
zip
Updated Nov 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tiago Henrique Tresoldi Tresoldi; Tiago Henrique Tresoldi Tresoldi (2024). Data Files for Tresoldi/Robinson article on spelling variation in Canterbury Tales manuscripts [Dataset]. http://doi.org/10.5281/zenodo.14209129
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14209129
Dataset updated
Nov 23, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tiago Henrique Tresoldi Tresoldi; Tiago Henrique Tresoldi Tresoldi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data is at https://github.com/peterrobinson/CTSpellingArticle2024. It is contained in three folders, each folder corresponding to one of the three sets of data used in this analysis, as follows:

“sorted by regularization”. This folder contains spelling data and results derived from the regularization process, where (for example) spellings of forms regularized to “goode” are distinguished from spellings of forms regularized to “god”;

“sorted by part of speech”. This folder contains spelling data and results derived from a lemmatization and part-of-speech identification process, where (for example) spellings of forms lemmatized to “goode” singular adjective are distinguished from spellings of forms regularized to “goode” plural adjective, and forms lemmatized to “gode” singular noun nominative case are distinguished from “gode” singular noun oblique case (as in “to gode”).

“all spellings unsorted”. This folder contains spelling data as undifferentiated counts of “bags of words”: for each witness: so many occurrences of “good”, so many of “goode”, so many of “god”, so many of “gode”.

Each folder contains the following files (under various names):

A . json file holding all the data, structured according to its categorization. The “sorted by part of speech” folder contains two .json files, one with spellings organized by headword lemma, the other organized by part-of-speech;

Two .nex Nexus files containing all the data. In the “sorted by regularization” and “sorted by part of speech” folders one Nexus file groups spellings by variant sites within each line, the other Nexus file groups spellings by words within each line. In the “all spellings unsorted” folder one Nexus file contains all the spellings organized by spelling; the second holds a Nexus distance matrix with distances created according to the Manhattan distance algorithm;

A .dst distance matrix file, containing a distance matrix constructed with distacnes calculated by the Manhattan distance algorithm;

A “features” file, containing a spreadsheet ranking each variant site according to its impact on the analysis

Multiple .pdf files visualizing the results of our analysis, with the names reflecting the analysis each contains. Files with names including “Splits” were created using the SplitsTree algorithm and software (Huson and Bryant 2006; “SplitsTree | Universität Tübingen,” n.d.)

The “sorted by regularization” folder also contains a single image file, “tiagoplot1.jpg”, visualizing the results of PCA analysis on the “sorted by regularization” data.
D
Replication Data for: A network of allostructions: quantified subject...
dataverse.no
search.dataone.org
bin, csv, html, pdf +2
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura A Janda; Laura A Janda; Tore Nesset; Tore Nesset (2023). Replication Data for: A network of allostructions: quantified subject constructions in Russian [Dataset]. http://doi.org/10.18710/4D2QII
Explore at:
xlsx(12830911), csv(1814986), csv(36013332), pdf(2269386), xlsx(1381766), html(2397841), txt(8856), pdf(175984), bin(20016)Available download formats
Unique identifier
https://doi.org/10.18710/4D2QII
Dataset updated
Sep 28, 2023
Dataset provided by
DataverseNO
Authors
Laura A Janda; Laura A Janda; Tore Nesset; Tore Nesset
License
https://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/4D2QIIhttps://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/4D2QII
Time period covered
1800 - 2017
Area covered
Russian Federation
Dataset funded by
Norwegian Directorate for Higher Education and Skills
Description
Data and R code are provided for statistical analysis of approximately 39,000 corpus examples of predicate agreement in constructions with quantified subjects in Russian. The analysis indicates that these constructions constitute a network of constructions (“allostructions”) with various preferences for singular or plural agreement. Factors pull in different directions, and we observe a relatively stable situation in the face of variation. We present an analysis of a multidimensional network of allostructions in Russian, thus contributing to our understanding of allostructional relationships in Construction Grammar. With regard to historical linguistics, language stability is an understudied field. We illustrate an interplay of divergent factors that apparently resists language change. The syntax of numerals and other quantifiers represents a notoriously complex phenomenon of the Russian language. Our study sheds new light on the contributions of factors that favor singular or plural agreement in sentences with quantified subjects.
f
Mean item characteristics of French materials.
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz (2023). Mean item characteristics of French materials. [Dataset]. http://doi.org/10.1371/journal.pone.0200723.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200723.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
French
Description
Standard deviations are shown in parentheses.
r
Data from: Singular Dirichlet (p, q)-equations
resodate.org
Updated Jul 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikolaos S. Papageorgiou; Patrick Winkert (2021). Singular Dirichlet (p, q)-equations [Dataset]. http://doi.org/10.14279/depositonce-12146
Explore at:
Unique identifier
https://doi.org/10.14279/depositonce-12146
Dataset updated
Jul 6, 2021
Dataset provided by
Technische Universität Berlin
DepositOnce
Authors
Nikolaos S. Papageorgiou; Patrick Winkert
Description
We consider a nonlinear Dirichlet problem driven by the (p, q)-Laplacian and with a reaction having the combined effects of a singular term and of a parametric (p−1)-superlinear perturbation. We prove a bifurcation-type result describing the changes in the set of positive solutions as the parameter λ>0 varies. Moreover, we prove the existence of a minimal positive solution u∗λ and study the monotonicity and continuity properties of the map λ→u∗λ.
f
Data from: Gender context effects in noun recognition: grammatical cues or...
tandf.figshare.com
figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cindy Bellanger; Jean-Pierre Chevrot; Elsa Spinelli (2023). Gender context effects in noun recognition: grammatical cues or co-occurrence effects? [Dataset]. http://doi.org/10.6084/m9.figshare.5092159.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5092159.v1
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Cindy Bellanger; Jean-Pierre Chevrot; Elsa Spinelli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Determiners with congruent gender facilitate the recognition of the following noun. We examine two explanations of this effect: either gender information is retrieved and influences lexical access, or gender effects are due to the determiner-noun co-occurrence. French nouns are either feminine or masculine and are preceded by feminine or masculine determiners in the singular. Plural articles are unmarked for gender. Because some nouns (peanuts) occur more frequently in the plural than in their singular, they frequently co-occur with determiners that do not provide gender information. Conversely, nouns that occur more frequently in their singular form (cathedral) co-occur more frequently with gender-marked determiners. We examined the recognition of plural- and singular-oriented nouns preceded by gender-marked and unmarked determiners. Singular-oriented nouns were recognised faster after gender-marked (singular) articles than after gender-unmarked (plural) ones. However, plural-oriented nouns were recognised faster after gender-unmarked (plural) articles, suggesting that articles/nouns co-occurrence outweigh abstract gender cue.
r
Data from: On best rank one approximation of tensors
resodate.org
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shmuel Friedland; Volker Mehrmann; Renato Pajarola; Susanne Suter (2021). On best rank one approximation of tensors [Dataset]. http://doi.org/10.14279/depositonce-14528
Explore at:
Unique identifier
https://doi.org/10.14279/depositonce-14528
Dataset updated
Dec 17, 2021
Dataset provided by
Technische Universität Berlin
DepositOnce
Authors
Shmuel Friedland; Volker Mehrmann; Renato Pajarola; Susanne Suter
Description
In this paper we suggest a new algorithm for the computation of a best rank one approximation of tensors, called 'alternating singular value decomposition'. This method is based on the computation of maximal singular values and the corresponding singular vectors of matrices. We also introduce a modification for this method and the alternating least squares method, which ensures that alternating iterations will always converge to a semi-maximal point. Finally, we introduce a new simple Newton-type method for speeding up the convergence of alternating methods near the optimum. We present several numerical examples that illustrate the computational performance of the new method in comparison to the alternating least square method.
d
Replication data for: Constructions are not predictable but are motivated:...
search.dataone.org
dataverse.no
Updated Jul 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lewandowski, Wojciech (2024). Replication data for: Constructions are not predictable but are motivated: evidence from the Spanish completive reflexive [Dataset]. http://doi.org/10.18710/4QHOBK
Explore at:
Unique identifier
https://doi.org/10.18710/4QHOBK
Dataset updated
Jul 29, 2024
Dataset provided by
DataverseNO
Authors
Lewandowski, Wojciech
Description
Many researchers seem to think that construction grammar posits the existence of just wholly idiosyncratic constructions or form-meaning pairings. However, this idea demonstrates a deep misunderstanding of the approach, since constructions rarely emerge sui generis. Rather, construction grammar aims to balance the fact that some linguistic uses cannot be fully predicted from other well-established uses, with the fact that extensions of a construction, while not predictable, are motivated by other senses in the constructional network. This study illustrates this tenet of constructional approaches to language by providing an analysis of the Spanish completive reflexive marker se. In order to identify the different senses of the completive se-construction I used data from the Spanish corpus CREA (Corpus de Referencia del Español Actual, http://corpus.rae.es/creanet.html). Given the large size of the corpus (200 million words), the frequency search—which is merely indicative—was arbitrarily limited to constructions in which the verb appeared in 3rd person singular and was directly followed by a direct object headed by the determined articles el ‘the’ (masculine) or la ‘the’ (feminine) in singular. The data set includes all the instances of the completive reflexive found in the sample described above.
Summaries of posterior distributions for singular values and variance...
plos.figshare.com
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luciano Antonio de Oliveira; Carlos Pereira da Silva; Alessandra Querino da Silva; Cristian Tiago Erazo Mendes; Joel Jorge Nuvunga; Joel Augusto Muniz; Júlio Sílvio de Sousa Bueno Filho; Marcio Balestre (2023). Summaries of posterior distributions for singular values and variance components for the BGGE and BGGEE models. [Dataset]. http://doi.org/10.1371/journal.pone.0256882.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0256882.t003
Dataset updated
Jun 9, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Luciano Antonio de Oliveira; Carlos Pereira da Silva; Alessandra Querino da Silva; Cristian Tiago Erazo Mendes; Joel Jorge Nuvunga; Joel Augusto Muniz; Júlio Sílvio de Sousa Bueno Filho; Marcio Balestre
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summaries of posterior distributions for singular values and variance components for the BGGE and BGGEE models.
g
GI625 optical fiber data imaged on a Zeiss Versa XRM-500 microCT at 12 tube...
gimi9.com
Updated Dec 3, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). GI625 optical fiber data imaged on a Zeiss Versa XRM-500 microCT at 12 tube voltages | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_gi625-optical-fiber-data-imaged-on-a-zeiss-versa-xrm-500-microct-at-12-tube-voltages/
Explore at:
Dataset updated
Dec 3, 2019
Description
This is tomography data as acquired using a commercial X-ray tomography instrument. We obtained reconstructions of a graded-index optical fiber with voxels of edge length 1.05 µm at 12 tube voltages. The fiber manufacturer created a graded index in the central region by varying the germanium concentration from a peak value in the center of the core to a very small value at the core-cladding boundary. Operating on 12 tube voltages, we show by a singular value decomposition that there are only two singular vectors with significant weight. Physically, this means scans beyond two tube voltages contain largely redundant information. We concentrate on an analysis of the images associated with these two singular vectors. The first singular vector is dominant and images of the coefficients of the first singular vector at each voxel look are similar to any of the single-energy reconstructions. Images of the coefficients of the second singular vector by itself appear to be noise. However, by averaging the reconstructed voxels in each of several narrow bands of radii, we can obtain values of the second singular vector at each radius. In the core region, where we expect the germanium doping to go from a peak value at the fiber center to zero at the core-cladding boundary, we find that a plot of the two coefficients of the singular vectors forms a line in the two-dimensional space consistent with the dopant decreasing linearly with radial distance from the core center. The coating, made of a polymer rather than silica, is not on this line indicating that the two-dimensional results are sensitive not only to the density but also to the elemental composition. A stack of reconstructions are given here as tiff files of individual slices. Each zip file corresponds to a tilt series at a given tube voltage, given in the file name. The power is also given in the file name. (For example, file “30kV-2W.zip” was tube voltage at 30kV, power 2W.) The power was varied so that the signal-to-noise was approximately equal for the various reconstructions. The experiment is described in: ZH Levine, AP Peskin, EJ Garboczi, and AD Holmgren, Multi-Energy X-Ray Tomography of an Optical Fiber: The Role of Spatial Averaging, Microscopy and Microanalysis 25 (1) 70-76 (2019). https://doi.org/10.1017/S1431927618016136
u
Widefield imaging data from the publication, Cortical State Fluctuations...
rdr.ucl.ac.uk
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elina Jacobs (2023). Widefield imaging data from the publication, Cortical State Fluctuations during Sensory Decision Making [Dataset]. http://doi.org/10.5522/04/13194452.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5522/04/13194452.v1
Dataset updated
Jun 1, 2023
Dataset provided by
University College London
Authors
Elina Jacobs
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This site contains the widefield imaging datasets from the publication Cortical State Fluctuations during Sensory Decision Making, by Jacobs et al in Current Biology.This data is from the behavioural tasks described in the publication, and is in a compressed SVD format (see Methods in the publication for more details). The companion code is designed to take the data in this format.The datasets provided here contain the top 500 singular values, which is how the data in the publication was analysed, as this was found to sufficiently capture the data. The data contaning up to 2000 singular values can be shared on request.The timestamps of the datasets here are not all aligned with the behavioural datasets; the companion code takes care of this.The data is organised by experimental subject; most subjects were recorded from on multiple days, which form subfolders within the subject folder. Within a day, there may have been several experiments, which again form subfolders within the day folder. The companion code expects this data organisation.The companion code is available at: https://github.com/eakjacobs/Jacobs_et_al_CurrentBiologyFor more information and links to the behavioural and pupil datasets, please follow this link: https://doi.org/10.6084/m9.figshare.13084805The research article can be found (freely available) at https://www.cell.com/current-biology/fulltext/S0960-9822(20)31437-8
H
Data from: Collisional effects on resonant particles in quasilinear theory
dataverse.harvard.edu
osti.gov
Updated May 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter J. Catto (2021). Collisional effects on resonant particles in quasilinear theory [Dataset]. http://doi.org/10.7910/DVN/ZWVNQF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ZWVNQF
Dataset updated
May 11, 2021
Dataset provided by
Harvard Dataverse
Authors
Peter J. Catto
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZWVNQFhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZWVNQF
Description
A careful examination of the effects of collisions on resonant wave-particle interactions leads to an alternate interpretation and deeper understanding of the quasilinear operator originally formulated by Kennel and Engelmann (Phys. Fluids vol. 9, 1966, pp. 2377- 2388) for collisionless, magnetized plasmas, and widely used to model radio frequency heating and current drive. The resonant and nearly resonant particles are particularly sensitive to collisions that pitch angle scatter them out of and into resonance. As a result, the resonant particle-wave interactions occur in the center of a narrow collisional boundary when the collision frequency nu is very small compared to the wave frequency omega. The diffusive nature of the pitch angle scattering combined with the wave-particle resonance condition enhances the collision frequency by (omega/nu)2/3 >>1, resulting in an effective resonant particle collision time of tau_int ~ (nu /omega)2/3 nu <<1/ nu . A rigorous collisional boundary layer analysis generalizes the standard quasilinear operator to a form that is fully consistent with Kennel-Englemann, but allows replacing the delta function appearing in the diffusivity with a simple integral (having the appropriate delta function limit) retaining the new physics associated with the narrow boundary layer, while preserving the entropy production principle. The limitations of the collisional boundary layer treatment are also estimated, and indicate that substantial departures from Maxwellian are not permitted.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422

Dataset: The plural interpretability of German linking elements ("Morphology")

Explore at:

csvAvailable download formats

Dataset updated

Aug 15, 2021

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.

Clear search

Close search

Google apps

Main menu

Dataset: The plural interpretability of German linking elements...

Common English Parts-of-speech

Common English Parts-of-speech

Over 8,000 words and their plural forms

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Data from: On the Approximation of Singular Functions by Series of...

Replication Data for: Understanding ‘many’ through the lens of Ukrainian...

Data for: Filling the data gaps within GRACE missions using Singular...

Error rates (in %) for each error type of Experiment 1, averaged across...

Data from: Production of Dutch variable plurals in language corpora

Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling...

Computing a partial elastic shape registration of 3D surfaces using dynamic...

Data Files for Tresoldi/Robinson article on spelling variation in Canterbury...

Replication Data for: A network of allostructions: quantified subject...

Mean item characteristics of French materials.

Data from: Singular Dirichlet (p, q)-equations

Data from: Gender context effects in noun recognition: grammatical cues or...

Data from: On best rank one approximation of tensors

Replication data for: Constructions are not predictable but are motivated:...

Summaries of posterior distributions for singular values and variance...

GI625 optical fiber data imaged on a Zeiss Versa XRM-500 microCT at 12 tube...

Widefield imaging data from the publication, Cortical State Fluctuations...

Data from: Collisional effects on resonant particles in quasilinear theory

Dataset: The plural interpretability of German linking elements ("Morphology")