30 datasets found
  1. E

    Dataset: The plural interpretability of German linking elements...

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    csv
    Updated Aug 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 15, 2021
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.

  2. Common English Parts-of-speech

    • kaggle.com
    zip
    Updated Nov 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Common English Parts-of-speech [Dataset]. https://www.kaggle.com/datasets/thedevastator/common-english-parts-of-speech
    Explore at:
    zip(1253764 bytes)Available download formats
    Dataset updated
    Nov 3, 2022
    Authors
    The Devastator
    Description

    Common English Parts-of-speech

    Over 8,000 words and their plural forms

    About this dataset

    This dataset provides ample information on over 8,000 various English words, including nouns and their plural forms. By mining this data, researchers can gain valuable insights into understanding the English language in a more efficient way

    How to use the dataset

    This dataset can be used to help researchers understand the English language in a new and innovative way. The data includes information on over 8,000 different English words, including nouns and their plural forms. This dataset is particularly useful for investigating the relationships between words and their plural forms

    Research Ideas

    • To create a program that can automatically generating plural forms of nouns.
    • To study the relationships between different words and their plural forms.
    • To develop a better understanding of the English language for non-native speakers

    Acknowledgements

    License

    See the dataset description for more information.

    Columns

    File: adjectives.csv

    File: adverbs.csv

    File: nouns.csv | Column name | Description | |:--------------|:-----------------------------------------------------------| | 007 | The code name of the character. (String) | | 007s | The number of times the character has been used. (Integer) |

    File: plural-nouns.csv

    File: verbs.csv | Column name | Description | |:--------------|:-----------------------------| | awake | (adjective) to stop sleeping | | awoke | (verb) to stop sleeping | | awoken | (verb) to stop sleeping |

    File: words-multiple-present-participle.csv | Column name | Description | |:-----------------------------------|:-------------------------------------------------------------| | Word | The word being described. (String) | | Present Participle | The present participle form of the word. (String) | | Present Participle Alternative | An alternative present participle form of the word. (String) |

  3. Z

    Data from: On the Approximation of Singular Functions by Series of...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Sep 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohan Zhao; Kirill Serkh (2023). On the Approximation of Singular Functions by Series of Non-integer Powers [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8323314
    Explore at:
    Dataset updated
    Sep 7, 2023
    Dataset provided by
    University of Toronto
    Authors
    Mohan Zhao; Kirill Serkh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file provides the singular powers and collocation points described in the paper "On the Approximation of Singular Functions by Series of Non-integer Powers," available on arXiv, for several useful combinations of the parameters a, b, and the precision ε. It also includes a MATLAB script which demonstrates the effectiveness of these singular powers and collocation points for approximating singular functions of the form x^c, where c is in the interval [a,b].

  4. d

    Replication Data for: Understanding ‘many’ through the lens of Ukrainian...

    • search-demo.dataone.org
    • dataverse.no
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janda, Laura Alexis (2024). Replication Data for: Understanding ‘many’ through the lens of Ukrainian багато [Dataset]. http://doi.org/10.18710/Y7VGQE
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    DataverseNO
    Authors
    Janda, Laura Alexis
    Time period covered
    Jan 1, 1742 - Jan 1, 2023
    Description

    Dataset description: The General Regionally Annotated Corpus of Ukrainian (GRAC, Shvedova et al. 2017-2024, uacorpus.org) was consulted to collect data for further analysis concerning the distribution of Singular vs. Plural verb forms in the target bahato construction. GRAC is a Sketch Engine corpus of over 1.8 billion words, representing texts from over 30,000 authors created between 1816 and 2023. This corpus is designed to serve as source material for linguistic research on Standard Ukrainian. Our data was collected during the month of February 2024. We extracted and annotated 28,491 examples of the bahato construction. An additional set of examples was collected from the Russian National Corpus (ruscorpora.ru) during the month of August 2024 to provide comparison with the Russian mnogo construction. For this purpose, 6,612 examples were extracted and annotated for word order and Singular vs. Plural verb agreement. Both the Ukrainian and the Russian data are included in this dataset, along with the R scripts used to analyze this data. Article abstract: We reveal an ongoing language change in Ukrainian involving a construction with a subject comprised of the indefinite quantifier багато ‘many’ modifying a noun phrase in the Genitive Plural. Number agreement on the verb varies, allowing both Singular (in 69.1% of attestations) and Plural (in 30.9% of attestations). Based on statistical analysis of corpus data, we investigate the influence of the factors of year of creation, word order of subject and verb, and animacy of the subject on the choice of verb number. We find that, while all combinations of word order and animacy are robustly attested, VS word order and inanimate subjects tend to prefer Singular, whereas SV word order and animate subjects tend to prefer Plural. Since about the 1950s, the proportion of Plural has been increasing, overtaking Singular in the current decade. We propose that this Singular vs. Plural variation is motivated by the human embodied experience of construing a group of items as either a homogeneous mass (and therefore Singular) or a multiplicity of individuals (and therefore Plural). This proposal is supported by the identification of micro-constructions that prefer Singular and show reduced individuation of human beings.

  5. D

    Data for: Filling the data gaps within GRACE missions using Singular...

    • darus.uni-stuttgart.de
    Updated May 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuang Yi; Nico Sneeuw (2021). Data for: Filling the data gaps within GRACE missions using Singular Spectrum Analysis [Dataset]. http://doi.org/10.18419/DARUS-807
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2021
    Dataset provided by
    DaRUS
    Authors
    Shuang Yi; Nico Sneeuw
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dozens of missing epochs in the monthly gravity product of the satellite mission Gravity Recovery and Climate Experiment (GRACE) and its follow-on (GRACE-FO) mission greatly inhibit the complete analysis and full utilization of the data. Despite previous attempts to handle this problem, a general all-purpose gap-filling solution is still lacking. Here we propose a non-parametric, data-adaptive and easy-to-implement approach - composed of the Singular Spectrum Analysis (SSA) gap-filling technique, cross-validation, and spectral testing for significant components - to produce reasonable gap-filling results in the form of spherical harmonic coefficients (SHCs). We demonstrate that this approach is adept at inferring missing data from long-term and oscillatory changes extracted from available observations. A comparison in the spectral domain reveals that the gap-filling result resembles the product of GRACE missions below spherical harmonic degree 30 very well. As the degree increases above 30, the amplitude per degree of the gap-filling result decreases more rapidly than that of GRACE/GRACE-FO SHCs, showing effective suppression of noise. As a result, our approach can reduce noise in the oceans without sacrificing resolutions on land. The gap filling dataset is stored in the “SSA_filing/" folder. Each file represents a monthly result in the form of spherical harmonics. The data format follows the convention of the site ftp://isdcftp.gfz-potsdam.de/grace/. Low degree corrections (degree-1, C20, C30) have been made. The code to generate the dataset is located in the “code_share/“ folder, with an example for C30. The model-based Greenland mass balance result for data validation (results given in the paper) is provided in the "Greenland_SMB-D.txt” file.

  6. Error rates (in %) for each error type of Experiment 1, averaged across...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz (2023). Error rates (in %) for each error type of Experiment 1, averaged across items for each participant. [Dataset]. http://doi.org/10.1371/journal.pone.0200723.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standard deviations are presented in parentheses.

  7. D

    Data from: Production of Dutch variable plurals in language corpora

    • ssh.datastations.nl
    pdf, tsv, txt +3
    Updated Aug 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus; T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus (2021). Production of Dutch variable plurals in language corpora [Dataset]. http://doi.org/10.17026/DANS-XVR-QSCF
    Explore at:
    tsv(32355), txt(305001), zip(21019), txt(48119), tsv(140592), type/x-r-syntax(52913), txt(2916), txt(12328), pdf(195426), xml(5232), txt(1562)Available download formats
    Dataset updated
    Aug 24, 2021
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    Authors
    T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus; T.J. Zee; L.F.M. ten Bosch; I. Plag; M.T.C. Ernestus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A growing body of work in psycholinguistics suggests that morphological relations between word forms affect the processing of complex words. Previous studies have usually focused on a particular type of paradigmatic relation, for example the relation between paradigm members, or the relation between alternative forms filling a particular paradigm cell. However, potential interactions between different types of paradigmatic relations have remained relatively unexplored. The data in in this data set were used in two corpus studies of variable plurals in Dutch to test hypotheses about potentially interacting paradigmatic effects. The first study (which uses the s_dist data) shows that generalization across noun paradigms predicts the distribution of plural variants, and that this effect is diminished for paradigms in which the plural variants are more likely to have a strong representation in the mental lexicon. The second study (which uses the s_dur data) demonstrates that the pronunciation of a target plural variant is affected by coactivation of the alternative variant, resulting in shorter segmental durations. This effect is dependent on the representational strength of the alternative plural variant. In sum, the distributional and durational measurements in these data provide evidence that storage of morphologically complex words may affect the role of generalization and coactivation during production. A full description of the data gathering process and the analyses is given in the Methodology file. The Readme file describes how the remaining files relate to the research.

  8. D

    Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    bin, text/tsv, txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Sönning; Lukas Sönning (2025). Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data [Dataset]. http://doi.org/10.18710/5KCE4U
    Explore at:
    bin(13462), text/tsv(2120816), txt(12381)Available download formats
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    DataverseNO
    Authors
    Lukas Sönning; Lukas Sönning
    License

    https://dataverse.no/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.18710/5KCE4Uhttps://dataverse.no/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.18710/5KCE4U

    Time period covered
    Jan 1, 1500 - Dec 31, 1707
    Area covered
    United Kingdom
    Description

    Dataset description This dataset, which is adapted from Jenset and McGillivray (2017), contains tabular files documenting the alternating usage of -(e)th and -(e)s to mark third-person verb inflection in Early Modern English. The data provided by Jenset and McGillivray (2017) are drawn from the PPCEME corpus (Kroch et al. 2004) and cover the period from 1500 to 1700. In total, 13,757 third-person singular tokens (excluding the verb BE) were annotated by these authors for a range of variables. For the purposes of the present methodological study, this dataset was reduced to a subset of 11,645 tokens, and the coding of variables was in some parts revised, completed, or modified. The dataset includes information about the Author and Verb Lemma, as well as a number of predictor variables, including Genre, Year, Frequency (of the verb lemma in the third-person singular), Phonological Context (stem-final sound), and the Gender of the author. Abstract for related publication Resource constraints often force researchers to down-size the list of tokens returned by a corpus query. This paper sketches a methodology for down-sampling and offers a survey of current practices. We build on earlier work and extend the evaluation of down-sampling designs to settings where tokens are clustered by text file and lexeme. Our case study deals with third-person present-tense verb inflection in Early Modern English and focuses on five predictors: Year, Gender, Genre, Frequency, and Phonological Context. We evaluate two strategies for selecting 2,000 (out of 11,645) tokens: simple down-sampling, where each hit has the same selection probability; and structured down-sampling, where this probability is inversely proportional to the author- and verb-specific token count. We form 500 sub-samples using each scheme and compare regression results to a reference model fit to the full set of cases. We observe that structured down-sampling shows better performance on several evaluation criteria.

  9. Computing a partial elastic shape registration of 3D surfaces using dynamic...

    • data.nist.gov
    • catalog.data.gov
    Updated Oct 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Computing a partial elastic shape registration of 3D surfaces using dynamic programming [Dataset]. http://doi.org/10.18434/mds2-3056
    Explore at:
    Dataset updated
    Oct 30, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Fortran and Matlab programs, Matlab mex file of Fortran program, compiled mex file, and sample data files, etc. for computing a partial elastic shape registration of two simple surfaces in 3-dimensional space and the elastic shape distance between them corresponding to the partial registration.

  10. Data Files for Tresoldi/Robinson article on spelling variation in Canterbury...

    • zenodo.org
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Nov 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tiago Henrique Tresoldi Tresoldi; Tiago Henrique Tresoldi Tresoldi (2024). Data Files for Tresoldi/Robinson article on spelling variation in Canterbury Tales manuscripts [Dataset]. http://doi.org/10.5281/zenodo.14209129
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tiago Henrique Tresoldi Tresoldi; Tiago Henrique Tresoldi Tresoldi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data is at https://github.com/peterrobinson/CTSpellingArticle2024. It is contained in three folders, each folder corresponding to one of the three sets of data used in this analysis, as follows:

    1. “sorted by regularization”. This folder contains spelling data and results derived from the regularization process, where (for example) spellings of forms regularized to “goode” are distinguished from spellings of forms regularized to “god”;

    2. “sorted by part of speech”. This folder contains spelling data and results derived from a lemmatization and part-of-speech identification process, where (for example) spellings of forms lemmatized to “goode” singular adjective are distinguished from spellings of forms regularized to “goode” plural adjective, and forms lemmatized to “gode” singular noun nominative case are distinguished from “gode” singular noun oblique case (as in “to gode”).

    3. “all spellings unsorted”. This folder contains spelling data as undifferentiated counts of “bags of words”: for each witness: so many occurrences of “good”, so many of “goode”, so many of “god”, so many of “gode”.

    Each folder contains the following files (under various names):

    1. A . json file holding all the data, structured according to its categorization. The “sorted by part of speech” folder contains two .json files, one with spellings organized by headword lemma, the other organized by part-of-speech;

    2. Two .nex Nexus files containing all the data. In the “sorted by regularization” and “sorted by part of speech” folders one Nexus file groups spellings by variant sites within each line, the other Nexus file groups spellings by words within each line. In the “all spellings unsorted” folder one Nexus file contains all the spellings organized by spelling; the second holds a Nexus distance matrix with distances created according to the Manhattan distance algorithm;

    3. A .dst distance matrix file, containing a distance matrix constructed with distacnes calculated by the Manhattan distance algorithm;

    4. A “features” file, containing a spreadsheet ranking each variant site according to its impact on the analysis

    5. Multiple .pdf files visualizing the results of our analysis, with the names reflecting the analysis each contains. Files with names including “Splits” were created using the SplitsTree algorithm and software (Huson and Bryant 2006; “SplitsTree | Universität Tübingen,” n.d.)

    The “sorted by regularization” folder also contains a single image file, “tiagoplot1.jpg”, visualizing the results of PCA analysis on the “sorted by regularization” data.

  11. D

    Replication Data for: A network of allostructions: quantified subject...

    • dataverse.no
    • search.dataone.org
    bin, csv, html, pdf +2
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura A Janda; Laura A Janda; Tore Nesset; Tore Nesset (2023). Replication Data for: A network of allostructions: quantified subject constructions in Russian [Dataset]. http://doi.org/10.18710/4D2QII
    Explore at:
    xlsx(12830911), csv(1814986), csv(36013332), pdf(2269386), xlsx(1381766), html(2397841), txt(8856), pdf(175984), bin(20016)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    DataverseNO
    Authors
    Laura A Janda; Laura A Janda; Tore Nesset; Tore Nesset
    License

    https://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/4D2QIIhttps://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/4D2QII

    Time period covered
    1800 - 2017
    Area covered
    Russian Federation
    Dataset funded by
    Norwegian Directorate for Higher Education and Skills
    Description

    Data and R code are provided for statistical analysis of approximately 39,000 corpus examples of predicate agreement in constructions with quantified subjects in Russian. The analysis indicates that these constructions constitute a network of constructions (“allostructions”) with various preferences for singular or plural agreement. Factors pull in different directions, and we observe a relatively stable situation in the face of variation. We present an analysis of a multidimensional network of allostructions in Russian, thus contributing to our understanding of allostructional relationships in Construction Grammar. With regard to historical linguistics, language stability is an understudied field. We illustrate an interplay of divergent factors that apparently resists language change. The syntax of numerals and other quantifiers represents a notoriously complex phenomenon of the Russian language. Our study sheds new light on the contributions of factors that favor singular or plural agreement in sentences with quantified subjects.

  12. f

    Mean item characteristics of French materials.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz (2023). Mean item characteristics of French materials. [Dataset]. http://doi.org/10.1371/journal.pone.0200723.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Elisabeth Beyersmann; Britta Biedermann; F.-Xavier Alario; Niels O. Schiller; Solène Hameau; Antje Lorenz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    French
    Description

    Standard deviations are shown in parentheses.

  13. r

    Data from: Singular Dirichlet (p, q)-equations

    • resodate.org
    Updated Jul 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikolaos S. Papageorgiou; Patrick Winkert (2021). Singular Dirichlet (p, q)-equations [Dataset]. http://doi.org/10.14279/depositonce-12146
    Explore at:
    Dataset updated
    Jul 6, 2021
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Nikolaos S. Papageorgiou; Patrick Winkert
    Description

    We consider a nonlinear Dirichlet problem driven by the (p, q)-Laplacian and with a reaction having the combined effects of a singular term and of a parametric (p−1)-superlinear perturbation. We prove a bifurcation-type result describing the changes in the set of positive solutions as the parameter λ>0 varies. Moreover, we prove the existence of a minimal positive solution u∗λ and study the monotonicity and continuity properties of the map λ→u∗λ.

  14. f

    Data from: Gender context effects in noun recognition: grammatical cues or...

    • tandf.figshare.com
    • figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cindy Bellanger; Jean-Pierre Chevrot; Elsa Spinelli (2023). Gender context effects in noun recognition: grammatical cues or co-occurrence effects? [Dataset]. http://doi.org/10.6084/m9.figshare.5092159.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Cindy Bellanger; Jean-Pierre Chevrot; Elsa Spinelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Determiners with congruent gender facilitate the recognition of the following noun. We examine two explanations of this effect: either gender information is retrieved and influences lexical access, or gender effects are due to the determiner-noun co-occurrence. French nouns are either feminine or masculine and are preceded by feminine or masculine determiners in the singular. Plural articles are unmarked for gender. Because some nouns (peanuts) occur more frequently in the plural than in their singular, they frequently co-occur with determiners that do not provide gender information. Conversely, nouns that occur more frequently in their singular form (cathedral) co-occur more frequently with gender-marked determiners. We examined the recognition of plural- and singular-oriented nouns preceded by gender-marked and unmarked determiners. Singular-oriented nouns were recognised faster after gender-marked (singular) articles than after gender-unmarked (plural) ones. However, plural-oriented nouns were recognised faster after gender-unmarked (plural) articles, suggesting that articles/nouns co-occurrence outweigh abstract gender cue.

  15. r

    Data from: On best rank one approximation of tensors

    • resodate.org
    Updated Dec 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shmuel Friedland; Volker Mehrmann; Renato Pajarola; Susanne Suter (2021). On best rank one approximation of tensors [Dataset]. http://doi.org/10.14279/depositonce-14528
    Explore at:
    Dataset updated
    Dec 17, 2021
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Shmuel Friedland; Volker Mehrmann; Renato Pajarola; Susanne Suter
    Description

    In this paper we suggest a new algorithm for the computation of a best rank one approximation of tensors, called 'alternating singular value decomposition'. This method is based on the computation of maximal singular values and the corresponding singular vectors of matrices. We also introduce a modification for this method and the alternating least squares method, which ensures that alternating iterations will always converge to a semi-maximal point. Finally, we introduce a new simple Newton-type method for speeding up the convergence of alternating methods near the optimum. We present several numerical examples that illustrate the computational performance of the new method in comparison to the alternating least square method.

  16. d

    Replication data for: Constructions are not predictable but are motivated:...

    • search.dataone.org
    • dataverse.no
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lewandowski, Wojciech (2024). Replication data for: Constructions are not predictable but are motivated: evidence from the Spanish completive reflexive [Dataset]. http://doi.org/10.18710/4QHOBK
    Explore at:
    Dataset updated
    Jul 29, 2024
    Dataset provided by
    DataverseNO
    Authors
    Lewandowski, Wojciech
    Description

    Many researchers seem to think that construction grammar posits the existence of just wholly idiosyncratic constructions or form-meaning pairings. However, this idea demonstrates a deep misunderstanding of the approach, since constructions rarely emerge sui generis. Rather, construction grammar aims to balance the fact that some linguistic uses cannot be fully predicted from other well-established uses, with the fact that extensions of a construction, while not predictable, are motivated by other senses in the constructional network. This study illustrates this tenet of constructional approaches to language by providing an analysis of the Spanish completive reflexive marker se. In order to identify the different senses of the completive se-construction I used data from the Spanish corpus CREA (Corpus de Referencia del Español Actual, http://corpus.rae.es/creanet.html). Given the large size of the corpus (200 million words), the frequency search—which is merely indicative—was arbitrarily limited to constructions in which the verb appeared in 3rd person singular and was directly followed by a direct object headed by the determined articles el ‘the’ (masculine) or la ‘the’ (feminine) in singular. The data set includes all the instances of the completive reflexive found in the sample described above.

  17. Summaries of posterior distributions for singular values and variance...

    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luciano Antonio de Oliveira; Carlos Pereira da Silva; Alessandra Querino da Silva; Cristian Tiago Erazo Mendes; Joel Jorge Nuvunga; Joel Augusto Muniz; Júlio Sílvio de Sousa Bueno Filho; Marcio Balestre (2023). Summaries of posterior distributions for singular values and variance components for the BGGE and BGGEE models. [Dataset]. http://doi.org/10.1371/journal.pone.0256882.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Luciano Antonio de Oliveira; Carlos Pereira da Silva; Alessandra Querino da Silva; Cristian Tiago Erazo Mendes; Joel Jorge Nuvunga; Joel Augusto Muniz; Júlio Sílvio de Sousa Bueno Filho; Marcio Balestre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summaries of posterior distributions for singular values and variance components for the BGGE and BGGEE models.

  18. g

    GI625 optical fiber data imaged on a Zeiss Versa XRM-500 microCT at 12 tube...

    • gimi9.com
    Updated Dec 3, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). GI625 optical fiber data imaged on a Zeiss Versa XRM-500 microCT at 12 tube voltages | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_gi625-optical-fiber-data-imaged-on-a-zeiss-versa-xrm-500-microct-at-12-tube-voltages/
    Explore at:
    Dataset updated
    Dec 3, 2019
    Description

    This is tomography data as acquired using a commercial X-ray tomography instrument. We obtained reconstructions of a graded-index optical fiber with voxels of edge length 1.05 µm at 12 tube voltages. The fiber manufacturer created a graded index in the central region by varying the germanium concentration from a peak value in the center of the core to a very small value at the core-cladding boundary. Operating on 12 tube voltages, we show by a singular value decomposition that there are only two singular vectors with significant weight. Physically, this means scans beyond two tube voltages contain largely redundant information. We concentrate on an analysis of the images associated with these two singular vectors. The first singular vector is dominant and images of the coefficients of the first singular vector at each voxel look are similar to any of the single-energy reconstructions. Images of the coefficients of the second singular vector by itself appear to be noise. However, by averaging the reconstructed voxels in each of several narrow bands of radii, we can obtain values of the second singular vector at each radius. In the core region, where we expect the germanium doping to go from a peak value at the fiber center to zero at the core-cladding boundary, we find that a plot of the two coefficients of the singular vectors forms a line in the two-dimensional space consistent with the dopant decreasing linearly with radial distance from the core center. The coating, made of a polymer rather than silica, is not on this line indicating that the two-dimensional results are sensitive not only to the density but also to the elemental composition. A stack of reconstructions are given here as tiff files of individual slices. Each zip file corresponds to a tilt series at a given tube voltage, given in the file name. The power is also given in the file name. (For example, file “30kV-2W.zip” was tube voltage at 30kV, power 2W.) The power was varied so that the signal-to-noise was approximately equal for the various reconstructions. The experiment is described in: ZH Levine, AP Peskin, EJ Garboczi, and AD Holmgren, Multi-Energy X-Ray Tomography of an Optical Fiber: The Role of Spatial Averaging, Microscopy and Microanalysis 25 (1) 70-76 (2019). https://doi.org/10.1017/S1431927618016136

  19. u

    Widefield imaging data from the publication, Cortical State Fluctuations...

    • rdr.ucl.ac.uk
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elina Jacobs (2023). Widefield imaging data from the publication, Cortical State Fluctuations during Sensory Decision Making [Dataset]. http://doi.org/10.5522/04/13194452.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    University College London
    Authors
    Elina Jacobs
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This site contains the widefield imaging datasets from the publication Cortical State Fluctuations during Sensory Decision Making, by Jacobs et al in Current Biology.This data is from the behavioural tasks described in the publication, and is in a compressed SVD format (see Methods in the publication for more details). The companion code is designed to take the data in this format.The datasets provided here contain the top 500 singular values, which is how the data in the publication was analysed, as this was found to sufficiently capture the data. The data contaning up to 2000 singular values can be shared on request.The timestamps of the datasets here are not all aligned with the behavioural datasets; the companion code takes care of this.The data is organised by experimental subject; most subjects were recorded from on multiple days, which form subfolders within the subject folder. Within a day, there may have been several experiments, which again form subfolders within the day folder. The companion code expects this data organisation.The companion code is available at: https://github.com/eakjacobs/Jacobs_et_al_CurrentBiologyFor more information and links to the behavioural and pupil datasets, please follow this link: https://doi.org/10.6084/m9.figshare.13084805The research article can be found (freely available) at https://www.cell.com/current-biology/fulltext/S0960-9822(20)31437-8

  20. H

    Data from: Collisional effects on resonant particles in quasilinear theory

    • dataverse.harvard.edu
    • osti.gov
    Updated May 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter J. Catto (2021). Collisional effects on resonant particles in quasilinear theory [Dataset]. http://doi.org/10.7910/DVN/ZWVNQF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Peter J. Catto
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZWVNQFhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZWVNQF

    Description

    A careful examination of the effects of collisions on resonant wave-particle interactions leads to an alternate interpretation and deeper understanding of the quasilinear operator originally formulated by Kennel and Engelmann (Phys. Fluids vol. 9, 1966, pp. 2377- 2388) for collisionless, magnetized plasmas, and widely used to model radio frequency heating and current drive. The resonant and nearly resonant particles are particularly sensitive to collisions that pitch angle scatter them out of and into resonance. As a result, the resonant particle-wave interactions occur in the center of a narrow collisional boundary when the collision frequency nu is very small compared to the wave frequency omega. The diffusive nature of the pitch angle scattering combined with the wave-particle resonance condition enhances the collision frequency by (omega/nu)2/3 >>1, resulting in an effective resonant particle collision time of tau_int ~ (nu /omega)2/3 nu <<1/ nu . A rigorous collisional boundary layer analysis generalizes the standard quasilinear operator to a form that is fully consistent with Kennel-Englemann, but allows replacing the delta function appearing in the diffusivity with a simple integral (having the appropriate delta function limit) retaining the new physics associated with the narrow boundary layer, while preserving the entropy production principle. The limitations of the collisional boundary layer treatment are also estimated, and indicate that substantial departures from Maxwellian are not permitted.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422

Dataset: The plural interpretability of German linking elements ("Morphology")

Explore at:
csvAvailable download formats
Dataset updated
Aug 15, 2021
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.

Search
Clear search
Close search
Google apps
Main menu