Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
Facebook
TwitterIntroduction This corpus contains ASCII versions of the CELEX lexical databases of English (Version 2.5), Dutch (Version 3.1) and German (Version 2.0). CELEX was developed as a joint enterprise of the University of Nijmegen, the Institute for Dutch Lexicology in Leiden, the Max Planck Institute for Psycholinguistics in Nijmegen, and the Institute for Perception Research in Eindhoven. Pre-mastering and production was done by the LDC. For each language, this data set contains detailed information on: orthography (variations in spelling, hyphenation) phonology (phonetic transcriptions, variations in pronunciation, syllable structure, primary stress) morphology (derivational and compositional structure, inflectional paradigms) syntax (word class, word class-specific subcategorizations, argument structures) word frequency (summed word and lemma counts, based on recent and representative text corpora) The databases have not been tailored to fit any particular database management program. Instead, the information is in ASCII files in a UNIX directory tree that can be queried with tools, such as AWK or ICON. Unique identity numbers allow the linking of information from different files. Some kinds of information have to be computed online; wherever necessary, AWK functions have been provided to recover this information. README files specify the details of their use. A detailed User Guide describing the various kinds of lexical information available is supplied. All sections of this guide are POSTSCRIPT files, except for some additional notes on the German lexicon in plain ASCII. CELEX-2 The second release of CELEX contains an enhanced, expanded version of the German lexical database (2.5), featuring approximately 1,000 new lemma entries, revised morphological parses, verb argument structures, inflectional paradigm codes and a corpus type lexicon. A complete PostScript version of the Germanic Linguistic Guide is also included, in both European A-4 format and American Letter format. For German, the total number of lemmas included is now 51,728, while all their inflected forms number 365,530. Moreover, phonetic syllable frequencies have been added for (British) English and Dutch. Apart from this, and provision of frequency information alongside every lexical feature, no changes have been made to Dutch and English lexicons. Complete AWK-scripts are now provided to compute representations not found in the (plain ASCII) lexical data files, corresponding to the features described in CELEX User Guide, which is included as well. For each language, i.e. English, German and Dutch, the data contains detailed information on the orthography (variations in spelling, hyphenation), the phonology (phonetic transcriptions, variations in pronunciation, syllable structure, primary stress), the morphology (derivational and compositional structure, inflectional paradigms), the syntax (word class, word-class specific subcategorisation, argument structures) and word frequency (summed word and lemma counts, based on resent and representative text corpora) of both wordforms and lemmas. Unique identity numbers allow the linking of information from different files with the aid of an efficient, index-based C-program. Like its predecessor, this release is mastered using the ISO 9660 daa format, with the Rock Ridge extensions, allowing it to be used in VMS, MS-DOS, Macintosh and UNIX environments. As the new release does not omit any data from the first edition, the current release will replace the old one. Updates Petra Stiener has developed a number of scripts to modify and update CELEX2 to a modern format. They are available on her github page. LREC papers related to these updates are accessible at the following urls: http://aclweb.org/anthology/W17-7619 & http://www.lrec-conf.org/proceedings/lrec2016/summaries/761.html.
Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
Facebook
TwitterAccess updated Celex A import data India with HS Code, price, importers list, Indian ports, exporting countries, and verified Celex A buyers in India.
Facebook
TwitterCelex Group Srl Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterExplore detailed Artichoke Extract import data of Celex Laboratories Inc in the USA—product details, price, quantity, origin countries, and US ports.
Facebook
TwitterCelex Laboratories Inc Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterExplore detailed Ashwagandha import data of Celex Laboratories Inc in the USA—product details, price, quantity, origin countries, and US ports.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Words with larger morphological families elicit shorter response times (RTs) in lexical decision experiments (e.g., Bertram, Baayen, & Schreuder, 2000). One possible account for this family size (FS) effect draws on Discriminative Lexicon Theory (Baayen et al., 2011), positing that morphological families strengthen relationships between forms and meanings. While Discriminative Lexicon Theory successfully explains FS effects in reading (Mulder et al., 2014), we will investigate whether it also does in listening. We employed the computational model LDL-AURIS (Shafaei-Bajestan et al., 2023), which is based on Discriminative Lexicon Theory, and show that, while it predicts auditory lexical decision RTs collected in a large-scale Dutch lexical decision experiment (BALDEY; Ernestus & Cutler, 2015), it does only partially explain the variance in the RTs that is explained by FS. This shows that Discriminative Lexicon Theory in its current form cannot fully explain FS effects in listening. We discuss possible reasons for this finding.
This Data Sharing Collection (DSC) includes: a) LDL-AURIS-based predictions of reaction times (RTs) in BALDEY. b) A BALDEY dataset enriched with three different family size (FS) measures. c) Control variables used in the associated experiment. d) All necessary scripts to derive the above materials
Changes in this version in comparison to the first version are: a) The analysis script now includes code to recreate the figures shown in the associated article. b) Enhancements have been made to the preprocessing steps in the analysis script. c) The documentation has been updated.
Please note that the collection does not include the CELEX or CGN databases used for computing FS measures and training the LDL-AURIS model, as we do not have the license to share them. Researchers with access to CELEX and CGN can use the provided scripts to recreate the LDL-AURIS model and FS measures. Those without access will need to use the enriched materials provided in this collection.
Facebook
TwitterCelex Laboratories Inc Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
The German-French Reciprocal Parallel Corpus (GeFRePaC) was produced by the Multilinguale Forschung/Multilingual Research Abteilung Lexik, Institut für Deutsche Sprache (Germany) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335).The German-French Reciprocal Parallel Corpus (GeFRePaC) is a 30 million word corpus (15 million for each language) for the purpose of developing, enhancing and improving translation aids (dictionaries, lexicons, platforms) for French-German and German-French translation. The database consists of the following parallel corpora:European Union CELEX Database: Treaties, Foreign relations, Law, Complementar Law and all the published documents of the "European Parliament".Celex-Database: 22,000,000 words (German+French)Europarl: 8,320,000 words (German+French)It covers natural general language as used in public socio-political discourse and it has a focus on multilingual administration and commercial and legal documentation. GeFRePaC comprises a large variety of text types for which there is a rapidly growing need for translation but which currently defy successful machine translation. The corpus is encoded according to the PAROLE guidelines, it was aligned on the sentence level and also for single word translation units on the lexical level, POS-tagged in conformity with EAGLES recommendations and validated according to the most current version of the ELRA guidelines. The parallel German-French texts were aligned using a program developed at the Equipe Langue et Dialogue, Laboratoire Loria, Nancy. The text files containing markup for paragraphs and sentences were processed by the Tree Tagger developed at the IMS Stuttgart. The text files are automatically converted into TEI-conformant SGML format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The epoch data are a fieldtrip format structure, all epochs are aligned with fixation onset to a given word, with the length of one second.Epochs were named as the combination of acquizaton date, subject code, and data type. Epochs that ended with 'Targ60' are from the first sentence set; ended with 'JEP60' are from the second sentence set (adapted from Degno et al. 2019, Journal of Experimental Psychology); ended with 'BL' were the baseline period of the first sentence set.This filed 'trialinfo' is the information about each trial, the header of all columns are as followings:1- sentence_id: sentence number for this epoch2- word_loc: the location of the current word in a sentence3- loc2targ:location distance between the current word and target word; loc2targ for pre-target, target, post-target are -1, 0, and 14- word_freq: CELEX frequency 5- word_length6- saccade2this_duration: saccade duration toward this word7- fixation_on_MEG: MEG trigger for fixation onset to this word8- fixation_duration 9- NextOrder: next word location minus the current word location; negative value indicates saccade backward to the previous words10- FirstPassFix: whether this fixation is the first for this word or not11- PreviousOrder: previous word location minus the current word location; negative value indicates saccade forward to the next words12- SentenceCondition: the current word is in a sentence with high or low frequency target word; 1 -- low, 2 -- high13- PupilSize: averaged pupil size during this fixation
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.