14 datasets found

E
CELEX Dutch lexical database - Frequency Subset
catalogue.elra.info
live.european-language-grid.eu
Updated Oct 5, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2005). CELEX Dutch lexical database - Frequency Subset [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-L0029_07/
Explore at:
Dataset updated
Oct 5, 2005
Dataset provided by
ELRA (European Language Resources Association)
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Description
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
d
Data from: CELEX2
dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baayen, R H.; Piepenbrock, R; Gulikers, L (2023). CELEX2 [Dataset]. http://doi.org/10.5683/SP2/XGW4WY
Explore at:
Unique identifier
https://doi.org/10.5683/SP2/XGW4WY
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Baayen, R H.; Piepenbrock, R; Gulikers, L
Description
Introduction This corpus contains ASCII versions of the CELEX lexical databases of English (Version 2.5), Dutch (Version 3.1) and German (Version 2.0). CELEX was developed as a joint enterprise of the University of Nijmegen, the Institute for Dutch Lexicology in Leiden, the Max Planck Institute for Psycholinguistics in Nijmegen, and the Institute for Perception Research in Eindhoven. Pre-mastering and production was done by the LDC. For each language, this data set contains detailed information on: orthography (variations in spelling, hyphenation) phonology (phonetic transcriptions, variations in pronunciation, syllable structure, primary stress) morphology (derivational and compositional structure, inflectional paradigms) syntax (word class, word class-specific subcategorizations, argument structures) word frequency (summed word and lemma counts, based on recent and representative text corpora) The databases have not been tailored to fit any particular database management program. Instead, the information is in ASCII files in a UNIX directory tree that can be queried with tools, such as AWK or ICON. Unique identity numbers allow the linking of information from different files. Some kinds of information have to be computed online; wherever necessary, AWK functions have been provided to recover this information. README files specify the details of their use. A detailed User Guide describing the various kinds of lexical information available is supplied. All sections of this guide are POSTSCRIPT files, except for some additional notes on the German lexicon in plain ASCII. CELEX-2 The second release of CELEX contains an enhanced, expanded version of the German lexical database (2.5), featuring approximately 1,000 new lemma entries, revised morphological parses, verb argument structures, inflectional paradigm codes and a corpus type lexicon. A complete PostScript version of the Germanic Linguistic Guide is also included, in both European A-4 format and American Letter format. For German, the total number of lemmas included is now 51,728, while all their inflected forms number 365,530. Moreover, phonetic syllable frequencies have been added for (British) English and Dutch. Apart from this, and provision of frequency information alongside every lexical feature, no changes have been made to Dutch and English lexicons. Complete AWK-scripts are now provided to compute representations not found in the (plain ASCII) lexical data files, corresponding to the features described in CELEX User Guide, which is included as well. For each language, i.e. English, German and Dutch, the data contains detailed information on the orthography (variations in spelling, hyphenation), the phonology (phonetic transcriptions, variations in pronunciation, syllable structure, primary stress), the morphology (derivational and compositional structure, inflectional paradigms), the syntax (word class, word-class specific subcategorisation, argument structures) and word frequency (summed word and lemma counts, based on resent and representative text corpora) of both wordforms and lemmas. Unique identity numbers allow the linking of information from different files with the aid of an efficient, index-based C-program. Like its predecessor, this release is mastered using the ISO 9660 daa format, with the Rock Ridge extensions, allowing it to be used in VMS, MS-DOS, Macintosh and UNIX environments. As the new release does not omit any data from the first edition, the current release will replace the old one. Updates Petra Stiener has developed a number of scripts to modify and update CELEX2 to a modern format. They are available on her github page. LREC papers related to these updates are accessible at the following urls: http://aclweb.org/anthology/W17-7619 & http://www.lrec-conf.org/proceedings/lrec2016/summaries/761.html.
E
CELEX Dutch lexical database - Complete set
catalogue.elra.info
live.european-language-grid.eu
Updated Oct 5, 2005
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2005). CELEX Dutch lexical database - Complete set [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-L0029_01/
Explore at:
Dataset updated
Oct 5, 2005
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
Description
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
E
CELEX Dutch lexical database - Inflectional Morphology Subset
catalogue.elra.info
live.european-language-grid.eu
Updated Oct 5, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2005). CELEX Dutch lexical database - Inflectional Morphology Subset [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-L0029_04/
Explore at:
Dataset updated
Oct 5, 2005
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
Description
The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.
t
CELEX - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). CELEX - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/celex
Explore at:
Dataset updated
Dec 16, 2024
Description
The CELEX dataset is a large lexical database containing phonological transcriptions of English words.
s
Celex A Import Data India – Buyers & Importers List
seair.co.in
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim Solutions, Celex A Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in/celex-a-import-data.aspx
Explore at:
.text/.csv/.xml/.xls/.binAvailable download formats
Dataset authored and provided by
Seair Exim Solutions
Area covered
India
Description
Access updated Celex A import data India with HS Code, price, importers list, Indian ports, exporting countries, and verified Celex A buyers in India.
e
Celex Group Srl Export Import Data | Eximpedia
eximpedia.app
Updated Jan 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Celex Group Srl Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/celex-group-srl/67560859
Explore at:
Dataset updated
Jan 12, 2025
Description
Celex Group Srl Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
s
Artichoke Extract Import Data | Celex Laboratories Inc
seair.co.in
Updated Mar 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim Solutions (2024). Artichoke Extract Import Data | Celex Laboratories Inc [Dataset]. https://www.seair.co.in/us-import/product-artichoke-extract/i-celex-laboratories-inc.aspx
Explore at:
.text/.csv/.xml/.xls/.binAvailable download formats
Dataset updated
Mar 6, 2024
Dataset authored and provided by
Seair Exim Solutions
Description
Explore detailed Artichoke Extract import data of Celex Laboratories Inc in the USA—product details, price, quantity, origin countries, and US ports.
e
Celex Laboratories Inc Export Import Data | Eximpedia
eximpedia.app
Updated Sep 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Celex Laboratories Inc Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/celex-laboratories-inc/98894285
Explore at:
Dataset updated
Sep 14, 2025
Description
Celex Laboratories Inc Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
s
Ashwagandha Import Data | Celex Laboratories Inc
seair.co.in
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim Solutions (2024). Ashwagandha Import Data | Celex Laboratories Inc [Dataset]. https://www.seair.co.in/us-import/product-ashwagandha/i-celex-laboratories-inc.aspx
Explore at:
.text/.csv/.xml/.xls/.binAvailable download formats
Dataset updated
Feb 26, 2024
Dataset authored and provided by
Seair Exim Solutions
Description
Explore detailed Ashwagandha import data of Celex Laboratories Inc in the USA—product details, price, quantity, origin countries, and US ports.
R
Data from: Can Discriminative Lexicon Theory account for the family size...
data.ru.nl
Updated Oct 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hanno Müller; Louis ten Bosch; Mirjam Ernestus (2025). Can Discriminative Lexicon Theory account for the family size effect in auditory word recognition? [Dataset]. http://doi.org/10.34973/x6v3-yj45
Explore at:
(61601763 bytes)Available download formats
Unique identifier
https://doi.org/10.34973/x6v3-yj45
Dataset updated
Oct 26, 2025
Dataset provided by
Radboud University
Authors
Hanno Müller; Louis ten Bosch; Mirjam Ernestus
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Words with larger morphological families elicit shorter response times (RTs) in lexical decision experiments (e.g., Bertram, Baayen, & Schreuder, 2000). One possible account for this family size (FS) effect draws on Discriminative Lexicon Theory (Baayen et al., 2011), positing that morphological families strengthen relationships between forms and meanings. While Discriminative Lexicon Theory successfully explains FS effects in reading (Mulder et al., 2014), we will investigate whether it also does in listening. We employed the computational model LDL-AURIS (Shafaei-Bajestan et al., 2023), which is based on Discriminative Lexicon Theory, and show that, while it predicts auditory lexical decision RTs collected in a large-scale Dutch lexical decision experiment (BALDEY; Ernestus & Cutler, 2015), it does only partially explain the variance in the RTs that is explained by FS. This shows that Discriminative Lexicon Theory in its current form cannot fully explain FS effects in listening. We discuss possible reasons for this finding.

This Data Sharing Collection (DSC) includes: a) LDL-AURIS-based predictions of reaction times (RTs) in BALDEY. b) A BALDEY dataset enriched with three different family size (FS) measures. c) Control variables used in the associated experiment. d) All necessary scripts to derive the above materials

Changes in this version in comparison to the first version are: a) The analysis script now includes code to recreate the figures shown in the associated article. b) Enhancements have been made to the preprocessing steps in the analysis script. c) The documentation has been updated.

Please note that the collection does not include the CELEX or CGN databases used for computing FS measures and training the LDL-AURIS model, as we do not have the license to share them. Researchers with access to CELEX and CGN can use the provided scripts to recreate the LDL-AURIS model and FS measures. Those without access will need to use the enriched materials provided in this collection.
e
Celex Laboratories Inc Export Import Data | Eximpedia
eximpedia.app
Updated Jan 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Celex Laboratories Inc Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Jan 9, 2025
Dataset provided by
Eximpedia PTE LTD
Eximpedia Export Import Trade Data
Authors
Seair Exim
Area covered
United Arab Emirates, Angola, Sierra Leone, Nauru, Western Sahara, Canada, Iran (Islamic Republic of), Egypt, Burundi, Jordan
Description
Celex Laboratories Inc Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
E
GeFRePaC - German French Reciprocal Parallel Corpus
catalogue.elra.info
live.european-language-grid.eu
Updated Jun 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2017). GeFRePaC - German French Reciprocal Parallel Corpus [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-W0031/
Explore at:
Dataset updated
Jun 26, 2017
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
Area covered
French
Description
The German-French Reciprocal Parallel Corpus (GeFRePaC) was produced by the Multilinguale Forschung/Multilingual Research Abteilung Lexik, Institut für Deutsche Sprache (Germany) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335).The German-French Reciprocal Parallel Corpus (GeFRePaC) is a 30 million word corpus (15 million for each language) for the purpose of developing, enhancing and improving translation aids (dictionaries, lexicons, platforms) for French-German and German-French translation. The database consists of the following parallel corpora:European Union CELEX Database: Treaties, Foreign relations, Law, Complementar Law and all the published documents of the "European Parliament".Celex-Database: 22,000,000 words (German+French)Europarl: 8,320,000 words (German+French)It covers natural general language as used in public socio-political discourse and it has a focus on multilingual administration and commercial and legal documentation. GeFRePaC comprises a large variety of text types for which there is a rapidly growing need for translation but which currently defy successful machine translation. The corpus is encoded according to the PAROLE guidelines, it was aligned on the sentence level and also for single word translation units on the lexical level, POS-tagged in conformity with EAGLES recommendations and validated according to the most current version of the ELRA guidelines. The parallel German-French texts were aligned using a program developed at the Equipe Langue et Dialogue, Laboratoire Loria, Nancy. The text files containing markup for paragraphs and sentences were processed by the Tree Tagger developed at the IMS Stuttgart. The text files are automatically converted into TEI-conformant SGML format.
Epoch data
figshare.com
hdf
Updated Jul 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yali Pan (2021). Epoch data [Dataset]. http://doi.org/10.6084/m9.figshare.14963730.v10
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14963730.v10
Dataset updated
Jul 14, 2021
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yali Pan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The epoch data are a fieldtrip format structure, all epochs are aligned with fixation onset to a given word, with the length of one second.Epochs were named as the combination of acquizaton date, subject code, and data type. Epochs that ended with 'Targ60' are from the first sentence set; ended with 'JEP60' are from the second sentence set (adapted from Degno et al. 2019, Journal of Experimental Psychology); ended with 'BL' were the baseline period of the first sentence set.This filed 'trialinfo' is the information about each trial, the header of all columns are as followings:1- sentence_id: sentence number for this epoch2- word_loc: the location of the current word in a sentence3- loc2targ:location distance between the current word and target word; loc2targ for pre-target, target, post-target are -1, 0, and 14- word_freq: CELEX frequency 5- word_length6- saccade2this_duration: saccade duration toward this word7- fixation_on_MEG: MEG trigger for fixation onset to this word8- fixation_duration 9- NextOrder: next word location minus the current word location; negative value indicates saccade backward to the previous words10- FirstPassFix: whether this fixation is the first for this word or not11- PreviousOrder: previous word location minus the current word location; negative value indicates saccade forward to the next words12- SentenceCondition: the current word is in a sentence with high or low frequency target word; 1 -- low, 2 -- high13- PupilSize: averaged pupil size during this fixation
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2005). CELEX Dutch lexical database - Frequency Subset [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-L0029_07/

CELEX Dutch lexical database - Frequency Subset

Explore at:

Dataset updated

Oct 5, 2005

Dataset provided by

ELRA (European Language Resources Association)
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)

License

https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

Description

The Dutch CELEX data is derived from R.H. Baayen, R. Piepenbrock & L. Gulikers, The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 1995.Apart from orthographic features, the CELEX database comprises representations of the phonological, morphological, syntactic and frequency properties of lemmata. For the Dutch data, frequencies have been disambiguated on the basis of the 42.4m Dutch Instituut voor Nederlandse Lexicologie text corpora.To make for greater compatibility with other operating systems, the databases have not been tailored to fit any particular database management program. Instead, the information is presented in a series of plain ASCII files, which can be queried with tools such as AWK and ICON. Unique identity numbers allow the linking of information from different files.This database can be divided into different subsets:· orthography: with or without diacritics, with or without word division positions, alternative spellings, number of letters/syllables;· phonology: phonetic transcriptions with syllable boundaries or primary and secondary stress markers, consonant-vowel patterns, number of phonemes/syllables, alternative pronunciations, frequency per phonetic syllable within words;· morphology: division into stems and affixes, flat or hierarchical representations, stems and their inflections;· syntax: word class, subcategorisations per word class;· frequency of the entries: disambiguated for homographic lemmata.

Clear search

Close search

Google apps

Main menu

CELEX Dutch lexical database - Frequency Subset

Data from: CELEX2

CELEX Dutch lexical database - Complete set

CELEX Dutch lexical database - Inflectional Morphology Subset

CELEX - Dataset - LDM

Celex A Import Data India – Buyers & Importers List

Celex Group Srl Export Import Data | Eximpedia

Artichoke Extract Import Data | Celex Laboratories Inc

Celex Laboratories Inc Export Import Data | Eximpedia

Ashwagandha Import Data | Celex Laboratories Inc

Data from: Can Discriminative Lexicon Theory account for the family size...

Celex Laboratories Inc Export Import Data | Eximpedia

GeFRePaC - German French Reciprocal Parallel Corpus

Epoch data

CELEX Dutch lexical database - Frequency SubsetSee More Versions

CELEX Dutch lexical database - Frequency Subset