Facebook
TwitterThe idea behind creating this dataset is to Use of negative and Positive word Sense for the research purpose. it has been made for research related to linguistic, like NLP, AI, Behaviour Detection and many more . it helps to:
1. Research whether language utilized in science abstracts can skew towards the employment of strikingly positive and negative words over time.
2. The yearly frequencies of positive, negative, and neutral words, plus 100 randomly selected words were normalised for the whole number of abstracts.
3. Subanalyses included pattern quantification of individual words, specificity for selected high impact journals, and comparison between author affiliations within or outside countries with English
because the official majority language.
in an analysis Frequency patterns were compared with 4% of all books ever printed and digitised by use of Google Books Ngram Viewer. Main outcome measures Frequencies of positive and negative words in abstracts compared with frequencies of words with a neutral and random connotation, expressed as relative change since 1980 so it can help in these tasks too. Results absolutely the frequency of positive words increased from 2.0% (1974-80) to 17.5% (2014), a relative increase of 880% over four decades. All 25 individual positive words contributed to the rise, particularly the words “robust,” “novel,” “innovative,” and “unprecedented,” which increased in ratio up to fifteen 000%. Comparable but less pronounced results were obtained when restricting the analysis to chose journals with high impact factors. Authors affiliated to an institute during a non-English speaking country used significantly more positive words. Negative word frequencies increased from 1.3% (1974-80) to three.2% (2014), a relative increase of 257%. Over the identical period of time, no apparent increase was found in neutral or random word use, or within the frequency of positive word use in published books. so lexicographic analysis indicates that scientific abstracts are currently written with more positive and negative words, and provides an insight into the evolution of scientific writing. Apparently scientists look on the brilliant side of research results. So THis data set can play major role in research.
About The Data Set: 1. Dataset is in Excel File Format. 2. Dataset Has two Column (I) Negative Word List (II) Positive Word List 3. In the Dataset Total 4699, Positive Words and Total 4722 Negative Words are theirs. 4. Dataset is collected data from different sources. 5. The dataset has some Null (nan) Values. 6. Please check the Data Once before Use.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Just to see how it can help in many NLP related Tasks.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Last Version: 4
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/12/15
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 4th version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.
Version: 3
Authors: Carlota Balsa-Sánchez, Vanesa Loureiro
Date of data collection: 2022/10/28
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 3rd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).
Erratum - Data articles in journals Version 3:
Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2
Data -- ISSN 2306-5729 -- JCR (JIF) n/a
Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a
Version: 2
Author: Francisco Rubio, Universitat Politècnia de València.
Date of data collection: 2020/06/23
General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers.
File list:
- data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
- data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published
Relationship between files: both files have the same information. Two different formats are offered to improve reuse
Type of version of the dataset: final processed version
Versions of the files: 2nd version
- Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types
- Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)
Total size: 32 KB
Version 1: Description
This dataset contains a list of journals that publish data articles, code, software articles and database articles.
The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals.
Acknowledgements:
Xaquín Lores Torres for his invaluable help in preparing this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We provide two files, an excel file named: "Global data set on micro- and mesoplastic loads in marine sediments" and a PDF file named "Metadata-Dataset". The excel file provides the dataset and the list of references from which the data were extracted or derived. The PDF file provides a detailed description of the dataset and of the methods used to extract and derive data.
Facebook
TwitterThe EU-U.S. and Swiss-U.S. Privacy Shield Frameworks are mechanisms that companies can use to comply with data protection requirements when transferring personal data from the European Union and Switzerland to the United States. ITA\'s Privacy Shield Team maintains two lists that are made available to the public: 1) the Privacy Shield Active List, and 2) the Privacy Shield Inactive List. The Active List is an authoritative list of U.S. organizations that have self-certified to the Department of Commerce and declared their commitment to adhere to the Privacy Shield Principles. The Inactive List is an authoritative list of U.S. organizations that are no longer self-certified under Privacy Shield and are therefore no longer assured of the benefits of using Privacy Shield to receive personal data from the European Union and/or Switzerland. Upon request, the Privacy Shield Team may provide a copy of the list in the form of an Excel spreadsheet.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.
All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.
Here is the data dictionary for (Indian) Shark Tank season's dataset.
Facebook
TwitterThis data set contains the first name statistics for newborns in Münster from 2007 to 2021. Two different lists are made available: A first name hit list with the top 30 most commonly used first names, grouped by year of birth and gender. A list of “first name numbers”. This list shows how many babies have been given multiple first names. First name hitlist The table with the first name hitlist contains the following columns: Year = year of birth Rank = Top 30 rank Gender = girl or boy Name = the chosen name Number = Number of children with this name Please note the following additional information: All given first names are taken into account for the calculation of the first name list, i.e. the second and third names. For example, if “Tom” leads the list in a year, that doesn't mean that Tom was the most popular name, but Tom was the most frequently mentioned first name among the total first, second, third and other given names for babies. First name number The table with the first name number contains the following columns: Year = year of birth Children with.. = How many first names Number = number of children The following is an Excel file, which contains both lists in different spreadsheets, as well as two corresponding CSV files.
Facebook
TwitterWe describe a bibliometric network characterizing co-authorship collaborations in the entire Italian academic community. The network, consisting of 38,220 nodes and 507,050 edges, is built upon two distinct data sources: faculty information provided by the Italian Ministry of University and Research and publications available in Semantic Scholar. Both nodes and edges are associated with a large variety of semantic data, including gender, bibliometric indexes, authors' and publications' research fields, and temporal information. While linking data between the two original sources posed many challenges, the network has been carefully validated to assess its reliability and to understand its graph-theoretic characteristics. By resembling several features of social networks, our dataset can be profitably leveraged in experimental studies in the wide social network analytics domain as well as in more specific bibliometric contexts. , The proposed network is built starting from two distinct data sources:
the entire dataset dump from Semantic Scholar (with particular emphasis on the authors and papers datasets) the entire list of Italian faculty members as maintained by Cineca (under appointment by the Italian Ministry of University and Research).
By means of a custom name-identity recognition algorithm (details are available in the accompanying paper published in Scientific Data), the names of the authors in the Semantic Scholar dataset have been mapped against the names contained in the Cineca dataset and authors with no match (e.g., because of not being part of an Italian university) have been discarded. The remaining authors will compose the nodes of the network, which have been enriched with node-related (i.e., author-related) attributes. In order to build the network edges, we leveraged the papers dataset from Semantic Scholar: specifically, any two authors are said to be connected if there is at least one pap..., , # Data cleaning and enrichment through data integration: networking the Italian academia
https://doi.org/10.5061/dryad.wpzgmsbwj
Manuscript published in Scientific Data with DOI .
This repository contains two main data files:
edge_data_AGG.csv, the full network in comma-separated edge list format (this file contains mainly temporal co-authorship information);Coauthorship_Network_AGG.graphml, the full network in GraphML format. along with several supplementary data, listed below, useful only to build the network (i.e., for reproducibility only):
University-City-match.xlsx, an Excel file that maps the name of a university against the city where its respective headquarter is located;Areas-SS-CINECA-match.xlsx, an Excel file that maps the research areas in Cineca against the research areas in Semantic Scholar.The `Coauthorship_Networ...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F01cf9c0b2af1ccd0d330298e9bb0b8e7%2Fenglishwords2.png?generation=1701184750090746&alt=media" alt="">
This is a direct download and copy of the dataset from https://github.com/dwyl/english-words/ , as the creator of the Github project said
A text file containing over 466k English words.
While searching for a list of english words (for an auto-complete tutorial) I found: https://stackoverflow.com/questions/2213607/how-to-get-english-language-word-database which refers to https://www.infochimps.com/datasets/word-list-350000-simple-english-words-excel-readable (archived).
No idea why infochimps put the word list inside an excel (.xls) file.
I pulled out the words into a simple new-line-delimited text file. Which is more useful when building apps or importing into databases etc.
Copyright still belongs to them.
There are two other datasets here in Kaggle that I've seen which are
But they don't seem to contain the latest updates made from Github, thus I'm uploading it and will use this dataset for a notebook that I will be writing here in Kaggle
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains the list and classification of electoral districts mentioned in the national and regional datasets. The Excel file version provides the information on districts for both datasets in a single file in two sheets for ease of use. The CSV files (UTF-8) provide the information for each sheet in two separate files.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary File 1 is an Excel spreadsheet containing a list of molecules found in IGD, along with the molecules of Subset-IGD.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The present data set provides an Excel file in a zip archive. The file lists 334 samples of size fractionated eukaryotic plankton community with a suite of associated metadata (Database W1). Note that if most samples represented the piconano- (0.8-5 µm, 73 samples), nano- (5-20 µm, 74 samples), micro- (20-180 µm, 70 samples), and meso- (180-2000 µm, 76 samples) planktonic size fractions, some represented different organismal size-fractions: 0.2-3 µm (1 sample), 0.8-20 µm (6 samples), 0.8 µm - infinity (33 samples), and 3-20 µm (1 sample). The table contains the following fields: a unique sample sequence identifier; the sampling station identifier; the Tara Oceans sample identifier (TARA_xxxxxxxxxx); an INDSC accession number allowing to retrieve raw sequence data for the major nucleotide databases (short read archives at EBI, NCBI or DDBJ); the depth of sampling (Subsurface - SUR or Deep Chlorophyll Maximum - DCM); the targeted size range; the sequences template (either DNA or WGA/DNA if DNA extracted from the filters was Whole Genome Amplified); the latitude of the sampling event (decimal degrees); the longitude of the sampling event (decimal degrees); the time and date of the sampling event; the device used to collect the sample; the logsheet event corresponding to the sampling event ; the volume of water sampled (liters). Then follows information on the cleaning bioinformatics pipeline shown on Figure W2 of the supplementary litterature publication: the number of merged pairs present in the raw sequence file; the number of those sequences matching both primers; the number of sequences after quality-check filtering; the number of sequences after chimera removal; and finally the number of sequences after selecting only barcodes present in at least three copies in total and in at least two samples. Finally, are given for each sequence sample: the number of distinct sequences (metabarcodes); the number of OTUs; the average number of barcode per OTU; the Shannon diversity index based on barcodes for each sample (URL of W4 dataset in PANGAEA); and the Shannon diversity index based on each OTU (URL of W5 dataset in PANGAEA).
Facebook
TwitterThis interactive sales dashboard is designed in Excel for B2C type of Businesses like Dmart, Walmart, Amazon, Shops & Supermarkets, etc. using Slicers, Pivot Tables & Pivot Chart.
The first column is the date of Selling. The second column is the product ID. The third column is quantity. The fourth column is sales types, like direct selling, are purchased by a wholesaler or ordered online. The fifth column is a mode of payment, which is online or in cash. You can update these two as per requirements. The last one is a discount percentage. if you want to offer any discount, you can add it here.
So, basically these are the four sheets mentioned above with different tasks.
However, a sales dashboard enables organizations to visualize their real-time sales data and boost productivity.
A dashboard is a very useful tool that brings together all the data in the forms of charts, graphs, statistics and many more visualizations which lead to data-driven and decision making.
Questions & Answers
Facebook
TwitterFollowing a request from the European Commission, in 2018 EFSA released a renovated database of host plant species of Xylella spp. (including both species X. fastidiosa and X. taiwanensis) together with a scientific report (EFSA, 2018). EFSA was tasked to maintain and update this database periodically. In May 2021, EFSA released the fourth update of the Xylella spp. host plant database (VERSION 4) with information retrieved from literature search up to December 2020, Europhyt outbreak notifications up to 18 March 2021, and communications of research groups and national authorities (EFSA, 2021). The protocol applied for the extensive literature review, data collection and reporting, as well as results and lists of host plants are described in detail in the related scientific report (EFSA, 2021). The overall number of Xylella spp. host plants determined with at least two different detection methods or positive with one method (between: sequencing, pure culture isolation) reaches now 385 plant species, 179 genera and 67 families (category A – see section 2.4.2 of EFSA (2021)). Such numbers rise to 638 plant species, 289 genera and 87 families if considered regardless of the detection method applied (category E, see section 2.4.2 of EFSA (2021). The Excel files here attached represent the VERSION 4 of the Xylella spp. host plants database. For a detailed description of the information included in the database, please consult the related scientific report (EFSA, 2021). The Excel file “Xylella spp. host plants database – VERSION 4” contains several sheets: the LEGENDA (with extensive description of each table), the full detailed raw data of the Xylella spp. host plant database (sheet “observation”) and several examples of data extraction. Additional Excel files contain the lists of host plant species of X. fastidiosa (subsp. unknown (i.e. not reported), fastidiosa, multiplex, pauca, morus, sandyi, tashke, fastidiosa/sandyi) and X. taiwanensis infected naturally, artificially and in not specified conditions, and according to different categories (A,B,C,D,E – see section 2.4.2 of EFSA (2021)). The Excel file “new_host_plant_species_v4” contain the list of new host plant species added to the database in this fourth update. Question number: EFSA-Q-2017-00221 Correspondence: alpha@efsa.europa.eu Bibliography: EFSA (European Food Safety Authority), 2018. Scientific report on the update of the Xylella spp. host plant database. EFSA Journal 2018;16(9):5408, 87 pp. https://doi.org/10.2903/j.efsa.2018.5408 EFSA (European Food Safety Authority), Delbianco A, Gibin D, Pasinato L and Morelli M, 2021. Scientific report on the update of the Xylella spp. host plant database – systematic literature search up to 31 December 2020. EFSA Journal 2021;19(6):6674, 70 pp. https://doi.org/10.2903/j.efsa.2021.6674
Facebook
TwitterI am showcasing the financial commissions model on Kaggle. On Excel we can utilize IF statements to chart rates that reward workers based on quotas. By compiling sales on a large or small scale we can easily derive the necessary compensation for workers.
The first sheet uses simple IF statements to derive a commission payment for different rates. The Sales company exceeded their quota of $95,000.00, and reached $99,343.00, which is a 104.6% return on investment.
On sheet 2 there is a detailed breakdown of individual employee rates and their deserved commission. The difference in sheet 2 is the use of nested IF statements, which can get much more complex if not catalogued properly.
There are two guides on YouTube which I credit heavily for these models here are the links: https://www.youtube.com/watch?v=bkrSVS7-CYo&list=PLQnuraB9JKXdUlDVZtcfG2_sO_uL_XyMm&index=4 https://www.youtube.com/watch?v=0Ahqr6Xdkos&list=PLQnuraB9JKXdUlDVZtcfG2_sO_uL_XyMm&index=12
Thanks for reading, and enjoy!
Facebook
TwitterThe dataset contains a list of some UK counties and their classification into regions.
Excel has two columns - county and region.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F26942353%2Ff61a628de82f20b9857ec06b9d51b8d0%2FUK_C-R.png?generation=1763987259962447&alt=media" alt="">
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThe idea behind creating this dataset is to Use of negative and Positive word Sense for the research purpose. it has been made for research related to linguistic, like NLP, AI, Behaviour Detection and many more . it helps to:
1. Research whether language utilized in science abstracts can skew towards the employment of strikingly positive and negative words over time.
2. The yearly frequencies of positive, negative, and neutral words, plus 100 randomly selected words were normalised for the whole number of abstracts.
3. Subanalyses included pattern quantification of individual words, specificity for selected high impact journals, and comparison between author affiliations within or outside countries with English
because the official majority language.
in an analysis Frequency patterns were compared with 4% of all books ever printed and digitised by use of Google Books Ngram Viewer. Main outcome measures Frequencies of positive and negative words in abstracts compared with frequencies of words with a neutral and random connotation, expressed as relative change since 1980 so it can help in these tasks too. Results absolutely the frequency of positive words increased from 2.0% (1974-80) to 17.5% (2014), a relative increase of 880% over four decades. All 25 individual positive words contributed to the rise, particularly the words “robust,” “novel,” “innovative,” and “unprecedented,” which increased in ratio up to fifteen 000%. Comparable but less pronounced results were obtained when restricting the analysis to chose journals with high impact factors. Authors affiliated to an institute during a non-English speaking country used significantly more positive words. Negative word frequencies increased from 1.3% (1974-80) to three.2% (2014), a relative increase of 257%. Over the identical period of time, no apparent increase was found in neutral or random word use, or within the frequency of positive word use in published books. so lexicographic analysis indicates that scientific abstracts are currently written with more positive and negative words, and provides an insight into the evolution of scientific writing. Apparently scientists look on the brilliant side of research results. So THis data set can play major role in research.
About The Data Set: 1. Dataset is in Excel File Format. 2. Dataset Has two Column (I) Negative Word List (II) Positive Word List 3. In the Dataset Total 4699, Positive Words and Total 4722 Negative Words are theirs. 4. Dataset is collected data from different sources. 5. The dataset has some Null (nan) Values. 6. Please check the Data Once before Use.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Just to see how it can help in many NLP related Tasks.