100+ datasets found
  1. Examples of boilerplate text from PLOS ONE papers based on targeted n-gram...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole M. White; Thirunavukarasu Balasubramaniam; Richi Nayak; Adrian G. Barnett (2023). Examples of boilerplate text from PLOS ONE papers based on targeted n-gram searches (sentence level). [Dataset]. http://doi.org/10.1371/journal.pone.0264360.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nicole M. White; Thirunavukarasu Balasubramaniam; Richi Nayak; Adrian G. Barnett
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Examples of boilerplate text from PLOS ONE papers based on targeted n-gram searches (sentence level).

  2. 190k+ Medium Articles

    • kaggle.com
    zip
    Updated Apr 26, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabio Chiusano (2022). 190k+ Medium Articles [Dataset]. https://www.kaggle.com/datasets/fabiochiusano/medium-articles
    Explore at:
    zip(386824829 bytes)Available download formats
    Dataset updated
    Apr 26, 2022
    Authors
    Fabio Chiusano
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Data source

    This data has been collected through a standard scraping process from the Medium website, looking for published articles.

    Data description

    Each row in the data is a different article published on Medium. For each article, you have the following features: - title [string]: The title of the article. - text [string]: The text content of the article. - url [string]: The URL associated to the article. - authors [list of strings]: The article authors. - timestamp [string]: The publication datetime of the article. - tags [list of strings]: List of tags associated to the article.

    Data analysis

    You can find a very quick data analysis in this notebook.

    What can I do with this data?

    • A multilabel classification model that assigns tags to articles.
    • A seq2seq model that generates article titles.
    • Text analysis.
    • Finetune text generation models on the general domain of Medium, or on specific domains by filtering articles by the appropriate tags.

    Collection methodology

    Scraping has been done with Python and the requests library. Starting from a random article on Medium, the next articles to scrape are selected by visiting: 1. The author archive pages. 2. The publication archive pages (if present). 3. The tags archives (if present).

    The article HTML pages have been parsed with the newspaper Python library.

    Published articles have been filtered for English articles only, using the Python langdetect library.

    As a consequence of the collection methodology, the scraped articles are coming from a not uniform publication date distribution. This means that there are articles published in 2016 and in 2022, but the number of articles in this dataset published in 2016 is not the same as the number of articles published in 2022. In particular, there is a strong prevalence of articles published in 2020. Have a look at the accompanying notebook to see the distribution of the publication dates.

  3. Statistical Analysis of Individual Participant Data Meta-Analyses: A...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    tiff
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart (2023). Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice [Dataset]. http://doi.org/10.1371/journal.pone.0046042
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundIndividual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and FindingsWe included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. ConclusionsFor these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials.

  4. f

    Statistical Analysis - pwID

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sousa, Carla; Damásio, Manuel José; Neves, José Carlos (2022). Statistical Analysis - pwID [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000281176
    Explore at:
    Dataset updated
    Feb 11, 2022
    Authors
    Sousa, Carla; Damásio, Manuel José; Neves, José Carlos
    Description

    Dataset for the statistical analysis of the article "Empowerment through Participatory Game Creation: A Case Study with Adults with Intellectual Disability".

  5. f

    Showing statistical analysis of study data.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Oct 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singh, Vikram P.; Vasanth, Shruthi; Tejan, Nidhi; Patel, Vikas; Garg, Atul; Ghoshal, Ujjala; Arya, Akshay K.; Pandey, Ankita (2021). Showing statistical analysis of study data. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000754250
    Explore at:
    Dataset updated
    Oct 13, 2021
    Authors
    Singh, Vikram P.; Vasanth, Shruthi; Tejan, Nidhi; Patel, Vikas; Garg, Atul; Ghoshal, Ujjala; Arya, Akshay K.; Pandey, Ankita
    Description

    Showing statistical analysis of study data.

  6. f

    Statistical analysis.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Feb 21, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Strommenger, Birgit; Layer, Franziska; Nathaus, Rolf; Witte, Wolfgang; Cuny, Christiane; Altmann, Doris (2013). Statistical analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001732210
    Explore at:
    Dataset updated
    Feb 21, 2013
    Authors
    Strommenger, Birgit; Layer, Franziska; Nathaus, Rolf; Witte, Wolfgang; Cuny, Christiane; Altmann, Doris
    Description

    Statistical analysis.

  7. Sample.

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Coosje L. S. Veldkamp; Michèle B. Nuijten; Linda Dominguez-Alvarez; Marcel A. L. M. van Assen; Jelte M. Wicherts (2023). Sample. [Dataset]. http://doi.org/10.1371/journal.pone.0114876.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Coosje L. S. Veldkamp; Michèle B. Nuijten; Linda Dominguez-Alvarez; Marcel A. L. M. van Assen; Jelte M. Wicherts
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note. 5-yr IF = five-year Impact Factor in 2011. Articles = number of articles published in 2012. Empirical = number of empirical articles published in 2012.Sample.

  8. An instrument to assess the statistical intensity of medical research papers...

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pentti Nieminen; Jorma I. Virtanen; Hannu Vähänikkilä (2023). An instrument to assess the statistical intensity of medical research papers [Dataset]. http://doi.org/10.1371/journal.pone.0186882
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Pentti Nieminen; Jorma I. Virtanen; Hannu Vähänikkilä
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThere is widespread evidence that statistical methods play an important role in original research articles, especially in medical research. The evaluation of statistical methods and reporting in journals suffers from a lack of standardized methods for assessing the use of statistics. The objective of this study was to develop and evaluate an instrument to assess the statistical intensity in research articles in a standardized way.MethodsA checklist-type measure scale was developed by selecting and refining items from previous reports about the statistical contents of medical journal articles and from published guidelines for statistical reporting. A total of 840 original medical research articles that were published between 2007–2015 in 16 journals were evaluated to test the scoring instrument. The total sum of all items was used to assess the intensity between sub-fields and journals. Inter-rater agreement was examined using a random sample of 40 articles. Four raters read and evaluated the selected articles using the developed instrument.ResultsThe scale consisted of 66 items. The total summary score adequately discriminated between research articles according to their study design characteristics. The new instrument could also discriminate between journals according to their statistical intensity. The inter-observer agreement measured by the ICC was 0.88 between all four raters. Individual item analysis showed very high agreement between the rater pairs, the percentage agreement ranged from 91.7% to 95.2%.ConclusionsA reliable and applicable instrument for evaluating the statistical intensity in research papers was developed. It is a helpful tool for comparing the statistical intensity between sub-fields and journals. The novel instrument may be applied in manuscript peer review to identify papers in need of additional statistical review.

  9. s

    Analysis of CBCS publications for Open Access, data availability statements...

    • figshare.scilifelab.se
    • researchdata.se
    • +2more
    txt
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Theresa Kieselbach (2025). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data [Dataset]. http://doi.org/10.17044/scilifelab.23641749.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Umeå University
    Authors
    Theresa Kieselbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)

  10. Medium articles dataset

    • kaggle.com
    • crawlfeeds.com
    zip
    Updated May 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2021). Medium articles dataset [Dataset]. https://www.kaggle.com/crawlfeeds/medium-articles-dataset
    Explore at:
    zip(21800753 bytes)Available download formats
    Dataset updated
    May 9, 2021
    Authors
    Crawl Feeds
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Medium Articles dataset

    Medium is an American online publishing platform launched in August 2012. Crawl Feeds team extracted data from medium articles for research and analysis purposes.

    Fields

    Total fields: 15

    url, crawled_at, id, title, author, published_at, author_url, reading_time, total_claps, raw_description, source, description, tags, images, modified_at

    Get complete dataset from crawl feeds over more than 500K+ records Link

  11. f

    Appendix A. Detailed methods, statistical analysis, figures, and references....

    • datasetcatalog.nlm.nih.gov
    • wiley.figshare.com
    • +1more
    Updated Aug 10, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Teller, Brittany J.; Shea, Katriona; Campbell, Colin (2016). Appendix A. Detailed methods, statistical analysis, figures, and references. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001517073
    Explore at:
    Dataset updated
    Aug 10, 2016
    Authors
    Teller, Brittany J.; Shea, Katriona; Campbell, Colin
    Description

    Detailed methods, statistical analysis, figures, and references.

  12. f

    R code for statistical analysis

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jan 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juliano, Steven; Chandrasegaran, Karthikeyan (2019). R code for statistical analysis [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000126767
    Explore at:
    Dataset updated
    Jan 31, 2019
    Authors
    Juliano, Steven; Chandrasegaran, Karthikeyan
    Description

    This file contains R code for the data analyzed in the paper from Frontiers in Ecology and Evolution

  13. s

    Data from: Data files used to study change dynamics in software systems

    • figshare.swinburne.edu.au
    pdf
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajesh Vasa (2024). Data files used to study change dynamics in software systems [Dataset]. http://doi.org/10.25916/sut.26288227.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Swinburne
    Authors
    Rajesh Vasa
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    It is a widely accepted fact that evolving software systems change and grow. However, it is less well-understood how change is distributed over time, specifically in object oriented software systems. The patterns and techniques used to measure growth permit developers to identify specific releases where significant change took place as well as to inform them of the longer term trend in the distribution profile. This knowledge assists developers in recording systemic and substantial changes to a release, as well as to provide useful information as input into a potential release retrospective. However, these analysis methods can only be applied after a mature release of the code has been developed. But in order to manage the evolution of complex software systems effectively, it is important to identify change-prone classes as early as possible. Specifically, developers need to know where they can expect change, the likelihood of a change, and the magnitude of these modifications in order to take proactive steps and mitigate any potential risks arising from these changes. Previous research into change-prone classes has identified some common aspects, with different studies suggesting that complex and large classes tend to undergo more changes and classes that changed recently are likely to undergo modifications in the near future. Though the guidance provided is helpful, developers need more specific guidance in order for it to be applicable in practice. Furthermore, the information needs to be available at a level that can help in developing tools that highlight and monitor evolution prone parts of a system as well as support effort estimation activities. The specific research questions that we address in this chapter are: (1) What is the likelihood that a class will change from a given version to the next? (a) Does this probability change over time? (b) Is this likelihood project specific, or general? (2) How is modification frequency distributed for classes that change? (3) What is the distribution of the magnitude of change? Are most modifications minor adjustments, or substantive modifications? (4) Does structural complexity make a class susceptible to change? (5) Does popularity make a class more change-prone? We make recommendations that can help developers to proactively monitor and manage change. These are derived from a statistical analysis of change in approximately 55000 unique classes across all projects under investigation. The analysis methods that we applied took into consideration the highly skewed nature of the metric data distributions. The raw metric data (4 .txt files and 4 .log files in a .zip file measuring ~2MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).

  14. f

    Data set used for statistical analysis.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Sep 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Beierle, Alexander; Beckers, Stefan K.; Beierle, Syrina; Rossaint, Rolf; Felzen, Marc; Schröder, Hanna (2024). Data set used for statistical analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001322824
    Explore at:
    Dataset updated
    Sep 6, 2024
    Authors
    Beierle, Alexander; Beckers, Stefan K.; Beierle, Syrina; Rossaint, Rolf; Felzen, Marc; Schröder, Hanna
    Description

    Although prehospital emergency anesthesia (PHEA), with a specific focus on intubation attempts, is frequently studied in prehospital emergency care, there is a gap in the knowledge on aspects related to adherence to PHEA guidelines. This study investigates adherence to the “Guidelines for Prehospital Emergency Anesthesia in Adults” with regard to the induction of PHEA, including the decision making, rapid sequence induction, preoxygenation, standard monitoring, intubation attempts, adverse events, and administration of appropriate medications and their side effects. This retrospective study examined PHEA interventions from 01/01/2020 to 12/31/2021 in the city of Aachen, Germany. The inclusion criteria were adult patients who met the indication criteria for the PHEA. Data were obtained from emergency medical protocols. A total of 127 patients were included in this study. All the patients met the PHEA indication criteria. Despite having a valid indication, 29 patients did not receive the PHEA. 98 patients were endotracheally intubated. For these patients, monitoring had conformed to the guidelines. The medications were used according to the guidelines. A significant increase in oxygen saturation was reported after anesthesia induction (p < 0.001). The patients were successfully intubated endotracheally on the third attempt. Guideline adherence was maintained in terms of execution of PHEA, rapid sequence induction, preoxygenation, monitoring, selection, and administration of relevant medications. Emergency physicians demonstrated the capacity to effectively respond to cardiorespiratory events. Further investigations are needed on the group of patients who did not receive PHEA despite meeting the criteria. The underlying causes of decision making in these cases need to be evaluated in the future.

  15. f

    Dataset utilized for statistical analysis.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raihan, Abu; Habib, Tasmia; Siddique, Mahbubul Pratik; Hossain, Zawad; Chouhan, Chandra Shaker; Ehsan, Amimul; Yeasmin, Farzana; Rahman, A. K. M. Anisur; Rahman, Siddiqur; Nahar, Azimun; Kabir, Ajran (2024). Dataset utilized for statistical analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001362177
    Explore at:
    Dataset updated
    Dec 5, 2024
    Authors
    Raihan, Abu; Habib, Tasmia; Siddique, Mahbubul Pratik; Hossain, Zawad; Chouhan, Chandra Shaker; Ehsan, Amimul; Yeasmin, Farzana; Rahman, A. K. M. Anisur; Rahman, Siddiqur; Nahar, Azimun; Kabir, Ajran
    Description

    BackgroundEhrlichia canis, a rickettsial organism, is responsible for causing ehrlichiosis, a tick-borne disease affecting dogs.ObjectivesThis study aimed to estimate ehrlichiosis prevalence and identify associated risk factors in pet dogs.MethodsA total of 246 peripheral blood samples were purposively collected from pet dogs in Dhaka, Mymensingh, and Rajshahi districts between December 2018 and December 2020. Risk factor data were obtained through face-to-face interviews with dog owners using a pre-structured questionnaire. Multivariable logistic regression analysis identified risk factors. Polymerase chain reaction targeting the 16S rRNA gene confirmed Ehrlichia spp. PCR results were further validated by sequencing.ResultsThe prevalence and case fatality of ehrlichiosis were 6.9% and 47.1%, respectively. Dogs in rural areas had 5.8 times higher odds of ehrlichiosis (odd ratio, OR: 5.84; 95% CI: 1.72–19.89) compared to urban areas. Dogs with access to other dogs had 5.14 times higher odds of ehrlichiosis (OR: 5.14; 95% CI: 1.63–16.27) than those without such access. Similarly, irregularly treated dogs with ectoparasitic drugs had 4.01 times higher odds of ehrlichiosis (OR: 4.01; 95% CI: 1.17–14.14) compared to regularly treated dogs. The presence of ticks on dogs increased ehrlichiosis odds nearly by 3 times (OR: 3.02; 95% CI: 1.02–8.97). Phylogenetic analysis, based on 17 commercially sequenced isolates, showed different clusters of aggregation, however, BAUMAH-13 (PP321265) perfectly settled with a China isolate (OK667945), similarly, BAUMAH-05 (PP321257) with Greece isolate (MN922610), BAUMAH-16 (PP321268) with Italian isolate (KX180945), and BAUMAH-07 (PP321259) with Thailand isolate (OP164610).ConclusionsPet owners and veterinarians in rural areas should be vigilant in monitoring dogs for ticks and ensuring proper preventive care. Limiting access to other dogs in high-risk areas can help mitigate disease spread. Tick prevention measures and regular treatment with ectoparasitic drugs will reduce the risk of ehrlichiosis in dogs. The observed genetic similarity of the Bangladeshi Ehrlichia canis strain highlights the need for ongoing surveillance and research to develop effective control and prevention strategies, both within Bangladesh and globally.

  16. c

    CNBC Economy Dataset - 17K Economy Articles CSV

    • crawlfeeds.com
    csv, zip
    Updated Nov 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). CNBC Economy Dataset - 17K Economy Articles CSV [Dataset]. https://crawlfeeds.com/datasets/cnbc-economy-articles-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Nov 24, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    CNBC Economy Articles Dataset is an invaluable collection of data extracted from CNBC’s economy section, offering deep insights into global and U.S. economic trends, market dynamics, financial policies, and industry developments.

    This dataset encompasses a diverse array of economic articles on critical topics like GDP growth, inflation rates, employment statistics, central bank policies, and major global events influencing the market. Designed for researchers, analysts, and businesses, it serves as an essential resource for understanding economic patterns, conducting sentiment analysis, and developing financial forecasting models.

    Dataset Highlights

    Each record in the dataset is meticulously structured and includes:

    • Article Titles
    • Publication Dates
    • Author Names
    • Content Summaries
    • URLs to Original Articles

    This rich combination of fields ensures seamless integration into data science projects, research papers, and market analyses.

    Key Features

    • Number of Articles: Hundreds of articles sourced directly from CNBC.
    • Data Fields: Includes title, publication date, author, article content, summary, URL, and relevant keywords.
    • Topics Covered: U.S. and global economy, GDP trends, inflation, employment, financial markets, and monetary policies.
    • Format: Delivered in CSV format for easy integration with research tools and analytical platforms.
    • Source: Extracted directly from CNBC’s economy news section, ensuring accuracy and relevance.

    Use Cases

    • Economic Research: Gain insights into U.S. and global economic policies, market trends, and industry developments.
    • Sentiment Analysis: Assess the sentiment of economic articles to gauge market perspectives and investor confidence.
    • Financial Modeling: Build forecasting models leveraging key economic indicators discussed in the dataset.
    • Content Creation: Develop research-backed reports, articles, and presentations on economic topics.

    Who Benefits?

    • Researchers & Academics studying macro-economics or financial policy.
    • Data Scientists building AI models, trend analyzers, or economic forecasting tools.
    • Economists & Analysts need real-world news data for policy analysis.
    • Content Strategists who write data-backed articles about economic trends.

    Why Choose This Dataset?

    • No need to manually scrape CNBC — data is pre-extracted and clean.
    • High-quality economy news metadata enables detailed filtering (by date, author, topic).
    • Ready for machine learning, sentiment analysis, or building news-based economic models.
    • Well-suited for trend tracking, policy analysis, and economic forecasting.

    Explore More News Datasets

    Interested in additional structured news datasets for your research or analytics needs? Check out our news dataset collection to find datasets tailored for diverse analytical applications.

  17. ODM Data Analysis—A tool for the automatic validation, monitoring and...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    mp4
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas (2023). ODM Data Analysis—A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data [Dataset]. http://doi.org/10.1371/journal.pone.0199242
    Explore at:
    mp4Available download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.

  18. H

    Python Codes for Data Analysis of The Impact of COVID-19 on Technical...

    • dataverse.harvard.edu
    • figshare.com
    Updated Mar 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Szkirpan (2022). Python Codes for Data Analysis of The Impact of COVID-19 on Technical Services Units Survey Results [Dataset]. http://doi.org/10.7910/DVN/SXMSDZ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Elizabeth Szkirpan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Copies of Anaconda 3 Jupyter Notebooks and Python script for holistic and clustered analysis of "The Impact of COVID-19 on Technical Services Units" survey results. Data was analyzed holistically using cleaned and standardized survey results and by library type clusters. To streamline data analysis in certain locations, an off-shoot CSV file was created so data could be standardized without compromising the integrity of the parent clean file. Three Jupyter Notebooks/Python scripts are available in relation to this project: COVID_Impact_TechnicalServices_HolisticAnalysis (a holistic analysis of all survey data) and COVID_Impact_TechnicalServices_LibraryTypeAnalysis (a clustered analysis of impact by library type, clustered files available as part of the Dataverse for this project).

  19. f

    Data used in statistical analyses

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Oct 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mayerl, Christopher (2022). Data used in statistical analyses [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000407573
    Explore at:
    Dataset updated
    Oct 25, 2022
    Authors
    Mayerl, Christopher
    Description

    Data used in statistical analyses.

  20. u

    Data from: Data on xylem sap proteins from Mn- and Fe-deficient tomato...

    • agdatacommons.nal.usda.gov
    • datasets.ai
    • +3more
    bin
    Updated Nov 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Ceballos-Laita; Elain Gutierrez-Carbonell; Daisuke Takahashi; Anunciación Abadía; Matsuo Uemura; Javier Abadía; Ana Flor López-Millán (2025). Data from: Data on xylem sap proteins from Mn- and Fe-deficient tomato plants obtained using shotgun proteomics [Dataset]. http://doi.org/10.1016/j.dib.2018.01.034
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    ProteomeXchange
    Authors
    Laura Ceballos-Laita; Elain Gutierrez-Carbonell; Daisuke Takahashi; Anunciación Abadía; Matsuo Uemura; Javier Abadía; Ana Flor López-Millán
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article contains consolidated proteomic data obtained from xylem sap collected from tomato plants grown in Fe- and Mn-sufficient control, as well as Fe-deficient and Mn-deficient conditions. Data presented here cover proteins identified and quantified by shotgun proteomics and Progenesis LC-MS analyses: proteins identified with at least two peptides and showing changes statistically significant (ANOVA; p ≤ 0.05) and above a biologically relevant selected threshold (fold ≥ 2) between treatments are listed. The comparison between Fe-deficient, Mn-deficient and control xylem sap samples using a multivariate statistical data analysis (Principal Component Analysis, PCA) is also included. Data included in this article are discussed in depth in "Effects of Fe and Mn deficiencies on the protein profiles of tomato (Solanum lycopersicum) xylem sap as revealed by shotgun analyses", Ceballos-Laita et al., J. Proteomics, 2018. This dataset is made available to support the cited study as well to extend analyses at a later stage. Resources in this dataset:Resource Title: ProteomeExchange submission PXD007517. Xylem sap shotgun proteomics from Fe- and Mn-deficient and Mn-toxic tomato plants. . File Name: Web Page, url: http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD007517 The MS proteomics data have been deposited to the ProteomeXchange Consortium via the Pride partner repository with the data set identifier PXD007517. Also includes FTP location. Files available at https://www.ebi.ac.uk/pride/archive/projects/PXD007517 via HTML, FTP, or Fast (Aspera) download : 1 SEARCH.xml file, 1 Peak file, 24 RAW files, 1 Mascot information.xlsx file. Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.dib.2018.01.034

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nicole M. White; Thirunavukarasu Balasubramaniam; Richi Nayak; Adrian G. Barnett (2023). Examples of boilerplate text from PLOS ONE papers based on targeted n-gram searches (sentence level). [Dataset]. http://doi.org/10.1371/journal.pone.0264360.t001
Organization logo

Examples of boilerplate text from PLOS ONE papers based on targeted n-gram searches (sentence level).

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 14, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Nicole M. White; Thirunavukarasu Balasubramaniam; Richi Nayak; Adrian G. Barnett
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Examples of boilerplate text from PLOS ONE papers based on targeted n-gram searches (sentence level).

Search
Clear search
Close search
Google apps
Main menu