8 datasets found

f
Table_1_Gender-based time discrepancy in diagnosis of coronary artery...
figshare.com
docx
Updated Jun 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maryam Panahiazar; Andrew M. Bishara; Yorick Chern; Roohallah Alizadehsani; Sheikh M. Shariful Islam; Dexter Hadley; Rima Arnaout; Ramin E. Beygui (2023). Table_1_Gender-based time discrepancy in diagnosis of coronary artery disease based on data analytics of electronic medical records.DOCX [Dataset]. http://doi.org/10.3389/fcvm.2022.969325.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fcvm.2022.969325.s001
Dataset updated
Jun 12, 2023
Dataset provided by
Frontiers
Authors
Maryam Panahiazar; Andrew M. Bishara; Yorick Chern; Roohallah Alizadehsani; Sheikh M. Shariful Islam; Dexter Hadley; Rima Arnaout; Ramin E. Beygui
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundWomen continue to have worse Coronary Artery Disease (CAD) outcomes than men. The causes of this discrepancy have yet to be fully elucidated. The main objective of this study is to detect gender discrepancies in the diagnosis and treatment of CAD.MethodsWe used data analytics to risk stratify ~32,000 patients with CAD of the total 960,129 patients treated at the UCSF Medical Center over an 8 year period. We implemented a multidimensional data analytics framework to trace patients from admission through treatment to create a path of events. Events are any medications or noninvasive and invasive procedures. The time between events for a similar set of paths was calculated. Then, the average waiting time for each step of the treatment was calculated. Finally, we applied statistical analysis to determine differences in time between diagnosis and treatment steps for men and women.ResultsThere is a significant time difference from the first time of admission to diagnostic Cardiac Catheterization between genders (p-value = 0.000119), while the time difference from diagnostic Cardiac Catheterization to CABG is not statistically significant.ConclusionWomen had a significantly longer interval between their first physician encounter indicative of CAD and their first diagnostic cardiac catheterization compared to men. Avoiding this delay in diagnosis may provide more timely treatment and a better outcome for patients at risk. Finally, we conclude by discussing the impact of the study on improving patient care with early detection and managing individual patients at risk of rapid progression of CAD.
Data for manuscript "Reciprocal Radicalization: The Rise of Culture War...
zenodo.org
explore.openaire.eu
+1more
bin
Updated Dec 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Rozado; David Rozado (2021). Data for manuscript "Reciprocal Radicalization: The Rise of Culture War Terminology in British and American News Coverage" [Dataset]. http://doi.org/10.5281/zenodo.5709760
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5709760
Dataset updated
Dec 7, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Rozado; David Rozado
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United Kingdom, United States
Description
This data set contains frequency counts of target words in 16 million news and opinion articles from 10 popular news media outlets in the United Kingdom: The Guardian, The Times, The Independent, The Daily Mirror, BBC, Financial Times, Metro, Telegraph, The and The Daily Mail plus a few additional American-based outlets used for comparison reference. The target words are listed in the associated manuscript and are mostly words that denote some type of prejudice, social justice related terms or counterreaction to it. A few additional words are also available since they are used in the manuscript for illustration purposes.

The textual content of news and opinion articles from the outlets listed in Figure 3 of the main manuscript is available in the outlet's online domains and/or public cache repositories such as Google cache (https://webcache.googleusercontent.com), The Internet Wayback Machine (https://archive.org/web/web.php), and Common Crawl (https://commoncrawl.org). We derived relative frequency counts from these sources. Textual content included in our analysis is circumscribed to articles headlines and main body of text of the articles and does not include other article elements such as figure captions.

Targeted textual content was located in HTML raw data using outlet specific xpath expressions. Tokens were lowercased prior to estimating frequency counts. To prevent outlets with sparse text content for a year from distorting aggregate frequency counts, we only include outlet frequency counts from years for which there is at least 1 million words of article content from an outlet.

Yearly frequency usage of a target word in an outlet in any given year was estimated by dividing the total number of occurrences of the target word in all articles of a given year by the number of all words in all articles of that year. This method of estimating frequency accounts for variable volume of total article output over time.

The list of compressed files in this data set is listed next:

-analysisScripts.rar contains the analysis scripts used in the main manuscript

-targetWordsInArticlesCounts.rar contains counts of target words in outlets articles as well as total counts of words in articles

-targetWordsInArticlesCountsGuardianExampleWords contains counts of target words in outlets articles as well as total counts of words in articles for illustrative Figure 1 in main manuscript

Usage Notes

In a small percentage of articles, outlet specific XPath expressions can fail to properly capture the content of the article due to the heterogeneity of HTML elements and CSS styling combinations with which articles text content is arranged in outlets online domains. As a result, the total and target word counts metrics for a small subset of articles are not precise. In a random sample of articles and outlets, manual estimation of target words counts overlapped with the automatically derived counts for over 90% of the articles.

Most of the incorrect frequency counts were minor deviations from the actual counts such as for instance counting the word "Facebook" in an article footnote encouraging article readers to follow the journalist’s Facebook profile and that the XPath expression mistakenly included as the content of the article main text. To conclude, in a data analysis of 16 million articles, we cannot manually check the correctness of frequency counts for every single article and hundred percent accuracy at capturing articles’ content is elusive due to the small number of difficult to detect boundary cases such as incorrect HTML markup syntax in online domains. Overall however, we are confident that our frequency metrics are representative of word prevalence in print news media content (see Figure 1 of main manuscript for supporting evidence).
Z
Dataset for Report: "The Increasing Prominence of Prejudice and Social...
data.niaid.nih.gov
Updated Jun 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Rozado (2022). Dataset for Report: "The Increasing Prominence of Prejudice and Social Justice Rhetoric in UK News Media" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6482344
Explore at:
Dataset updated
Jun 13, 2022
Dataset authored and provided by
David Rozado
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United Kingdom
Description
This data set contains frequency counts of target words in 16 million news and opinion articles from 10 popular news media outlets in the United Kingdom. The target words are listed in the associated report and are mostly words that denote prejudice or are often associated with social justice discourse. A few additional words not denoting prejudice are also available since they are used in the report for illustration purposes of the method.

The textual content of news and opinion articles from the outlets is available in the outlet's online domains and/or public cache repositories such as Google cache (https://webcache.googleusercontent.com), The Internet Wayback Machine (https://archive.org/web/web.php), and Common Crawl (https://commoncrawl.org). We used derived word frequency counts from these sources. Textual content included in our analysis is circumscribed to articles headlines and main body of text of the articles and does not include other article elements such as figure captions.

Targeted textual content was located in HTML raw data using outlet specific xpath expressions. Tokens were lowercased prior to estimating frequency counts. To prevent outlets with sparse text content for a year from distorting aggregate frequency counts, we only include outlet frequency counts from years for which there is at least 1 million words of article content from an outlet. This threshold was chosen to maximize inclusion in our analysis of outlets with sparse amounts of articles text per year.

Yearly frequency usage of a target word in an outlet in any given year was estimated by dividing the total number of occurrences of the target word in all articles of a given year by the number of all words in all articles of that year. This method of estimating frequency accounts for variable volume of total article output over time.

In a small percentage of articles, outlet specific XPath expressions might fail to properly capture the content of the article due to the heterogeneity of HTML elements and CSS styling combinations with which articles text content is arranged in outlets online domains. As a result, the total and target word counts metrics for a small subset of articles are not precise. In a random sample of articles and outlets, manual estimation of target words counts overlapped with the automatically derived counts for over 90% of the articles.

Most of the incorrect frequency counts are often minor deviations from the actual counts such as for instance counting the word "Facebook" in an article footnote encouraging article readers to follow the journalist’s Facebook profile and that the XPath expression mistakenly included as the content of the article main text.To conclude, in a data analysis of over 16 million articles, we cannot manually check the correctness of frequency counts for every single article and hundred percent accuracy at capturing articles’ content is elusive due to the small number of difficult to detect boundary cases such as incorrect HTML markup syntax in online domains. Overall however, we are confident that our frequency metrics are representative of word prevalence in print news media content (see Figure 2 of main manuscript for supporting evidence of the temporal precision of the method).
data result.xlsx
figshare.com
bin
Updated Aug 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuhan Liu (2023). data result.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.23828130.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23828130.v1
Dataset updated
Aug 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Yuhan Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We use content analysis to compare local media coverage with central media coverage in Chengdu, the main city of the protests, 6 online media platforms and 987 covid-related tweets are included to analyze the crisis coverage gap. We encode tweets and tables present the result of data analysis.
Z
Data for manuscript "The Prevalence of Terms Denoting Far-right and Far-left...
data.niaid.nih.gov
zenodo.org
Updated Mar 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rozado, David (2022). Data for manuscript "The Prevalence of Terms Denoting Far-right and Far-left Political Extremism in U.S. and U.K. News Media" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5437015
Explore at:
Dataset updated
Mar 22, 2022
Dataset authored and provided by
Rozado, David
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United Kingdom, United States
Description
This data set belongs to an academic manuscript examining longitudinally (2000-2019) the prevalence of terms denoting far-right and far-left political extremism in a large corpus of more than 32 million written news and opinion articles from 54 news media outlets popular in the United States and the United Kingdom.

The textual content of news and opinion articles from the 54 outlets listed in the main manuscript is available in the outlet's online domains and/or public cache repositories such as Google cache (https://webcache.googleusercontent.com), The Internet Wayback Machine (https://archive.org/web/web.php), and Common Crawl (https://commoncrawl.org). We used derived word frequency counts from these sources. Textual content included in our analysis is circumscribed to articles headlines and main body of text of the articles and does not include other article elements such as figure captions.

Targeted textual content was located in HTML raw data using outlet specific xpath expressions. Tokens were lowercased prior to estimating frequency counts. To prevent outlets with sparse text content for a year from distorting aggregate frequency counts, we only include outlet frequency counts from years for which there is at least 1 million words of article content from an outlet. This threshold was chosen to maximize inclusion in our analysis of outlets with sparse amounts of articles text per year.

Yearly frequency usage of a target word in an outlet in any given year was estimated by dividing the total number of occurrences of the target word in all articles of a given year by the number of all words in all articles of that year. This method of estimating frequency accounts for variable volume of total article output over time.

The list of compressed files in this data set is listed next:

-analysisScripts.rar contains the analysis scripts used in the main manuscript

-articlesContainingTargetWords.rar contains counts of target words in outlets articles as well as total counts of words in articles

Usage Notes

In a small percentage of articles, outlet specific XPath expressions failed to properly capture the content of the article due to the heterogeneity of HTML elements and CSS styling combinations with which articles text content is arranged in outlets online domains. As a result, the total and target word counts metrics for a small subset of articles are not precise. In a random sample of articles and outlets, manual estimation of target words counts overlapped with the automatically derived counts for over 90% of the articles.

Most of the incorrect frequency counts were minor deviations from the actual counts such as for instance counting the word "Facebook" in an article footnote encouraging article readers to follow the journalist’s Facebook profile and that the XPath expression mistakenly included as the content of the article main text. Some additional outlet-specific inaccuracies that we could identify occurred in "The Hill" and "Newsmax" news outlets where XPath expressions had some shortfalls at precisely capturing articles’ content. For "The Hill", in years 2007-2009, XPath expressions failed to capture the complete text of the article in about 40% of the articles. This does not necessarily result in incorrect frequency counts for that outlet but in a sample of articles’ words that is about 40% smaller than the total population of articles words for those three years. In the case of "NewsMax", the issue was that for some articles, XPath expressions captured the entire text of the article twice. Notice that this does not result in incorrect frequency counts. If a word appears x times in an article with a total of y words, the same frequency count will still be derived when our scripts count the word 2x times in the version of the article with a total of 2y words.

To conclude, in a data analysis of 32 million articles, we cannot manually check the correctness of frequency counts for every single article and hundred percent accuracy at capturing articles’ content is elusive due to the small number of difficult to detect boundary cases such as incorrect HTML markup syntax in online domains. Overall however, we are confident that our frequency metrics are representative of word prevalence in print news media content (see Figure 1 in the main manuscript for illustration of the accuracy of the frequency counts).
Compilation of stomach content data for mesopelagic fish and predator...
doi.pangaea.de
gis.ices.dk
html, tsv
Updated Jul 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mónica A Silva; Catarina T Fonseca; M Pilar Olivar; Ainhoa Bernal; Jérôme Spitz; Gui M Menezes; Tone Falkenhaug; Odd Aksel Bergstad; Sergi Pérez-Jorge; Vanda Carmo; Tracey T Sutton (2022). Compilation of stomach content data for mesopelagic fish and predator species from the central and Northeast Atlantic, and the Mediterranean Sea [Dataset]. http://doi.org/10.1594/PANGAEA.946139
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.946139
Dataset updated
Jul 8, 2022
Dataset provided by
PANGAEA
Authors
Mónica A Silva; Catarina T Fonseca; M Pilar Olivar; Ainhoa Bernal; Jérôme Spitz; Gui M Menezes; Tone Falkenhaug; Odd Aksel Bergstad; Sergi Pérez-Jorge; Vanda Carmo; Tracey T Sutton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Variables measured
Gear, Size, Class, Month, Order, Family, Phylum, Comment, LATITUDE, Location, and 20 more
Description
Stomach contents analysis is a standard dietary assessment method that potentially enables quantifying diet components with high taxonomic resolution. We compiled diet compositions from stomach content analysis from 75 unique species or genera: 32 fish, 19 marine mammals, 14 elasmobranchs, 9 seabirds and one marine turtle. Data were gathered from 89 published sources that included samples collected between 1885 and 2016 throughout the central and Northeast Atlantic, and the Mediterranean Sea. When available, we reported the percentage number of individuals of a prey type as a proportion of the total number of prey items (%N), the proportion of a prey item by weight (%W), and the proportion of stomachs containing a particular prey item (i.e. percent frequency of occurrence, %F). For each data record, we also provided the sampling location, geographic coordinates, month and year of sample collection, method of sample collection, taxonomic ranks (phylum, class, order, family), number and size (or size range) of sampled organisms, as well as the reference and DOI of the original data source, for further details on the samples analysed and/or the analytical techniques used.
n
Global Navigation Satellite System (GNSS) IGS Ionosphere Vertical Total...
cmr.earthdata.nasa.gov
s.cnmilf.com
+4more
ascii
Updated Jun 21, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Global Navigation Satellite System (GNSS) IGS Ionosphere Vertical Total Electron Content (VTEC) Analysis Center (AC) Rapid Product from NASA CDDIS [Dataset]. http://doi.org/10.5067/GNSS/GNSS_IGSRIONOACTEC_001
Explore at:
ascii(5 MB)Available download formats
Unique identifier
https://doi.org/10.5067/GNSS/GNSS_IGSRIONOACTEC_001
Dataset updated
Jun 21, 2019
Time period covered
Jan 1, 1998 - Present
Area covered
Earth
Description
This derived product set consists of Global Navigation Satellite System Rapid Ionosphere Vertical Total Electron Content (VTEC) product (daily files) from the NASA Crustal Dynamics Data Information System (CDDIS). The VTEC product files also include Delay Code Bias (DCB) values for GNSS satellites and ground receivers derived during the analysis. GNSS provide autonomous geo-spatial positioning with global coverage. GNSS data sets from ground receivers at the CDDIS consist primarily of the data from the U.S. Global Positioning System (GPS) and the Russian GLObal NAvigation Satellite System (GLONASS). Since 2011, the CDDIS GNSS archive includes data from other GNSS (Europe’s Galileo, China’s Beidou, Japan’s Quasi-Zenith Satellite System/QZSS, the Indian Regional Navigation Satellite System/IRNSS, and worldwide Satellite Based Augmentation Systems/SBASs), which are similar to the U.S. GPS in terms of the satellite constellation, orbits, and signal structure. GNSS observations from a global network can be utilized for atmospheric measurements. Analysis Centers (ACs) of the International GNSS Service (IGS) retrieve GNSS data on regular schedules to produce independently computed VTEC maps. The IGS Ionosphere Analysis Center Coordinator (ACC) uses these individual AC solutions to generate the official IGS VTEC maps. The AC VTEC maps are computed with a resolution of 2 hours in UT, 5 degrees in longitude and 2.5 degrees in latitude; they have an availability with a latency of 1-2 days.
Components obtained from each block of questions having applied a factorial...
plos.figshare.com
xls
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Candela Ollé; Alexandre López-Borrull; Remedios Melero; Juan-José Boté-Vericad; Josep-Manuel Rodríguez-Gairín; Ernest Abadal (2023). Components obtained from each block of questions having applied a factorial analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0288313.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0288313.t004
Dataset updated
Jul 13, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Candela Ollé; Alexandre López-Borrull; Remedios Melero; Juan-José Boté-Vericad; Josep-Manuel Rodríguez-Gairín; Ernest Abadal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Components obtained from each block of questions having applied a factorial analysis.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Maryam Panahiazar; Andrew M. Bishara; Yorick Chern; Roohallah Alizadehsani; Sheikh M. Shariful Islam; Dexter Hadley; Rima Arnaout; Ramin E. Beygui (2023). Table_1_Gender-based time discrepancy in diagnosis of coronary artery disease based on data analytics of electronic medical records.DOCX [Dataset]. http://doi.org/10.3389/fcvm.2022.969325.s001

Table_1_Gender-based time discrepancy in diagnosis of coronary artery disease based on data analytics of electronic medical records.DOCX

Explore at:

docxAvailable download formats

Unique identifier

https://doi.org/10.3389/fcvm.2022.969325.s001

Dataset updated

Jun 12, 2023

Dataset provided by

Frontiers

Authors

Maryam Panahiazar; Andrew M. Bishara; Yorick Chern; Roohallah Alizadehsani; Sheikh M. Shariful Islam; Dexter Hadley; Rima Arnaout; Ramin E. Beygui

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

BackgroundWomen continue to have worse Coronary Artery Disease (CAD) outcomes than men. The causes of this discrepancy have yet to be fully elucidated. The main objective of this study is to detect gender discrepancies in the diagnosis and treatment of CAD.MethodsWe used data analytics to risk stratify ~32,000 patients with CAD of the total 960,129 patients treated at the UCSF Medical Center over an 8 year period. We implemented a multidimensional data analytics framework to trace patients from admission through treatment to create a path of events. Events are any medications or noninvasive and invasive procedures. The time between events for a similar set of paths was calculated. Then, the average waiting time for each step of the treatment was calculated. Finally, we applied statistical analysis to determine differences in time between diagnosis and treatment steps for men and women.ResultsThere is a significant time difference from the first time of admission to diagnostic Cardiac Catheterization between genders (p-value = 0.000119), while the time difference from diagnostic Cardiac Catheterization to CABG is not statistically significant.ConclusionWomen had a significantly longer interval between their first physician encounter indicative of CAD and their first diagnostic cardiac catheterization compared to men. Avoiding this delay in diagnosis may provide more timely treatment and a better outcome for patients at risk. Finally, we conclude by discussing the impact of the study on improving patient care with early detection and managing individual patients at risk of rapid progression of CAD.

Clear search

Close search

Google apps

Main menu

Table_1_Gender-based time discrepancy in diagnosis of coronary artery...

Data for manuscript "Reciprocal Radicalization: The Rise of Culture War...

Dataset for Report: "The Increasing Prominence of Prejudice and Social...

data result.xlsx

Data for manuscript "The Prevalence of Terms Denoting Far-right and Far-left...

Compilation of stomach content data for mesopelagic fish and predator...

Global Navigation Satellite System (GNSS) IGS Ionosphere Vertical Total...

Components obtained from each block of questions having applied a factorial...

Table_1_Gender-based time discrepancy in diagnosis of coronary artery disease based on data analytics of electronic medical records.DOCXSee More Versions

Table_1_Gender-based time discrepancy in diagnosis of coronary artery disease based on data analytics of electronic medical records.DOCX