13 datasets found

Data from: Journal Ranking Dataset
kaggle.com
Updated Aug 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abir (2023). Journal Ranking Dataset [Dataset]. https://www.kaggle.com/datasets/xabirhasan/journal-ranking-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 15, 2023
Dataset provided by
Kaggle
Authors
Abir
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Journals & Ranking

An academic journal or research journal is a periodical publication in which research articles relating to a particular academic discipline is published, according to Wikipedia. Currently, there are more than 25,000 peer-reviewed journals that are indexed in citation index databases such as Scopus and Web of Science. These indexes are ranked on the basis of various metrics such as CiteScore, H-index, etc. The metrics are calculated from yearly citation data of the journal. A lot of efforts are given to make a metric that reflects the journal's quality.

Journal Ranking Dataset

This is a comprehensive dataset on the academic journals coving their metadata information as well as citation, metrics, and ranking information. Detailed data on their subject area is also given in this dataset. The dataset is collected from the following indexing databases: - Scimago Journal Ranking - Scopus - Web of Science Master Journal List

The data is collected by scraping and then it was cleaned, details of which can be found in HERE.

Key Features

Rank: Overall rank of journal (derived from sorted SJR index).

Title: Name or title of journal.

OA: Open Access or not.

Country: Country of origin.

SJR-index: A citation index calculated by Scimago.

CiteScore: A citation index calculated by Scopus.

H-index: Hirsh index, the largest number h such that at least h articles in that journal were cited at least h times each.

Best Quartile: Top Q-index or quartile a journal has in any subject area.

Best Categories: Subject areas with top quartile.

Best Subject Area: Highest ranking subject area.

Best Subject Rank: Rank of the highest ranking subject area.

Total Docs.: Total number of documents of the journal.

Total Docs. 3y: Total number of documents in the past 3 years.

Total Refs.: Total number of references of the journal.

Total Cites 3y: Total number of citations in the past 3 years.

Citable Docs. 3y: Total number of citable documents in the past 3 years.

Cites/Doc. 2y: Total number of citations divided by the total number of documents in the past 2 years.

Refs./Doc.: Total number of references divided by the total number of documents.

Publisher: Name of the publisher company of the journal.

Core Collection: Web of Science core collection name.

Coverage: Starting year of coverage.

Active: Active or inactive.

In-Press: Articles in press or not.

ISO Language Code: Three-letter ISO 639 code for language.

ASJC Codes: All Science Journal Classification codes for the journal.

Rest of the features provide further details on the journal's subject area or category: - Life Sciences: Top level subject area. - Social Sciences: Top level subject area. - Physical Sciences: Top level subject area. - Health Sciences: Top level subject area. - 1000 General: ASJC main category. - 1100 Agricultural and Biological Sciences: ASJC main category. - 1200 Arts and Humanities: ASJC main category. - 1300 Biochemistry, Genetics and Molecular Biology: ASJC main category. - 1400 Business, Management and Accounting: ASJC main category. - 1500 Chemical Engineering: ASJC main category. - 1600 Chemistry: ASJC main category. - 1700 Computer Science: ASJC main category. - 1800 Decision Sciences: ASJC main category. - 1900 Earth and Planetary Sciences: ASJC main category. - 2000 Economics, Econometrics and Finance: ASJC main category. - 2100 Energy: ASJC main category. - 2200 Engineering: ASJC main category. - 2300 Environmental Science: ASJC main category. - 2400 Immunology and Microbiology: ASJC main category. - 2500 Materials Science: ASJC main category. - 2600 Mathematics: ASJC main category. - 2700 Medicine: ASJC main category. - 2800 Neuroscience: ASJC main category. - 2900 Nursing: ASJC main category. - 3000 Pharmacology, Toxicology and Pharmaceutics: ASJC main category. - 3100 Physics and Astronomy: ASJC main category. - 3200 Psychology: ASJC main category. - 3300 Social Sciences: ASJC main category. - 3400 Veterinary: ASJC main category. - 3500 Dentistry: ASJC main category. - 3600 Health Professions: ASJC main category.
r
ABS - Index of Household Advantage and Disadvantage (IHAD) (LGA) 2016
researchdata.edu.au
null
Updated Jun 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of the Commonwealth of Australia - Australian Bureau of Statistics (2023). ABS - Index of Household Advantage and Disadvantage (IHAD) (LGA) 2016 [Dataset]. https://researchdata.edu.au/abs-index-household-lga-2016/2747823
Explore at:
nullAvailable download formats
Dataset updated
Jun 28, 2023
Dataset provided by
Australian Urban Research Infrastructure Network (AURIN)
Authors
Government of the Commonwealth of Australia - Australian Bureau of Statistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Description
This dataset presents information from 2016 at the household level; the percentage of households within each Index of Household Advantage and Disadvantage (IHAD) quartile for Local Government Area (LGA) 2017 boundaries.

The IHAD is an experimental analytical index developed by the Australian Bureau of Statistics (ABS) that provides a summary measure of relative socio-economic advantage and disadvantage for households. It utilises information from the 2016 Census of Population and Housing.

IHAD quartiles: All households are ordered from lowest to highest disadvantage, the lowest 25% of households are given a quartile number of 1, the next lowest 25% of households are given a quartile number of 2 and so on, up to the highest 25% of households which are given a quartile number of 4. This means that households are divided up into four groups, depending on their score.

This data is ABS data (catalogue number: 4198.0) used with permission from the Australian Bureau of Statistics.

For more information please visit the Australian Bureau of Statistics.

Please note:

AURIN has generated this dataset through aggregating the original SA1 level data (with calculated number of households/quartile) to LGA level.

Aggregation was achieved through calculating the centroid for each SA1 and assigning it to the LGA it fell within.

The number of occupied private dwellings, and number of households in each of the IHAD quartiles were calculated for each LGA by aggregating the peviously assigned SA1 values of each of those specified columns from the SA1 dataset. Percentages of households in each of the IHAD quartiles were calculated for each LGA from these aggregated totals.

A household is defined as one or more persons, at least one of whom is at least 15 years of age, usually resident in the same private dwelling. All occupants of a dwelling form a household. For Census purposes, the total number of households is equal to the total number of occupied private dwellings (Census of Population and Housing: Census Dictionary, 2016 cat. no. 2901.0).

IHAD output has been confidentialised to meet ABS requirements. In line with standard ABS procedures to minimise the risk of identifying individuals, a technique has been applied to randomly adjust cell values of the output tables. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals.
Gender, Age, and Emotion Detection from Voice
kaggle.com
Updated May 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rohit Zaman (2021). Gender, Age, and Emotion Detection from Voice [Dataset]. https://www.kaggle.com/datasets/rohitzaman/gender-age-and-emotion-detection-from-voice/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 29, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rohit Zaman
Description
Context

Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.

Content

Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.

Acknowledgements

Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/
House price to workplace-based earnings ratio
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Mar 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). House price to workplace-based earnings ratio [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/ratioofhousepricetoworkplacebasedearningslowerquartileandmedian
Explore at:
xlsxAvailable download formats
Dataset updated
Mar 24, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Affordability ratios calculated by dividing house prices by gross annual workplace-based earnings. Based on the median and lower quartiles of both house prices and earnings in England and Wales.
r
ABS - Index of Household Advantage and Disadvantage (IHAD) (SA2) 2016
researchdata.edu.au
null
Updated Jun 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of the Commonwealth of Australia - Australian Bureau of Statistics (2023). ABS - Index of Household Advantage and Disadvantage (IHAD) (SA2) 2016 [Dataset]. https://researchdata.edu.au/abs-index-household-sa2-2016/2748282
Explore at:
nullAvailable download formats
Dataset updated
Jun 28, 2023
Dataset provided by
Australian Urban Research Infrastructure Network (AURIN)
Authors
Government of the Commonwealth of Australia - Australian Bureau of Statistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Description
This dataset presents information from 2016 at the household level; the percentage of households within each Index of Household Advantage and Disadvantage (IHAD) quartile for Statistical Area Level 2 (SA2) 2016 boundaries.

The IHAD is an experimental analytical index developed by the Australian Bureau of Statistics (ABS) that provides a summary measure of relative socio-economic advantage and disadvantage for households. It utilises information from the 2016 Census of Population and Housing.

IHAD quartiles: All households are ordered from lowest to highest disadvantage, the lowest 25% of households are given a quartile number of 1, the next lowest 25% of households are given a quartile number of 2 and so on, up to the highest 25% of households which are given a quartile number of 4. This means that households are divided up into four groups, depending on their score.

This data is ABS data (catalogue number: 4198.0) used with permission from the Australian Bureau of Statistics.

For more information please visit the Australian Bureau of Statistics.

Please note:

AURIN has generated this dataset through aggregating the original SA1 level data (with calculated number of households/quartile) to SA2 level.

The number of occupied private dwellings, and number of households in each of the IHAD quartiles for each SA2 were calculated by aggregating the values of each of those specified columns from the SA1 dataset. Percentages of households in each of the IHAD quartiles were calculated for each SA2 from these aggregated totals.

A household is defined as one or more persons, at least one of whom is at least 15 years of age, usually resident in the same private dwelling. All occupants of a dwelling form a household. For Census purposes, the total number of households is equal to the total number of occupied private dwellings (Census of Population and Housing: Census Dictionary, 2016 cat. no. 2901.0).

IHAD output has been confidentialised to meet ABS requirements. In line with standard ABS procedures to minimise the risk of identifying individuals, a technique has been applied to randomly adjust cell values of the output tables. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals.
House price to residence-based earnings ratio
ons.gov.uk
cloud.csiss.gmu.edu
+2more
xlsx
Updated Mar 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). House price to residence-based earnings ratio [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/ratioofhousepricetoresidencebasedearningslowerquartileandmedian
Explore at:
xlsxAvailable download formats
Dataset updated
Mar 24, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Affordability ratios calculated by dividing house prices by gross annual residence-based earnings. Based on the median and lower quartiles of both house prices and earnings in England and Wales.
Lettuce Growth Days Analysis
kaggle.com
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jurijs Ručko (2024). Lettuce Growth Days Analysis [Dataset]. https://www.kaggle.com/datasets/jurijsruko/lettuce/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 15, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jurijs Ručko
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The project aims to investigate the relationship between temperature, humidity, TDS value, pH level, and growth days to understand how these factors influence lettuce growth. The project also aims to calculate summary statistics such as mean, median, quartiles, and min/max values for each variable to gain insights into the distribution of the data.

Plant Identifier (Plant_ID): A distinctive identifier assigned to each individual plant.

Date: The timestamp of the observation, marking key milestones in the growth process.

Temperature (°C): The recorded temperature in degrees Celsius, a pivotal environmental variable.

Humidity (%): The percentage representing the humidity level, influencing the plant’s water uptake.

Total Dissolved Solids (TDS) Value (ppm): A measurement of dissolved solids in parts per million, reflecting nutrient availability.

pH Level: The environmental pH level, a crucial factor impacting nutrient absorption.

Growth Days: The duration, in days, from the initial growth stage to the plant’s full maturity.
f
Descriptive statistics of the 2 datasets with mean, standard deviation (SD),...
plos.figshare.com
xls
Updated Jun 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Achim Langenbucher; Nóra Szentmáry; Alan Cayless; Jascha Wendelstein; Peter Hoffmann (2023). Descriptive statistics of the 2 datasets with mean, standard deviation (SD), median, the lower (quantile 2.5%) and upper (quantile 97.5%) boundary of the 95% confidence interval, and the interquartile range IQR (quartile 75%—quartile 25%). [Dataset]. http://doi.org/10.1371/journal.pone.0282213.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0282213.t001
Dataset updated
Jun 18, 2023
Dataset provided by
PLOS ONE
Authors
Achim Langenbucher; Nóra Szentmáry; Alan Cayless; Jascha Wendelstein; Peter Hoffmann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AL refers to the axial length, CCT to the central corneal thickness, ACD to the external phakic anterior chamber depth measured from the corneal front apex to the front apex of the crystalline lens, LT to the central thickness of the crystalline lens, R1 and R2 to the corneal radii of curvature for the flat and steep meridians, Rmean to the average of R1 and R2, PIOL to the refractive power of the intraocular lens implant, and SEQ to the spherical equivalent power achieved 5 to 12 weeks after cataract surgery.
u
2016 Census of Canada - Housing Suitability and Shelter-cost-to-income Ratio...
open.library.ubc.ca
borealisdata.ca
Updated Feb 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2020). 2016 Census of Canada - Housing Suitability and Shelter-cost-to-income Ratio by Status of Primary Household Maintainer for BC CSDs [custom tabulation] [Dataset]. http://doi.org/10.14288/1.0388705
Explore at:
Unique identifier
https://doi.org/10.14288/1.0388705
Dataset updated
Feb 25, 2020
Authors
Statistics Canada
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Time period covered
Dec 31, 2016
Area covered
British Columbia
Description
This dataset includes one dataset which was custom ordered from Statistics Canada.The table includes information on housing suitability and shelter-cost-to-income ratio by number of bedrooms, housing tenure, status of primary household maintainer, household type, and income quartile ranges for census subdivisions in British Columbia.

The dataset is in Beyond 20/20 (.ivt) format. The Beyond 20/20 browser is required in order to open it. This software can be freely downloaded from the Statistics Canada website:
https://www.statcan.gc.ca/eng/public/beyond20-20 (Windows only).
For information on how to use Beyond 20/20, please see:
http://odesi2.scholarsportal.info/documentation/Beyond2020/beyond20-quickstart.pdf
https://wiki.ubc.ca/Library:Beyond_20/20_Guide

Custom order from Statistics Canada includes the following dimensions and variables:

Geography:
Non-reserve CSDs in British Columbia - 299 geographies
The global non-response rate (GNR) is an important measure of census data quality. It combines total non-response (households) and partial non-response (questions). A lower GNR indicates a lower risk of non-response bias and, as a result, a lower risk of inaccuracy. The counts and estimates for geographic areas with a GNR equal to or greater than 50% are not published in the standard products. The counts and estimates for these areas have a high risk of non-response bias, and in most cases, should not be released. All the geographies requested for this tabulation have been cleared for the release of income data and have a GNR under 50%.

Housing Tenure Including Presence of Mortgage (5)
1. Total – Private non-band non-farm off-reserve households with an income greater than zero by housing tenure
2. Households who own
3. With a mortgage1
4. Without a mortgage
5. Households who rent
Note: 1) Presence of mortgage - Refers to whether the owner households reported mortgage or loan payments for their dwelling.

2015 Before-tax Household Income Quartile Ranges (5)
1. Total – Private households by quartile ranges1, 2, 3
2. Count of households under or at quartile 1
3. Count of households between quartile 1 and quartile 2 (median) (including at quartile 2)
4. Count of households between quartile 2 (median) and quartile 3 (including at quartile 3)
5. Count of households over quartile 3
Notes: 1) A private household will be assigned to a quartile range depending on its CSD-level location and depending on its tenure (owned and rented). Quartile ranges for owned households in a specific CSD are delimited by the 2015 before-tax income quartiles of owned households with an income greater than zero and residing in non-farm off-reserve dwellings in that CSD. Quartile ranges for rented households in a specific CSD are delimited by the 2015 before-tax income quartiles of rented households with an income greater than zero and residing in non-farm off-reserve dwellings in that CSD.
2) For the income quartiles dollar values (the delimiters) please refer to Table 1.
3) Quartiles 1 to 3 are suppressed if the number of actual records used in the calculation (not rounded or weighted) is less than 16. For cases in which the renters’ quartiles or the owners’ quartiles (figures from Table 1) of a CSD are suppressed the CSD is assigned to a quartile range depending on the provincial renters’ or owners’ quartile figures.

Number of Bedrooms (Unit Size) (6)
1. Total – Private households by number of bedrooms1
2. 0 bedrooms (Bachelor/Studio)
3. 1 bedroom
4. 2 bedrooms
5. 3 bedrooms
6. 4 bedrooms
Note: 1) Dwellings with 5 bedrooms or more included in the total count only.

Housing Suitability (6)
1. Total - Housing suitability
2. Suitable
3. Not suitable
4. One bedroom shortfall
5. Two bedroom shortfall
6. Three or more bedroom shortfall
Note: 1) 'Housing suitability' refers to whether a private household is living in suitable accommodations according to the National Occupancy Standard (NOS); that is, whether the dwelling has enough bedrooms for the size and composition of the household. A household is deemed to be living in suitable accommodations if its dwelling has enough bedrooms, as calculated using the NOS.
'Housing suitability' assesses the required number of bedrooms for a household based on the age, sex, and relationships among household members. An alternative variable, 'persons per room,' considers all rooms in a private dwelling and the number of household members.
Housing suitability and the National Occupancy Standard (NOS) on which it is based were developed by Canada Mortgage and Housing Corporation (CMHC) through consultations with provincial housing agencies.

Shelter-cost-to-income-ratio (4)
1. Total – Private non-band non-farm off-reserve households with an income greater than zero
2. Spending less than 30% of households total income on shelter costs
3. Spending 30% or more of households total income on shelter costs
4. Spending 50% or more of households total income on shelter costs
Note: 'Shelter-cost-to-income ratio' refers to the proportion of average total income of household which is spent on shelter costs.

Household Statistics (8)
1. Total – Private non-band non-farm off-reserve households with an income greater than zero1
2. Average household income in 2015 ($)2
3. Median household income in 2015 ($)3
4. Quartile 1 of household income in 2015 ($)4
5. Quartile 2 (median) of household income in 2015 ($)4
6. Quartile 3 of household income in 2015 ($)4
7. Average monthly shelter costs ($)2,5
8. Median monthly shelter costs ($)3,5
Notes: 1) All households statistics are calculated based on the distribution of private households in non-farm off-reserve non-band occupied private dwellings with a before-tax household income greater than zero.
2) The average is suppressed if the number of actual records used in the calculation (not rounded or weighted) is less than 4.
3) The median is suppressed if the number of actual records used in the calculation (not rounded or weighted) is less than 8.
4) Quartiles 1 to 3 are suppressed if the number of actual records used in the calculation (not rounded or weighted) is less than 16.
5) Shelter costs for owner households include, where applicable, mortgage payments, property taxes and condominium fees, along with the costs of electricity, heat, water and other municipal services. For renter households, shelter costs include, where applicable, the rent and the costs of electricity, heat, water and other municipal services.

Status of Primary Household Maintainer (11)
1. Total – Private households by Aboriginal identity of the primary household maintainer
2. PHM is Aboriginal2
3. PHM is not Aboriginal
4. Total – Private households by immigration status of the primary household maintainer
5. PHM is a non-immigrant3
6. PHM is an immigrant or a non-permanent resident
7. PHM is a non-permanent resident4
8. PHM is an immigrant5,6
9. Officially landed in Canada between 2011 and 2016 7
10. Officially landed in Canada between 2006 and 2010
11. Officially landed in Canada before 2006

Notes: 1) The Primary Household Maintainer is the first person in the household identified as someone who pays the rent or the mortgage, or the taxes, or the electricity bill, and so on, for the dwelling.
In the case of a household where two or more people are listed as household maintainers, the first person listed is chosen as the primary household maintainer.
2) 'Aboriginal identity' includes persons who are First Nations (North American Indian), Métis or Inuk (Inuit) and/or those who are Registered or Treaty Indians (that is, registered under the Indian Act of Canada) and/or those who have membership in a First Nation or Indian band. Aboriginal peoples of Canada are defined in the Constitution Act, 1982, section 35 (2) as including the Indian, Inuit and Métis peoples of Canada.
3) 'Non-immigrants' includes persons who are Canadian citizens by birth.
4) 'Non-permanent residents' includes persons from another country who have a work or study permit or who are refugee claimants, and their family members sharing the same permit and living in Canada with them.
5) 'Immigrants' includes persons who are, or who have ever been, landed immigrants or permanent residents. Such persons have been granted the right to live in Canada permanently by immigration authorities. Immigrants who have obtained Canadian citizenship by naturalization are included in this category. In the 2016 Census of Population, 'Immigrants' includes immigrants who landed in Canada on or prior to May 10, 2016.
6) Immigrants may not have a complete year of applicable income. The income data for the 2016 Census of Population are for the year 2015.
7) Includes immigrants who landed in Canada on or prior to May 10, 2016.

Original file name: CRO0163850_CT.5 (BC_Cultural),ivt
f
Long Covid Risk
figshare.com
txt
Updated Apr 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Shaheen (2024). Long Covid Risk [Dataset]. http://doi.org/10.6084/m9.figshare.25599591.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25599591.v1
Dataset updated
Apr 13, 2024
Dataset provided by
figshare
Authors
Ahmed Shaheen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Feature preparation Preprocessing was applied to the data, such as creating dummy variables and performing transformations (centering, scaling, YeoJohnson) using the preProcess() function from the “caret” package in R. The correlation among the variables was examined and no serious multicollinearity problems were found. A stepwise variable selection was performed using a logistic regression model. The final set of variables included: Demographic: age, body mass index, sex, ethnicity, smoking History of disease: heart disease, migraine, insomnia, gastrointestinal disease, COVID-19 history: covid vaccination, rashes, conjunctivitis, shortness of breath, chest pain, cough, runny nose, dysgeusia, muscle and joint pain, fatigue, fever ,COVID-19 reinfection, and ICU admission. These variables were used to train and test various machine-learning models Model selection and training The data was randomly split into 80% training and 20% testing subsets. The “h2o” package in R version 4.3.1 was employed to implement different algorithms. AutoML was first used, which automatically explored a range of models with different configurations. Gradient Boosting Machines (GBM), Random Forest (RF), and Regularized Generalized Linear Model (GLM) were identified as the best-performing models on our data and their parameters were fine-tuned. An ensemble method that stacked different models together was also used, as it could sometimes improve the accuracy. The models were evaluated using the area under the curve (AUC) and C-statistics as diagnostic measures. The model with the highest AUC was selected for further analysis using the confusion matrix, accuracy, sensitivity, specificity, and F1 and F2 scores. The optimal prediction threshold was determined by plotting the sensitivity, specificity, and accuracy and choosing the point of intersection as it balanced the trade-off between the three metrics. The model’s predictions were also plotted, and the quantile ranges were used to classify the model’s prediction as follows: > 1st quantile, > 2nd quantile, > 3rd quartile and < 3rd quartile (very low, low, moderate, high) respectively. Metric Formula C-statistics (TPR + TNR - 1) / 2 Sensitivity/Recall TP / (TP + FN) Specificity TN / (TN + FP) Accuracy (TP + TN) / (TP + TN + FP + FN) F1 score 2 * (precision * recall) / (precision + recall) Model interpretation We used the variable importance plot, which is a measure of how much each variable contributes to the prediction power of a machine learning model. In H2O package, variable importance for GBM and RF is calculated by measuring the decrease in the model's error when a variable is split on. The more a variable's split decreases the error, the more important that variable is considered to be. The error is calculated using the following formula: 𝑆𝐸=𝑀𝑆𝐸∗𝑁=𝑉𝐴𝑅∗𝑁 and then it is scaled between 0 and 1 and plotted. Also, we used The SHAP summary plot which is a graphical tool to visualize the impact of input features on the prediction of a machine learning model. SHAP stands for SHapley Additive exPlanations, a method to calculate the contribution of each feature to the prediction by averaging over all possible subsets of features [28]. SHAP summary plot shows the distribution of the SHAP values for each feature across the data instances. We use the h2o.shap_summary_plot() function in R to generate the SHAP summary plot for our GBM model. We pass the model object and the test data as arguments, and optionally specify the columns (features) we want to include in the plot. The plot shows the SHAP values for each feature on the x-axis, and the features on the y-axis. The color indicates whether the feature value is low (blue) or high (red). The plot also shows the distribution of the feature values as a density plot on the right.
f
Adult age and gender adjusted associations of milk intake (breastfed vs....
plos.figshare.com
figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dylan M. Williams; Richard M. Martin; George Davey Smith; K. G. M. M. Alberti; Yoav Ben-Shlomo; Anne McCarthy (2023). Adult age and gender adjusted associations of milk intake (breastfed vs. formula/cows' milk fed, or by quartile of formula/cows' milk consumed) at 10 days, 6 weeks and 3 months during infancy of subjects included in final analysis (mean and (95% CI), unless stated as %). [Dataset]. http://doi.org/10.1371/journal.pone.0034161.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0034161.t002
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Dylan M. Williams; Richard M. Martin; George Davey Smith; K. G. M. M. Alberti; Yoav Ben-Shlomo; Anne McCarthy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
P for trend was calculated across quartiles of formula/cows' milk intake.
f
Baseline characteristics of the study population according to CMI quartiles....
figshare.com
plos.figshare.com
xls
Updated Feb 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qiming Xu; Junyan Lin; Lin Liao; Jing Hu; Jianrao Lu (2025). Baseline characteristics of the study population according to CMI quartiles. [Dataset]. http://doi.org/10.1371/journal.pone.0318736.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318736.t002
Dataset updated
Feb 25, 2025
Dataset provided by
PLOS ONE
Authors
Qiming Xu; Junyan Lin; Lin Liao; Jing Hu; Jianrao Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Baseline characteristics of the study population according to CMI quartiles.
f
Quartile values of the number of tandem repeats at each locus in each major...
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shinichiro Hirai; Eiji Yokoyama; Naoshi Ando; Junji Seto; Kyoko Hazama; Keigo Enomoto; Hidemasa Izumiya; Yukihiro Akeda; Makoto Ohnishi (2023). Quartile values of the number of tandem repeats at each locus in each major clade using all strains isolated in Chiba prefecture a. [Dataset]. http://doi.org/10.1371/journal.pone.0283684.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0283684.t002
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Shinichiro Hirai; Eiji Yokoyama; Naoshi Ando; Junji Seto; Kyoko Hazama; Keigo Enomoto; Hidemasa Izumiya; Yukihiro Akeda; Makoto Ohnishi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Chiba
Description
Quartile values of the number of tandem repeats at each locus in each major clade using all strains isolated in Chiba prefecture a.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Abir (2023). Journal Ranking Dataset [Dataset]. https://www.kaggle.com/datasets/xabirhasan/journal-ranking-dataset

Data from: Journal Ranking Dataset

A dataset of journal ranking based on Scimago, Web of Science, and Scopus.

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 15, 2023

Dataset provided by

Kaggle

Authors

Abir

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Journals & Ranking

An academic journal or research journal is a periodical publication in which research articles relating to a particular academic discipline is published, according to Wikipedia. Currently, there are more than 25,000 peer-reviewed journals that are indexed in citation index databases such as Scopus and Web of Science. These indexes are ranked on the basis of various metrics such as CiteScore, H-index, etc. The metrics are calculated from yearly citation data of the journal. A lot of efforts are given to make a metric that reflects the journal's quality.

Journal Ranking Dataset

This is a comprehensive dataset on the academic journals coving their metadata information as well as citation, metrics, and ranking information. Detailed data on their subject area is also given in this dataset. The dataset is collected from the following indexing databases: - Scimago Journal Ranking - Scopus - Web of Science Master Journal List

The data is collected by scraping and then it was cleaned, details of which can be found in HERE.

Key Features

Rank: Overall rank of journal (derived from sorted SJR index).
Title: Name or title of journal.
OA: Open Access or not.
Country: Country of origin.
SJR-index: A citation index calculated by Scimago.
CiteScore: A citation index calculated by Scopus.
H-index: Hirsh index, the largest number h such that at least h articles in that journal were cited at least h times each.
Best Quartile: Top Q-index or quartile a journal has in any subject area.
Best Categories: Subject areas with top quartile.
Best Subject Area: Highest ranking subject area.
Best Subject Rank: Rank of the highest ranking subject area.
Total Docs.: Total number of documents of the journal.
Total Docs. 3y: Total number of documents in the past 3 years.
Total Refs.: Total number of references of the journal.
Total Cites 3y: Total number of citations in the past 3 years.
Citable Docs. 3y: Total number of citable documents in the past 3 years.
Cites/Doc. 2y: Total number of citations divided by the total number of documents in the past 2 years.
Refs./Doc.: Total number of references divided by the total number of documents.
Publisher: Name of the publisher company of the journal.
Core Collection: Web of Science core collection name.
Coverage: Starting year of coverage.
Active: Active or inactive.
In-Press: Articles in press or not.
ISO Language Code: Three-letter ISO 639 code for language.
ASJC Codes: All Science Journal Classification codes for the journal.

Rest of the features provide further details on the journal's subject area or category: - Life Sciences: Top level subject area. - Social Sciences: Top level subject area. - Physical Sciences: Top level subject area. - Health Sciences: Top level subject area. - 1000 General: ASJC main category. - 1100 Agricultural and Biological Sciences: ASJC main category. - 1200 Arts and Humanities: ASJC main category. - 1300 Biochemistry, Genetics and Molecular Biology: ASJC main category. - 1400 Business, Management and Accounting: ASJC main category. - 1500 Chemical Engineering: ASJC main category. - 1600 Chemistry: ASJC main category. - 1700 Computer Science: ASJC main category. - 1800 Decision Sciences: ASJC main category. - 1900 Earth and Planetary Sciences: ASJC main category. - 2000 Economics, Econometrics and Finance: ASJC main category. - 2100 Energy: ASJC main category. - 2200 Engineering: ASJC main category. - 2300 Environmental Science: ASJC main category. - 2400 Immunology and Microbiology: ASJC main category. - 2500 Materials Science: ASJC main category. - 2600 Mathematics: ASJC main category. - 2700 Medicine: ASJC main category. - 2800 Neuroscience: ASJC main category. - 2900 Nursing: ASJC main category. - 3000 Pharmacology, Toxicology and Pharmaceutics: ASJC main category. - 3100 Physics and Astronomy: ASJC main category. - 3200 Psychology: ASJC main category. - 3300 Social Sciences: ASJC main category. - 3400 Veterinary: ASJC main category. - 3500 Dentistry: ASJC main category. - 3600 Health Professions: ASJC main category.

Clear search

Close search

Google apps

Main menu

Data from: Journal Ranking Dataset

Journals & Ranking

Journal Ranking Dataset

Key Features

ABS - Index of Household Advantage and Disadvantage (IHAD) (LGA) 2016

Gender, Age, and Emotion Detection from Voice

Context

Content

Acknowledgements

House price to workplace-based earnings ratio

ABS - Index of Household Advantage and Disadvantage (IHAD) (SA2) 2016

House price to residence-based earnings ratio

Lettuce Growth Days Analysis

Descriptive statistics of the 2 datasets with mean, standard deviation (SD),...

2016 Census of Canada - Housing Suitability and Shelter-cost-to-income Ratio...

Long Covid Risk

Adult age and gender adjusted associations of milk intake (breastfed vs....

Baseline characteristics of the study population according to CMI quartiles....

Quartile values of the number of tandem repeats at each locus in each major...

Data from: Journal Ranking Dataset

A dataset of journal ranking based on Scimago, Web of Science, and Scopus.

Journals & Ranking

Journal Ranking Dataset

Key Features