100+ datasets found
  1. File S1 - Evaluation of Bias-Variance Trade-Off for Commonly Used...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xing Qiu; Rui Hu; Zhixin Wu (2023). File S1 - Evaluation of Bias-Variance Trade-Off for Commonly Used Post-Summarizing Normalization Procedures in Large-Scale Gene Expression Studies [Dataset]. http://doi.org/10.1371/journal.pone.0099380.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xing Qiu; Rui Hu; Zhixin Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supporting tables and figures. Table S1. The impact of different effect sizes on gene selection strategies when the sample size is fixed and relatively small. Mean (STD) of true positives computed from SIMU1 with 20 repetitions are reported. Sample size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S2. The impact of different effect sizes on gene selection strategies when the sample size is fixed and relatively small. Mean (STD) of false positives computed from SIMU1 with 20 repetitions are reported. Sample size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S3. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively small. Mean (STD) of true positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S4. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively small. Mean (STD) of false positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S5. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively large. Mean (STD) of true positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S6. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively large. Mean (STD) of false positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S7. The impact of different sample sizes on gene selection strategies with simulation based on biological data. Mean (STD) of true positives computed from SIMU-BIO with 20 repetitions are reported. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Table S8. The impact of different sample sizes on gene selection strategies with simulation based on biological data. Mean (STD) of false positives computed from SIMU-BIO with 20 repetitions are reported. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Table S9. The numbers of differentially expressed genes detected by different selection strategies. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Figure S1. Histogram of pairwise Pearson correlation coefficients between genes computed from HYPERDIP without normalization. Number of genes: 9005. Number of arrays: 88. (PDF)

  2. f

    Data from: Average salary

    • froghire.ai
    Updated Apr 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FrogHire.ai (2025). Average salary [Dataset]. https://www.froghire.ai/major/Math%20And%20Statistics
    Explore at:
    Dataset updated
    Apr 6, 2025
    Dataset provided by
    FrogHire.ai
    Description

    Explore the progression of average salaries for graduates in Math And Statistics from 2020 to 2023 through this detailed chart. It compares these figures against the national average for all graduates, offering a comprehensive look at the earning potential of Math And Statistics relative to other fields. This data is essential for students assessing the return on investment of their education in Math And Statistics, providing a clear picture of financial prospects post-graduation.

  3. S

    Spain PISA math scores - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated Jan 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2025). Spain PISA math scores - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/Spain/pisa_math_scores/
    Explore at:
    excel, xml, csvAvailable download formats
    Dataset updated
    Jan 6, 2025
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 2003 - Dec 31, 2022
    Area covered
    Spain
    Description

    Spain: PISA math scores: The latest value from 2022 is 473.14 index points, a decline from 481.393 index points in 2018. In comparison, the world average is 439.569 index points, based on data from 78 countries. Historically, the average for Spain from 2003 to 2022 is 481.893 index points. The minimum value, 473.14 index points, was reached in 2022 while the maximum of 485.843 index points was recorded in 2015.

  4. f

    Data from: Average salary

    • froghire.ai
    Updated Apr 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FrogHire.ai (2025). Average salary [Dataset]. https://www.froghire.ai/major/Applied%20And%20Computational%20Math%20And%20Statistics
    Explore at:
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    FrogHire.ai
    Description

    Explore the progression of average salaries for graduates in Applied And Computational Math And Statistics from 2020 to 2023 through this detailed chart. It compares these figures against the national average for all graduates, offering a comprehensive look at the earning potential of Applied And Computational Math And Statistics relative to other fields. This data is essential for students assessing the return on investment of their education in Applied And Computational Math And Statistics, providing a clear picture of financial prospects post-graduation.

  5. p

    Saudi Arabia Phone Number Data

    • listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Saudi Arabia Phone Number Data [Dataset]. https://listtodata.com/saudi-arabia-number-data
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset authored and provided by
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Saudi Arabia
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    Saudi Arabia phone number data is another important collection of phone numbers. These numbers come from trusted sources. We carefully check every number. This means you only get real numbers from reliable places. Furthermore, this data includes source URLs. You can use these URLs to find out where the numbers came from. This adds transparency to the data. If you have questions, you can get help anytime. Support is available 24/7. Moreover, the phone data has an opt-in feature. With customer support always on hand to help, you can feel confident using this data.Saudi Arabia number data is a special collection of phone numbers. Besides, this list includes numbers from people living in Saudi Arabia. Each number in this database has verification for accuracy. If you ever find a number that does not work, there is a replacement guarantee. This means any invalid number gets replaced with a valid one at no extra cost. The data comes from people who have given permission. Thus, this respect for privacy makes it a great tool for businesses. At List to Data, we help you find important phone numbers easily and quickly.

  6. f

    Analysis of biological data.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Zielinski; Anne M. Moore; Eilidh Troup; Karen J. Halliday; Andrew J. Millar (2023). Analysis of biological data. [Dataset]. http://doi.org/10.1371/journal.pone.0096462.t012
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Tomasz Zielinski; Anne M. Moore; Eilidh Troup; Karen J. Halliday; Andrew J. Millar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Biological data were analysed with all 6 methods, the mean period value is reported in the table (standard deviation in brackets). The expected period is 24 h as the clock is entrained by a 24 h light:dark cycle. 1) The data were collected in two different conditions: LD and SD, monitoring 5 output genes in each of them. 2) (All) represents aggregated results from all data sets. 3) NoCAT3 represents aggregated results from all data sets except the CAT3 marker. +) The cases for which mean period is not statistically different from the 24 h are marked with +.

  7. V

    Vietnam No of Transaction: Check

    • ceicdata.com
    Updated May 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2020). Vietnam No of Transaction: Check [Dataset]. https://www.ceicdata.com/en/vietnam/domestic-transaction-means-of-liquidity
    Explore at:
    Dataset updated
    May 18, 2020
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2017 - Dec 1, 2019
    Area covered
    Vietnam
    Description

    No of Transaction: Check data was reported at 23,310.000 Unit in Dec 2019. This records an increase from the previous number of 20,747.000 Unit for Sep 2019. No of Transaction: Check data is updated quarterly, averaging 167,515.000 Unit from Mar 2013 (Median) to Dec 2019, with 28 observations. The data reached an all-time high of 250,046.000 Unit in Dec 2016 and a record low of 20,747.000 Unit in Sep 2019. No of Transaction: Check data remains active status in CEIC and is reported by State Bank of Vietnam. The data is categorized under Global Database’s Vietnam – Table VN.KA009: Domestic Transaction: Means of Liquidity. [COVID-19-IMPACT]

  8. C

    China Industrial Enterprise: No of Employee: Average

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). China Industrial Enterprise: No of Employee: Average [Dataset]. https://www.ceicdata.com/en/china/industrial-financial-data/industrial-enterprise-no-of-employee-average
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2024 - Dec 1, 2024
    Area covered
    China
    Variables measured
    Economic Activity
    Description

    China Industrial Enterprise: Number of Employee: Average data was reported at 72,892.000 Person th in Mar 2025. This records an increase from the previous number of 72,243.000 Person th for Feb 2025. China Industrial Enterprise: Number of Employee: Average data is updated monthly, averaging 77,451.100 Person th from Dec 1992 (Median) to Mar 2025, with 186 observations. The data reached an all-time high of 99,772.100 Person th in Dec 2014 and a record low of 54,408.390 Person th in Dec 2001. China Industrial Enterprise: Number of Employee: Average data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Industrial Sector – Table CN.BF: Industrial Financial Data.

  9. C

    China PISA math scores - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2024). China PISA math scores - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/China/pisa_math_scores/
    Explore at:
    csv, excel, xmlAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 2015
    Area covered
    China
    Description

    China: PISA math scores: The latest value from 2015 is 531.296 index points, unavailable from index points in . In comparison, the world average is 463.913 index points, based on data from 67 countries. Historically, the average for China from 2015 to 2015 is 531.296 index points. The minimum value, 531.296 index points, was reached in 2015 while the maximum of 531.296 index points was recorded in 2015.

  10. w

    "THE ROLE OF FIELD AND LABORATORY DATA AND MATHEMATICAL MODELS IN REDUCING...

    • data.wu.ac.at
    pdf
    Updated Sep 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). "THE ROLE OF FIELD AND LABORATORY DATA AND MATHEMATICAL MODELS IN REDUCING THE UNCERTAINT Y OF ECONOMIC STUDIES" [Dataset]. https://data.wu.ac.at/odso/edx_netl_doe_gov/ODVmYmJmMWEtZjJlMS00ZGVlLWEzOTYtZWRmZTVlMzE5MmQ1
    Explore at:
    pdf(1393900.0)Available download formats
    Dataset updated
    Sep 29, 2016
    Description

    "In order to complete an economic study of UCG, a preliminary design must be made for the process. The design and economic study both require the estimation of a considerable number of variables such as depth and thickness of the coal seam, well spacing, gas heating value and production rate, air injection requirements, well spacing, percentage coal recovery, thermal efficiency of the process, and rate of advance of the gasification zone. Almost never will sufficient experimental data be available to determine all variables with confidence. Furthermore, not all of the variables cited are independent of each other. The purpose of this paper is to show how mathematical models and laboratory data can lead to a major reduction in uncertainties resulting from assumptions associated with economic analyses of UCG. An economic analysis is used to illustrate this method. Mathematical model calculations are used to establish the relationships between variables such as the gas heating value and thermal efficiency. The actual correlations are developed from operating data from the Hanna field tests, but model calculations provide the theoretical explanation for the shape and sensitivity of the experimental curves. Model calculations also allow confident interpolation and extrapolation of the experimental data. The end result is an economic analysis with improved accuracy and few assumptions. Finally, it is shown how economic studies can provide valuable feedback for an ongoing research program. Certain variables have yet to be fully determined such as maximum well spacing , probable gas leakage due to subsidence and long term average gas heating value. Economic studies show which of these variables have the greatest economic impact. Those variables with maximum impact should receive the greatest emphasis in research."

  11. o

    University SET data, with faculty and courses characteristics

    • openicpsr.org
    Updated Sep 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Under blind review in refereed journal (2021). University SET data, with faculty and courses characteristics [Dataset]. http://doi.org/10.3886/E149801V1
    Explore at:
    Dataset updated
    Sep 12, 2021
    Authors
    Under blind review in refereed journal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○

  12. T

    Meta-Analysis Codebook of Final Articles with Effect Sizes, Sample Sizes,...

    • ldbase.org
    csv
    Updated Jul 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Amy R. Napoli; Dr. Jamie M. Quinn; Dr. Sarah G. Wood; Mia C. Daucourt; Sara A. Hart (2021). Meta-Analysis Codebook of Final Articles with Effect Sizes, Sample Sizes, and Moderators [Dataset]. https://ldbase.org/datasets/421c34e2-7c32-4bf2-ae72-db5e30bae07a
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 7, 2021
    Authors
    Dr. Amy R. Napoli; Dr. Jamie M. Quinn; Dr. Sarah G. Wood; Mia C. Daucourt; Sara A. Hart
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    The data are in long form, with some studies having multiple lines and includes a sample of children ranging from 3.54 to 13.75 years old. The main effect size is the r, correlation coefficient, and the accompanying sample size is also included. Each article is coded to include a study number, the article name, and its authors, as well as a X moderators. The moderators are as follows:
    - grade_new2 = sample grade category, where 1 = preschool/kindergarten, 2 = secondary
    - HME_comp_new = HME component, where 1 = direct activities, 2 = indirect activities, 3 = combination direct and indirect activities, 4 = parent attitudes and/or beliefs, 5 = parent math expectations, 6 = spatial activities, 7 = math talk
    - hme_type_nocombo = HME measurement method, where 1 = frequency-based scale, 2 = rating scale, 3 = checklist, 4 = observation
    - obs_pr = two-level HME measurement method variable, where 1 = observation-based, 2 = parent-report
    - math_dom_nospat = math domain, where 1 = arithmetic operations, 2 = relations, 3 = numbering, 4 = multiple domains
    - symbolic_nonsymbolic = refers to math assessment, where 1 = symbolic, 2 = non-symbolic, 3 = combination symbolic and non-symbolic
    - timed_new = refers to math assessment, where 1 = timed, 2 = untimed, . = combination timed and untimed
    - composite = refers to math assessment, where 1 = composite, 2 = single math assessment
    - std_new = refers to math assessment, where 1 = standardized, 2 = unstandardized, . = combination standardized and unstandardized
    - hme_calc = hme calculation method, where 1 = latent factor score, 2 = sum score, 3 = single item
    - age = sample age in years
    - long_new = refers to effect size, where 1 = longitudinal relation, 2 = concurrent relation
    - low_SES = sample SES, where 1 = low SES (50% or more), 2 = average SES, 3 = high SES (50% or more)
    - parent_ed = sample SES in terms of parent education level, based on the percentage of parents reported to have completed any post-secondary education (included a vocational certification, attended some college, and/or completed an associate’s, bachelor’s, or graduate degree program). The percentage was converted into a decimal value ranging from .00 to 1.00.

  13. f

    Data from: Average salary

    • froghire.ai
    Updated Apr 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FrogHire.ai (2025). Average salary [Dataset]. https://www.froghire.ai/major/Math%20Of%20Finance
    Explore at:
    Dataset updated
    Apr 6, 2025
    Dataset provided by
    FrogHire.ai
    Description

    Explore the progression of average salaries for graduates in Math Of Finance from 2020 to 2023 through this detailed chart. It compares these figures against the national average for all graduates, offering a comprehensive look at the earning potential of Math Of Finance relative to other fields. This data is essential for students assessing the return on investment of their education in Math Of Finance, providing a clear picture of financial prospects post-graduation.

  14. S

    San Marino PISA math scores - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated Mar 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2024). San Marino PISA math scores - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/San-Marino/pisa_math_scores/
    Explore at:
    csv, excel, xmlAvailable download formats
    Dataset updated
    Mar 29, 2024
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    San Marino
    Description

    San Marino: PISA math scores: The latest value from is index points, unavailable from index points in . In comparison, the world average is 0.000 index points, based on data from countries. Historically, the average for San Marino from to is index points. The minimum value, index points, was reached in while the maximum of index points was recorded in .

  15. The Ultimate Film Statistics Dataset - for ML🏆🎬

    • kaggle.com
    Updated Jul 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessandro Lo Bello (2023). The Ultimate Film Statistics Dataset - for ML🏆🎬 [Dataset]. https://www.kaggle.com/datasets/alessandrolobello/the-ultimate-film-statistics-dataset-for-ml/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alessandro Lo Bello
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description: This dataset provides comprehensive movie statistics compiled from multiple sources, including Wikipedia, The Numbers, and IMDb. It offers a rich collection of information and insights into various aspects of movies, such as movie titles, production dates, genres, runtime minutes, director information, average ratings, number of votes, approval index, production budgets, domestic gross earnings, and worldwide gross earnings.

    The dataset combines data scraped from Wikipedia, which includes details about movie titles, production dates, genres, runtime minutes, and director information, with data from The Numbers, a reliable source for box office statistics. Additionally, IMDb data is integrated to provide information on average ratings, number of votes, and other movie-related attributes.

    With this dataset, users can analyze and explore trends in the film industry, assess the financial success of movies, identify popular genres, and investigate the relationship between average ratings and box office performance. Researchers, movie enthusiasts, and data analysts can leverage this dataset for various purposes, including data visualization, predictive modeling, and deeper understanding of the movie landscape.

    Features: - Movie_title - Production_date - Genres - Runtime_minutes - Director_name (primaryName) - Director_professions (primaryProfession) - Director_birthYear - Director_deathYear - Movie_averageRating : refers to the average rating given by online users for a particular movie - Movie_numberOfVotes : refers to the number of votes given by online users for a particular movie - Approval_Index :is a normalized indicator (on scale 0-10) calculated by multiplying the logarithm of the number of votes by the average users rating. It provides a concise measure of a movie's overall popularity and approval among online viewers, penalizing both films that got too few reviews and blockbusters that got too many. - Production_budget ( $) - Domestic_gross ($) - Worldwide_gross ($)

    Potential Applications:

    Box office analysis: Analyze the relationship between production budgets, domestic and worldwide gross earnings, and profitability. Genre analysis: Identify the most popular genres based on movie counts and analyze their performance. Rating analysis: Explore the relationship between average ratings, number of votes, and financial success. Director analysis: Investigate the impact of directors on movie ratings and financial performance. Time-based analysis: Study movie trends over different production years and observe changes in production budgets, box office earnings, and genre preferences. By utilizing this dataset, users can gain valuable insights into the movie industry and uncover patterns that can inform decision-making, market research, and creative strategies.

  16. N

    Netherlands PISA math scores - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated Apr 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2024). Netherlands PISA math scores - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/Netherlands/pisa_math_scores/
    Explore at:
    xml, excel, csvAvailable download formats
    Dataset updated
    Apr 5, 2024
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 2003 - Dec 31, 2022
    Area covered
    Netherlands
    Description

    The Netherlands: PISA math scores: The latest value from 2022 is 492.676 index points, a decline from 519.231 index points in 2018. In comparison, the world average is 439.569 index points, based on data from 78 countries. Historically, the average for the Netherlands from 2003 to 2022 is 520.206 index points. The minimum value, 492.676 index points, was reached in 2022 while the maximum of 537.823 index points was recorded in 2003.

  17. T

    Thailand Average Monthly Revenue: Per Mobile Phone Number: Prepaid

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Thailand Average Monthly Revenue: Per Mobile Phone Number: Prepaid [Dataset]. https://www.ceicdata.com/en/thailand/telecommunication-statistics-office-of-the-national-broadcasting-and-telecommunications-commission/average-monthly-revenue-per-mobile-phone-number-prepaid
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2016 - Sep 1, 2019
    Area covered
    Thailand
    Variables measured
    Phone Statistics
    Description

    Thailand Average Monthly Revenue: Per Mobile Phone Number: Prepaid data was reported at 151.000 THB in Sep 2019. This records a decrease from the previous number of 152.000 THB for Jun 2019. Thailand Average Monthly Revenue: Per Mobile Phone Number: Prepaid data is updated quarterly, averaging 152.000 THB from Mar 2014 (Median) to Sep 2019, with 23 observations. The data reached an all-time high of 165.000 THB in Mar 2016 and a record low of 134.000 THB in Sep 2014. Thailand Average Monthly Revenue: Per Mobile Phone Number: Prepaid data remains active status in CEIC and is reported by Office of The National Broadcasting and Telecommunications Commission. The data is categorized under Global Database’s Thailand – Table TH.TB006: Telecommunication Statistics: Office of The National Broadcasting and Telecommunications Commission .

  18. Z

    Normalized First Street Census-Tract Data V1.3

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    First Street Foundation (2024). Normalized First Street Census-Tract Data V1.3 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5710939
    Explore at:
    Dataset updated
    Jun 17, 2024
    Dataset authored and provided by
    First Street Foundation
    Description

    Normalized 2020 and 2050 First Street flood risk data aggregated at the census-tract level. A lower number indicates less risk (0 is minimum) and a higher number indicates more risk (1 is maximum). The normalization process subtracts the mean from the local value and divides it by the standard deviation: ((tract_value - overall mean) / stand_dev). The overall mean is the national average of all census tracts.

    If you are interested in acquiring First Street flood data, you can request to access the data here. More information on First Street's flood risk statistics can be found here and information on First Street's hazards can be found here.

  19. LinkedIn Profile Data

    • kaggle.com
    Updated Mar 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Om Ashish Mishra (2020). LinkedIn Profile Data [Dataset]. https://www.kaggle.com/datasets/omashish/linkedin-profile-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Om Ashish Mishra
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    LinkedIn is a place for increasing connection, showing your skills and achievements. Therefore in order to understand the various features like promotions, regional analysis and facial characteristics. This data is taken into consideration.

    Content

    Data is consisting of around 15000 profiles. The data set deals with a lot of features like region, the way the images are being uploaded, the emotions on them and growth of the users over time.

    Lets understand the following attributes for the betterment:-

    User id is a thing of privacy and should not be disclosed although there characteristics can be given in order to understand the various behavior pattern of people in LinkedIn. c id : name for each data, basically forms the primary key.

    Profession Columns avg time in previous position: The amount of time spent in years in the previous position avg current position length: The amount of time on an average the user is present in the current position avg previous position length: The amount of time on an average the user is present in the previous position m urn: The user id for each profile m urn id: This is reduced to a distinct code no of promotions: Total number of times the user was promoted no of previous positions: The number of previous positions the user holds current position length: The number of months the person is in current position age: The Age of the person gender: Male or Female ethnicity: The percentage of ethnicity n followers: Number of followers

    Image Clarity beauty: The beauty is the index for the analysis of the beauty female: This predicts the user image is more to be female or not.
    beauty male: This predicts the user image is more to be male or not. blur: The degree of shadiness of the image

    Emotion Captured emo anger: The percentage of anger found emo disgust: The percentage of disgust found emo fear : The percentage of fear found emo happiness: The percentage of happiness found emo neutral: The percentage of neutral emo sadness: The percentage of sadness emo surprise: The percentage of surprise

    Orientation & Facial Accessories glass: The person is wearing glasses or not or sunglasses head pitch: The orientation of head(basically Up or down) head roll: The orientation of head(side ways rolling; horizontal or vertical) head yaw: The orientation of head(side facing; left or right) mouth close: The percentage of closed mouth mouth mask: The percentage of masked mouth mouth open: The percentage of open mouth mouth other: The percentage of other mouth things skin acne: The percentage of skin tone skin dark_circle: The percentage of dark circle on skin skin health: The growth of the skin percentage skin stain: The stain percentage on skin smile: The smile percentage

    Region Columns nationality: The nationality belonging Followed by the percentage of each:- african celtic english
    east asian
    european
    greek
    hispanic
    jewish
    muslim
    nordic
    south asian

    face_quality: The quality of the face recognized.

    Acknowledgements

    We wouldn't be here without the help of Kagglers. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Always wanted to contribute to the data science community and open up to questions.

  20. e

    Telephone number for government agencies and municipalities

    • data.europa.eu
    unknown
    Updated Jan 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Telephone number for government agencies and municipalities [Dataset]. https://data.europa.eu/data/datasets/https-data-norge-no-node-579?locale=en
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Jan 31, 2022
    License

    https://data.norge.no/nlod/en/2.0/https://data.norge.no/nlod/en/2.0/

    Description

    Data set of phone numbers for state-of-the-art businesses, municipalities and county authorities. It is intended to be used together with the data set of the units of public administration. This dataset is part of several data sets about public enterprises. The data sets are referred to as the agency base and were previously on Norge.no. They contain an overview of public enterprises, i.e. government agencies and enterprises’ central, regional and local units, county municipalities and municipalities. Data sets are not updated. The data sets contain information about the name of the enterprise, visiting address, postal address, telephone number, e-mail address, web address (URL), map coordinates (position), coverage (which municipalities the business covers), organisation number, overarching activity, type of organisation, type of affiliation (the way in which an enterprise is linked to the executive government) and quality assessments of the website. Look up on the keyword/tag agency base to see the other datasets. The establishment base is closed and is no longer maintained by the Directorate of Digitalisation (formerly Difi). The datasets were last updated in January 2012. Note that this does not mean that all data was updated in January 2012, but that the last changes were made at that time. Reference to the source When using this dataset, we ask that the source be referred to as follows (cf the NLOD license): The service is based on open data sets from the Directorate of Digitalisation and is subject to the Norwegian License for Public Data (NLOD). The data was last updated in 2012 and is no longer maintained by the Directorate of Digitalisation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Xing Qiu; Rui Hu; Zhixin Wu (2023). File S1 - Evaluation of Bias-Variance Trade-Off for Commonly Used Post-Summarizing Normalization Procedures in Large-Scale Gene Expression Studies [Dataset]. http://doi.org/10.1371/journal.pone.0099380.s001
Organization logo

File S1 - Evaluation of Bias-Variance Trade-Off for Commonly Used Post-Summarizing Normalization Procedures in Large-Scale Gene Expression Studies

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Xing Qiu; Rui Hu; Zhixin Wu
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Supporting tables and figures. Table S1. The impact of different effect sizes on gene selection strategies when the sample size is fixed and relatively small. Mean (STD) of true positives computed from SIMU1 with 20 repetitions are reported. Sample size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S2. The impact of different effect sizes on gene selection strategies when the sample size is fixed and relatively small. Mean (STD) of false positives computed from SIMU1 with 20 repetitions are reported. Sample size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S3. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively small. Mean (STD) of true positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S4. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively small. Mean (STD) of false positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S5. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively large. Mean (STD) of true positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S6. The impact of different sample sizes on gene selection strategies when the effect size is fixed and relatively large. Mean (STD) of false positives computed from SIMU2 with 20 repetitions are reported. Effect size: . Total number of genes: 1000. Number of differentially expressed genes: 100. Number of permutations for Nstat: 10000. The significance threshold: 0.05. Table S7. The impact of different sample sizes on gene selection strategies with simulation based on biological data. Mean (STD) of true positives computed from SIMU-BIO with 20 repetitions are reported. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Table S8. The impact of different sample sizes on gene selection strategies with simulation based on biological data. Mean (STD) of false positives computed from SIMU-BIO with 20 repetitions are reported. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Table S9. The numbers of differentially expressed genes detected by different selection strategies. Total number of genes: 9005. Number of permutations for Nstat: 100000. The significance threshold: 0.05. Figure S1. Histogram of pairwise Pearson correlation coefficients between genes computed from HYPERDIP without normalization. Number of genes: 9005. Number of arrays: 88. (PDF)

Search
Clear search
Close search
Google apps
Main menu