100+ datasets found
  1. A

    ‘Kaggle Datasets Ranking’ analyzed by Analyst-2

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Kaggle Datasets Ranking’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-kaggle-datasets-ranking-2744/64eafea2/?iid=003-662&v=presentation
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Kaggle Datasets Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-datasets-ranking on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This dataset contains Kaggle ranking of datasets.

    Content

    +800 rows and 8 columns. Columns' description are listed below.

    • Rank : Rank of the user
    • Tier : Grandmaster, Master or Expert
    • Username : Name of the user
    • Join Date : Year of join
    • Gold Medals : Number of gold medals
    • Silver Medals : Number of silver medals
    • Bronze Medals : Number of bronze medals
    • Points : Total points

    Acknowledgements

    Data from Kaggle. Image from The Guardian.

    If you're reading this, please upvote.

    --- Original source retains full ownership of the source dataset ---

  2. A

    ‘QS World University Rankings 2017 - 2022’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘QS World University Rankings 2017 - 2022’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-qs-world-university-rankings-2017-2022-7fc4/d793e726/?iid=007-103&v=presentation
    Explore at:
    Dataset updated
    Aug 1, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘QS World University Rankings 2017 - 2022’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/padhmam/qs-world-university-rankings-2017-2022 on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    QS World University Rankings is an annual publication of global university rankings by Quacquarelli Symonds. The QS ranking receives approval from the International Ranking Expert Group (IREG), and is viewed as one of the three most-widely read university rankings in the world. QS publishes its university rankings in partnership with Elsevier.

    Content

    This dataset contains university data from the year 2017 to 2022. It has a total of 15 features. - university - name of the university - year - year of ranking - rank_display - rank given to the university - score - score of the university based on the six key metrics mentioned above - link - link to the university profile page on QS website - country - country in which the university is located - city - city in which the university is located - region - continent in which the university is located - logo - link to the logo of the university - type - type of university (public or private) - research_output - quality of research at the university - student_faculty_ratio - number of students assigned to per faculty - international_students - number of international students enrolled at the university - size - size of the university in terms of area - faculty_count - number of faculty or academic staff at the university

    Acknowledgements

    This dataset was acquired by scraping the QS World University Rankings website with Python and Selenium. Cover Image: Source

    Inspiration

    Some of the questions that can be answered with this dataset, 1. What makes a best ranked university? 2. Does the location of a university play a role in its ranking? 3. What do the best universities have in common? 4. How important is academic research for a university? 5. Which country is preferred by international students?

    --- Original source retains full ownership of the source dataset ---

  3. Empirical Analysis of Ranking Models for an Adaptable Dataset Search:...

    • figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angelo Batista Neves Júnior; Luiz André Portes Paes Leme; Marco Antonio Casanova (2023). Empirical Analysis of Ranking Models for an Adaptable Dataset Search: complementary material [Dataset]. http://doi.org/10.6084/m9.figshare.5620651.v4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Angelo Batista Neves Júnior; Luiz André Portes Paes Leme; Marco Antonio Casanova
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    This repository contains performance measures of dataset ranking models.- Usage: from Results/src run Python results m1 m2 ...such that mi can be omitted, or be any element of the list of model labels ['bayesian-12C', 'bayesian-5L', 'bayesian-5L12C', 'cos-12C', 'cos-5L', 'cos-5L5C', 'j48-12C', 'j48-5L', 'j48-5L5C', 'jrip-12C', 'jrip-5L', 'jrip-5L5C', 'sn-12C', 'sn-5L', 'sn-5L12C']. Results of selected models will be plotted in a 2D line plot. If no model is provided all models will be listed.

  4. w

    Play Fairway Analysis CA-NV-OR: 2km Grid Based Analysis...

    • data.wu.ac.at
    Updated Mar 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HarvestMaster (2018). Play Fairway Analysis CA-NV-OR: 2km Grid Based Analysis 2kmGridStandardDeviations (1).xlsx [Dataset]. https://data.wu.ac.at/schema/geothermaldata_org/ZmM2OTI5MmQtY2VlMS00Y2JiLWFjYTItNzgyNjI3NWM2YWFh
    Explore at:
    Dataset updated
    Mar 6, 2018
    Dataset provided by
    HarvestMaster
    Area covered
    b40fa4cc61258a7f5bb6a93937c754e72035e479
    Description

    Combined geochemical and geophysical data, weighted and ranked for geothermal prospect favorability. Conversion of data to grids. Weight added to various characteristics.

  5. Academic ranking of world universities Analytics

    • kaggle.com
    Updated Aug 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shivan kumar (2020). Academic ranking of world universities Analytics [Dataset]. https://www.kaggle.com/shivan118/world-university-rankings-analytics/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 27, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    shivan kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    In this Datasets, you check how the performance of the university depends on every factor like the Location of the University, Quality of Faculty, Facility, Alumni Employment, etc.......

    Content

    This is the Academic Ranking datasets of the top 1000 world universities.

    Columns Name

    • World Rank
    • Institution
    • Location
    • National Rank
    • Quality of Education
    • Alumni Employment
    • Quality of Faculty
    • Research Output
    • Quality Publications
    • Influence Citations Score

    Inspiration

    Good luck and enjoy the learning!

  6. d

    Data from: Global network centrality of university rankings

    • dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weisi Guo; Marco Del Vecchio; Ganna Pogrebna (2025). Global network centrality of university rankings [Dataset]. http://doi.org/10.5061/dryad.fv5mn
    Explore at:
    Dataset updated
    Jun 9, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Weisi Guo; Marco Del Vecchio; Ganna Pogrebna
    Time period covered
    Jul 14, 2020
    Description

    Universities and higher education institutions form an integral part of the national infrastructure and prestige. As academic research benefits increasingly from international exchange and cooperation, many universities have increased investment in improving and enabling their global connectivity. Yet, the relationship of university performance and its global physical connectedness has not been explored in detail. We conduct the first large-scale data-driven analysis into whether there is a correlation between university relative ranking performance and its global connectivity via the air transport network. The results show that local access to global hubs (as measured by air transport network betweenness) strongly and positively correlates with the ranking growth (statistical significance in different models ranges between 5% and 1% level). We also showed that the local airport's aggregate flight paths (degree) and capacity (weighted degree) has no effect on university ranking, further...

  7. H

    Replication data for: Top 10 Law School Home Pages of 2010

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Nov 8, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2011). Replication data for: Top 10 Law School Home Pages of 2010 [Dataset]. http://doi.org/10.7910/DVN/DOVSSH
    Explore at:
    xlsx(51105), text/plain; charset=us-ascii(28573)Available download formats
    Dataset updated
    Nov 8, 2011
    Dataset provided by
    Harvard Dataverse
    Time period covered
    Jan 1, 2010 - Dec 31, 2010
    Area covered
    United States
    Description

    This ranking report attempts to identify the best law school home pages based exclusively on objective criteria. The goal is to assess elements that make websites easier to use for sighted as well as visually-impaired users. Most elements require no special design skills, sophisticated technology or significant expenses. Ranking results in this report represent reasonably relevant elements. In this report, 200 ABA-accredited law school home pages are analyzed and ranked for twenty elements in three broad categories: Design Patterns & Metadata; Accessibility & Validation; and Marketing & Communications. As was the case in 2009, there is still no objective way to account for good taste. For interpreting these results, we don't try to decide if any whole is greater or less than the sum of its parts.

  8. QS World University Rankings

    • kaggle.com
    Updated Jun 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Divyansh Agrawal (2020). QS World University Rankings [Dataset]. https://www.kaggle.com/divyansh22/qs-world-university-rankings/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 12, 2020
    Dataset provided by
    Kaggle
    Authors
    Divyansh Agrawal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains rankings of the world universities as maintained by Quacquarelli Symmonds. QS is a British think-tank company specializing in the analysis of higher education institutions throughout the world. QS uses 6 factors for their ranking framework wiz. Academic Reputation, Employer Reputation, Faculty to Student Ratio, Number of citations per faculty, International Faculty, International Students. Another feature included in this data was Classification (which is not used for ranking) which included the institution's size, subject range, research intensity, age, and status.

    This data can be used to analyze a specific set of universities' performances over the years as seen by QS. The scores can be an indicator of how well or bad have the universities performed in comparison to the last year.

  9. A

    ‘Kaggle Notebooks Ranking’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Kaggle Notebooks Ranking’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-kaggle-notebooks-ranking-ebf3/e7f75ea8/?iid=002-618&v=presentation
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Kaggle Notebooks Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-notebooks-ranking on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This dataset contains Kaggle ranking of notebooks.

    Content

    +3000 rows and 8 columns. Columns' description are listed below.

    • Rank : Rank of the user
    • Tier : Grandmaster, Master or Expert
    • Username : Name of the user
    • Join Date : Year of join
    • Gold Medals : Number of gold medals
    • Silver Medals : Number of silver medals
    • Bronze Medals : Number of bronze medals
    • Points : Total points

    Acknowledgements

    Data from Kaggle. Image from Wikiwand.

    If you're reading this, please upvote.

    --- Original source retains full ownership of the source dataset ---

  10. Data from: Data for "A benchmarking method to rank the performance of...

    • zenodo.org
    • produccioncientifica.ucm.es
    • +1more
    zip
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Octavi Gómez-Novell; Octavi Gómez-Novell; Francesco Visini; Francesco Visini; Bruno Pace; Bruno Pace; José Antonio Álvarez Gómez; José Antonio Álvarez Gómez; Paula Herrero-Barbero; Paula Herrero-Barbero (2024). Data for "A benchmarking method to rank the performance of physics-based earthquake simulations" [Dataset]. http://doi.org/10.5281/zenodo.10143779
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Octavi Gómez-Novell; Octavi Gómez-Novell; Francesco Visini; Francesco Visini; Bruno Pace; Bruno Pace; José Antonio Álvarez Gómez; José Antonio Álvarez Gómez; Paula Herrero-Barbero; Paula Herrero-Barbero
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 30, 2023
    Description

    This repository contains the datasets and codes supplementary to the article "A benchmarking method to rank the performance of physics-based earthquake simulations " submitted to Seismological Research Letters.

    The datasets include the codes to run the ranking analyses, inputs and outputs for the RSQSim earthquake simulation cases explained in the paper: a single fault and the fault system of the Eastern Betics Shear Zone (simulations from Herrero-Barbero et al. 2021). The results and data are stored in a separate folder for each case study presented in the paper: "Single fault" and "EBSZ". Each folder contains a series of subfolders and a Python script to run the ranking analysis for that specific case study. The script contains the default path references to read all necessary input files for the analysis and automatically save all the outputs. The subfolders are:

    ./Inputs: This folder contains the input files required for the RSQSim simulations. This includes:

    a. The fault model ("Nodes_RSQSim.flt" and "EBSZ_model.csv" for the single fault and EBSZ cases, respectively), which specifies the coordinate nodes of the fault triangular meshes and fault properties such as rake (º) and slip rate (m/yr).

    b. Neighbor file ("neighbors.dat"/"neighbors.12") that contains lists of triangular patches of the fault model that are neighboring. This file is used in RSQSim.

    c. Input parameter file ("Input_Parameters.txt"): this file specifies the parameters that are variable in each catalogue. This file is just for information purposes and is not used for the calculations.

    d. Parameter file(s) to run the RSQSim calculations.

    *For the single fault, this file is common ("test_normal.in") and is updated during the calculation when executing the "Run.sh" file in the terminal when running RSQSim. This file contains a script that loops through the input parameters a, b and normal stress explored in the study and changes the input parameter file accordingly in each iteration.

    *For the EBSZ, this file is specific for each simulation ("param_EBSZ_(n).in"), as each simulation was run separately.

    e. (Only for the EBSZ case) Input paleoseismic data for the paleorate benchmark. One file ("coord_sites_EBSZ.csv") contains a list of UTM coordinates of each paleoseismic site in the EBSZ and another ("paleo_rates_EBSZ.csv") contains the mean recurrence intervals and annual paleoearthquake rates in those sites (data from Herrero-Barbero et al., 2021).

    ./Simulation_models: contains several subfolders, one for each simulated catalogue (96 for the single fault case and 11 for the EBSZ). Each subfolder contains data that is read by the ranking code to perform the analysis.

    *For the single fault, the folder names follow the structure "model_(normal stress)(a)(b)".

    *For the EBSZ, the folder names are "cat-(n)".

    ./Ranking_results: contains the outputs of the ranking analysis, which are two figures and one text file.

    *Figure 1 ("Final_ranking.pdf"): visualization of the final ranking analysis for all models against the analyzed benchmarks.

    *Figure 2 ("Parameter_sensitivity.pdf"): visualization of the final and benchmark performance versus the input parameter of the models.

    *Text file ("Ranking_results.txt"): contains the final and benchmark scores of each simulation model. This file is outputted so the user can reproduce and customize their own figures with the ranking results.

    To use the ranking codes in you own datasets, please replicate the folder structure explained above. Use the code that best suits your data: use the one for the single fault if you wish not to use the paleorate benchmarks, and use the EBSZ one if you wish to include these data in your analysis. At the beginning of the respective codes (before the "Start" block comment) you will find the variables where the file names of the fault model and paleoseismic data are indicated. Change them to adapt it to your data. There you can also assign weights to the respective benchmarks in the analysis (default is set at equal weight for all benchmarks).

    For updates of the code please visit our GitHub: https://github.com/octavigomez/Ranking-physics-based-EQ-simulations

  11. Crinacle iem ranking

    • kaggle.com
    Updated Sep 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush (2023). Crinacle iem ranking [Dataset]. https://www.kaggle.com/datasets/piyushsharmaxyz/crinacle-iem-ranking/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Piyush
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Explore the world of In-Ear Monitors (IEMs) with the "Crinacle IEM List" dataset. This comprehensive dataset offers a detailed ranking and analysis of a wide range of In-Ear Monitors, meticulously compiled by Crinacle.

    Key Features:

    IEM Rankings: Discover how various IEM models stack up against each other in terms of audio signature, prices, ranks

    Technical Insights: Gain valuable insights into the technical specifications of each IEM, including drivers setup

    Brand Diversity: Explore IEMs from a diverse range of brands, providing you with a comprehensive overview of the market.

    Informed Decision-Making: Whether you're an audiophile, a music enthusiast, or a consumer looking for the perfect IEM, this dataset equips you with the information you need to make informed decisions.

    Audio Enthusiast's Resource: An invaluable resource for audiophiles, audio reviewers, and anyone passionate about high-quality audio equipment.

  12. f

    Relevance and Redundancy ranking: Code and Supplementary material

    • springernature.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arvind Kumar Shekar; Tom Bocklisch; Patricia Iglesias Sanchez; Christoph Nikolas Straehle; Emmanuel Mueller (2023). Relevance and Redundancy ranking: Code and Supplementary material [Dataset]. http://doi.org/10.6084/m9.figshare.5418706.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Arvind Kumar Shekar; Tom Bocklisch; Patricia Iglesias Sanchez; Christoph Nikolas Straehle; Emmanuel Mueller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the code for Relevance and Redundancy ranking; a an efficient filter-based feature ranking framework for evaluating relevance based on multi-feature interactions and redundancy on mixed datasets.Source code is in .scala and .sbt format, metadata in .xml, all of which can be accessed and edited in standard, openly accessible text edit software. Diagrams are in openly accessible .png format.Supplementary_2.pdf: contains the results of experiments on multiple classifiers, along with parameter settings and a description of how KLD converges to mutual information based on its symmetricity.dataGenerator.zip: Synthetic data generator inspired from NIPS: Workshop on variable and feature selection (2001), http://www.clopinet.com/isabelle/Projects/NIPS2001/rar-mfs-master.zip: Relevance and Redundancy Framework containing overview diagram, example datasets, source code and metadata. Details on installing and running are provided below.Background. Feature ranking is benfie cial to gain knowledge and to identify the relevant features from a high-dimensional dataset. However, in several datasets, few features by themselves might have small correlation with the target classes, but by combining these features with some other features, they can be strongly correlated with the target. This means that multiple features exhibit interactions among themselves. It is necessary to rank the features based on these interactions for better analysis and classifier performance. However, evaluating these interactions on large datasets is computationally challenging. Furthermore, datasets often have features with redundant information. Using such redundant features hinders both efficiency and generalization capability of the classifier. The major challenge is to efficiently rank the features based on relevance and redundancy on mixed datasets. In the related publication, we propose a filter-based framework based on Relevance and Redundancy (RaR), RaR computes a single score that quantifies the feature relevance by considering interactions between features and redundancy. The top ranked features of RaR are characterized by maximum relevance and non-redundancy. The evaluation on synthetic and real world datasets demonstrates that our approach outperforms several state of-the-art feature selection techniques.# Relevance and Redundancy Framework (rar-mfs) Build Statusrar-mfs is an algorithm for feature selection and can be employed to select features from labelled data sets. The Relevance and Redundancy Framework (RaR), which is the theory behind the implementation, is a novel feature selection algorithm that - works on large data sets (polynomial runtime),- can handle differently typed features (e.g. nominal features and continuous features), and- handles multivariate correlations.## InstallationThe tool is written in scala and uses the weka framework to load and handle data sets. You can either run it independently providing the data as an .arff or .csv file or you can include the algorithm as a (maven / ivy) dependency in your project. As an example data set we use heart-c. ### Project dependencyThe project is published to maven central (link). To depend on the project use:- maven xml de.hpi.kddm rar-mfs_2.11 1.0.2 - sbt: sbt libraryDependencies += "de.hpi.kddm" %% "rar-mfs" % "1.0.2" To run the algorithm usescalaimport de.hpi.kddm.rar._// ...val dataSet = de.hpi.kddm.rar.Runner.loadCSVDataSet(new File("heart-c.csv", isNormalized = false, "")val algorithm = new RaRSearch( HicsContrastPramsFA(numIterations = config.samples, maxRetries = 1, alphaFixed = config.alpha, maxInstances = 1000), RaRParamsFixed(k = 5, numberOfMonteCarlosFixed = 5000, parallelismFactor = 4))algorithm.selectFeatures(dataSet)### Command line tool- EITHER download the prebuild binary which requires only an installation of a recent java version (>= 6) 1. download the prebuild jar from the releases tab (latest) 2. run java -jar rar-mfs-1.0.2.jar--help Using the prebuild jar, here is an example usage: sh rar-mfs > java -jar rar-mfs-1.0.2.jar arff --samples 100 --subsetSize 5 --nonorm heart-c.arff Feature Ranking: 1 - age (12) 2 - sex (8) 3 - cp (11) ...- OR build the repository on your own: 1. make sure sbt is installed 2. clone repository 3. run sbt run Simple example using sbt directly after cloning the repository: sh rar-mfs > sbt "run arff --samples 100 --subsetSize 5 --nonorm heart-c.arff" Feature Ranking: 1 - age (12) 2 - sex (8) 3 - cp (11) ... ### [Optional]To speed up the algorithm, consider using a fast solver such as Gurobi (http://www.gurobi.com/). Install the solver and put the provided gurobi.jar into the java classpath. ## Algorithm### IdeaAbstract overview of the different steps of the proposed feature selection algorithm:https://github.com/tmbo/rar-mfs/blob/master/docu/images/algorithm_overview.png" alt="Algorithm Overview">The Relevance and Redundancy ranking framework (RaR) is a method able to handle large scale data sets and data sets with mixed features. Instead of directly selecting a subset, a feature ranking gives a more detailed overview into the relevance of the features. The method consists of a multistep approach where we 1. repeatedly sample subsets from the whole feature space and examine their relevance and redundancy: exploration of the search space to gather more and more knowledge about the relevance and redundancy of features 2. decude scores for features based on the scores of the subsets 3. create the best possible ranking given the sampled insights.### Parameters| Parameter | Default value | Description || ---------- | ------------- | ------------|| m - contrast iterations | 100 | Number of different slices to evaluate while comparing marginal and conditional probabilities || alpha - subspace slice size | 0.01 | Percentage of all instances to use as part of a slice which is used to compare distributions || n - sampling itertations | 1000 | Number of different subsets to select in the sampling phase|| k - sample set size | 5 | Maximum size of the subsets to be selected in the sampling phase|

  13. 4

    Associated data underlying the article "Comparing open data benchmarks:...

    • data.4tu.nl
    zip
    Updated May 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anneke Zuiderwijk; Ali Pirannejad; Iryna Susha (2021). Associated data underlying the article "Comparing open data benchmarks: which metrics and methodologies determine countries’ positions in the ranking lists?" [Dataset]. http://doi.org/10.4121/14604330.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 13, 2021
    Dataset provided by
    4TU.ResearchData
    Authors
    Anneke Zuiderwijk; Ali Pirannejad; Iryna Susha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    Swedish Research Council
    European Commission
    Description

    An understanding of the similar and divergent metrics and methodologies underlying open government data benchmarks can reduce the risks of the potential misinterpretation and misuse of benchmarking outcomes by policymakers, politicians, and researchers. Hence, this study aims to compare the metrics and methodologies used to measure, benchmark, and rank governments' progress in open government data initiatives. Using a critical meta-analysis approach, we compare nine benchmarks with reference to meta-data, meta-methods, and meta-theories. This study finds that both existing open government data benchmarks and academic open data progress models use a great variety of metrics and methodologies, although open data impact is not usually measured. While several benchmarks’ methods have changed over time, and variables measured have been adjusted, we did not identify a similar pattern for academic open data progress models. This study contributes to open data research in three ways: 1) it reveals the strengths and weaknesses of existing open government data benchmarks and academic open data progress models; 2) it reveals that the selected open data benchmarks employ relatively similar measures as the theoretical open data progress models; and 3) it provides an updated overview of the different approaches used to measure open government data initiatives’ progress. Finally, this study offers two practical contributions: 1) it provides the basis for combining the strengths of benchmarks to create more comprehensive approaches for measuring governments’ progress in open data initiatives; and 2) it explains why particular countries are ranked in a certain way. This information is essential for governments and researchers to identify and propose effective measures to improve their open data initiatives.

  14. DATA for Nano Ranking Analysis: determining NPF event occurrence and...

    • zenodo.org
    zip
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DIEGO ALIAGA; DIEGO ALIAGA (2023). DATA for Nano Ranking Analysis: determining NPF event occurrence and intensity based on the concentration spectrum of formed (sub-5 nm) particles [Dataset]. http://doi.org/10.5281/zenodo.10231189
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    DIEGO ALIAGA; DIEGO ALIAGA
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    data used for:

    Nano Ranking Analysis: determining NPF event occurrence and intensity based on the concentration spectrum of formed (sub-5 nm) particles

    https://doi.org/10.5194/ar-2023-5

  15. argument_quality_ranking_30k

    • huggingface.co
    Updated Nov 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2023). argument_quality_ranking_30k [Dataset]. https://huggingface.co/datasets/ibm-research/argument_quality_ranking_30k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2023
    Dataset provided by
    IBM Research
    IBMhttp://ibm.com/
    Authors
    IBM Research
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Dataset Card for Argument-Quality-Ranking-30k Dataset

      Dataset Summary
    
    
    
    
    
      Argument Quality Ranking
    

    The dataset contains 30,497 crowd-sourced arguments for 71 debatable topics labeled for quality and stance, split into train, validation and test sets. The dataset was originally published as part of our paper: A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis.

      Argument Topic
    

    This subset contains 9,487 of the arguments only with… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/argument_quality_ranking_30k.

  16. m

    Ranking water utilities in a competitive scenario using two years...

    • data.mendeley.com
    Updated Dec 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ranking water utilities in a competitive scenario using two years information and data envelopment analysis [Dataset]. https://data.mendeley.com/datasets/c37v6zs5vg/2
    Explore at:
    Dataset updated
    Dec 13, 2024
    Authors
    Dickson Kasese Gidion
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data for DEA for absolute with titled manuscript

  17. Scientific JOURNALS Indicators & Info - SCImagoJR

    • kaggle.com
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Jalaali (2025). Scientific JOURNALS Indicators & Info - SCImagoJR [Dataset]. https://www.kaggle.com/datasets/alijalali4ai/scimagojr-scientific-journals-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Ali Jalaali
    Description


    The SCImago Journal & Country Rank is a publicly available portal that includes the journals and country scientific indicators developed from the information contained in the Scopus® database (Elsevier B.V.). These indicators can be used to assess and analyze scientific domains. Journals can be compared or analysed separately.


    💬Also have a look at
    💡 COUNTRIES Research & Science Dataset - SCImagoJR
    💡 UNIVERSITIES & Research INSTITUTIONS Rank - SCImagoIR

    • Journals can be grouped by subject area (27 major thematic areas), subject category (309 specific subject categories) or by country.
    • Citation data is drawn from over 34,100 titles from more than 5,000 international publishers
    • This platform takes its name from the SCImago Journal Rank (SJR) indicator , developed by SCImago from the widely known algorithm Google PageRank™. This indicator shows the visibility of the journals contained in the Scopus® database from 1996.
    • SCImago is a research group from the Consejo Superior de Investigaciones Científicas (CSIC), University of Granada, Extremadura, Carlos III (Madrid) and Alcalá de Henares, dedicated to information analysis, representation and retrieval by means of visualisation techniques.

    ☢️❓The entire dataset is obtained from public and open-access data of ScimagoJR (SCImago Journal & Country Rank)
    ScimagoJR Journal Rank
    SCImagoJR About Us

    Available indicators:

    • SJR (SCImago Journal Rank) indicator: It expresses the average number of weighted citations received in the selected year by the documents published in the selected journal in the three previous years, --i.e. weighted citations received in year X to documents published in the journal in years X-1, X-2 and X-3. See detailed description of SJR (PDF).
    • H Index: The h index expresses the journal's number of articles (h) that have received at least h citations. It quantifies both journal scientific productivity and scientific impact and it is also applicable to scientists, countries, etc. (see H-index wikipedia definition)
    • Total Documents: Output of the selected period. All types of documents are considered, including citable and non citable documents.
    • Total Documents (3years): Published documents in the three previous years (selected year documents are excluded), i.e.when the year X is selected, then X-1, X-2 and X-3 published documents are retrieved. All types of documents are considered, including citable and non citable documents.
    • Citable Documents (3 years): Number of citable documents published by a journal in the three previous years (selected year documents are excluded). Exclusively articles, reviews and conference papers are considered. Non-citable Docs. (Available in the graphics) Non-citable documents ratio in the period being considered.
    • Total Cites (3years): Number of citations received in the seleted year by a journal to the documents published in the three previous years, --i.e. citations received in year X to documents published in years X-1, X-2 and X-3. All types of documents are considered.
    • Cites per Document (2 years): Average citations per document in a 2 year period. It is computed considering the number of citations received by a journal in the current year to the documents published in the two previous years, --i.e. citations received in year X to documents published in years X-1 and X-2.
    • Cites per Document (3 years): Average citations per document in a 3 year period. It is computed considering the number of citations received by a journal in the current year to the documents published in the three previous years, --i.e. citations received in year X to documents published in years X-1, X-2 and X-3.
    • Self Cites: Number of journal's self-citations in the seleted year to its own documents published in the three previous years, --i.e. self-citations in year X to documents published in years X-1, X-2 and X-3. All types of documents are considered.
    • Cited Documents: Number of documents cited at least once in the three previous years, --i.e. years X-1, X-2 and X-3
    • Uncited Documents: Number of uncited documents in the three previous years, --i.e. years X-1, X-2 and X-3
    • Total References: It includes all the bibliographical references in a journal in the selected period.
    • References per Document:Average number of references per document in the selected year.
    • % International Collaboration: Document ratio whose affiliation includes more than one country address.
  18. m

    Data for: Deep Ranking Analysis by Power Eigenvectors (DRAPE): a...

    • data.mendeley.com
    • board.unimib.it
    Updated Jul 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberto Todeschini (2020). Data for: Deep Ranking Analysis by Power Eigenvectors (DRAPE): a polypharmacology case study [Dataset]. http://doi.org/10.17632/w57dst72t8.1
    Explore at:
    Dataset updated
    Jul 4, 2020
    Authors
    Roberto Todeschini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset comprising 55 molecules described by seven criteria was used. The criteria are composed of binding activity values for each target expressed as half maximal activity concentration (AC50), based on the dose-response curves, thus the smaller the concentration, the more active the molecules. Seven targets are taken into account belonging to the nuclear receptors family: Estrogen Receptor Alpha (ERα), Estrogen Receptor Beta (ERβ), Farnesoid X Receptor (FXR), Progesterone Receptor (PR), Pregnane X Receptor (PXR), Peroxisome Proliferator-Activated Receptor Gamma (PPARγ) and Peroxisome Proliferator-Activated Receptor Delta (PPARδ). To create the dataset we collected from [22] the Tox21 databases [23, 24] of agonism/antagonism activity for the seven nuclear receptors.

  19. ATP Tour Ranking - Decade-wise and year-wise

    • kaggle.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kalilur Rahman (2025). ATP Tour Ranking - Decade-wise and year-wise [Dataset]. https://www.kaggle.com/datasets/kalilurrahman/atp-tennis-player-ranking-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Kaggle
    Authors
    Kalilur Rahman
    Description

    The ATP Tour (known as the ATP World Tour from January 2009 until December 2018) is a worldwide top-tier tennis tour for men organized by the Association of Tennis Professionals. The second-tier tour is the ATP Challenger Tour and the third-tier is ITF Men's World Tennis Tour. The equivalent women's organisation is the WTA Tour.

    The ATP Tour comprises ATP Masters 1000, ATP 500, and ATP 250.[1] The ATP also oversees the ATP Challenger Tour,[2] a level below the ATP Tour, and the ATP Champions Tour for seniors. Grand Slam tournaments, a small portion of the Olympic tennis tournament, the Davis Cup, and the entry-level ITF World Tennis Tour do not fall under the purview of the ATP, but are overseen by the ITF instead and the International Olympic Committee (IOC) for the Olympics. In these events, however, ATP ranking points are awarded, with the exception of the Olympics. The four-week ITF Satellite tournaments were discontinued in 2007. Players and doubles teams with the most ranking points (collected during the calendar year) play in the season-ending ATP Finals, which, from 2000–2008, was run jointly with the International Tennis Federation (ITF). The details of the professional tennis tour are:

    Event Number Total prize money (USD) Winner's ranking points Governing body Grand Slam 4 See individual articles 2,000 ITF ATP Finals 1 4,450,000 1,100–1,500 ATP (2009–present) ATP Masters 1000 9 2,450,000 to 3,645,000 1000 ATP ATP 500 13 755,000 to 2,100,000 500 ATP ATP 250 39 416,000 to 1,024,000 250 ATP Olympics 1 See individual articles 0 IOC ATP Challenger Tour 178 40,000 to 220,000 80 to 125 ATP ITF Men's Circuit 534 10,000 and 25,000 18 to 35 ITF

    The dataset is from Jeff Sackmann(https://github.com/JeffSackmann/tennis_atp)

  20. Second-order ranking data analysis for BHK cells (with attenuation).

    • plos.figshare.com
    xls
    Updated Jun 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kwang-il Lim; John Yin (2023). Second-order ranking data analysis for BHK cells (with attenuation). [Dataset]. http://doi.org/10.1371/journal.pcbi.1000283.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kwang-il Lim; John Yin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The first column and the first row list component i and j for Pairs, respectively.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Kaggle Datasets Ranking’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-kaggle-datasets-ranking-2744/64eafea2/?iid=003-662&v=presentation

‘Kaggle Datasets Ranking’ analyzed by Analyst-2

Explore at:
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Analysis of ‘Kaggle Datasets Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-datasets-ranking on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

This dataset contains Kaggle ranking of datasets.

Content

+800 rows and 8 columns. Columns' description are listed below.

  • Rank : Rank of the user
  • Tier : Grandmaster, Master or Expert
  • Username : Name of the user
  • Join Date : Year of join
  • Gold Medals : Number of gold medals
  • Silver Medals : Number of silver medals
  • Bronze Medals : Number of bronze medals
  • Points : Total points

Acknowledgements

Data from Kaggle. Image from The Guardian.

If you're reading this, please upvote.

--- Original source retains full ownership of the source dataset ---

Search
Clear search
Close search
Google apps
Main menu