Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Kaggle Datasets Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-datasets-ranking on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains Kaggle ranking of datasets.
+800 rows and 8 columns. Columns' description are listed below.
Data from Kaggle. Image from The Guardian.
If you're reading this, please upvote.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘QS World University Rankings 2017 - 2022’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/padhmam/qs-world-university-rankings-2017-2022 on 13 February 2022.
--- Dataset description provided by original source is as follows ---
QS World University Rankings is an annual publication of global university rankings by Quacquarelli Symonds. The QS ranking receives approval from the International Ranking Expert Group (IREG), and is viewed as one of the three most-widely read university rankings in the world. QS publishes its university rankings in partnership with Elsevier.
This dataset contains university data from the year 2017 to 2022. It has a total of 15 features. - university - name of the university - year - year of ranking - rank_display - rank given to the university - score - score of the university based on the six key metrics mentioned above - link - link to the university profile page on QS website - country - country in which the university is located - city - city in which the university is located - region - continent in which the university is located - logo - link to the logo of the university - type - type of university (public or private) - research_output - quality of research at the university - student_faculty_ratio - number of students assigned to per faculty - international_students - number of international students enrolled at the university - size - size of the university in terms of area - faculty_count - number of faculty or academic staff at the university
This dataset was acquired by scraping the QS World University Rankings website with Python and Selenium. Cover Image: Source
Some of the questions that can be answered with this dataset, 1. What makes a best ranked university? 2. Does the location of a university play a role in its ranking? 3. What do the best universities have in common? 4. How important is academic research for a university? 5. Which country is preferred by international students?
--- Original source retains full ownership of the source dataset ---
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This repository contains performance measures of dataset ranking models.- Usage: from Results/src run Python results m1 m2 ...such that mi can be omitted, or be any element of the list of model labels ['bayesian-12C', 'bayesian-5L', 'bayesian-5L12C', 'cos-12C', 'cos-5L', 'cos-5L5C', 'j48-12C', 'j48-5L', 'j48-5L5C', 'jrip-12C', 'jrip-5L', 'jrip-5L5C', 'sn-12C', 'sn-5L', 'sn-5L12C']. Results of selected models will be plotted in a 2D line plot. If no model is provided all models will be listed.
Combined geochemical and geophysical data, weighted and ranked for geothermal prospect favorability. Conversion of data to grids. Weight added to various characteristics.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In this Datasets, you check how the performance of the university depends on every factor like the Location of the University, Quality of Faculty, Facility, Alumni Employment, etc.......
This is the Academic Ranking datasets of the top 1000 world universities.
Good luck and enjoy the learning!
Universities and higher education institutions form an integral part of the national infrastructure and prestige. As academic research benefits increasingly from international exchange and cooperation, many universities have increased investment in improving and enabling their global connectivity. Yet, the relationship of university performance and its global physical connectedness has not been explored in detail. We conduct the first large-scale data-driven analysis into whether there is a correlation between university relative ranking performance and its global connectivity via the air transport network. The results show that local access to global hubs (as measured by air transport network betweenness) strongly and positively correlates with the ranking growth (statistical significance in different models ranges between 5% and 1% level). We also showed that the local airport's aggregate flight paths (degree) and capacity (weighted degree) has no effect on university ranking, further...
This ranking report attempts to identify the best law school home pages based exclusively on objective criteria. The goal is to assess elements that make websites easier to use for sighted as well as visually-impaired users. Most elements require no special design skills, sophisticated technology or significant expenses. Ranking results in this report represent reasonably relevant elements. In this report, 200 ABA-accredited law school home pages are analyzed and ranked for twenty elements in three broad categories: Design Patterns & Metadata; Accessibility & Validation; and Marketing & Communications. As was the case in 2009, there is still no objective way to account for good taste. For interpreting these results, we don't try to decide if any whole is greater or less than the sum of its parts.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains rankings of the world universities as maintained by Quacquarelli Symmonds. QS is a British think-tank company specializing in the analysis of higher education institutions throughout the world. QS uses 6 factors for their ranking framework wiz. Academic Reputation, Employer Reputation, Faculty to Student Ratio, Number of citations per faculty, International Faculty, International Students. Another feature included in this data was Classification (which is not used for ranking) which included the institution's size, subject range, research intensity, age, and status.
This data can be used to analyze a specific set of universities' performances over the years as seen by QS. The scores can be an indicator of how well or bad have the universities performed in comparison to the last year.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Kaggle Notebooks Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-notebooks-ranking on 13 February 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains Kaggle ranking of notebooks.
+3000 rows and 8 columns. Columns' description are listed below.
Data from Kaggle. Image from Wikiwand.
If you're reading this, please upvote.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the datasets and codes supplementary to the article "A benchmarking method to rank the performance of physics-based earthquake simulations " submitted to Seismological Research Letters.
The datasets include the codes to run the ranking analyses, inputs and outputs for the RSQSim earthquake simulation cases explained in the paper: a single fault and the fault system of the Eastern Betics Shear Zone (simulations from Herrero-Barbero et al. 2021). The results and data are stored in a separate folder for each case study presented in the paper: "Single fault" and "EBSZ". Each folder contains a series of subfolders and a Python script to run the ranking analysis for that specific case study. The script contains the default path references to read all necessary input files for the analysis and automatically save all the outputs. The subfolders are:
./Inputs: This folder contains the input files required for the RSQSim simulations. This includes:
a. The fault model ("Nodes_RSQSim.flt" and "EBSZ_model.csv" for the single fault and EBSZ cases, respectively), which specifies the coordinate nodes of the fault triangular meshes and fault properties such as rake (º) and slip rate (m/yr).
b. Neighbor file ("neighbors.dat"/"neighbors.12") that contains lists of triangular patches of the fault model that are neighboring. This file is used in RSQSim.
c. Input parameter file ("Input_Parameters.txt"): this file specifies the parameters that are variable in each catalogue. This file is just for information purposes and is not used for the calculations.
d. Parameter file(s) to run the RSQSim calculations.
*For the single fault, this file is common ("test_normal.in") and is updated during the calculation when executing the "Run.sh" file in the terminal when running RSQSim. This file contains a script that loops through the input parameters a, b and normal stress explored in the study and changes the input parameter file accordingly in each iteration.
*For the EBSZ, this file is specific for each simulation ("param_EBSZ_(n).in"), as each simulation was run separately.
e. (Only for the EBSZ case) Input paleoseismic data for the paleorate benchmark. One file ("coord_sites_EBSZ.csv") contains a list of UTM coordinates of each paleoseismic site in the EBSZ and another ("paleo_rates_EBSZ.csv") contains the mean recurrence intervals and annual paleoearthquake rates in those sites (data from Herrero-Barbero et al., 2021).
./Simulation_models: contains several subfolders, one for each simulated catalogue (96 for the single fault case and 11 for the EBSZ). Each subfolder contains data that is read by the ranking code to perform the analysis.
*For the single fault, the folder names follow the structure "model_(normal stress)(a)(b)".
*For the EBSZ, the folder names are "cat-(n)".
./Ranking_results: contains the outputs of the ranking analysis, which are two figures and one text file.
*Figure 1 ("Final_ranking.pdf"): visualization of the final ranking analysis for all models against the analyzed benchmarks.
*Figure 2 ("Parameter_sensitivity.pdf"): visualization of the final and benchmark performance versus the input parameter of the models.
*Text file ("Ranking_results.txt"): contains the final and benchmark scores of each simulation model. This file is outputted so the user can reproduce and customize their own figures with the ranking results.
To use the ranking codes in you own datasets, please replicate the folder structure explained above. Use the code that best suits your data: use the one for the single fault if you wish not to use the paleorate benchmarks, and use the EBSZ one if you wish to include these data in your analysis. At the beginning of the respective codes (before the "Start" block comment) you will find the variables where the file names of the fault model and paleoseismic data are indicated. Change them to adapt it to your data. There you can also assign weights to the respective benchmarks in the analysis (default is set at equal weight for all benchmarks).
For updates of the code please visit our GitHub: https://github.com/octavigomez/Ranking-physics-based-EQ-simulations
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Explore the world of In-Ear Monitors (IEMs) with the "Crinacle IEM List" dataset. This comprehensive dataset offers a detailed ranking and analysis of a wide range of In-Ear Monitors, meticulously compiled by Crinacle.
Key Features:
IEM Rankings: Discover how various IEM models stack up against each other in terms of audio signature, prices, ranks
Technical Insights: Gain valuable insights into the technical specifications of each IEM, including drivers setup
Brand Diversity: Explore IEMs from a diverse range of brands, providing you with a comprehensive overview of the market.
Informed Decision-Making: Whether you're an audiophile, a music enthusiast, or a consumer looking for the perfect IEM, this dataset equips you with the information you need to make informed decisions.
Audio Enthusiast's Resource: An invaluable resource for audiophiles, audio reviewers, and anyone passionate about high-quality audio equipment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the code for Relevance and Redundancy ranking; a an efficient filter-based feature ranking framework for evaluating relevance based on multi-feature interactions and redundancy on mixed datasets.Source code is in .scala and .sbt format, metadata in .xml, all of which can be accessed and edited in standard, openly accessible text edit software. Diagrams are in openly accessible .png format.Supplementary_2.pdf: contains the results of experiments on multiple classifiers, along with parameter settings and a description of how KLD converges to mutual information based on its symmetricity.dataGenerator.zip: Synthetic data generator inspired from NIPS: Workshop on variable and feature selection (2001), http://www.clopinet.com/isabelle/Projects/NIPS2001/rar-mfs-master.zip: Relevance and Redundancy Framework containing overview diagram, example datasets, source code and metadata. Details on installing and running are provided below.Background. Feature ranking is benfiecial to gain knowledge and to identify the relevant features from a high-dimensional dataset. However, in several datasets, few features by themselves might have small correlation with the target classes, but by combining these features with some other features, they can be strongly correlated with the target. This means that multiple features exhibit interactions among themselves. It is necessary to rank the features based on these interactions for better analysis and classifier performance. However, evaluating these interactions on large datasets is computationally challenging. Furthermore, datasets often have features with redundant information. Using such redundant features hinders both efficiency and generalization capability of the classifier. The major challenge is to efficiently rank the features based on relevance and redundancy on mixed datasets. In the related publication, we propose a filter-based framework based on Relevance and Redundancy (RaR), RaR computes a single score that quantifies the feature relevance by considering interactions between features and redundancy. The top ranked features of RaR are characterized by maximum relevance and non-redundancy. The evaluation on synthetic and real world datasets demonstrates that our approach outperforms several state of-the-art feature selection techniques.# Relevance and Redundancy Framework (rar-mfs) rar-mfs is an algorithm for feature selection and can be employed to select features from labelled data sets. The Relevance and Redundancy Framework (RaR), which is the theory behind the implementation, is a novel feature selection algorithm that - works on large data sets (polynomial runtime),- can handle differently typed features (e.g. nominal features and continuous features), and- handles multivariate correlations.## InstallationThe tool is written in scala and uses the weka framework to load and handle data sets. You can either run it independently providing the data as an
.arff
or .csv
file or you can include the algorithm as a (maven / ivy) dependency in your project. As an example data set we use heart-c. ### Project dependencyThe project is published to maven central (link). To depend on the project use:- maven xml de.hpi.kddm rar-mfs_2.11 1.0.2
- sbt: sbt libraryDependencies += "de.hpi.kddm" %% "rar-mfs" % "1.0.2"
To run the algorithm usescalaimport de.hpi.kddm.rar._// ...val dataSet = de.hpi.kddm.rar.Runner.loadCSVDataSet(new File("heart-c.csv", isNormalized = false, "")val algorithm = new RaRSearch( HicsContrastPramsFA(numIterations = config.samples, maxRetries = 1, alphaFixed = config.alpha, maxInstances = 1000), RaRParamsFixed(k = 5, numberOfMonteCarlosFixed = 5000, parallelismFactor = 4))algorithm.selectFeatures(dataSet)
### Command line tool- EITHER download the prebuild binary which requires only an installation of a recent java version (>= 6) 1. download the prebuild jar from the releases tab (latest) 2. run java -jar rar-mfs-1.0.2.jar--help
Using the prebuild jar, here is an example usage: sh rar-mfs > java -jar rar-mfs-1.0.2.jar arff --samples 100 --subsetSize 5 --nonorm heart-c.arff Feature Ranking: 1 - age (12) 2 - sex (8) 3 - cp (11) ...
- OR build the repository on your own: 1. make sure sbt is installed 2. clone repository 3. run sbt run
Simple example using sbt directly after cloning the repository: sh rar-mfs > sbt "run arff --samples 100 --subsetSize 5 --nonorm heart-c.arff" Feature Ranking: 1 - age (12) 2 - sex (8) 3 - cp (11) ...
### [Optional]To speed up the algorithm, consider using a fast solver such as Gurobi (http://www.gurobi.com/). Install the solver and put the provided gurobi.jar
into the java classpath. ## Algorithm### IdeaAbstract overview of the different steps of the proposed feature selection algorithm:https://github.com/tmbo/rar-mfs/blob/master/docu/images/algorithm_overview.png" alt="Algorithm Overview">The Relevance and Redundancy ranking framework (RaR) is a method able to handle large scale data sets and data sets with mixed features. Instead of directly selecting a subset, a feature ranking gives a more detailed overview into the relevance of the features. The method consists of a multistep approach where we 1. repeatedly sample subsets from the whole feature space and examine their relevance and redundancy: exploration of the search space to gather more and more knowledge about the relevance and redundancy of features 2. decude scores for features based on the scores of the subsets 3. create the best possible ranking given the sampled insights.### Parameters| Parameter | Default value | Description || ---------- | ------------- | ------------|| m - contrast iterations | 100 | Number of different slices to evaluate while comparing marginal and conditional probabilities || alpha - subspace slice size | 0.01 | Percentage of all instances to use as part of a slice which is used to compare distributions || n - sampling itertations | 1000 | Number of different subsets to select in the sampling phase|| k - sample set size | 5 | Maximum size of the subsets to be selected in the sampling phase|
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An understanding of the similar and divergent metrics and methodologies underlying open government data benchmarks can reduce the risks of the potential misinterpretation and misuse of benchmarking outcomes by policymakers, politicians, and researchers. Hence, this study aims to compare the metrics and methodologies used to measure, benchmark, and rank governments' progress in open government data initiatives. Using a critical meta-analysis approach, we compare nine benchmarks with reference to meta-data, meta-methods, and meta-theories. This study finds that both existing open government data benchmarks and academic open data progress models use a great variety of metrics and methodologies, although open data impact is not usually measured. While several benchmarks’ methods have changed over time, and variables measured have been adjusted, we did not identify a similar pattern for academic open data progress models. This study contributes to open data research in three ways: 1) it reveals the strengths and weaknesses of existing open government data benchmarks and academic open data progress models; 2) it reveals that the selected open data benchmarks employ relatively similar measures as the theoretical open data progress models; and 3) it provides an updated overview of the different approaches used to measure open government data initiatives’ progress. Finally, this study offers two practical contributions: 1) it provides the basis for combining the strengths of benchmarks to create more comprehensive approaches for measuring governments’ progress in open data initiatives; and 2) it explains why particular countries are ranked in a certain way. This information is essential for governments and researchers to identify and propose effective measures to improve their open data initiatives.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
data used for:
Nano Ranking Analysis: determining NPF event occurrence and intensity based on the concentration spectrum of formed (sub-5 nm) particles
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Dataset Card for Argument-Quality-Ranking-30k Dataset
Dataset Summary
Argument Quality Ranking
The dataset contains 30,497 crowd-sourced arguments for 71 debatable topics labeled for quality and stance, split into train, validation and test sets. The dataset was originally published as part of our paper: A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis.
Argument Topic
This subset contains 9,487 of the arguments only with… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/argument_quality_ranking_30k.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data for DEA for absolute with titled manuscript
💬Also have a look at
💡 COUNTRIES Research & Science Dataset - SCImagoJR
💡 UNIVERSITIES & Research INSTITUTIONS Rank - SCImagoIR
☢️❓The entire dataset is obtained from public and open-access data of ScimagoJR (SCImago Journal & Country Rank)
ScimagoJR Journal Rank
SCImagoJR About Us
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A dataset comprising 55 molecules described by seven criteria was used. The criteria are composed of binding activity values for each target expressed as half maximal activity concentration (AC50), based on the dose-response curves, thus the smaller the concentration, the more active the molecules. Seven targets are taken into account belonging to the nuclear receptors family: Estrogen Receptor Alpha (ERα), Estrogen Receptor Beta (ERβ), Farnesoid X Receptor (FXR), Progesterone Receptor (PR), Pregnane X Receptor (PXR), Peroxisome Proliferator-Activated Receptor Gamma (PPARγ) and Peroxisome Proliferator-Activated Receptor Delta (PPARδ). To create the dataset we collected from [22] the Tox21 databases [23, 24] of agonism/antagonism activity for the seven nuclear receptors.
The ATP Tour (known as the ATP World Tour from January 2009 until December 2018) is a worldwide top-tier tennis tour for men organized by the Association of Tennis Professionals. The second-tier tour is the ATP Challenger Tour and the third-tier is ITF Men's World Tennis Tour. The equivalent women's organisation is the WTA Tour.
The ATP Tour comprises ATP Masters 1000, ATP 500, and ATP 250.[1] The ATP also oversees the ATP Challenger Tour,[2] a level below the ATP Tour, and the ATP Champions Tour for seniors. Grand Slam tournaments, a small portion of the Olympic tennis tournament, the Davis Cup, and the entry-level ITF World Tennis Tour do not fall under the purview of the ATP, but are overseen by the ITF instead and the International Olympic Committee (IOC) for the Olympics. In these events, however, ATP ranking points are awarded, with the exception of the Olympics. The four-week ITF Satellite tournaments were discontinued in 2007. Players and doubles teams with the most ranking points (collected during the calendar year) play in the season-ending ATP Finals, which, from 2000–2008, was run jointly with the International Tennis Federation (ITF). The details of the professional tennis tour are:
Event Number Total prize money (USD) Winner's ranking points Governing body Grand Slam 4 See individual articles 2,000 ITF ATP Finals 1 4,450,000 1,100–1,500 ATP (2009–present) ATP Masters 1000 9 2,450,000 to 3,645,000 1000 ATP ATP 500 13 755,000 to 2,100,000 500 ATP ATP 250 39 416,000 to 1,024,000 250 ATP Olympics 1 See individual articles 0 IOC ATP Challenger Tour 178 40,000 to 220,000 80 to 125 ATP ITF Men's Circuit 534 10,000 and 25,000 18 to 35 ITF
The dataset is from Jeff Sackmann(https://github.com/JeffSackmann/tennis_atp)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The first column and the first row list component i and j for Pairs, respectively.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Kaggle Datasets Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-datasets-ranking on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains Kaggle ranking of datasets.
+800 rows and 8 columns. Columns' description are listed below.
Data from Kaggle. Image from The Guardian.
If you're reading this, please upvote.
--- Original source retains full ownership of the source dataset ---