Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probabilistic models such as logistic regression, Bayesian classification, neural networks, and models for natural language processing, are increasingly more present in both undergraduate and graduate statistics and data science curricula due to their wide range of applications. In this article, we present a one-week course module for students in advanced undergraduate and applied graduate courses on variational inference, a popular optimization-based approach for approximate inference with probabilistic models. Our proposed module is guided by active learning principles: In addition to lecture materials on variational inference, we provide an accompanying class activity, an R shiny app, and guided labs based on real data applications of logistic regression and clustering documents using Latent Dirichlet Allocation with R code. The main goal of our module is to expose students to a method that facilitates statistical modeling and inference with large datasets. Using our proposed module as a foundation, instructors can adopt and adapt it to introduce more realistic case studies and applications in data science, Bayesian statistics, multivariate analysis, and statistical machine learning courses.
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Descriptive Statistics of Basic Data Science, 3rd Semester , Master of Computer Applications (2 Years)
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Inferential Statistics of Basic Data Science, 3rd Semester , Master of Computer Applications (2 Years)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Finland Education Expenditure: University of Applied Sciences Education data was reported at 916.000 EUR mn in 2016. This records an increase from the previous number of 883.000 EUR mn for 2015. Finland Education Expenditure: University of Applied Sciences Education data is updated yearly, averaging 733.000 EUR mn from Dec 1995 (Median) to 2016, with 22 observations. The data reached an all-time high of 928.000 EUR mn in 2012 and a record low of 145.000 EUR mn in 1995. Finland Education Expenditure: University of Applied Sciences Education data remains active status in CEIC and is reported by Statistics Finland. The data is categorized under Global Database’s Finland – Table FI.G005: Education Statistics.
Facebook
TwitterIn the winter semester of 2023/24, the **************** of students in Germany were enrolled in universities, followed by universities of applied sciences. Smaller numbers were distributed in higher education establishments with a specific subject focus. Class in session In general, for higher education the German academic year is split into the winter and summer semester. Actual class starting and ending dates may differ depending on the type of university or college attended, as well as the type of course. In most cases, the German winter semester starts in October and ends in March, while the summer semester begins in April and ends in September. The time during the semester when no classes take place is called the non-lecture period (vorlesungsfreie Zeit), otherwise known among students as semester vacation (Semesterferien). On average, German university graduates who completed their first degree studied for * semesters. It is not uncommon to consider a second degree, after getting a Bachelors, for example a Masters. Tuition-free First-year student numbers have dropped between 2019 and 2022, most probably due to universities being closed and only operating online, because of the COVID-19 pandemic. However, numbers have been increasing again since 2023. Optimistically so, the number of students taking out a loan to finance their studies has been generally decreasing, as all ** German states have abolished tuition fees for first degrees, although there has been an increase again after a low during the corona pandemic.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
RDM file with the statistical analysis performed on the DGS and RSL data for the project "Body-anchored verbs and argument omission in two sign languages".
Facebook
TwitterThis data set consists of upward looking sonar draft data collected by submarines in the Arctic Ocean. It includes data from both U.S. Navy and Royal Navy submarines. Maps showing submarine tracks are available. Data are provided as ice draft profiles and as statistics derived from the profile data. Statistics files include information concerning ice draft characteristics, keels, level ice, leads, un- deformed and deformed ice. Data from the U.S. Navy's Digital Ice Profiling System (DIPS) have been interpolated and processed for release as unclassified data at the U.S. Army's Cold Regions Research and Engineering Laboratory (CRREL) in Hanover, New Hampshire. Data from the analog draft recording system were digitized and then processed by the Polar Science Center, Applied Physics Laboratory, University of Washington. Data from British submarines were provided by the Department of Applied Mathematics and Theoretical Physics, University of Cambridge. All data sources used similar processing methods in order to ensure a consistent data set.Access to the Submarine Upward Looking Sonar Ice Draft Profile Data and Statistics data set is unrestricted, but users are encouraged to register for the data. Registered users will receive e-mail notification about any product changes.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains a dataset of higher education institutions in Germany. This includes 400 higher education institutions in Germany, including universities, universities of applied sciences and Higher Institutes as Higher Institute of Engineering, Higher Institute of biotechnologies and few others. This dataset was compiled in response to a cybersecurity investigation of Germany higher education institutions' websites [1]. The data is being made publicly available to promote open science principles [2].
The data includes the following fields for each institution:
The methodology for creating the dataset involved obtaining data from two sources: The European Higher Education Sector Observatory (ETER)[3]. The data was collected on December 26, 2024, the Eurostat for NUTS - Nomenclature of territorial units for statistics 2013-16[4] and 2021[5].
This section outlines the methodology used to create the dataset for Higher Education Institutions (HEIs) in France. The dataset consolidates information from various sources, processes the data, and enriches it to provide accurate and reliable insights.
Data Sources
eter-export-2021-DE.xlsxNUTS2013-NUTS2016.xlsxNUTS2021.xlsxData Cleaning and Preprocessing Column Renaming Columns in the raw dataset were renamed for consistency and readability. Examples include:
ETER ID → ETER_IDInstitution Name → NameLegal status → CategoryValue Replacement
Category column was cleaned, with government-dependent institutions classified as "public."Handling Missing or Incorrect Data
ETER_ID. For instance:
DE0012 (updated to www.zeppelin-university.com)FR0906 (updated to hmtm.de)FR0104 (updated to www.dhfpg.de)FR0466 (updated to fhf.brandenburg.de)FR0907 (updated to hr-nord.niedersachsen.de)FR0333 (updated to www.srh-university.de)Regional Data Integration
Final Dataset The final dataset was saved as a CSV file: germany-heis.csv, encoded in UTF-8 for compatibility. It includes detailed information about HEIs in France, their categories, regional affiliations, and membership in European alliances.
Summary This methodology ensures that the dataset is accurate, consistent, and enriched with valuable regional and institutional details. The final dataset is intended to serve as a reliable resource for analyzing French HEIs.
This data is available under the Creative Commons Zero (CC0) license and can be used for any purpose, including academic research purposes. We encourage the sharing of knowledge and the advancement of research in this field by adhering to open science principles [2].
If you use this data in your research, please cite the source and include a link to this repository. To properly attribute this data, please use the following DOI: 10.5281/zenodo.7614862
If you have any updates or corrections to the data, please feel free to open a pull request or contact us directly. Let's work together to keep this data accurate and up-to-date.
We would like to acknowledge the support of the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project "Cybers SeC IP" (NORTE-01-0145-FEDER-000044). This study was also developed as part of the Master in Cybersecurity Program at the Instituto Politécnico de Viana do Castelo, Portugal.
Facebook
TwitterUnited States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is originally from Dhaka Stock Exchange Ltd. The objective of the dataset is to assign analytical report writing tasks to Summer 2020 students enrolled in ASDS18: Data Mining course in proceedings of the partial fulfillment of the requirements for the Professional Masters in Applied Statistics and Data Science (PMASDS) degree. This data set was collected using the Dhaka Stock Exchange API.
The datasets consist of several stock company predictor (independent) variables and one target (dependent) variable, Outcome. Independent variables include the last price, net asset value (NAV) of the stock, Earnings Per Share (EPS), price-to-earnings (P/E) ratio of the stock, paid-up capital per share, and so on.
It contains information on 374 listed companies from Dhaka Stock Exchange - DSE, Bangladesh. The outcome tested was Category, 258 tested positive and 500 tested negative. Therefore, there is one target (dependent) variable and 8 attributes.
Dr. Md. Rezaul Karim, Associate Professor, Department of Statistics, Jahangirnagar University, Dhaka, Bangladesh (2021) provided us with this dataset. Using the Dhaka Stock Exchange API this data set was collected to assign analytical report writing tasks to Summer 2020 students in proceedings of the partial fulfillment of the requirements for the Professional Masters in Applied Statistics and Data Science (PMASDS) degree.
Facebook
TwitterThis dataset offers a set of statistics on the number of students enrolled from 2006-07 to 2022-23 per public institution under the supervision of the French Ministry of Higher Education: universities, Technology Universities, Large Institutions, COMUE, Normal Graduate Schools, Central Schools, INSA, Other Engineering Schools... Unless otherwise noted, the indicators proposed in this dataset do not take into account double CPGE registrations The number of students enrolled in parallel in IFSI (Institutes for Nursing Training) is not taken into account in the number of institutions. **** The data are taken from the Student Monitoring Information System (SISE). Registrations are observed on January 15, except for the University of New Caledonia, which has additional time to take into account the Southern calendar. Each line of this dataset provides an institution’s statistics for one academic year. This game unitely declines a set of variables on the student (sex, baccalaureate, age at the baccalaureate, national attractiveness, international attractiveness) and the training he mainly follows (cursus LMD, type of diploma, diploma, major discipline, discipline and disciplinary sector). The geographical data provided in this game relate to the seat of the institution and not the actual location of the training followed by the student. Cross-sectional and more detailed data are available in the dataset “Staff of students enrolled in public institutions under the supervision of the Ministry of Higher Education](https://data.enseignementsup-recherche.gouv.fr/explore/dataset/fr-esr-sise-effectifs-d-etudiants-inscrits-esr-public/)”. National Framework of Training and Conventions EPSCP-CPGE: impacts on measured workforce changes Two regulatory provisions impact developments from 2018-19 onwards and create statistical breaks: - The new National Training Framework (CNF), put in place for Bachelor’s degrees. The CNF significantly reduces the number of diploma titles. Some of these titles have become more precise, leading to an easier ranking by discipline: this is the case for science licences, less frequently classified in “Plurisciences”, but more in “fundamental sciences and applications” or “sciences of nature and life”. On the other hand, other titles are more general, particularly in literary disciplines (e.g. license mention Humanities) and are more frequently classified as “plurilettres, languages, humanities”. - The progressive implementation of agreements between high schools with preparatory classes for the Grandes écoles (CPGE) and the public institutions of a scientific, cultural and professional nature (EPSCP), of which universities belong, significantly increases the number of LMD license registrations from this year onwards, even if double enrolments were already possible and effective before. University enrolments include these double registrations. These two developments mainly impact the workforce detailed by discipline in L1, which hosts the vast majority of new entrants. The impact on total staff is more marginal. Developments taking into account double listings are at constant regulatory scope. — In 2015-2016 the 2014-15 data for these institutions were renewed: University of New Caledonia, ENS Cachan, ENS Rennes. For more information on this dataset, see dataset documentation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The document identifier of the csv version of the data file is 10.21942/uva.12443465. The identifier of the metadata table is 10.21942/uva.12443468. This long-form documentation is identified with 10.21942/uva.12443474
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains annual statistics from Finnish scientific libraries for 2013 according to international library statistics standards. The dataset covers data and indicators on library premises, branches, collections, services, acquisitions, use, users, opening hours, finances, and staff. The statistical data have been collected from the National Library of Finland, university libraries, university of applied sciences libraries, and a few specialised libraries. The collecting organisations are specified in the data file. The collecting organisations compiled the statistical data in accordance with the following standards: ISO 2789 (International library statistics), ISO 11620 (Library performance indicators), and ISO TR 28118 (Performance indicators for national libraries). More information on the data collection can be found in the KITT user guide (available only in Finnish, see Related Materials).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This research designs and develops a software innovation for Pattavia pineapple cultivation and productivity distribution planning to increase income for farmers. This research formulated and introduced an innovative machine learning (ML) model, the new model is called a new intelligent particle swarm optimization algorithm with extreme learning machine (NIPSO-ELM), to forecast Pattavia pineapple productivity with a notable degree of precision and dependability. In this work, the artificial neural network (ANN) and the standard ELM was built, and assessed for its ability to forecast the productivity of Pattavia pineapples. The findings indicate that the ELM neural network is an innovative model characterized by its straightforward architecture and exceptional performance. Moreover, the utilization of particle swarm optimization (PSO), ant colony optimization (ACO) and the NIPSO algorithms significantly enhanced ELM performance when forecasting the productivity of Pattavia pineapples. The NIPSO-ELM model emerged as the most optimal ML model for accurately, reliably and stably forecasting the productivity of Pattavia pineapples in practical scenarios. The most optimal NIPSO-ELM models for the Loei provinces, Thailand exhibit the following performance metrics: RMSE = 304.36389, MAE = 243.29531, MAPE = 0.03753 and MASE = 0.93157. The most optimal NIPSO-ELM models for the Nong Khai provinces, Thailand exhibit: RMSE = 304.57352, MAE = 244.67834, MAPE = 0.03756 and MASE = 0.93296, respectively.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data for a study into the relationship between student GPA, literacy skills, and numeracy skills. Students are from a third-year university course focusing on statistics applied to biological sciences. Also includes raw data related to the average H-index, citation rate, and publication rate of university researchers across three broad fields (biological theory, biological statistics, and science communications).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probabilistic models such as logistic regression, Bayesian classification, neural networks, and models for natural language processing, are increasingly more present in both undergraduate and graduate statistics and data science curricula due to their wide range of applications. In this article, we present a one-week course module for students in advanced undergraduate and applied graduate courses on variational inference, a popular optimization-based approach for approximate inference with probabilistic models. Our proposed module is guided by active learning principles: In addition to lecture materials on variational inference, we provide an accompanying class activity, an R shiny app, and guided labs based on real data applications of logistic regression and clustering documents using Latent Dirichlet Allocation with R code. The main goal of our module is to expose students to a method that facilitates statistical modeling and inference with large datasets. Using our proposed module as a foundation, instructors can adopt and adapt it to introduce more realistic case studies and applications in data science, Bayesian statistics, multivariate analysis, and statistical machine learning courses.