Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pen-and-paper homework and project-based learning are both commonly used instructional methods in introductory statistics courses. However, there have been few studies comparing these two methods exclusively. In this case study, each was used in two different sections of the same introductory statistics course at a regional state university. Students’ statistical literacy was measured by exam scores across the course, including the final. The comparison of the two instructional methods includes using descriptive statistics and two-sample t-tests, as well authors’ reflections on the instructional methods. Results indicated that there is no statistically discernible difference between the two instructional methods in the introductory statistics course.
Facebook
Twitterthe Department of Energy’s Enterprise Project Management Organization (EPMO), providing leadership and assistance in developing and implementing DOE-wide policies, procedures, programs, and management systems pertaining to project management, and independently monitors, assesses, and reports on project execution performance. The office validates project performance baselines–scope, cost and schedule–of the Department’s largest construction and environmental clean-up projects prior to budget request to Congress—an active project portfolio totaling over $30 billion. The office also serves as Executive Secretariat for the Department’s Energy Systems Acquisition Advisory Board (ESAAB) and the Project Management Risk Committee (PMRC). In these capacities, the Director is accountable to the Deputy Secretary.
Facebook
TwitterThis dataset was created by Dhinesh Gupthaa K
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Machine learning (ML) has gained much attention and has been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such high-quality dataset poses an obstacle to understanding ML projects. To help clear this obstacle, we present NICHE, a manually labelled dataset consisting of 572 ML projects. Based on evidences of good software engineering practices, we label 441 of these projects as engineered and 131 as non-engineered. In this repository we provide "NICHE.csv" file that contains the list of the project names along with their labels, descriptive information for every dimension, and several basic statistics, such as the number of stars and commits. This dataset can help researchers understand the practices that are followed in high-quality ML projects. It can also be used as a benchmark for classifiers designed to identify engineered ML projects.
GitHub page: https://github.com/soarsmu/NICHE
Facebook
TwitterThis dataset was created by Glitch_in_Vector
Chunk_0 for me, Choose others as you want.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Comprehensive football (soccer) data lake from Transfermarkt, clean and structured for analysis and machine learning.
Everything in raw CSV format – perfect for EDA, ML, and advanced football analytics.
A complete football data lake covering players, teams, transfers, performances, market values, injuries, and national team stats. Perfect for analysts, data scientists, researchers, and enthusiasts.
Here’s the high-level schema to help you understand the dataset structure:
https://i.imgur.com/WXLIx3L.png" alt="Transfermarkt Dataset ER Diagram">
Organized into 10 well-structured CSV categories:
Most football datasets are pre-processed and restrictive. This one is raw, rich, and flexible:
I’m always excited to collaborate on innovative football data projects. If you’ve got an idea, let’s make it happen together!
If this dataset helps you:
- Upvote on Kaggle
- Star the GitHub repo
- Share with others in the football analytics community
football analytics soccer dataset transfermarkt sports analytics machine learning football research player statistics
🔥 Analyze football like never before. Your next AI or analytics project starts here.
Facebook
TwitterCity of Pittsburgh Capital Projects Budgets NOTE: The data in this dataset has not updated since 2021 because of a broken data feed. We're working to fix it.
Facebook
TwitterAnnual Statistics of Approved Projects under General Support Programme
Facebook
TwitterIn 2024, the total number of open source projects taken up was about *** million. Of these, the majority was through JavaScript with about *** million projects, far more than those in any other language.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Variables
Agile Effectiveness (measured on a Likert scale from 2 to 5): This variable captures how respondents perceive the effectiveness of Agile methodology in enhancing project management processes.
Risk Mitigation (Likert scale 2 to 5): This variable reflects respondents' views on how well Agile methodology supports the mitigation of risks throughout the project lifecycle.
Management Satisfaction (Likert scale 2 to 5): This variable measures how satisfied the management is with the outcomes of projects where Agile methodologies were implemented.
Supply Chain Improvement (Likert scale 2 to 5): This variable captures the perceived improvements in supply chain processes that result from using Agile methods.
Time Efficiency (Likert scale 2 to 5): This measures the impact of Agile methodology on improving the efficiency of time management within projects.
Cost Savings (percentage from 10% to 48%): This variable quantifies the percentage of cost savings achieved as a result of implementing Agile methods.
Project Success (binary: 0 = Failure, 1 = Success): This is the dependent variable and represents whether or not the project was considered successful.
Facebook
TwitterThe statistic shows the success rate of various big data initiatives as of 2019, according to a survey of industry-leading firms, primarily in the United States. As of that time, **** percent of respondents reported having seen measurable results from big data initiatives to decrease expenses.
Facebook
TwitterThis data set contains DOT construction project information. The data is refreshed nightly from multiple data sources, therefore the data becomes stale rather quickly.
Facebook
TwitterAnalysis of the projects proposed by the seven finalists to USDOT's Smart City Challenge, including challenge addressed, proposed project category, and project description. The time reported for the speed profiles are between 2:00PM to 8:00PM in increments of 10 minutes.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Crowdfunding has become one of the main sources of initial capital for small businesses and start-up companies that are looking to launch their first products. Websites like Kickstarter and Indiegogo provide a platform for millions of creators to present their innovative ideas to the public. This is a win-win situation where creators could accumulate initial fund while the public get access to cutting-edge prototypical products that are not available in the market yet.
At any given point, Indiegogo has around 10,000 live campaigns while Kickstarter has 6,000. It has become increasingly difficult for projects to stand out of the crowd. Of course, advertisements via various channels are by far the most important factor to a successful campaign. However, for creators with a smaller budget, this leaves them wonder,
"How do we increase the probability of success of our campaign starting from the very moment we create our project on these websites?"
All of my raw data are scraped from Kickstarter.com.
First 4000 live projects that are currently campaigning on Kickstarter (live.csv)
Top 4000 most backed projects ever on Kickstarter (most_backed.csv)
See more at http://datapolymath.paperplane.io/
Facebook
TwitterThis list includes all pipeline projects that have submitted an Intake. Some may be held at Intake due to early concept status or because the developer has reached their maximum project limit in ORCA.
Facebook
TwitterSmart local energy systems require the communication, automation and engagement of individual systems, each producing substantial quantities of data. Data analytics and digital communication technologies need to be used to explore and investigate hypothetical and real-time operating scenarios for future smart local energy systems. To conduct experimental data analysis of the way energy is generated, stored, shared and consumed, the Newcastle team created the SLES database to store data from our partners and results of our case studies. A novel, generic and scalable smart energy platform for optimised design and the real-time efficient control of power generation and delivery for local energy systems through IoT services for data analytics was proposed. The structure of each table of the SLES database is shown in the attached sql file (sles_db.sql). The design and functionality of the smart energy platform is presented in the paper submitted to IREC2021 - 12th International Renewable Engineering Conference (https://irec2021.meu.edu.jo/). The title of the paper is "An integration platform for optimised design and real-time control of smart local energy systems".
Facebook
TwitterReplacement project detail by school.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Algeria Value of Projects: Private: Construction and Public Works data was reported at 903,269.000 DZD mn in 2017. This records an increase from the previous number of 94,134.000 DZD mn for 2016. Algeria Value of Projects: Private: Construction and Public Works data is updated yearly, averaging 61,535.000 DZD mn from Dec 1994 (Median) to 2017, with 22 observations. The data reached an all-time high of 903,269.000 DZD mn in 2017 and a record low of 1,784.000 DZD mn in 1994. Algeria Value of Projects: Private: Construction and Public Works data remains active status in CEIC and is reported by National Office of Statistics. The data is categorized under Global Database’s Algeria – Table DZ.O004: Value of Private Projects.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The continuous growth of the global human population results in increased use and change of landscapes, with infrastructures like transportation or energy facilities, being a particular risk to large carnivores. Environmental Impact Assessments were established to identify the probable environmental consequences of any new proposed project, find ways to reduce impacts, and provide evidence to inform decision making and mitigation. Portugal has a wolf population of around 300 individuals, designated as an endangered species with full legal protection. They occupy the northern mountainous areas of the country which has also been the focus of new human infrastructures over the last 20 years. Consequently, dozens of wolf monitoring programs have been established to evaluate wolf population status, to identify impacts, and to inform appropriate mitigation or compensation measures. We reviewed Portuguese wolf monitoring programs to answer four key questions: do wolf programs examine adequate biological parameters to meet monitoring objectives? is the study design suitable for measuring impacts? are data collection methods and effort sufficient for the stated inference objectives? and do statistical analyses of the data lead to robust conclusions? Overall, we found a mismatch between the stated aims of wolf monitoring and the results reported, and often neither aligns with the existing national wolf monitoring guidelines. Despite the vast effort expended and the diversity of methods used, data analysis makes almost exclusive use of relative indices or summary statistics, with little consideration of the potential biases that arise through the (imperfect) observational process. This makes comparisons of impacts across space and time difficult and is therefore unlikely to contribute to a general understanding of wolf responses to infrastructure-related disturbance. We recommend the development of standardized monitoring protocols and advocate for the use of statistical methods that account for imperfect detection to guarantee accuracy, reproducibility, and efficacy of the programs. Methods We reviewed all major wolf monitoring programs developed for environmental impact assessments in Portugal since 2002 (Table S1, Supplementary material). Given that the focus here is on the adequacy of targeted wolf monitoring for delivering conclusions about the effects of infrastructure development, we reviewed only monitoring programs that were specifically designed for wolves and not those concerned with general mammalian assessment. The starting point was a compilation from the 2019-2021 National Wolf Census (Pimenta et al., 2023), where every wolf monitoring program that occurred between 2014 and 2019 in Portugal was identified. The list was completed with projects that started before 2014 or after 2019 based on personal knowledge, inquires to principal scientific teams, governmental agencies, and EIA consultants. Depending on duration, wolf monitoring programs can produce several, usually annual, reports that are not peer-reviewed and do not appear on standard search engines (e.g., Web of Science or Google Schoolar) but are publicly available from the Portuguese Environmental Agency (APA – www.apambiente.pt). We conducted an online search on APA´s search engine (https://siaia.apambiente.pt/) and identified a total of 30 projects. For each of these projects, we were interested in the first and the last report to identify any methodological changes. If the last report was not present, we reviewed the most recent one. If no report was present, we requested it from the team responsible. Our investigation centred on characterizing and quantifying four components of wolf monitoring programs that are interlinked and that should be ideally determined by the initial objectives: (1) biological parameters, i.e., what wolf parameters were studied to assess impacts; (2) study design, i.e., what sampling schemes were followed to collect and analyse data; (3) data collection, i.e., which sampling methodology and how much effort was used to collect data; and (4) data analysis, i.e., how data were analysed to estimate relevant parameters and assess impact. Biological parameters were identified and classified under two categories: occurrence and demography, which broadly correspond to the necessary inputs to assess impacts like exclusion effect and changes in reproductive patterns. Occurrence-related parameters refer to variables used to measure the presence or absence of wolves, whereas demographic parameters refer to variables that intend to measure population-level effects such as abundance, density, survival, or reproduction. We also recorded whether any effort was made to quantify prey population distribution or abundance as recommended in the guidelines. For study design, we reviewed the sampling design of the project, with specific focus on the spatial and temporal aspect of the study such as total area surveyed, the definition of a sampling site within this region (i.e., resolution), the duration of the study and the number of sampling seasons. The goal here was to determine whether the sampling scheme used was appropriate for assessing infrastructure impacts on wolf distribution or demography, depending on what the focus was. For data collection, we identified the main data collection methodologies used and the corresponding sampling effort. By far the most frequent method used is sign surveys, and specifically scat surveys, and for these studies we recorded whether genetic identification of species or individuals based on faecal DNA was attempted. We compare how sampling effort varies by the various inference objectives and, as above, assess which, if any, project or data collection approach is most likely to produce evidence of impact. We divided the Analysis component into two groups: single-year and multi-year analyses. For single-year analysis we identified how monitoring projects used data to make inferences about the state biological parameters of interest and discuss the associated strengths and weaknesses. For multi-year analyses, we recorded how differences or trends were quantified and associated with infrastructure impacts, commenting on the statistical robustness of the analyses used across the projects.
Facebook
TwitterSince the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.
Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.
For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.
Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.
Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).
For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.
For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.
The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.
he sample consisted of students born on the 5th, 15th and 25th of any month in 1953, a total of 10,723 students.
The data obtained in 1966 were: 1. School administrative data (school form, class type, year and grades). 2. Information about the parents' profession and education, number of siblings, the distance between home and school, etc.
This information was collected for 93% of all born on the current days. The reason for this is reduced resources for Statistics Sweden for follow-up work - reminders etc. Annual data for cohorts in 1953 were collected by Statistics Sweden up to and including academic year 1972/73.
Response rate for test and questionnaire data is 88% Standard test results were received for just over 85% of those who took the tests.
The sample included a total of 9955 students, for whom some form of information was obtained.
Part of the "Individual Statistics Project" together with cohort 1953.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pen-and-paper homework and project-based learning are both commonly used instructional methods in introductory statistics courses. However, there have been few studies comparing these two methods exclusively. In this case study, each was used in two different sections of the same introductory statistics course at a regional state university. Students’ statistical literacy was measured by exam scores across the course, including the final. The comparison of the two instructional methods includes using descriptive statistics and two-sample t-tests, as well authors’ reflections on the instructional methods. Results indicated that there is no statistically discernible difference between the two instructional methods in the introductory statistics course.