This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is tailored for learning and practicing Multiple Linear Regression, a core concept in machine learning and data science. It includes features that influence a student's performance, making it an excellent resource for beginners and practitioners.
Features:
StudentHours: Hours spent studying per week (int). ExtraParticipation: Participation in extracurricular activities (Yes/No). PapersPracticed: Number of practice papers attempted (int). PreviousMarks: Academic performance in previous exams (int). **SleepingHours: **Average sleep hours per day (int).
Target Variable:
PerformanceIndex: A numeric measure representing the student's overall academic performance (int).
Key Highlights: 📊 Explore the relationships between study habits, extracurricular involvement, and sleep patterns. 🧮 Apply Multiple Linear Regression models to predict the PerformanceIndex. ⚙️ Evaluate model performance with metrics like Mean Squared Error (MSE) and Adjusted R-squared (R²).
The dataset contains approximately 1000 rows, offering ample data points to build, train, and test regression models effectively. It's ideal for learning how multiple factors collectively influence outcomes in a practical scenario.
Feel free to use this dataset for practice, tutorials, or projects. Let it guide your journey in mastering regression techniques. Happy coding and analyzing!
This dataset was created by #Feba2005
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
This dataset was created by Parisan Ahmadi
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The partner company’s historical data could be utilized in developing a data-driven prediction model with project division details as its inputs and project division labor-hours as the desired output. The BIM models contain 42 design features and 1559 records, each record denoting a division of fabrication. The BIM design features are listed in Table 1. Labor-hours spent on each division were extracted from job costing databases serving as the output parameter in the regression model. Although the variables in Table 1 are all considered related, there are certain inter-correlations between them and some variables can be explained by others. For instance, material length and weight are highly correlated; by knowing one, the other can be deduced. Therefore, a variable selection technique is instrumental in removing these inter-correlations in an analytical manner. It is noteworthy that the dataset was linearly scaled prior to performing analyses in order not to reveal sensitive information of the partner company without distorting patterns and relationships inherent in the data.
This dataset consists of short-term (1970-2009) linear regression shoreline change rates for the Boston region of Massachusetts. Rates of short-term shoreline change were computed within a GIS using the Digital Shoreline Analysis System (DSAS) version 4.3, an ArcGIS extension developed by the U.S. Geological Survey. The baseline is used as a reference line for the transects cast by the DSAS software. The transects intersect each shoreline at the measurement points, which are then used to calculate the short-term rates. Due to continued coastal population growth and increased threats of erosion, current data on trends and rates of shoreline movement are required to inform shoreline and floodplain management. The Massachusetts Office of Coastal Zone Management launched the Shoreline Change Project in 1989 to identify erosion-prone areas of the coast. In 2001, a 1994 shoreline was added to calculate both long- and short-term shoreline change rates at 40-meter intervals along ocean-facing sections of the Massachusetts coast. The Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) in cooperation with the Massachusetts Office of Coastal Zone Management, has compiled reliable historical shoreline data along open-facing sections of the Massachusetts coast under the Massachusetts Shoreline Change Mapping and Analysis Project 2013 Update. Two oceanfront shorelines for Massachusetts (approximately 1,800 km) were (1) delineated using 2008/09 color aerial orthoimagery, and (2) extracted from topographic LIDAR datasets (2007) obtained from NOAA's Ocean Service, Coastal Services Center. The new shorelines were integrated with existing Massachusetts Office of Coastal Zone Management and USGS historical shoreline data in order to compute long- and short-term rates using the latest version of the Digital Shoreline Analysis System (DSAS).
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
This dataset consists of short-term (~30-year) shoreline change rates for the Florida west coastal region from Anclote Key to Estero Island. Rate calculations were computed within a GIS using the Digital Shoreline Analysis System (DSAS) version 4.3, an ArcGIS extension developed by the U.S. Geological Survey. Where three or more shorelines exist in the dataset for the short-term time period (1960 and later), rates of shoreline change were calculated using a linear regression rate method. A reference baseline was used as the originating point for the orthogonal transects cast by the DSAS software. The transects intersect each shoreline establishing measurement points, which are then used to calculate short-term rates. Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project.There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner. .
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
This dataset consists of short-term (1970-2009) linear regression shoreline change rates for the Buzzards Bay region of Massachusetts. Rates of short-term shoreline change were computed within a GIS using the Digital Shoreline Analysis System (DSAS) version 4.3, an ArcGIS extension developed by the U.S. Geological Survey. The baseline is used as a reference line for the transects cast by the DSAS software. The transects intersect each shoreline at the measurement points, which are then used to calculate the short-term rates. Due to continued coastal population growth and increased threats of erosion, current data on trends and rates of shoreline movement are required to inform shoreline and floodplain management. The Massachusetts Office of Coastal Zone Management launched the Shoreline Change Project in 1989 to identify erosion-prone areas of the coast. In 2001, a 1994 shoreline was added to calculate both long- and short-term shoreline change rates at 40-meter intervals along ocean-facing sections of the Massachusetts coast. The Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) in cooperation with the Massachusetts Office of Coastal Zone Management, has compiled reliable historical shoreline data along open-facing sections of the Massachusetts coast under the Massachusetts Shoreline Change Mapping and Analysis Project 2013 Update. Two oceanfront shorelines for Massachusetts (approximately 1,800 km) were (1) delineated using 2008/09 color aerial orthoimagery, and (2) extracted from topographic LIDAR datasets (2007) obtained from NOAA's Ocean Service, Coastal Services Center. The new shorelines were integrated with existing Massachusetts Office of Coastal Zone Management and USGS historical shoreline data in order to compute long- and short-term rates using the latest version of the Digital Shoreline Analysis System (DSAS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It includes five different datasets. The first four datasets contain student projects collected from different offerings of two undergraduate-level courses – Object-Oriented Analysis and Design (OOAD) and Software Engineering (SE) – taught in a renowned private university in Lahore over a period of six years. The fifth dataset contains real-life industry projects collected from a renowned software house (i.e. member of Pakistan Software Houses Association for IT and ITeS (P@SHA)) in Lahore.
Dataset #1 consists of 31 C++ GUI-based desktop applications. Dataset #2 consists of 19 Java GUI-based desktop applications. Dataset #3 consists of 12 Java web applications. Dataset #4 consists of 31 Java all two categories. Dataset #5 consists of 11 VB.NET GUI-based desktop applications.
Attributes are used as follows: Project Code – Project ID for identification purposes NOC – The total number of classes in a class diagram NOA – The total number of attributes in a class diagram NOM – The total number of methods/operations in a class diagram NODep – The total number of dependency relationships in a class diagram NOAss – The total number of association relationships in a class diagram NOComp – The total number of composition relationships in a class diagram NOAgg – The total number of aggregation relationships in a class diagram NOGen – The total number of generalization relationships in a class diagram NORR – The total number of realization relationships in a class diagram NOOM – The total number of one-to-one multiplicity relationships in a class diagram NOMM – The total number of one-to-many multiplicity relationships in a class diagram NMMM – The total number of many-to-many multiplicity relationships in a class diagram OCP – objective class points EOCP – enhanced objective class points WEOCP – weighted enhanced objective class points SLOC – software size measured in source lines of code
Hi, everyone, This dataset contains two different datasets called train_1 and test_1. About 'train_1' dataset:- This dataset is related to the banking sector. This dataset includes 18 columns and more than 30k rows. This dataset is having the "ID, AGE, JOB, MARITAL, EDUCATION, DEFAULT, BALANCE, HOUSING, LOAN, CONTACT, DAY, MONTH, DURATION, CAMPAIGN, PDAYS, PREVIOUS, POUTCOME, SUBSCRIBED". About the 'test_1" dataset:- The 'test_1' dataset has 17 columns and more than 13k rows (approximately). All columns are the same as the 'train_1' dataset, but the 'test_1' dataset does not have the 'SUBSCRIBED' COLUMN, because we have predicted this 'SUBSCRIBED' column, with the help of 'train_1' dataset(this train_1 dataset have the subscribed column, based on that, we have to predict the subscribed column in the Test_1 dataset.)
The details about these two datasets are given above. Just look at both datasets, then you can easily understand these data types.
This is my first project in 'Kaggle.com'. This dataset is useful for applying the 'Logistic_Regression_Algorithm' and 'Linear_Regression_Algorithm'. By using these two datasets, I have done my 'Logistic_Regression' project. Code is also there for this logistic_regression_problem, please check once, give me some suggestions, to improve my data science skills.
Note:- This both 'train_1' and 'test_1' datasets are, I got from 'internshala.com' training.
Sandy ocean beaches are a popular recreational destination, often surrounded by communities containing valuable real estate. Development is on the rise despite the fact that coastal infrastructure is subjected to flooding and erosion. As a result, there is an increased demand for accurate information regarding past and present shoreline changes. To meet these national needs, the Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) is compiling existing reliable historical shoreline data along open-ocean sandy shores of the conterminous United States and parts of Alaska and Hawaii under the National Assessment of Shoreline Change project. There is no widely accepted standard for analyzing shoreline change. Existing shoreline data measurements and rate calculation methods vary from study to study and prevent combining results into state-wide or regional assessments. The impetus behind the National Assessment project was to develop a standardized method of measuring changes in shoreline position that is consistent from coast to coast. The goal was to facilitate the process of periodically and systematically updating the results in an internally consistent manner.
Automated monitoring of turbidity using in-situ water quality sondes can be used to infer sediment loading and offers several advantages over manual, event-based sampling. However, because turbidity is an optical property and not a true measurement of gravimetric TSS concentration, a regression model between turbidity and TSS must be used prior to any loading estimations. The relationship between turbidity and TSS is dependent on a number of site-specific factors including dissolved organic material, watershed mineralogy and sedimentology, particle density, etc. Therefore, an important step for using turbidity data to estimate sediment loading is the development of a site-specific regression model between turbidity and TSS. Regression model development is conducted by analyzing several samples (n > 100) concurrently for turbidity and TSS across the expected range of turbidity levels. This dataset was generated from a pre-restoration monitoring project funded through the National Fish and Wildlife Foundation - Gulf Environmental Benefit Fund (Project #67265). The dataset includes information on turbidity readings and TSS measurements for water samples collected at a site upstream of the confluence of Schoolhouse Branch and Magnolia River. Additionally, the R script for the data regressions is provided. Purpose The purpose of this project was to develop a site-specific TSS-turbidity regression model that can be used to infer TSS from turbidity data for the Week’s Bay watershed. Samples collected at the Schoolhouse Branch restoration site over the course of early- to mid-2024 by an automated sampler were analyzed to develop the regression model. Completion of this work allows for more robust pre- and post-restoration sediment monitoring and will help inform stream restoration and watershed management planning activities. DOI: Suggested Citation
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Model transformation languages are special-purpose languages, which are designed to define transformations as comfortably as possible, i.e., often in a declarative way. With the increasing use of transformations in various domains, the complexity and size of input models are also increasing. However, developers often lack suitable models for performance testing. We have therefore conducted experiments in which we predict the performance of model transformations based on characteristics of input models using machine learning approaches. This dataset contains our raw and processed input data, the scripts necessary to repeat our experiments, and the results we obtained.
Our input data consists of the time measurements for six different transformations defined in the Atlas Transformation Language (ATL), as well as the collected characteristics of the real-world input models that were transformed. We provide the script that implements our experiments. We predict the execution time of ATL transformations using the machine learning approaches linear regression, random forests and support vector regression using a radial basis function kernel. We also investigate different sets of characteristics of input models as input for the machine learning approaches. These are described in detail in the provided documentation.pdf. The results of the experiments are provided as raw data in individual cvs files. Additionally, we calculated the mean absolute percentage error in % and the 95th percentile of the absolute percentage error in % for each experiment and provide these results. Furthermore, we provide our Eclipse plugin, which collects the characteristics for a set of given models, the Java projects used to measure the execution time of the transformations, and other supporting scripts, e.g. for the analysis of the results.
A short introduction with a quick start guide can be found in README.md and a detailed documentation in documentaion.pdf.
This dataset consists of short-term (1970-2009) linear regression shoreline change rates for the South Shore region of Massachusetts. Rates of short-term shoreline change were computed within a GIS using the Digital Shoreline Analysis System (DSAS) version 4.3, an ArcGIS extension developed by the U.S. Geological Survey. The baseline is used as a reference line for the transects cast by the DSAS software. The transects intersect each shoreline at the measurement points, which are then used to calculate the short-term rates. Due to continued coastal population growth and increased threats of erosion, current data on trends and rates of shoreline movement are required to inform shoreline and floodplain management. The Massachusetts Office of Coastal Zone Management launched the Shoreline Change Project in 1989 to identify erosion-prone areas of the coast. In 2001, a 1994 shoreline was added to calculate both long- and short-term shoreline change rates at 40-meter intervals along ocean-facing sections of the Massachusetts coast. The Coastal and Marine Geology Program of the U.S. Geological Survey (USGS) in cooperation with the Massachusetts Office of Coastal Zone Management, has compiled reliable historical shoreline data along open-facing sections of the Massachusetts coast under the Massachusetts Shoreline Change Mapping and Analysis Project 2013 Update. Two oceanfront shorelines for Massachusetts (approximately 1,800 km) were (1) delineated using 2008/09 color aerial orthoimagery, and (2) extracted from topographic LIDAR datasets (2007) obtained from NOAA's Ocean Service, Coastal Services Center. The new shorelines were integrated with existing Massachusetts Office of Coastal Zone Management and USGS historical shoreline data in order to compute long- and short-term rates using the latest version of the Digital Shoreline Analysis System (DSAS).
This data set contains example data for exploration of the theory of regression based regionalization. The 90th percentile of annual maximum streamflow is provided as an example response variable for 293 streamgages in the conterminous United States. Several explanatory variables are drawn from the GAGES-II data base in order to demonstrate how multiple linear regression is applied. Example scripts demonstrate how to collect the original streamflow data provided and how to recreate the figures from the associated Techniques and Methods chapter.