ssurgoOnDemandThe purpose of these tools are to give users the ability to get Soil Survey Geographic Database (SSURGO) properties and interpretations in an efficient manner. They are very similiar to the United States Department of Agriculture - Natural Resource Conservation Service's distributed Soil Data Viewer (SDV), although there are distinct differences. The most important difference is the data collected with the SSURGO On-Demand (SOD) tools are collected in real-time via web requests to Soil Data Access (https://sdmdataaccess.nrcs.usda.gov/). SOD tools do not require users to have the data found in a traditional SSURGO download from the NRCS's official repository, Web Soil Survey (https://websoilsurvey.sc.egov.usda.gov/App/HomePage.htm). The main intent of both SOD and SDV are to hide the complex relationships of the SSURGO tables and allow the users to focus on asking the question they need to get the information they want. This is accomplished in the user interface of the tools and the subsequent SQL is built and executed for the user. Currently, the tools packaged here are designed to run within the ESRI ArcGIS Desktop Application - ArcMap, version 10.1 or greater. However, much of the Python code is recyclable and could run within a Python intepreter or other GIS applications such as Quantum GIS with some modification.NOTE: The queries in these tools only consider the major components of soil map units.Within the SOD tools are 2 primary toolsets, descibed as follows:<1. AreasymbolThe Areasymbol tools collect SSURGO properties and interpretations based on a user supplied list of Soil Survey areasymbols (e.g. NC123). After the areasymbols have been collected, an aggregation method (see below) is selected . Tee aggregation method has no affect on interpretations other than how the SSURGO data aggregated. For soil properties, the aggregation method drives what properties can be run. For example, you can't run the weighted average aggregation method on Taxonomic Order. Similarly, for the same soil property, you wouldn't specify a depth range. The point here is the aggregation method affects what parameters need to be supplied for the SQL generation. It is important to note the user can specify any number of areasymbols and any number of interpretations. This is another distinct advantage of these tools. You could collect all of the SSURGO interpretations for every soil survey area (areasymbol) by executing the tool 1 time. This also demonstrates the flexibility SOD has in defining the geographic extent over which information is collected. The only constraint is the extent of soil survey areas selected to run (and these can be discontinuous).As SOD Areasymbol tools execute, 2 lists are collected from the tool dialog, a list of interpretations/properties and a list of areasymbols. As each interpretation/property is run, every areasymbol is run against the interpretation/property requested. For instance, suppose you wanted to collect the weighted average of sand, silt and clay for 5 soil survey areas. The sand property would run for all 5 soil survey areas and built into a table. Next the silt would run for all 5 soil survey areas and built into a table, and so on. In this example a total of 15 web request would have been sent and 3 tables are built. Two VERY IMPORTANT things here...A. All the areasymbol tools do is generate tables. They are not collecting spatial data.B. They are collecting stored information. They are not making calculations(with the exception of the weighted average aggregation method).<2. ExpressThe Express toolset is nearly identical to the Areasymbol toolset, with 2 exceptions.A. The area to collect SSURGO information over is defined by the user. The user digitizes coordinates into a 'feature set' after the tool is open. The points in the feature set are closed (first point is also the last) into a polygon. The polygon is sent to Soil Data Access and the features set points (polygon) are used to clip SSURGO spatial data. The geomotries of the clip operation are returned, along with the mapunit keys (unique identifier). It is best to keep the points in the feature set simple and beware of self intersections as they are fatal.B. Instead of running on a list of areasymbols, the SQL queries on a list of mapunit keys.The properties and interpretations options are identical to what was discussed for the Areasymbol toolset.The Express tools present the user the option of creating layer files (.lyr) where the the resultant interpretation/property are joined to the geometry and saved to disk as a virtual join. Additionally, for soil properties, an option exists to append all of the selected soil properties to a single table. In this case, if the user ran sand, silt, and clay properties, instead of 3 output tables, there is only 1 table with a sand column, a silt column, and a clay column.<Supplemental Information<sAggregation MethodAggregation is the process by which a set of component attribute values is reduced to a single value to represent the map unit as a whole.A map unit is typically composed of one or more "components". A component is either some type of soil or some nonsoil entity, e.g., rock outcrop. The components in the map unit name represent the major soils within a map unit delineation. Minor components make up the balance of the map unit. Great differences in soil properties can occur between map unit components and within short distances. Minor components may be very different from the major components. Such differences could significantly affect use and management of the map unit. Minor components may or may not be documented in the database. The results of aggregation do not reflect the presence or absence of limitations of the components which are not listed in the database. An on-site investigation is required to identify the location of individual map unit components. For queries of soil properties, only major components are considered for Dominant Component (numeric) and Weighted Average aggregation methods (see below). Additionally, the aggregation method selected drives the available properties to be queried. For queries of soil interpretations, all components are condisered.For each of a map unit's components, a corresponding percent composition is recorded. A percent composition of 60 indicates that the corresponding component typically makes up approximately 60% of the map unit. Percent composition is a critical factor in some, but not all, aggregation methods.For the attribute being aggregated, the first step of the aggregation process is to derive one attribute value for each of a map unit's components. From this set of component attributes, the next step of the aggregation process derives a single value that represents the map unit as a whole. Once a single value for each map unit is derived, a thematic map for soil map units can be generated. Aggregation must be done because, on any soil map, map units are delineated but components are not.The aggregation method "Dominant Component" returns the attribute value associated with the component with the highest percent composition in the map unit. If more than one component shares the highest percent composition, the value of the first named component is returned.The aggregation method "Dominant Condition" first groups like attribute values for the components in a map unit. For each group, percent composition is set to the sum of the percent composition of all components participating in that group. These groups now represent "conditions" rather than components. The attribute value associated with the group with the highest cumulative percent composition is returned. If more than one group shares the highest cumulative percent composition, the value of the group having the first named component of the mapunit is returned.The aggregation method "Weighted Average" computes a weighted average value for all components in the map unit. Percent composition is the weighting factor. The result returned by this aggregation method represents a weighted average value of the corresponding attribute throughout the map unit.The aggregation method "Minimum or Maximum" returns either the lowest or highest attribute value among all components of the map unit, depending on the corresponding "tie-break" rule. In this case, the "tie-break" rule indicates whether the lowest or highest value among all components should be returned. For this aggregation method, percent composition ties cannot occur. The result may correspond to a map unit component of very minor extent. This aggregation method is appropriate for either numeric attributes or attributes with a ranked or logically ordered domain.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comes as SQL-importable file and is compatible with the widely available MariaDB- and MySQL-databases.
It is based on (and incorporates/extends) the dataset "1151 commits with software maintenance activity labels (corrective,perfective,adaptive)" by Levin and Yehudai (https://doi.org/10.5281/zenodo.835534).
The extensions to this dataset were obtained using Git-Tools, a tool that is included in the Git-Density (https://doi.org/10.5281/zenodo.2565238) suite. For each of the projects in the original dataset, Git-Tools was run in extended mode.
The dataset contains these tables:
x1151: The original dataset from Levin and Yehudai.
despite its name, this dataset has only 1,149 commits, as two commits were duplicates in the original dataset.
This dataset spanned 11 projects, each of which had between 99 and 114 commits
This dataset has 71 features and spans the projects RxJava, hbase, elasticsearch, intellij-community, hadoop, drools, Kotlin, restlet-framework-java, orientdb, camel and spring-framework.
gtools_ex (short for Git-Tools, extended)
Contains 359,569 commits, analyzed using Git-Tools in extended mode
It spans all commits and projects from the x1151 dataset as well.
All 11 projects were analyzed, from the initial commit until the end of January 2019. For the projects Intellij and Kotlin, the first 35,000 resp. 30,000 commits were analyzed.
This dataset introduces 35 new features (see list below), 22 of which are size- or density-related.
The dataset contains these views:
geX_L (short for Git-tools, extended, with labels)
Joins the commits' labels from x1151 with the extended attributes from gtools_ex, using the commits' hashes.
jeX_L (short for joined, extended, with labels)
Joins the datasets x1151 and gtools_ex entirely, based on the commits' hashes.
Features of the gtools_ex dataset:
SHA1
RepoPathOrUrl
AuthorName
CommitterName
AuthorTime (UTC)
CommitterTime (UTC)
MinutesSincePreviousCommit: Double, describing the amount of minutes that passed since the previous commit. Previous refers to the parent commit, not the previous in time.
Message: The commit's message/comment
AuthorEmail
CommitterEmail
AuthorNominalLabel: All authors of a repository are analyzed and merged by Git-Density using some heuristic, even if they do not always use the same email address or name. This label is a unique string that helps identifying the same author across commits, even if the author did not always use the exact same identity.
CommitterNominalLabel: The same as AuthorNominalLabel, but for the committer this time.
IsInitialCommit: A boolean indicating, whether a commit is preceded by a parent or not.
IsMergeCommit: A boolean indicating whether a commit has more than one parent.
NumberOfParentCommits
ParentCommitSHA1s: A comma-concatenated string of the parents' SHA1 IDs
NumberOfFilesAdded
NumberOfFilesAddedNet: Like the previous property, but if the net-size of all changes of an added file is zero (i.e. when adding a file that is empty/whitespace or does not contain code), then this property does not count the file.
NumberOfLinesAddedByAddedFiles
NumberOfLinesAddedByAddedFilesNet: Like the previous property, but counts the net-lines
NumberOfFilesDeleted
NumberOfFilesDeletedNet: Like the previous property, but considers only files that had net-changes
NumberOfLinesDeletedByDeletedFiles
NumberOfLinesDeletedByDeletedFilesNet: Like the previous property, but counts the net-lines
NumberOfFilesModified
NumberOfFilesModifiedNet: Like the previous property, but considers only files that had net-changes
NumberOfFilesRenamed
NumberOfFilesRenamedNet: Like the previous property, but considers only files that had net-changes
NumberOfLinesAddedByModifiedFiles
NumberOfLinesAddedByModifiedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesDeletedByModifiedFiles
NumberOfLinesDeletedByModifiedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesAddedByRenamedFiles
NumberOfLinesAddedByRenamedFilesNet: Like the previous property, but counts the net-lines
NumberOfLinesDeletedByRenamedFiles
NumberOfLinesDeletedByRenamedFilesNet: Like the previous property, but counts the net-lines
Density: The ratio between the two sums of all lines added+deleted+modified+renamed and their resp. gross-version. A density of zero means that the sum of net-lines is zero (i.e. all lines changes were just whitespace, comments etc.). A density of of 1 means that all changed net-lines contribute to the gross-size of the commit (i.e. no useless lines with e.g. only comments or whitespace).
AffectedFilesRatioNet: The ratio between the sums of NumberOfFilesXXX and NumberOfFilesXXXNet
This dataset is supporting the paper "Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities", as submitted to the QRS2019 conference (The 19th IEEE International Conference on Software Quality, Reliability, and Security). Citation: Hönel, S., Ericsson, M., Löwe, W. and Wingkvist, A., 2019. Importance and Aptitude of Source code Density for Commit Classification into Maintenance Activities. In The 19th IEEE International Conference on Software Quality, Reliability, and Security.
You are an Analytics Engineer at an EdTech company focused on improving customer learning experiences. Your team relies on in-depth analysis of user data to enhance the learning journey and inform product feature updates.
Track
→ Course
→ Topic
→ Lesson
. Each lesson can take various formats, such as videos, practice exercises, exams, etc.user_lesson_progress_log
table. A user can have multiple logs for a lesson in a day.DB Diagram: https://dbdiagram.io/d/627100b17f945876b6a93e54 (use the ‘Highlight’ option to understand the relationships)
track_table
: Contains all tracks
Column | Description | Schema |
---|---|---|
track_id | unique id for an individual track | string |
track_title | name of the track | string |
course_table
: Contains all courses
Column | Description | Schema |
---|---|---|
course_id | unique id for an individual course | string |
track_id | track id to which this course belongs to | string |
course_title | name of the course | string |
topic_table
: Contains all topics
Column | Description | Schema |
---|---|---|
topic_id | unique id for an individual topic | string |
course_id | course id to which this topic belongs to | string |
topic_title | name of the topic | string |
lesson_table
: Contains all lessons
Column | Description | Schema |
---|---|---|
lesson_id | unique id for individual lesson | string |
topic_id | topic id to which this lesson belongs to | string |
lesson_title | name of the lesson | string |
lesson_type | type of the lesson i.e., it may be practice, video, exam | string |
duration_in_sec | ideal duration of the lesson (in seconds) in which user can complete the lesson | float |
user_registrations
: Contains the registration information of the users. A user has only one entry
Column | Description | Schema |
---|---|---|
user_id | unique id for an individual user | string |
registration_date | date at which a user registered | string |
user_info | contains information about the users. The field stores address, education_info, and profile in JSON format | string |
user_lesson_progress_log
: Any learning activity done by the user on a lesson is stored in logs. A user can have multiple logs for a lesson in a day. Every time a lesson completion percentage of a user is updated, a log is recorded here.
Column | Description | Schema |
---|---|---|
id | unique id for each entry | string |
user_id | unique id for an individual user | string |
lesson_id | unique id for a particular lesson | string |
overall_completion_percentage | total completion percentage of the lesson at the time of log | float |
completion_percentage_difference | Difference between the overall _completion _percentage of the lesson and the immediate preceding overall _completion _percentage | float |
activity_recorded_datetime_in_ utc | datetime at which the user has done some activity on the lesson | datetime |
Example: If a user u1 has started the lesson lesson1 and completed 10% of the lesson at May 1st 2022 8:00:00 UTC. And, the user completed 30% of the lesson at May 1st 2022 10:00:00 UTC and 20% of the lesson at May 3rd 2022 10:00:00 UTC, then the logs are recorded as follows:
id | user_id | lesson_id | overall_completion_percentage | completion_percentage_difference | activity_recorded_datetime_in_utc |
---|---|---|---|---|---|
id1 | u1 | lesson1 | 10 | 10 | 2022-05-01 08:00:00 |
id2 | u1 | lesson1 | 40 | 30 | 2022-05-01 10:00:00 |
id3 | u1 | lesson1 | 60 | 20 | 2022-05-03 10:00:00 |
user_feedback
: The table contains the feedback data given by the users. A user can give feedback to a lesson multiple times. Each feedback contains multiple questions. Each question and response is stored in an entry.
Column | Description | Schema |
---|---|---|
id | unique id for each entry | string |
feedback_id | unique id for each feedback | string |
creation_datetime | datetime at which user gave a feedback | string |
user_id | user id who gave the feedback | float |
lesson_id | ... |
According to our latest research, the SQL Server Performance Monitoring Tools and Software market size reached USD 1.87 billion in 2024, with a compound annual growth rate (CAGR) of 13.2% projected over the forecast period. By 2033, the market is anticipated to achieve a value of USD 5.73 billion. The primary growth factor driving this market is the increasing demand for real-time database performance optimization and the rapid digital transformation across industries, which is compelling organizations to ensure seamless, reliable, and high-performing SQL Server environments.
One of the most significant growth drivers for the SQL Server Performance Monitoring Tools and Software market is the exponential increase in data volumes and the complexity of enterprise IT infrastructures. As organizations migrate more workloads to SQL Server databases, the need to maintain optimal performance, uptime, and security becomes paramount. This scenario is further complicated by the proliferation of hybrid and multi-cloud environments, which require advanced monitoring solutions that can provide unified visibility across diverse deployments. Enterprises are investing in sophisticated monitoring tools to proactively identify bottlenecks, predict potential failures, and automate performance tuning, all of which contribute to higher operational efficiency and reduced downtime. The growing emphasis on digital transformation and data-driven decision-making ensures that robust performance monitoring remains a top priority for IT leaders globally.
Another key factor propelling the adoption of SQL Server Performance Monitoring Tools and Software is the rise in regulatory compliance and cybersecurity requirements across various industries. Sectors such as BFSI, healthcare, and government are subject to stringent data protection regulations, necessitating continuous monitoring of database activity and performance. Advanced monitoring tools now offer features such as anomaly detection, predictive analytics, and real-time alerting, which help organizations not only optimize performance but also maintain compliance with industry standards like GDPR, HIPAA, and PCI DSS. The integration of artificial intelligence and machine learning into these tools further enhances their capability to detect unusual patterns and mitigate risks proactively, thereby reinforcing the need for comprehensive performance monitoring solutions.
The surge in cloud adoption and the shift towards cloud-native architectures are also significantly impacting the SQL Server Performance Monitoring Tools and Software market. As businesses increasingly deploy SQL Server instances in public, private, or hybrid clouds, they require monitoring tools that are cloud-agnostic and scalable to dynamic workloads. Cloud-based monitoring solutions offer the flexibility, scalability, and cost-effectiveness that modern enterprises demand, enabling them to monitor performance metrics in real-time, regardless of deployment model. This trend is particularly pronounced among small and medium enterprises (SMEs), which benefit from the lower upfront costs and ease of management associated with cloud-based tools. As a result, vendors are intensifying their focus on delivering SaaS-based monitoring platforms with advanced analytics and intuitive dashboards, further accelerating market growth.
Regionally, the SQL Server Performance Monitoring Tools and Software market is witnessing robust growth in North America, driven by early technology adoption, a large base of SQL Server users, and the presence of leading market players. Europe follows closely, with strong demand from sectors such as BFSI, healthcare, and government, while Asia Pacific is emerging as a high-growth region due to rapid digitalization and increasing cloud adoption. Latin America and the Middle East & Africa are gradually catching up, supported by investments in IT infrastructure and the expansion of enterprise applications. As organizations worldwide seek to modernize their database environments and enhance operational resilience, the demand for advanced SQL Server performance monitoring solutions is expected to remain strong throughout the forecast period.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
ssurgoOnDemandThe purpose of these tools are to give users the ability to get Soil Survey Geographic Database (SSURGO) properties and interpretations in an efficient manner. They are very similiar to the United States Department of Agriculture - Natural Resource Conservation Service's distributed Soil Data Viewer (SDV), although there are distinct differences. The most important difference is the data collected with the SSURGO On-Demand (SOD) tools are collected in real-time via web requests to Soil Data Access (https://sdmdataaccess.nrcs.usda.gov/). SOD tools do not require users to have the data found in a traditional SSURGO download from the NRCS's official repository, Web Soil Survey (https://websoilsurvey.sc.egov.usda.gov/App/HomePage.htm). The main intent of both SOD and SDV are to hide the complex relationships of the SSURGO tables and allow the users to focus on asking the question they need to get the information they want. This is accomplished in the user interface of the tools and the subsequent SQL is built and executed for the user. Currently, the tools packaged here are designed to run within the ESRI ArcGIS Desktop Application - ArcMap, version 10.1 or greater. However, much of the Python code is recyclable and could run within a Python intepreter or other GIS applications such as Quantum GIS with some modification.NOTE: The queries in these tools only consider the major components of soil map units.Within the SOD tools are 2 primary toolsets, descibed as follows:<1. AreasymbolThe Areasymbol tools collect SSURGO properties and interpretations based on a user supplied list of Soil Survey areasymbols (e.g. NC123). After the areasymbols have been collected, an aggregation method (see below) is selected . Tee aggregation method has no affect on interpretations other than how the SSURGO data aggregated. For soil properties, the aggregation method drives what properties can be run. For example, you can't run the weighted average aggregation method on Taxonomic Order. Similarly, for the same soil property, you wouldn't specify a depth range. The point here is the aggregation method affects what parameters need to be supplied for the SQL generation. It is important to note the user can specify any number of areasymbols and any number of interpretations. This is another distinct advantage of these tools. You could collect all of the SSURGO interpretations for every soil survey area (areasymbol) by executing the tool 1 time. This also demonstrates the flexibility SOD has in defining the geographic extent over which information is collected. The only constraint is the extent of soil survey areas selected to run (and these can be discontinuous).As SOD Areasymbol tools execute, 2 lists are collected from the tool dialog, a list of interpretations/properties and a list of areasymbols. As each interpretation/property is run, every areasymbol is run against the interpretation/property requested. For instance, suppose you wanted to collect the weighted average of sand, silt and clay for 5 soil survey areas. The sand property would run for all 5 soil survey areas and built into a table. Next the silt would run for all 5 soil survey areas and built into a table, and so on. In this example a total of 15 web request would have been sent and 3 tables are built. Two VERY IMPORTANT things here...A. All the areasymbol tools do is generate tables. They are not collecting spatial data.B. They are collecting stored information. They are not making calculations(with the exception of the weighted average aggregation method).<2. ExpressThe Express toolset is nearly identical to the Areasymbol toolset, with 2 exceptions.A. The area to collect SSURGO information over is defined by the user. The user digitizes coordinates into a 'feature set' after the tool is open. The points in the feature set are closed (first point is also the last) into a polygon. The polygon is sent to Soil Data Access and the features set points (polygon) are used to clip SSURGO spatial data. The geomotries of the clip operation are returned, along with the mapunit keys (unique identifier). It is best to keep the points in the feature set simple and beware of self intersections as they are fatal.B. Instead of running on a list of areasymbols, the SQL queries on a list of mapunit keys.The properties and interpretations options are identical to what was discussed for the Areasymbol toolset.The Express tools present the user the option of creating layer files (.lyr) where the the resultant interpretation/property are joined to the geometry and saved to disk as a virtual join. Additionally, for soil properties, an option exists to append all of the selected soil properties to a single table. In this case, if the user ran sand, silt, and clay properties, instead of 3 output tables, there is only 1 table with a sand column, a silt column, and a clay column.<Supplemental Information<sAggregation MethodAggregation is the process by which a set of component attribute values is reduced to a single value to represent the map unit as a whole.A map unit is typically composed of one or more "components". A component is either some type of soil or some nonsoil entity, e.g., rock outcrop. The components in the map unit name represent the major soils within a map unit delineation. Minor components make up the balance of the map unit. Great differences in soil properties can occur between map unit components and within short distances. Minor components may be very different from the major components. Such differences could significantly affect use and management of the map unit. Minor components may or may not be documented in the database. The results of aggregation do not reflect the presence or absence of limitations of the components which are not listed in the database. An on-site investigation is required to identify the location of individual map unit components. For queries of soil properties, only major components are considered for Dominant Component (numeric) and Weighted Average aggregation methods (see below). Additionally, the aggregation method selected drives the available properties to be queried. For queries of soil interpretations, all components are condisered.For each of a map unit's components, a corresponding percent composition is recorded. A percent composition of 60 indicates that the corresponding component typically makes up approximately 60% of the map unit. Percent composition is a critical factor in some, but not all, aggregation methods.For the attribute being aggregated, the first step of the aggregation process is to derive one attribute value for each of a map unit's components. From this set of component attributes, the next step of the aggregation process derives a single value that represents the map unit as a whole. Once a single value for each map unit is derived, a thematic map for soil map units can be generated. Aggregation must be done because, on any soil map, map units are delineated but components are not.The aggregation method "Dominant Component" returns the attribute value associated with the component with the highest percent composition in the map unit. If more than one component shares the highest percent composition, the value of the first named component is returned.The aggregation method "Dominant Condition" first groups like attribute values for the components in a map unit. For each group, percent composition is set to the sum of the percent composition of all components participating in that group. These groups now represent "conditions" rather than components. The attribute value associated with the group with the highest cumulative percent composition is returned. If more than one group shares the highest cumulative percent composition, the value of the group having the first named component of the mapunit is returned.The aggregation method "Weighted Average" computes a weighted average value for all components in the map unit. Percent composition is the weighting factor. The result returned by this aggregation method represents a weighted average value of the corresponding attribute throughout the map unit.The aggregation method "Minimum or Maximum" returns either the lowest or highest attribute value among all components of the map unit, depending on the corresponding "tie-break" rule. In this case, the "tie-break" rule indicates whether the lowest or highest value among all components should be returned. For this aggregation method, percent composition ties cannot occur. The result may correspond to a map unit component of very minor extent. This aggregation method is appropriate for either numeric attributes or attributes with a ranked or logically ordered domain.