Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spatial association rule mining (SARM) is an important data mining task for understanding implicit and sophisticated interactions in spatial data. The usefulness of SARM results, represented as sets of rules, depends on their reliability: the abundance of rules, control over the risk of spurious rules, and accuracy of rule interestingness measure (RIM) values. This study presents crisp-fuzzy SARM, a novel SARM method that can enhance the reliability of resultant rules. The method firstly prunes dubious rules using statistically sound tests and crisp supports for the patterns involved, and then evaluates RIMs of accepted rules using fuzzy supports. For the RIM evaluation stage, the study also proposes a Gaussian-curve-based fuzzy data discretization model for SARM with improved design for spatial semantics. The proposed techniques were evaluated by both synthetic and real-world data. The synthetic data was generated with predesigned rules and RIM values, thus the reliability of SARM results could be confidently and quantitatively evaluated. The proposed techniques showed high efficacy in enhancing the reliability of SARM results in all three aspects. The abundance of resultant rules was improved by 50% or more compared with using conventional fuzzy SARM. Minimal risk of spurious rules was guaranteed by statistically sound tests. The probability that the entire result contained any spurious rules was below 1%. The RIM values also avoided large positive errors committed by crisp SARM, which typically exceeded 50% for representative RIMs. The real-world case study on New York City points of interest reconfirms the improved reliability of crisp-fuzzy SARM results, and demonstrates that such improvement is critical for practical spatial data analytics and decision support.
Facebook
TwitterThis dataset is a collection of scraped public twitter updates used in coordination with an academic project to study the geolocation data related to twittering. We provide both training set and test set in the paper You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users in CIKM 2010. The training set contains 115,886 Twitter users and 3,844,612 updates from the users. All the locations of the users are self-labeled in United States in city-level granularity. The test set contains 5,136 Twitter users and 5,156,047 tweets from the users. All the locations of users are uploaded from their smart phones with the form of "UT: Latitude,Longitude".
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
High Utility Co-location Pattern Mining (HUCPM), as an important branch of spatial data mining, aims to extract patterns with utility values that meet or exceed a predefined threshold based on user-defined utility criteria (e.g., cost, profit). However, due to the non-uniformity of spatial distribution, the utility associations between spatial features exhibit significant differences across different regions. As data scale and complexity continue to increase, mining efficiency faces significant challenges. Although various pruning strategies have been proposed to enhance mining efficiency, they cannot adaptively adjust based on the characteristics of the data distribution, making them difficult to apply widely across different datasets. To address these issues, this paper introduces the AUW-CE Miner (Adaptive Utility-Weighted Cross-Entropy Miner), a heuristic algorithm built upon an enhanced cross-entropy framework. By integrating a heuristic search mechanism, the algorithm can quickly converge to potential high utility patterns and effectively reduce redundant computational processes. Moreover, in response to the limitations of conventional cross-entropy methods for HUCPM, four core optimization strategies are designed: optimization of the initial probability distribution to guide the search direction, enhancement of sample diversity to prevent local convergence, dynamic adjustment of sample size to reduce redundant calculations, and incorporation of utility weights to improve the accuracy of probability updates. Experimental results show that the AUW-CE Miner significantly outperforms other algorithms in terms of runtime efficiency, with an average efficiency improvement of up to 56.5\%, demonstrating exceptional mining efficiency and stability.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Three semi-automated detection approaches using Sentinel-1 Synthetic Aperture Radar (SAR) have been performed to identify artisanal and small-scale mining (ASM) riverine dredges on the Madeira River in Brazil. The methods are: i) Search for Unidentified Maritime Objects (SUMO), an established method for large ocean ship detection; and two techniques specifically developed for riverine environments: ii) a local detection method; and iii) a global threshold method. The results from each method are contained on this landing page along with the visual interpretation dataset of SAR data used as the validation dataset. The pre-processed SAR data used to produce these results are found also found on this page.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a set of POI data sets of Shenzhen, Guangzhou, Beijing, and Shanghai cities, China.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Increasing popularity of social networks made them a viable data source for many data mining applications and event detection is no exception. Researchers aim not only to find events that happen in networks but more importantly to identify and locate events occurring in the real world.In this paper, we propose an enhanced version of quadtree - convolutional quadtree (ConvTree) - and demonstrate its advantage compared to the standard quadtree. We also introduce the algorithm for searching events of different scales using geospatial data obtained from social networks. The algorithm is based on statistical analysis of historical data, generation of ConvTrees representing the normal state of the city and anomalies evaluation for events detection.Experimental study conducted on the dataset of 60 million geotagged Instagram posts in the New York City area demonstrates that the proposed approach is able to find a wide range of events from very local (indie band concert or wedding party) to city (baseball game or holiday march) and even country scale (political protest or Christmas) events. This opens up a perspective of building simple and fast yet powerful system for real-time multiscale events monitoring.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract This aim of this paper is the acquisition of geographic data from the Foursquare application, using data mining to perform exploratory and spatial analyses of the distribution of tourist attraction and their density distribution in Rio de Janeiro city. Thus, in accordance with the Extraction, Transformation, and Load methodology, three research algorithms were developed using a tree hierarchical structure to collect information for the categories of Museums, Monuments and Landmarks, Historic Sites, Scenic Lookouts, and Trails, in the foursquare database. Quantitative analysis was performed of check-ins per neighborhood of Rio de Janeiro city, and kernel density (hot spot) maps were generated The results presented in this paper show the need for the data filtering process - less than 50% of the mined data were used, and a large part of the density of the Museums, Historic Sites, and Monuments and Landmarks categories is in the center of the city; while the Scenic Lookouts and Trails categories predominate in the south zone. This kind of analysis was shown to be a tool to support the city's tourist management in relation to the spatial localization of these categories, the tourists’ evaluations of the places, and the frequency of the target public.
Facebook
TwitterFrom the site: "Coverages containing industrial mineral mining data by quadrangle for the state of Pennsylvania. Digitized from the Harrisburg Bureau of Mining and Reclamation mylar map system each quadrangle contains multiple coverages identifying seams in that quad. Also includes coverages ('noncoal') indicating industrial minerals and coal mining refuse disposal sites, permitted sites, point coverages of deep mine entry and other surface features of deep mines and Small Operators Assistance Program (SOAP) areas."
Facebook
Twittera largely commercial website with some public information large amounts of data on global mining statistics, mine locations and ownership, however payment is required to view. These pages link to a large range of different maps and spatial data, including maps of miens and geology for a range of geographic regions
Website: http://www.infomine.com/maps/linkstree.aspx#cat1404
Facebook
TwitterFrom the site: "An Industrial Mineral Mining Operation is a DEP primary facility type related to the Industrial Mineral Mining Program. The sub-facility types are listed below:_Deep Mine Underground mining of industrial minerals, i.e., noncoal mining. Includes, but is not limited to, industrial minerals extracted from beneath the surface by means of shafts, tunnels, adits, or other mine openings. Discharge Point Discharge of water from an area as a result of industrial mining activities, i.e., noncoal mining. Mineral Preparation Plant Facility at which industrial minerals (i.e., noncoal minerals) are cleaned and processed. Mining Stormwater GP - General permit for Stormwater discharges associated with industrial mineral mining activities in which the main pollutant is sediment. Discharge is not into a High Quality or Exceptional Value designated stream. Surface Mine Surface mining of industrial minerals (i.e., noncoal minerals) by removing material which lies above the industrial minerals. Includes, but is not limited to, strip, auger, quarry, dredging, and leaching mines."
Facebook
TwitterThis repository contains the scripts required to implement the Wikidata-based geocoding pipeline described in the accompanying paper. geocode.sh : Shell script for setting up and executing Stanford CoreNLP with the required language models and entitylink annotator. Automates preprocessing, named entity recognition (NER), and wikification across a directory of plain-text (.txt) files. Configured for both local execution and high-performance computing (HPC) environments. geocode.py : Python script that processes the list of extracted location entities (entities.txt) and retrieves latitude/longitude coordinates from Wikidata using Pywikibot. Handles redirects, missing pages, and missing coordinate values, returning standardized placeholder codes where necessary. Outputs results as a CSV file with columns for place name, latitude, longitude, and source file. geocode.sbatch : Optional SLURM submission script for running run_corenlp.sh on HPC clusters. Includes configurable resource requests for scalable processing of large corpora. README.md : Detailed README file including a line-by-line explanation of the geocode.sh file. Together, these files provide a reproducible workflow for geocoding textual corpora via wikification, suitable for projects ranging from small-scale literary analysis to large-scale archival datasets.
Facebook
Twitterhttps://doi.org/10.5061/dryad.83bk3j9zv
M. Delia Basanta Department of Biology, University of Nevada Reno. Reno, Nevada, USA. delibasanta@gmail.com
Julián A. Velasco Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México. javelasco@atmosfera.unam.mx
Constantino González-Salazar. Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México. cgsalazar@atmosfera.unam.mx
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Objective: To identify geographically the beneficiaries categorized as prone to Type 2 Diabetes Mellitus, using the recognition of patterns in a database of a health plan operator, through data mining. Method: The following steps were developed: the initial step, the information survey. Development, construction of the process of extraction, transformation, and loading of the database. Deployment, presentation of the geographical information through a georeferencing tool. Results: As a result, the mapping of Paraná according to its health care network and the concentration of Type 2 Diabetes Mellitus is presented, enabling the identification of cause-and-effect relationships. Conclusion: It is concluded that the analysis of georeferenced information, linked to health information obtained through the data mining technique, can be an excellent tool for the health management of a health plan operator, contributing to the decision-making process in Health.
Facebook
TwitterFrom the site: "Coal Pillar Locations are pillars of coal that must remain in place to provide support for a coal mine."
Facebook
Twitterhttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Yearly citation counts for the publication titled "Spatial analysis and data mining techniques for identifying risk factors of Out-of-Hospital Cardiac Arrest".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Travel regions are not necessarily defined by political or administrative boundaries. For example, in the Schengen region of Europe, tourists can travel freely across borders irrespective of national borders. Identifying transboundary travel regions is an interesting problem which we aim to solve using mobility analysis of Twitter users. Our proposed solution comprises collecting geotagged tweets, combining them into trajectories and, thus, mining thousands of trips undertaken by twitter users. After aggregating these trips into a mobility graph, we apply a community detection algorithm to find coherent regions throughout the world. The discovered regions provide insights into international travel and can reveal both domestic and transnational travel regions.
Facebook
TwitterFrom the site: "Location of mined areas, including surface and deep coal and non-coal mining. Data incomplete, areas not mapped when screened at small scales during low level radioactive waste siting analysis."
Facebook
TwitterAir pollution directly affects human health endpoints including growth, respiratory processes, cardiovascular health, fertility, pregnancy outcomes, and cancer. Therefore, the distribution of air pollution is a topic that is relevant to all, and of direct interest to many students. Air quality varies across space and time, often disproportionally affecting minority communities and impoverished neighborhoods. Air pollution is usually higher in locations where pollution sources are concentrated, such as industrial production facilities, highways, and coal-fired power plants. The United States Environmental Protection Agency manages a national air quality-monitoring program to measure and report air-pollutant levels across the United States. These data cover multiple decades and are publicly available via a website interface. For this lesson, students learn how to mine data from this website. They work in pairs to develop their own questions about air quality or air pollution that span spatial and/or temporal scales, and then gather the data needed to answer their question. The students analyze their data and write a scientific paper describing their work. This laboratory experience requires the students to generate their own questions, gather and interpret data, and draw conclusions, allowing for creativity and instilling ownership and motivation for deeper learning gains.
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Geographic Information System Analytics Market Size 2024-2028
The geographic information system analytics market size is forecast to increase by USD 12 billion at a CAGR of 12.41% between 2023 and 2028.
The GIS Analytics Market analysis is experiencing significant growth, driven by the increasing need for efficient land management and emerging methods in data collection and generation. The defense industry's reliance on geospatial technology for situational awareness and real-time location monitoring is a major factor fueling market expansion. Additionally, the oil and gas industry's adoption of GIS for resource exploration and management is a key trend. Building Information Modeling (BIM) and smart city initiatives are also contributing to market growth, as they require multiple layered maps for effective planning and implementation. The Internet of Things (IoT) and Software as a Service (SaaS) are transforming GIS analytics by enabling real-time data processing and analysis.
Augmented reality is another emerging trend, as it enhances the user experience and provides valuable insights through visual overlays. Overall, heavy investments are required for setting up GIS stations and accessing data sources, making this a promising market for technology innovators and investors alike.
What will be the Size of the GIS Analytics Market during the forecast period?
Request Free Sample
The geographic information system analytics market encompasses various industries, including government sectors, agriculture, and infrastructure development. Smart city projects, building information modeling, and infrastructure development are key areas driving market growth. Spatial data plays a crucial role in sectors such as transportation, mining, and oil and gas. Cloud technology is transforming GIS analytics by enabling real-time data access and analysis. Startups are disrupting traditional GIS markets with innovative location-based services and smart city planning solutions. Infrastructure development in sectors like construction and green buildings relies on modern GIS solutions for efficient planning and management. Smart utilities and telematics navigation are also leveraging GIS analytics for improved operational efficiency.
GIS technology is essential for zoning and land use management, enabling data-driven decision-making. Smart public works and urban planning projects utilize mapping and geospatial technology for effective implementation. Surveying is another sector that benefits from advanced GIS solutions. Overall, the GIS analytics market is evolving, with a focus on providing actionable insights to businesses and organizations.
How is this Geographic Information System Analytics Industry segmented?
The geographic information system analytics industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.
End-user
Retail and Real Estate
Government
Utilities
Telecom
Manufacturing and Automotive
Agriculture
Construction
Mining
Transportation
Healthcare
Defense and Intelligence
Energy
Education and Research
BFSI
Components
Software
Services
Deployment Modes
On-Premises
Cloud-Based
Applications
Urban and Regional Planning
Disaster Management
Environmental Monitoring Asset Management
Surveying and Mapping
Location-Based Services
Geospatial Business Intelligence
Natural Resource Management
Geography
North America
US
Canada
Europe
France
Germany
UK
APAC
China
India
South Korea
Middle East and Africa
UAE
South America
Brazil
Rest of World
By End-user Insights
The retail and real estate segment is estimated to witness significant growth during the forecast period.
The GIS analytics market analysis is witnessing significant growth due to the increasing demand for advanced technologies in various industries. In the retail sector, for instance, retailers are utilizing GIS analytics to gain a competitive edge by analyzing customer demographics and buying patterns through real-time location monitoring and multiple layered maps. The retail industry's success relies heavily on these insights for effective marketing strategies. Moreover, the defense industries are integrating GIS analytics into their operations for infrastructure development, permitting, and public safety. Building Information Modeling (BIM) and 4D GIS software are increasingly being adopted for construction project workflows, while urban planning and designing require geospatial data for smart city planning and site selection.
The oil and gas industry is leveraging satellite imaging and IoT devices for land acquisition and mining operations. In the public sector, gover
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Preprocessed envelope EEG features based on a spatial filter approach. The features were computed across multiple within-trial SVIPT events for a large hyperparameter space on data of an exemplary subject.
The file "components.bsv" contains the preprocessed envelope features of all investigated configurations and provides underlying parameters as well as a relative path for the key ``record_dir'' to additional component information. Specifically, for each configuration the spatial filter, spatial activity pattern and the time-resolved within-trial envelope signal is provided under "records/".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spatial association rule mining (SARM) is an important data mining task for understanding implicit and sophisticated interactions in spatial data. The usefulness of SARM results, represented as sets of rules, depends on their reliability: the abundance of rules, control over the risk of spurious rules, and accuracy of rule interestingness measure (RIM) values. This study presents crisp-fuzzy SARM, a novel SARM method that can enhance the reliability of resultant rules. The method firstly prunes dubious rules using statistically sound tests and crisp supports for the patterns involved, and then evaluates RIMs of accepted rules using fuzzy supports. For the RIM evaluation stage, the study also proposes a Gaussian-curve-based fuzzy data discretization model for SARM with improved design for spatial semantics. The proposed techniques were evaluated by both synthetic and real-world data. The synthetic data was generated with predesigned rules and RIM values, thus the reliability of SARM results could be confidently and quantitatively evaluated. The proposed techniques showed high efficacy in enhancing the reliability of SARM results in all three aspects. The abundance of resultant rules was improved by 50% or more compared with using conventional fuzzy SARM. Minimal risk of spurious rules was guaranteed by statistically sound tests. The probability that the entire result contained any spurious rules was below 1%. The RIM values also avoided large positive errors committed by crisp SARM, which typically exceeded 50% for representative RIMs. The real-world case study on New York City points of interest reconfirms the improved reliability of crisp-fuzzy SARM results, and demonstrates that such improvement is critical for practical spatial data analytics and decision support.