Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this course, you will explore a variety of open-source technologies for working with geosptial data, performing spatial analysis, and undertaking general data science. The first component of the class focuses on the use of QGIS and associated technologies (GDAL, PROJ, GRASS, SAGA, and Orfeo Toolbox). The second component of the class introduces Python and associated open-source libraries and modules (NumPy, Pandas, Matplotlib, Seaborn, GeoPandas, Rasterio, WhiteboxTools, and Scikit-Learn) used by geospatial scientists and data scientists. We also provide an introduction to Structured Query Language (SQL) for performing table and spatial queries. This course is designed for individuals that have a background in GIS, such as working in the ArcGIS environment, but no prior experience using open-source software and/or coding. You will be asked to work through a series of lecture modules and videos broken into several topic areas, as outlined below. Fourteen assignments and the required data have been provided as hands-on opportunites to work with data and the discussed technologies and methods. If you have any questions or suggestions, feel free to contact us. We hope to continue to update and improve this course. This course was produced by West Virginia View (http://www.wvview.org/) with support from AmericaView (https://americaview.org/). This material is based upon work supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G18AP00077. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the opinions or policies of the U.S. Geological Survey. Mention of trade names or commercial products does not constitute their endorsement by the U.S. Geological Survey. After completing this course you will be able to: apply QGIS to visualize, query, and analyze vector and raster spatial data. use available resources to further expand your knowledge of open-source technologies. describe and use a variety of open data formats. code in Python at an intermediate-level. read, summarize, visualize, and analyze data using open Python libraries. create spatial predictive models using Python and associated libraries. use SQL to perform table and spatial queries at an intermediate-level.
According to our latest research, the global Read-Only Replicas for BAS Analytics market size stood at USD 2.62 billion in 2024, reflecting robust demand for advanced analytics and business intelligence solutions across industries. The market is anticipated to grow at a CAGR of 13.8% during the forecast period, reaching an estimated value of USD 8.14 billion by 2033. This significant growth is fueled by the increasing need for real-time data accessibility, enhanced disaster recovery capabilities, and the growing adoption of cloud-based analytics platforms worldwide.
A primary driver of expansion in the Read-Only Replicas for BAS Analytics market is the surging demand for high-availability data systems that support business analytics services. Organizations are increasingly leveraging read-only replicas to offload analytics workloads from primary databases, ensuring uninterrupted performance for mission-critical applications. The proliferation of big data and the necessity for immediate insights have made it imperative for enterprises to deploy systems that can efficiently handle concurrent analytical queries without impacting transactional operations. This trend is particularly evident in sectors such as BFSI, healthcare, and retail, where real-time analytics can translate into competitive advantages, improved customer experiences, and operational efficiencies.
Another crucial growth factor is the rapid evolution of cloud technologies and their integration with advanced analytics platforms. Cloud-based deployment modes offer scalability, flexibility, and cost-effectiveness, enabling organizations of all sizes to implement read-only replica solutions without significant upfront investments in infrastructure. The shift towards cloud-native architectures has facilitated seamless disaster recovery, simplified data management, and enhanced the ability to scale analytics workloads dynamically. As a result, both large enterprises and small and medium enterprises (SMEs) are accelerating their migration to cloud environments, driving further adoption of read-only replicas for business analytics systems.
Additionally, the market is benefiting from increasing regulatory requirements and the need for robust data governance frameworks. Industries such as finance, healthcare, and government are subject to stringent compliance mandates that necessitate reliable data backup, audit trails, and disaster recovery mechanisms. Read-only replicas play a pivotal role in meeting these requirements by providing consistent, tamper-proof copies of data that can be used for analytics, reporting, and compliance audits. The integration of business intelligence and reporting tools with read-only replicas further enhances data transparency and accountability, fostering trust among stakeholders and regulatory bodies.
From a regional perspective, North America currently leads the global Read-Only Replicas for BAS Analytics market, accounting for the largest share in 2024 due to early technology adoption and the presence of major cloud service providers. However, the Asia Pacific region is poised for the fastest growth over the forecast period, driven by rapid digital transformation, expanding IT infrastructure, and increasing investments in data-driven decision-making across emerging economies. Europe also demonstrates significant potential, with organizations focusing on compliance, data security, and advanced analytics to maintain competitiveness in a dynamic business landscape.
The Read-Only Replicas for BAS Analytics market is segmented by component into software, hardware, and services, each playing a critical role in the overall ecosystem. The software segment dominates the market, driven by the continuous innovation in analytics platforms, database management systems, and business intelligence tools. Modern software solutions are designed to seamlessly integrate with existing data architectures, enabling organizations to deploy, manage,
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Wikipedia is the largest and most read online free encyclopedia currently existing. As such, Wikipedia offers a large amount of data on all its own contents and interactions around them, as well as different types of open data sources. This makes Wikipedia a unique data source that can be analyzed with quantitative data science techniques. However, the enormous amount of data makes it difficult to have an overview, and sometimes many of the analytical possibilities that Wikipedia offers remain unknown. In order to reduce the complexity of identifying and collecting data on Wikipedia and expanding its analytical potential, after collecting different data from various sources and processing them, we have generated a dedicated Wikipedia Knowledge Graph aimed at facilitating the analysis, contextualization of the activity and relations of Wikipedia pages, in this case limited to its English edition. We share this Knowledge Graph dataset in an open way, aiming to be useful for a wide range of researchers, such as informetricians, sociologists or data scientists.
There are a total of 9 files, all of them in tsv format, and they have been built under a relational structure. The main one that acts as the core of the dataset is the page file, after it there are 4 files with different entities related to the Wikipedia pages (category, url, pub and page_property files) and 4 other files that act as "intermediate tables" making it possible to connect the pages both with the latter and between pages (page_category, page_url, page_pub and page_link files).
The document Dataset_summary includes a detailed description of the dataset.
Thanks to Nees Jan van Eck and the Centre for Science and Technology Studies (CWTS) for the valuable comments and suggestions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the updated version of the dataset from 10.5281/zenodo.6320761
Information
The diverse publicly available compound/bioactivity databases constitute a key resource for data-driven applications in chemogenomics and drug design. Analysis of their coverage of compound entries and biological targets revealed considerable differences, however, suggesting benefit of a consensus dataset. Therefore, we have combined and curated information from five esteemed databases (ChEMBL, PubChem, BindingDB, IUPHAR/BPS and Probes&Drugs) to assemble a consensus compound/bioactivity dataset comprising 1144648 compounds with 10915362 bioactivities on 5613 targets (including defined macromolecular targets as well as cell-lines and phenotypic readouts). It also provides simplified information on assay types underlying the bioactivity data and on bioactivity confidence by comparing data from different sources. We have unified the source databases, brought them into a common format and combined them, enabling an ease for generic uses in multiple applications such as chemogenomics and data-driven drug design.
The consensus dataset provides increased target coverage and contains a higher number of molecules compared to the source databases which is also evident from a larger number of scaffolds. These features render the consensus dataset a valuable tool for machine learning and other data-driven applications in (de novo) drug design and bioactivity prediction. The increased chemical and bioactivity coverage of the consensus dataset may improve robustness of such models compared to the single source databases. In addition, semi-automated structure and bioactivity annotation checks with flags for divergent data from different sources may help data selection and further accurate curation.
This dataset belongs to the publication: https://doi.org/10.3390/molecules27082513
Structure and content of the dataset
ChEMBL ID |
PubChem ID |
IUPHAR ID | Target |
Activity type | Assay type | Unit | Mean C (0) | ... | Mean PC (0) | ... | Mean B (0) | ... | Mean I (0) | ... | Mean PD (0) | ... | Activity check annotation | Ligand names | Canonical SMILES C | ... | Structure check (Tanimoto) | Source |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
The dataset was created using the Konstanz Information Miner (KNIME) (https://www.knime.com/) and was exported as a CSV-file and a compressed CSV-file.
Except for the canonical SMILES columns, all columns are filled with the datatype ‘string’. The datatype for the canonical SMILES columns is the smiles-format. We recommend the File Reader node for using the dataset in KNIME. With the help of this node the data types of the columns can be adjusted exactly. In addition, only this node can read the compressed format.
Column content:
These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading.
Metadata includes
reviews
add-to-shelf, read, review actions
book attributes: title, isbn
graph of similar books
Basic Statistics:
Items: 1,561,465
Users: 808,749
Interactions: 225,394,930
Market basket analysis with Apriori algorithm
The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.
Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.
Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Number of Attributes: 7
https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">
First, we need to load required libraries. Shortly I describe all libraries.
https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">
Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.
https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png">
https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">
After we will clear our data frame, will remove missing values.
https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">
To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
The Digital Surficial Geologic-GIS Map of the Big Thicket National Preserve Area, Texas is composed of GIS data layers and GIS tables, and is available in the following GRI-supported GIS data formats: 1.) an ESRI file geodatabase (btam_surficial_geology.gdb), a 2.) Open Geospatial Consortium (OGC) geopackage, and 3.) 2.2 KMZ/KML file for use in Google Earth, however, this format version of the map is limited in data layers presented and in access to GRI ancillary table information. The file geodatabase format is supported with a 1.) ArcGIS Pro map file (.mapx) file (btam_surficial_geology.mapx) and individual Pro layer (.lyrx) files (for each GIS data layer). The OGC geopackage is supported with a QGIS project (.qgz) file. Upon request, the GIS data is also available in ESRI shapefile format. Contact Stephanie O'Meara (see contact information below) to acquire the GIS data in these GIS data formats. In addition to the GIS data and supporting GIS files, three additional files comprise a GRI digital geologic-GIS dataset or map: 1.) a readme file (bith_geology_gis_readme.pdf), 2.) the GRI ancillary map information document (.pdf) file (bith_geology.pdf) which contains geologic unit descriptions, as well as other ancillary map information and graphics from the source map(s) used by the GRI in the production of the GRI digital geologic-GIS data for the park, and 3.) a user-friendly FAQ PDF version of the metadata (btam_surficial_geology_metadata_faq.pdf). Please read the bith_geology_gis_readme.pdf for information pertaining to the proper extraction of the GIS data and other map files. Google Earth software is available for free at: https://www.google.com/earth/versions/. QGIS software is available for free at: https://www.qgis.org/en/site/. Users are encouraged to only use the Google Earth data for basic visualization, and to use the GIS data for any type of data analysis or investigation. The data were completed as a component of the Geologic Resources Inventory (GRI) program, a National Park Service (NPS) Inventory and Monitoring (I&M) Division funded program that is administered by the NPS Geologic Resources Division (GRD). For a complete listing of GRI products visit the GRI publications webpage: https://www.nps.gov/subjects/geology/geologic-resources-inventory-products.htm. For more information about the Geologic Resources Inventory Program visit the GRI webpage: https://www.nps.gov/subjects/geology/gri.htm. At the bottom of that webpage is a "Contact Us" link if you need additional information. You may also directly contact the program coordinator, Jason Kenworthy (jason_kenworthy@nps.gov). Source geologic maps and data used to complete this GRI digital dataset were provided by the following: Texas Water Development Board. Detailed information concerning the sources used and their contribution the GRI product are listed in the Source Citation section(s) of this metadata record (btam_surficial_geology_metadata.txt or btam_surficial_geology_metadata_faq.pdf). Users of this data are cautioned about the locational accuracy of features within this dataset. Based on the source map scale of 1:250,000 and United States National Map Accuracy Standards features are within (horizontally) 127 meters or 416.7 feet of their actual location as presented by this dataset. Users of this data should thus not assume the location of features is exactly where they are portrayed in Google Earth, ArcGIS Pro, QGIS or other software used to display this dataset. All GIS and ancillary tables were produced as per the NPS GRI Geology-GIS Geodatabase Data Model v. 2.3. (available at: https://www.nps.gov/articles/gri-geodatabase-model.htm).
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Cloud Advertising Market Size 2024-2028
The cloud advertising market size is forecast to increase by USD 380.3 billion at a CAGR of 21.23% between 2023 and 2028. The market is experiencing significant growth due to the increasing adoption of cloud services and the shift from traditional to online advertising. Customer experience and brand loyalty are key priorities for businesses in the era of Internet commerce, leading them to explore advanced marketing strategies utilizing cloud-based advertising platforms. Data science and machine learning are integral components of these platforms, enabling personalized targeting and real-time campaign optimization. Moreover, artificial intelligence (AI) and machine learning-driven mobile SaaS and app-based solutions are gaining traction, offering agility and flexibility to marketers. However, data security concerns persist, necessitating strong security measures to protect sensitive customer information.
Request Free Sample
The market is witnessing significant growth as organizations increasingly adopt cloud-based solutions to enhance their digital advertising strategies. This shift is driven by the need for advanced consumer and customer analytics, which are crucial for effective omnichannel brand interactions. Cloud advertising services, including Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS), enable businesses to store, process, and analyze large volumes of data in real-time. This capability is essential for data-driven marketing campaigns that leverage big data from various sources, such as social media, email marketing, in-app marketing, and company websites.
Furthermore, private cloud and hybrid cloud solutions are popular choices for organizations due to their enhanced security features and flexibility. Data warehouse solutions integrated with cloud advertising platforms offer advanced data analytics capabilities, enabling businesses to gain valuable insights into consumer behavior and preferences. Artificial intelligence (AI) and machine learning (ML) technologies are integral to cloud advertising services. These technologies enable automated targeting, personalized messaging, and real-time optimization, resulting in improved campaign performance and higher ROI. The market segmentation by organization size reveals that mid-sized and large enterprises dominate the market. These organizations have larger marketing budgets and a greater need for advanced analytics capabilities to manage complex digital advertising campaigns.
Market Segmentation
The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.
End-user
Retail
Media and entertainment
IT and telecom
BFSI
Others
Deployment
Private
Public
Hybrid
Geography
North America
US
Europe
Germany
UK
APAC
China
India
Middle East and Africa
South America
By End-user Insights
The retail segment is estimated to witness significant growth during the forecast period. In today's digital age, retail businesses are increasingly focusing on providing customer-centric experiences to stay competitive. With growing consumer spending power, the retail sector is poised for significant expansion. Traditionally, shoppers would visit physical stores to make purchases, but now they can access a vast array of goods online using mobile devices. Before making a purchase, consumers compare prices, read product reviews, and explore competitors' offerings. To attract customers to their brick-and-mortar stores, retailers employ various marketing strategies. However, managing customer data in large quantities is a challenge for these organizations. Hybrid environments, serverless architecture, and containers are becoming increasingly popular in the retail cloud market to address these data management issues.
Furthermore, cloud advertising services, such as programmatic advertising, are being adopted to reach potential customers more effectively. In the SaaS market, these solutions offer cost savings, flexibility, and scalability. However, data security concerns and strict cloud restrictions remain significant challenges. Retailers must ensure that their cloud solutions provide strong security measures to protect sensitive customer information. In conclusion, the retail industry's shift towards digital transformation has created a need for advanced cloud solutions. Hybrid environments, serverless architecture, and containers are key technologies driving growth in the retail cloud market. Cloud advertising services, such as programmatic advertising, offer retailers an effective way to reach potential customers. However, data security and cloud restrictions remain
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Natural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15–20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use of molecular data in species diagnosis and description. In 2018, of the 2661 papers published in specialized taxonomic journals, molecular data were widely used in mycology (94%), regularly in vertebrates (53%), but rarely in botany (15%) and entomology (10%). Images play an important role in taxonomic research on all taxa, with photographs used in >80% and drawings in 58% of the surveyed papers. The use of omics (high-throughput) approaches or 3D documentation is still rare. Improved archiving strategies for metabarcoding consensus reads, genome and transcriptome assemblies, and chemical and metabolomic data could help to mobilize the wealth of high-throughput data for alpha-taxonomy. Because long term — ideally perpetual — data storage is of particular importance for taxonomy, energy footprint reduction via less storage-demanding formats is a priority if their information content suffices for the purpose of taxonomic studies. Whereas taxonomic assignments are quasi-facts for most biological disciplines, they remain hypotheses pertaining to evolutionary relatedness of individuals for alpha-taxonomy. For this reason, an improved re-use of taxonomic data, including machine-learning-based species identification and delimitation pipelines, requires a cyberspecimen approach—linking data via unique specimen identifiers, and thereby making them findable, accessible, interoperable, and reusable for taxonomic research. This poses both qualitative challenges to adapt the existing infrastructure of data centers to a specimen-centered concept and quantitative challenges to host and connect an estimated ≤2 million images produced per year by alpha-taxonomic studies, plus many millions of images from digitization campaigns. Of the 30–40,000 taxonomists globally, many are thought to be non-professionals, and capturing the data for online storage and reuse therefore requires low-complexity submission workflows and cost-free repository use. Expert taxonomists are the main stakeholders able to identify and formalize the needs of the discipline; their expertise is needed to implement the envisioned virtual collections of cyberspecimens.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The "LLM World of Words" (LWOW) [1] is a collection of datasets of English free association norms generated by various large language models (LLMs). Currently, the collection consists of datasets generated by Mistral, LLaMA3, and Claude Haiku. The datasets are modeled after the "Small World of Words" (SWOW) (https://smallworldofwords.org/en/project/) [2] English free association norms, generated by humans, consisting of over 12,000 cue words and over 3 million responses. The purpose of the LWOW datasets is to provide a way to investigate various aspects of the semantic memory of LLMs using an approach that has been applied extensively for investigating the semantic memory of humans. These datasets, together with the SWOW dataset, can be used to gain insights about similarities and differences in the language structures possessed by humans and LLMs.
Free associations are implicit mental connections between words or concepts. They are typically accessed by presenting humans (or AI agents) with a cue word and then asking them to respond with the first words that come to mind. The responses represent implicit associations that connect different concepts in the mind, reflecting the semantic representations that underly patterns of thought, memory, and language. For example, given the cue word "woman", a common free association response might be "man", reflecting the associative mental relation between these two concepts.
Free associations have been extensively used in cognitive psychology and linguistics as a tool for studying language and cognitive information processing. They provide a way for researchers to understand how conceptual knowledge is organized and accessed in the mind. Free associations are often used to built network models of semantic memory by connecting cue words to their responses. When thousands of cues and responses are connected in this way, the result is a complex network model that represents the complex organization of semantic knowledge. Such models enable the investigation of complex cognitive processes that take place within semantic memory, and can be used to study a variety of cognitive phenomena such as language learning, creativity, personality traits, and cognitive biases.
The LWOW datasets were validated using data from the Semantic Priming Project (https://www.montana.edu/attmemlab/spp.html) [3], which implements a lexical decision task (LDT) to study semantic priming. The semantic priming effect is the cognitive phenomenon that a target word (e.g. nurse) is more easily recognized when it is prompted by a related prime word (e.g. doctor) compared to an unrelated prime word (e.g. doctrine). We simulated the semantic priming effect within network models of semantic memory built from both the LWOW and the SWOW free association norms by implementing spreading activation processes within the networks [4]. We found that the final activation levels of prime-target pairs correlated significantly with reaction time data for the same prime-target pairs from the LDT. Specifically, the activation of a target node (e.g. nurse) is higher when a related prime node (e.g. doctor) is activated compared to an unrelated prime node (e.g. doctrine). These results demonstrate how the LWOW datasets can be used for investigating cognitive and linguistic phenomena in LLMs, demonstrating the validity of the datasets.
To demonstrate how this dataset can be used to investigate gender biases in LLMs compared to humans, we conducted an analysis using network models of semantic memory built from both the LWOW and the SWOW free association norms. We applied a methodology that simulates semantic priming within the networks to measure the strength of association between pairs of concepts, for example, "woman" and "forecful" vs. "man" and "forceful". We applied this methodology using a set of female-related and male-related primes, and a set of female-related and male-related targets. This analysis revealed that certain adjectives like "forceful" and "strong" are more strongly associated with certain genders, shedding light on the types of stereotypical gender biases that both humans and LLMs possess.
The free associations were generated (either via API or locally, depending on the LLM) by providing each LLM with a set of cue words and the following prompt: "You will be provided with an input word. Write the first 3 words you associate to it separated by a comma." This prompt was repeated 100 times for each cue word, resulting in a dataset of 11,545 unique cues words and 3,463,500 total responses for each LLM.
The LWOW datasets for Mistral, Llama3, and Haiku can be found in the LWOW_datasets folder, which contains two subfolders. The .csv files of the processed cues and responses can be found in the processed_datasets folder while the .csv files of the edge lists of the semantic networks constructed from the datasets can be found in the graphs/edge_lists folder.
Since the LWOW datasets are intended to be used in comparison to humans, we have further processed the original SWOW dataset to create a Human dataset that is aligned with the processing that we applied to the LWOW datasets. While this human dataset is not included in this repository due to the license of the original SWOW dataset, it can be easily reproduced by running the code provided in the reproducibility folder. We highly encourage you to generate this dataset as it enabales a direct comparison between humans and LLMs. The Human dataset can be generated with the following steps:
To reproduce the analyses, first the required external files need to be downloaded:
Once the files are saved in the correct folders, follow the instructions in each script, which can be found in the reproducibility folder. The scripts should be run in the following order:
Abramski, K., et al. (2024). The "LLM World of Words" English free association norms generated by large language models (https://arxiv.org/abs/2412.01330)
For speaking requests and enquiries, please contact:
[1] Abramski, K., et al. (2024). The" LLM World of Words" English free association norms generated
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Studying the graph characteristics of these networks is beneficial;
Moreover, understanding the vulnerabilities and attack possibilities unique to these networks allows us to develop proactive defense mechanisms and mitigate potential threats.
Data collection method: ask all reachable nodes continuously for their known peers. In Bitcoin's parlor, we send GETADDR messages and store all ADDR replies, drawing a connection between the sending node to all ip addresses contained in the ADDR message.
All IP addresses have been replaced by numbers (NodeID) for ethical reasons. NodeIDs are consistent accross all files. The same NodeID corresponds to the same ip in ALL files (if present). Filenames contain the timestamp and the corresponding network. The date-time format is YYYYMMDD-HHMISS.
File Contents: The edgelist files store information about the structure of the connectivity graph. Each file represents an edgelist of a graph at the specified time-stamp. Each line in a file corresponds the the list of known peers to a node. The NodeID of the node is the first number of each line. Example: the following line
S N1 N2 N3 N4
means that node S knows of nodes N1..N4; their ip addresses were included in S's ADDR responses.
To process the files in snap and networkx proper transformations have to be made. Please read the relevant documentation to find the appropriate input.
This dataset has been used in the following works:
- @inproceedings{aris_ssec,
author = {Paphitis, Aristodemos and Kourtellis, Nicolas and Sirivianos, Michael},
title = {Graph Analysis of Blockchain {P2P} Overlays and their Security Implications},
booktitle = {Proceedings of the 9th International Symposium on Security and Privacy in Social Networks and Big Data (SocialSec 2023)},
series = {Lecture Notes in Computer Science},
volume = {13983},
publisher = {Springer Nature},
year = {2023},
}
Please cite as:
Aristodemos Paphitis, Nicolas Kourtellis, and Michael Sirivianos. A First Look into the Structural Properties of Blockchain P2P Overlays. DOI:https://doi.org/10.6084/m9.figshare.23522919
bibtex:
@misc{paphitis_first_nodate,
author = {Paphitis, Aristodemos and Kourtellis, Nicolas and Sirivianos, Michael},
title = {A First Look into the Structural Properties of Blockchain {P2P} Overlays},
howpublished = {Public dataset with figshare},
doi = {10.6084/m9.figshare.23522919},
}
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Direct Attached AI Storage System Market Size 2025-2029
The direct attached AI storage system market size is valued to increase by USD 19.98 billion, at a CAGR of 24.2% from 2024 to 2029. Imperative for low latency and high throughput will drive the direct attached ai storage system market.
Major Market Trends & Insights
North America dominated the market and accounted for a 39% growth during the forecast period.
By Product - Hardware segment was valued at USD 1.03 billion in 2023
By Method - Block storage segment accounted for the largest market revenue share in 2023
Market Size & Forecast
Market Opportunities: USD 1.00 million
Market Future Opportunities: USD 19983.20 million
CAGR from 2024 to 2029 : 24.2%
Market Summary
The market is experiencing significant growth, with an increasing number of businesses recognizing the importance of low latency and high throughput in handling artificial intelligence (AI) workloads. These systems, which allow direct access to storage devices without the need for a network intermediary, are essential for handling the massive data requirements of AI applications. One notable trend in this market is the drive towards petabyte-scale, hyper-dense systems. This reflects the growing need for businesses to store and process vast amounts of data in a compact space. Cloud native and cloud-adjacent technologies, like machine learning and artificial intelligence, are transforming industries, from edge computing to big data analysis. However, this scalability comes with its challenges. Data sharing limitations can pose a hurdle, making it difficult for multiple users or applications to access the same data simultaneously.
Despite these challenges, the market for Direct Attached AI Storage Systems is expected to continue expanding. According to recent market research, the global market for AI storage systems is projected to reach USD15.7 billion by 2026, growing at a compound annual growth rate of 34.5% from 2021 to 2026. This underscores the market's potential for businesses seeking to harness the power of AI and handle the resulting data demands. In conclusion, the market is a critical component in the evolving landscape of AI technology. Its inherent scalability and data sharing limitations make it a key area of focus for businesses seeking to optimize their AI workloads and manage the resulting data requirements.
The market's continued growth is a testament to its importance in the broader context of AI adoption and innovation.
What will be the Size of the Direct Attached AI Storage System Market during the forecast period?
Get Key Insights on Market Forecast (PDF) Request Free Sample
How is the Direct Attached AI Storage System Market Segmented ?
The direct attached AI storage system industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Product
Hardware
Software
Method
Block storage
File storage
Object storage
Type
Solid state drive
Hard disc drive
End-user
Enterprises
Cloud services providers
Government bodies
Telecom companies
Geography
North America
US
Canada
Europe
France
Germany
UK
APAC
China
India
Japan
South Korea
South America
Brazil
Rest of World (ROW)
By Product Insights
The hardware segment is estimated to witness significant growth during the forecast period.
The market continues to evolve, with the hardware segment serving as the critical foundation for data retention and swift delivery to AI computational resources. This segment comprises specialized components designed to meet the stringent performance requirements of artificial intelligence and machine learning workloads. The cornerstone of this infrastructure is the storage media, predominantly solid state drives (SSDs) based on NVMe technology. These enterprise-grade SSDs are optimized for high sustained write performance, low read latency, and superior endurance to handle the intensive, repetitive read-write cycles inherent in AI model training. In the realm of data management, power efficiency, storage lifecycle management, and data security protocols are paramount.
Strategies for data migration, redundancy, deduplication, and compression are essential for optimizing storage provisioning and capacity planning. Performance monitoring, fault tolerance, and system reliability are also crucial, with throughput improvement and latency optimization key performance metrics. Scalability solutions, storage virtualization, and RAID configurations further enhance storage performance and system resilience. Data encryption, access control, and disaster recovery planning are integral components of the data man
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Long-term Gap-free High-resolution Air Pollutants concentration dataset (abbreviated as LGHAP) is of great significance for environmental management and earth system science analysis. In the current release of LGHAP dataset (LGHAP v2), we provide 22-year-long gap free aerosol optical depth (AOD) and near-surface PM2.5 concentrations with daily 1-km resolution covering the global land area from 2000 to 2021. Leveraging an improved big earth data analytic framework with attention-reinforced tensor construction and adaptive background information updating schemes, gap-free AOD grids were firstly derived via an integration of multimodal AODs and air quality measurements acquired from diverse satellites, ground monitors, and numerical models. For better predicting PM2.5 concentration across the globe, a scene-aware ensemble learning graph attention network (SCAGAT) was then developed to account for large modeling bias over regions with limited or even none in situ air quality measurements. These datasets were archived in the NetCDF (nc) format, while data in every year were archived as an individual submission. Python, MATLAB, R, and IDL codes were also provided to help users read and visualize the LGHAP v2 data.
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
According to Cognitive Market Research, the global application testing services market size is USD 50.14 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 13.5% from 2024 to 2031. Market Dynamics of Application Testing Services Market
Key Drivers for Application Testing Services Market
Growing Problem of Modern Software - One major factor drives the market growth is that comprehensive testing is essential to guarantee the functionality and performance of application testing due to the increasing complexity of these applications and their integration with different technologies and platforms. The need for application testing services capable of handling various systems and requirements is driven by the complexity of these systems. Advancements in digital technology and the emphasis on software solutions in application testing services accelerate market growth.
Key Restraints for Application Testing Services Market
One major obstacle is the lack of qualified workers and the accessibility of free testing resources. Another factor expected to slow down the market is rising worry about safeguarding application testing services and high implementation costs.
Test Environment Management becomes a challenge with the on growing complex advancements
Virtual test environments mimic actual scenarios, has strong security, and allow testers to establish unlimited user configurations and test with alternate user profiles but there are issues such as equipment conflict between drivers of equipment being tested and virtual environment that may arise and compound the already present complexities. Manual testing environments are costly, time intensive, and complicated and can potentially consume 40% of the tester's time and hence impact quality and productivity, as stated by Wipro. Application testing services firms know that dealing with numerous slow, inconsistent and unreliable test environments will mean higher operating costs and reduced test coverage. Config problems in test environments, disjointed manual test environments and no monitoring of the test environment assets are just a few of the most significant test environment issues.
Opportunities for Application Testing Services
Advances in technologies like AI, Big Data and others is going to propel the Application Testing Services Market
Implementation of new advanced technologies like Big Data, AI, machine learning, and so on assists in giving correct real-time data analysis and is viewed to be among the key factors driving application testing services market growth. The increasing speed of innovation in emerging technologies, networks and practices is pushing firms to implement newer approaches to obtain intelligent testing solutions that can cope with the intricacies of new product launches while improving product quality, reducing costs and speeding up time-to-revenue. In 2019, the World Quality Report, a combined publication of Capgemini, Microfocus, and Sogeti, reported that artificial intelligence (AI) for testing is an emerging trend to enhance speed to test especially for digital initiatives in order to enhance the testing process for digital initiatives in financial services. Introduction of the Application Testing Services Market
Application testing services provide a whole range of testing procedures to guarantee that software applications work as expected in various network, device, and operating system settings. This market is experiencing strong demand due to the widespread requirement for powerful software applications in multiple industries, such as technology, healthcare, retail, and finance. Factors including the increasing use of mobile applications, the necessity of continuously ensuring the quality of software updates that are released quickly, and the popularity of agile and DevOps approaches in software development are driving this industry forward. A key component propelling the market forward is the surge in demand from organizations that can enhance the performance of application testing software. The need for application testing services is increasing due to the expanding popularity of agile testing and crowdsourced testing. The expansion of cities and the popularity of mobile apps are two major trends that will slow the expansion of the application testing services industry.
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
SRAM And ROM Design IP Market Size and Trends
The SRAM And ROM Design IP market size is forecast to increase by USD 31.2 million, at a CAGR of 1.8% between 2023 and 2028. The market is experiencing significant growth due to the increasing demand for power-efficient solutions in various applications. One of the primary drivers is the expanding data center sector, where low-power SRAM and ROM designs are essential for data processing and power consumption management. Additionally, the proliferation of connected devices, 5G networks, and edge computing are fueling the need for high-density, low-power memory solutions. Autonomous vehicles and non-volatile memory technologies are also contributing to market growth. In the automotive industry, SRAM and ROM IP are crucial for data processing and storage in advanced driver assistance systems (ADAS) and autonomous driving applications. Furthermore, the adoption of non-volatile memory, such as MRAM and ReRAM, is increasing due to their ability to offer higher density, lower power consumption, and faster access times compared to traditional volatile memory. Enterprise storage and cloud storage are other significant markets for SRAM and ROM Design IP. Hyper-converged infrastructure(HCI) and other data processing applications require high-performance, low-power memory solutions to handle increasing data workloads efficiently. The market analysis report also highlights the challenges faced by the market, including scaling issues and the need for further innovation to meet the evolving demands of various applications.
Request Free Sample
The SRAM (Static Random Access Memory) and ROM (Read-Only Memory) Design IP market represents a significant segment in the semiconductor industry, catering to the increasing demand for memory solutions in data-intensive applications. These applications span across various domains, including artificial intelligence (AI), machine learning, big data analytics, Internet of Things (IoT), and other electronics components. Memory-intensive applications, such as microcontrollers, embedded systems, programmable devices, and application-specific integrated circuits (ASICs), require high-performance, compact memory technologies. SRAM and ROM design IPs offer solutions that meet these demands, ensuring optimal memory density and power consumption. SRAM technologies have gained popularity due to their fast read and write access times, making them suitable for applications requiring frequent data access. On the other hand, ROM technologies offer the benefits of non-volatility, ensuring data retention even without power. The SRAM market continues to evolve, with a focus on low power consumption and compact technologies.
Small SRAM modules, such as Serial Peripheral Interface (SPI) SRAMs and Synchronous On-Chip Memory (SOM), cater to the needs of low-power computer vision applications and other power-sensitive systems. Design intellectual property (IP) plays a crucial role in the semiconductor manufacturing process. SRAM and ROM design IPs enable OEM manufacturers to integrate these memory solutions into their products efficiently, reducing development time and costs. In the context of data centers, SRAM and ROM design IPs contribute to the development of high-performance, energy-efficient memory solutions. These solutions enable faster data processing, reducing latency and enhancing overall system performance. In conclusion, the market offers essential memory solutions for memory-intensive applications across various domains. The focus on compact technologies, low power consumption, and high memory density ensures that these solutions cater to the evolving needs of the semiconductor industry. By integrating SRAM and ROM design IPs into their products, OEM manufacturers can deliver innovative, high-performance solutions to their customers.
Market Segmentation
The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018 - 2022 for the following segments.
Type
SRAM
ROM
Geography
APAC
China
India
Japan
South Korea
North America
Canada
US
Europe
Germany
France
Italy
Spain
South America
Middle East and Africa
By Type Insights
The SRAM segment is estimated to witness significant growth during the forecast period. The market encompasses solutions that cater to power consumption requirements in various sectors, including data centers, connected devices, and autonomous vehicles. Power efficiency is crucial in data processing for data centers, where 5G and edge computing are driving the need for low-power solutions. In the realm of connected devices, the demand for high-density memory and non-volatile memory is escalating. These memory types ensure data retention even when
The Digital Geologic-GIS Map of Big Cypress National Preserve and Vicinity, Florida is composed of GIS data layers and GIS tables, and is available in the following GRI-supported GIS data formats: 1.) a 10.1 file geodatabase (bicy_geology.gdb), a 2.) Open Geospatial Consortium (OGC) geopackage, and 3.) 2.2 KMZ/KML file for use in Google Earth, however, this format version of the map is limited in data layers presented and in access to GRI ancillary table information. The file geodatabase format is supported with a 1.) ArcGIS Pro map file (.mapx) file (bicy_geology.mapx) and individual Pro layer (.lyrx) files (for each GIS data layer), as well as with a 2.) 10.1 ArcMap (.mxd) map document (bicy_geology.mxd) and individual 10.1 layer (.lyr) files (for each GIS data layer). The OGC geopackage is supported with a QGIS project (.qgz) file. Upon request, the GIS data is also available in ESRI 10.1 shapefile format. Contact Stephanie O'Meara (see contact information below) to acquire the GIS data in these GIS data formats. In addition to the GIS data and supporting GIS files, three additional files comprise a GRI digital geologic-GIS dataset or map: 1.) A GIS readme file (bicy_geology_gis_readme.pdf), 2.) the GRI ancillary map information document (.pdf) file (bicy_geology.pdf) which contains geologic unit descriptions, as well as other ancillary map information and graphics from the source map(s) used by the GRI in the production of the GRI digital geologic-GIS data for the park, and 3.) a user-friendly FAQ PDF version of the metadata (bicy_geology_metadata_faq.pdf). Please read the bicy_geology_gis_readme.pdf for information pertaining to the proper extraction of the GIS data and other map files. Google Earth software is available for free at: https://www.google.com/earth/versions/. QGIS software is available for free at: https://www.qgis.org/en/site/. Users are encouraged to only use the Google Earth data for basic visualization, and to use the GIS data for any type of data analysis or investigation. The data were completed as a component of the Geologic Resources Inventory (GRI) program, a National Park Service (NPS) Inventory and Monitoring (I&M) Division funded program that is administered by the NPS Geologic Resources Division (GRD). For a complete listing of GRI products visit the GRI publications webpage: For a complete listing of GRI products visit the GRI publications webpage: https://www.nps.gov/subjects/geology/geologic-resources-inventory-products.htm. For more information about the Geologic Resources Inventory Program visit the GRI webpage: https://www.nps.gov/subjects/geology/gri,htm. At the bottom of that webpage is a "Contact Us" link if you need additional information. You may also directly contact the program coordinator, Jason Kenworthy (jason_kenworthy@nps.gov). Source geologic maps and data used to complete this GRI digital dataset were provided by the following: Florida Geological Survey, U.S. Geological Survey and Earthfx Incorporated/BEM Systems Inc.. Detailed information concerning the sources used and their contribution the GRI product are listed in the Source Citation section(s) of this metadata record (bicy_geology_metadata.txt or bicy_geology_metadata_faq.pdf). Users of this data are cautioned about the locational accuracy of features within this dataset. Based on the source map scale of 1:675,000 and United States National Map Accuracy Standards features are within (horizontally) 342.9 meters or 1125 feet of their actual location as presented by this dataset. Users of this data should thus not assume the location of features is exactly where they are portrayed in Google Earth, ArcGIS, QGIS or other software used to display this dataset. All GIS and ancillary tables were produced as per the NPS GRI Geology-GIS Geodatabase Data Model v. 2.3. (available at: https://www.nps.gov/articles/gri-geodatabase-model.htm).
The Digital Geologic-GIS Map of the Big Pine 15' Quadrangle, California is composed of GIS data layers and GIS tables, and is available in the following GRI-supported GIS data formats: 1.) a 10.1 file geodatabase (bigp_geology.gdb), and a 2.) Open Geospatial Consortium (OGC) geopackage. The file geodatabase format is supported with a 1.) ArcGIS Pro map file (.mapx) file (bigp_geology.mapx) and individual Pro layer (.lyrx) files (for each GIS data layer), as well as with a 2.) 10.1 ArcMap (.mxd) map document (bigp_geology.mxd) and individual 10.1 layer (.lyr) files (for each GIS data layer). Upon request, the GIS data is also available in ESRI 10.1 shapefile format. Contact Stephanie O'Meara (see contact information below) to acquire the GIS data in these GIS data formats. In addition to the GIS data and supporting GIS files, three additional files comprise a GRI digital geologic-GIS dataset or map: 1.) A GIS readme file (seki_manz_geology_gis_readme.pdf), 2.) the GRI ancillary map information document (.pdf) file (seki_manz_geology.pdf) which contains geologic unit descriptions, as well as other ancillary map information and graphics from the source map(s) used by the GRI in the production of the GRI digital geologic-GIS data for the park, and 3.) a user-friendly FAQ PDF version of the metadata (bigp_geology_metadata_faq.pdf). Please read the seki_manz_geology_gis_readme.pdf for information pertaining to the proper extraction of the GIS data and other map files. QGIS software is available for free at: https://www.qgis.org/en/site/. The data were completed as a component of the Geologic Resources Inventory (GRI) program, a National Park Service (NPS) Inventory and Monitoring (I&M) Division funded program that is administered by the NPS Geologic Resources Division (GRD). For a complete listing of GRI products visit the GRI publications webpage: For a complete listing of GRI products visit the GRI publications webpage: https://www.nps.gov/subjects/geology/geologic-resources-inventory-products.htm. For more information about the Geologic Resources Inventory Program visit the GRI webpage: https://www.nps.gov/subjects/geology/gri,htm. At the bottom of that webpage is a "Contact Us" link if you need additional information. You may also directly contact the program coordinator, Jason Kenworthy (jason_kenworthy@nps.gov). Source geologic maps and data used to complete this GRI digital dataset were provided by the following: U.S. Geological Survey. Detailed information concerning the sources used and their contribution the GRI product are listed in the Source Citation section(s) of this metadata record (bigp_geology_metadata.txt or bigp_geology_metadata_faq.pdf). Users of this data are cautioned about the locational accuracy of features within this dataset. Based on the source map scale of 1:62,500 and United States National Map Accuracy Standards features are within (horizontally) 31.8 meters or 104.2 feet of their actual location as presented by this dataset. Users of this data should thus not assume the location of features is exactly where they are portrayed in ArcGIS, QGIS or other software used to display this dataset. All GIS and ancillary tables were produced as per the NPS GRI Geology-GIS Geodatabase Data Model v. 2.3. (available at: https://www.nps.gov/articles/gri-geodatabase-model.htm).
The Digital Geomorphic-GIS Map of Big Thicket National Preserve and Vicinity, Texas is composed of GIS data layers and GIS tables, and is available in the following GRI-supported GIS data formats: 1.) an ESRI file geodatabase (bitl_geomorphology.gdb), a 2.) Open Geospatial Consortium (OGC) geopackage, and 3.) 2.2 KMZ/KML file for use in Google Earth, however, this format version of the map is limited in data layers presented and in access to GRI ancillary table information. The file geodatabase format is supported with a 1.) ArcGIS Pro map file (.mapx) file (bitl_geomorphology.mapx) and individual Pro layer (.lyrx) files (for each GIS data layer). The OGC geopackage is supported with a QGIS project (.qgz) file. Upon request, the GIS data is also available in ESRI shapefile format. Contact Stephanie O'Meara (see contact information below) to acquire the GIS data in these GIS data formats. In addition to the GIS data and supporting GIS files, three additional files comprise a GRI digital geologic-GIS dataset or map: 1.) a readme file (bith_geology_gis_readme.pdf), 2.) the GRI ancillary map information document (.pdf) file (bith_geology.pdf) which contains geologic unit descriptions, as well as other ancillary map information and graphics from the source map(s) used by the GRI in the production of the GRI digital geologic-GIS data for the park, and 3.) a user-friendly FAQ PDF version of the metadata (bitl_geomorphology_metadata_faq.pdf). Please read the bith_geology_gis_readme.pdf for information pertaining to the proper extraction of the GIS data and other map files. Google Earth software is available for free at: https://www.google.com/earth/versions/. QGIS software is available for free at: https://www.qgis.org/en/site/. Users are encouraged to only use the Google Earth data for basic visualization, and to use the GIS data for any type of data analysis or investigation. The data were completed as a component of the Geologic Resources Inventory (GRI) program, a National Park Service (NPS) Inventory and Monitoring (I&M) Division funded program that is administered by the NPS Geologic Resources Division (GRD). For a complete listing of GRI products visit the GRI publications webpage: https://www.nps.gov/subjects/geology/geologic-resources-inventory-products.htm. For more information about the Geologic Resources Inventory Program visit the GRI webpage: https://www.nps.gov/subjects/geology/gri.htm. At the bottom of that webpage is a "Contact Us" link if you need additional information. You may also directly contact the program coordinator, Jason Kenworthy (jason_kenworthy@nps.gov). Source geologic maps and data used to complete this GRI digital dataset were provided by the following: Texas Bureau of Economic Geology, University of Texas at Austin. Detailed information concerning the sources used and their contribution the GRI product are listed in the Source Citation section(s) of this metadata record (bitl_geomorphology_metadata.txt or bitl_geomorphology_metadata_faq.pdf). Users of this data are cautioned about the locational accuracy of features within this dataset. Based on the source map scale of 1:500,000 and United States National Map Accuracy Standards features are within (horizontally) 254 meters or 833.3 feet of their actual location as presented by this dataset. Users of this data should thus not assume the location of features is exactly where they are portrayed in Google Earth, ArcGIS Pro, QGIS or other software used to display this dataset. All GIS and ancillary tables were produced as per the NPS GRI Geology-GIS Geodatabase Data Model v. 2.3. (available at: https://www.nps.gov/articles/gri-geodatabase-model.htm).
International Journal of Computational Intelligence Systems Impact Factor 2024-2025 - ResearchHelpDesk - The International Journal of Computational Intelligence Systems is an international peer reviewed journal and the official publication of the European Society for Fuzzy Logic and Technologies (EUSFLAT). The journal publishes original research on all aspects of applied computational intelligence, especially targeting papers demonstrating the use of techniques and methods originating from computational intelligence theory. This is an open access journal, i.e. all articles are immediately and permanently free to read, download, copy & distribute. The journal is published under the CC BY-NC 4.0 user license which defines the permitted 3rd-party reuse of its articles. Aims & Scope The International Journal of Computational Intelligence Systems publishes original research on all aspects of applied computational intelligence, especially targeting papers demonstrating the use of techniques and methods originating from computational intelligence theory. The core theories of computational intelligence are fuzzy logic, neural networks, evolutionary computation and probabilistic reasoning. The journal publishes only articles related to the use of computational intelligence and broadly covers the following topics: Autonomous reasoning Bio-informatics Cloud computing Condition monitoring Data science Data mining Data visualization Decision support systems Fault diagnosis Intelligent information retrieval Human-machine interaction and interfaces Image processing Internet and networks Noise analysis Pattern recognition Prediction systems Power (nuclear) safety systems Process and system control Real-time systems Risk analysis and safety-related issues Robotics Signal and image processing IoT and smart environments Systems integration System control System modelling and optimization Telecommunications Time series prediction Warning systems Virtual reality Web intelligence Deep learning
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this course, you will explore a variety of open-source technologies for working with geosptial data, performing spatial analysis, and undertaking general data science. The first component of the class focuses on the use of QGIS and associated technologies (GDAL, PROJ, GRASS, SAGA, and Orfeo Toolbox). The second component of the class introduces Python and associated open-source libraries and modules (NumPy, Pandas, Matplotlib, Seaborn, GeoPandas, Rasterio, WhiteboxTools, and Scikit-Learn) used by geospatial scientists and data scientists. We also provide an introduction to Structured Query Language (SQL) for performing table and spatial queries. This course is designed for individuals that have a background in GIS, such as working in the ArcGIS environment, but no prior experience using open-source software and/or coding. You will be asked to work through a series of lecture modules and videos broken into several topic areas, as outlined below. Fourteen assignments and the required data have been provided as hands-on opportunites to work with data and the discussed technologies and methods. If you have any questions or suggestions, feel free to contact us. We hope to continue to update and improve this course. This course was produced by West Virginia View (http://www.wvview.org/) with support from AmericaView (https://americaview.org/). This material is based upon work supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G18AP00077. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the opinions or policies of the U.S. Geological Survey. Mention of trade names or commercial products does not constitute their endorsement by the U.S. Geological Survey. After completing this course you will be able to: apply QGIS to visualize, query, and analyze vector and raster spatial data. use available resources to further expand your knowledge of open-source technologies. describe and use a variety of open data formats. code in Python at an intermediate-level. read, summarize, visualize, and analyze data using open Python libraries. create spatial predictive models using Python and associated libraries. use SQL to perform table and spatial queries at an intermediate-level.