Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The various performance criteria applied in this analysis include the probability of reaching the ultimate target, the costs, elapsed times and system vulnerability resulting from any intrusion. This Excel file contains all the logical, probabilistic and statistical data entered by a user, and required for the evaluation of the criteria. It also reports the results of all the computations.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using the User Manual as a guide and the Excel Graph Input Data Example file as a reference, the user enters the semantics of the graph model in this file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database collates 3552 development indicators from different studies with data by country and year, including single year and multiple year time series. The data is presented as charts, the data can be downloaded from linked project pages/references for each set, and the data for each presented graph is available as a CSV file as well as a visual download of the graph (both available via the download link under each chart).
Use the Chart Viewer template to display bar charts, line charts, pie charts, histograms, and scatterplots to complement a map. Include multiple charts to view with a map or side by side with other charts for comparison. Up to three charts can be viewed side by side or stacked, but you can access and view all the charts that are authored in the map. Examples: Present a bar chart representing average property value by county for a given area. Compare charts based on multiple population statistics in your dataset. Display an interactive scatterplot based on two values in your dataset along with an essential set of map exploration tools. Data requirements The Chart Viewer template requires a map with at least one chart configured. Key app capabilities Multiple layout options - Choose Stack to display charts stacked with the map, or choose Side by side to display charts side by side with the map. Manage chart - Reorder, rename, or turn charts on and off in the app. Multiselect chart - Compare two charts in the panel at the same time. Bookmarks - Allow users to zoom and pan to a collection of preset extents that are saved in the map. Home, Zoom controls, Legend, Layer List, Search Supportability This web app is designed responsively to be used in browsers on desktops, mobile phones, and tablets. We are committed to ongoing efforts towards making our apps as accessible as possible. Please feel free to leave a comment on how we can improve the accessibility of our apps for those who use assistive technologies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool’s interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different “visual channel” of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (n = 703) suggests that, as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies. Papers rarely included scatterplots, box plots, and histograms that allow readers to critically evaluate continuous data. Most papers presented continuous data in bar and line graphs. This is problematic, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. We recommend training investigators in data presentation, encouraging a more complete presentation of data, and changing journal editorial policies. Investigators can quickly make univariate scatterplots for small sample size studies using our Excel templates.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Business process event data modeled as labeled property graphs
Data Format
-----------
The dataset comprises one labeled property graph in two different file formats.
#1) Neo4j .dump format
A neo4j (https://neo4j.com) database dump that contains the entire graph and can be imported into a fresh neo4j database instance using the following command, see also the neo4j documentation: https://neo4j.com/docs/
/bin/neo4j-admin.(bat|sh) load --database=graph.db --from=
The .dump was created with Neo4j v3.5.
#2) .graphml format
A .zip file containing a .graphml file of the entire graph
Data Schema
-----------
The graph is a labeled property graph over business process event data. Each graph uses the following concepts
:Event nodes - each event node describes a discrete event, i.e., an atomic observation described by attribute "Activity" that occurred at the given "timestamp"
:Entity nodes - each entity node describes an entity (e.g., an object or a user), it has an EntityType and an identifier (attribute "ID")
:Log nodes - describes a collection of events that were recorded together, most graphs only contain one log node
:Class nodes - each class node describes a type of observation that has been recorded, e.g., the different types of activities that can be observed, :Class nodes group events into sets of identical observations
:CORR relationships - from :Event to :Entity nodes, describes whether an event is correlated to a specific entity; an event can be correlated to multiple entities
:DF relationships - "directly-followed by" between two :Event nodes describes which event is directly-followed by which other event; both events in a :DF relationship must be correlated to the same entity node. All :DF relationships form a directed acyclic graph.
:HAS relationship - from a :Log to an :Event node, describes which events had been recorded in which event log
:OBSERVES relationship - from an :Event to a :Class node, describes to which event class an event belongs, i.e., which activity was observed in the graph
:REL relationship - placeholder for any structural relationship between two :Entity nodes
The concepts a further defined in Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases. CoRR abs/2005.14552 (2020) https://arxiv.org/abs/2005.14552
Data Contents
-------------
neo4j-bpic19-2021-02-17 (.dump|.graphml.zip)
An integrated graph describing the raw event data of the entire BPI Challenge 2019 dataset.
van Dongen, B.F. (Boudewijn) (2019): BPI Challenge 2019. 4TU.ResearchData. Collection. https://doi.org/10.4121/uuid:d06aff4b-79f0-45e6-8ec8-e19730c248f1
This data originated from a large multinational company operating from The Netherlands in the area of coatings and paints and we ask participants to investigate the purchase order handling process for some of its 60 subsidiaries. In particular, the process owner has compliance questions. In the data, each purchase order (or purchase document) contains one or more line items. For each line item, there are roughly four types of flows in the data: (1) 3-way matching, invoice after goods receipt: For these items, the value of the goods receipt message should be matched against the value of an invoice receipt message and the value put during creation of the item (indicated by both the GR-based flag and the Goods Receipt flags set to true). (2) 3-way matching, invoice before goods receipt: Purchase Items that do require a goods receipt message, while they do not require GR-based invoicing (indicated by the GR-based IV flag set to false and the Goods Receipt flags set to true). For such purchase items, invoices can be entered before the goods are receipt, but they are blocked until goods are received. This unblocking can be done by a user, or by a batch process at regular intervals. Invoices should only be cleared if goods are received and the value matches with the invoice and the value at creation of the item. (3) 2-way matching (no goods receipt needed): For these items, the value of the invoice should match the value at creation (in full or partially until PO value is consumed), but there is no separate goods receipt message required (indicated by both the GR-based flag and the Goods Receipt flags set to false). (4)Consignment: For these items, there are no invoices on PO level as this is handled fully in a separate process. Here we see GR indicator is set to true but the GR IV flag is set to false and also we know by item type (consignment) that we do not expect an invoice against this item. Unfortunately, the complexity of the data goes further than just this division in four categories. For each purchase item, there can be many goods receipt messages and corresponding invoices which are subsequently paid. Consider for example the process of paying rent. There is a Purchase Document with one item for paying rent, but a total of 12 goods receipt messages with (cleared) invoices with a value equal to 1/12 of the total amount. For logistical services, there may even be hundreds of goods receipt messages for one line item. Overall, for each line item, the amounts of the line item, the goods receipt messages (if applicable) and the invoices have to match for the process to be compliant. Of course, the log is anonymized, but some semantics are left in the data, for example: The resources are split between batch users and normal users indicated by their name. The batch users are automated processes executed by different systems. The normal users refer to human actors in the process. The monetary values of each event are anonymized from the original data using a linear translation respecting 0, i.e. addition of multiple invoices for a single item should still lead to the original item worth (although there may be small rounding errors for numerical reasons). Company, vendor, system and document names and IDs are anonymized in a consistent way throughout the log. The company has the key, so any result can be translated by them to business insights about real customers and real purchase documents.
The case ID is a combination of the purchase document and the purchase item. There is a total of 76,349 purchase documents containing in total 251,734 items, i.e. there are 251,734 cases. In these cases, there are 1,595,923 events relating to 42 activities performed by 627 users (607 human users and 20 batch users). Sometimes the user field is empty, or NONE, which indicates no user was recorded in the source system. For each purchase item (or case) the following attributes are recorded: concept:name: A combination of the purchase document id and the item id, Purchasing Document: The purchasing document ID, Item: The item ID, Item Type: The type of the item, GR-Based Inv. Verif.: Flag indicating if GR-based invoicing is required (see above), Goods Receipt: Flag indicating if 3-way matching is required (see above), Source: The source system of this item, Doc. Category name: The name of the category of the purchasing document, Company: The subsidiary of the company from where the purchase originated, Spend classification text: A text explaining the class of purchase item, Spend area text: A text explaining the area for the purchase item, Sub spend area text: Another text explaining the area for the purchase item, Vendor: The vendor to which the purchase document was sent, Name: The name of the vendor, Document Type: The document type, Item Category: The category as explained above (3-way with GR-based invoicing, 3-way without, 2-way, consignment).
The data contains the following entities and their events
- PO - Purchase Order documents handled at a large multinational company operating from The Netherlands
- POItem - an item in a Purchase Order document describing a specific item to be purchased
- Resource - the user or worker handling the document or a specific item
- Vendor - the external organization from which an item is to be purchased
Data Size
---------
BPIC19, nodes: 1926651, relationships: 15082099
This part of the data release includes graphical representation (figures) of data from sediment cores collected in 2009 offshore of Palos Verdes, California. This file graphically presents combined data for each core (one core per page). Data on each figure are continuous core photograph, CT scan (where available), graphic diagram core description (graphic legend included at right; visual grain size scale of clay, silt, very fine sand [vf], fine sand [f], medium sand [med], coarse sand [c], and very coarse sand [vc]), multi-sensor core logger (MSCL) p-wave velocity (meters per second) and gamma-ray density (grams per cc), radiocarbon age (calibrated years before present) with analytical error (years), and pie charts that present grain-size data as percent sand (white), silt (light gray), and clay (dark gray). This is one of seven files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, offshore Los Angeles and the Palos Verdes Peninsula, adjacent to the Palos Verdes Fault. Gravity cores were collected by the USGS in 2009 (cruise ID S-I2-09-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=SI209SC), and vibracores were collected with the Monterey Bay Aquarium Research Institute's remotely operated vehicle (ROV) Doc Ricketts in 2010 (cruise ID W-1-10-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=W110SC). One spreadsheet (PalosVerdesCores_Info.xlsx) contains core name, location, and length. One spreadsheet (PalosVerdesCores_MSCLdata.xlsx) contains Multi-Sensor Core Logger P-wave velocity, gamma-ray density, and magnetic susceptibility whole-core logs. One zipped folder of .bmp files (PalosVerdesCores_Photos.zip) contains continuous core photographs of the archive half of each core. One spreadsheet (PalosVerdesCores_GrainSize.xlsx) contains laser particle grain size sample information and analytical results. One spreadsheet (PalosVerdesCores_Radiocarbon.xlsx) contains radiocarbon sample information, results, and calibrated ages. One zipped folder of DICOM files (PalosVerdesCores_CT.zip) contains raw computed tomography (CT) image files. One .pdf file (PalosVerdesCores_Figures.pdf) contains combined displays of data for each core, including graphic diagram descriptive logs. This particular metadata file describes the information contained in the file PalosVerdesCores_Figures.pdf. All cores are archived by the U.S. Geological Survey Pacific Coastal and Marine Science Center.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## GMarsGT: For rare cell identification from matched scRNA-seq (snRNA-seq) and scATAC-seq (snATAC-seq),includes genes, enhancers, and cells in a heterogeneous graph to simultaneously identify major cell clusters and rare cell clusters based on eRegulon.
## Data Collection The data was collected using GEO Database.
## Data Format The data is stored as TSV file and MTX file where each row represents a gene and each column represents a sample.
## Variables - Gene IDs: Gene Symbols (e.g., MALAT1) - Sample IDs: Sample identifiers (e.g., AAACATGCAAATTCGT-1) - Expression level: Row gene expression level.
Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences. 1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc. 2. Demographics - Gender, Age Group, Marital Status, Language etc. 3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc 4. Persona - Consumer type, Communication preferences, Family type, etc 5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc. 6. Household - Number of Children, Number of Adults, IP Address, etc. 7. Behaviours - Brand Affinity, App Usage, Web Browsing etc. 8. Firmographics - Industry, Company, Occupation, Revenue, etc 9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc. 10. Auto - Car Make, Model, Type, Year, etc. 11. Housing - Home type, Home value, Renter/Owner, Year Built etc.
Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
Consumer Graph Use Cases: 360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Here's the schema of Consumer Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic_code4
company_sic_code4_description
company_sic_code6
company_sic_code6_description
company_sic_code8
company_sic_code8_description
company_parent_company
company_parent_company_location
company_public_private
company_subsidiary_company
company_residential_business_code
company_revenue_at_side_code
company_revenue_range
company_revenue
company_sales_volume
company_small_business
company_stock_ticker
company_year_founded
company_minorityowned
company_female_owned_or_operated
company_franchise_code
company_dma
company_dma_name
company_hq_address
company_hq_city
company_hq_duns
company_hq_state
company_hq_zip5
company_hq_zip4
co...
DRAKO is a leader in providing Device Graph Data, focusing on understanding the relationships between consumer devices and identities. Our data allows businesses to create holistic profiles of users, track engagement across platforms, and measure the effectiveness of advertising efforts.
Device Graph Data is essential for accurate audience targeting, cross-device attribution, and understanding consumer journeys. By integrating data from multiple sources, we provide a unified view of user interactions, helping businesses make informed decisions.
Key Features: - Comprehensive device mapping to understand user behaviour across multiple platforms - Detailed Identity Graph Data for cross-device identification and engagement tracking - Integration with Connected TV Data for enhanced insights into video consumption habits - Mobile Attribution Data to measure the effectiveness of mobile campaigns - Customizable analytics to segment audiences based on device usage and demographics - Some ID types offered: AAID, idfa, Unified ID 2.0, AFAI, MSAI, RIDA, AAID_CTV, IDFA_CTV
Use Cases: - Cross-device marketing strategies - Attribution modelling and campaign performance measurement - Audience segmentation and targeting - Enhanced insights for Connected TV advertising - Comprehensive consumer journey mapping
Data Compliance: All of our Device Graph Data is sourced responsibly and adheres to industry standards for data privacy and protection. We ensure that user identities are handled with care, providing insights without compromising individual privacy.
Data Quality: DRAKO employs robust validation techniques to ensure the accuracy and reliability of our Device Graph Data. Our quality assurance processes include continuous monitoring and updates to maintain data integrity and relevance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the dataset used to train and test the object condensation particle flow approach described in arxiv:2002.03605.
The data can be read with DeepJetCore 3.1 (https://github.com/DL4Jets/DeepJetCore) The entries in the truth array are of dimension (batch, 200, N_truth). The truth inputs are:
isElectron, isGamma, isPositron, true_energy, true_x, true_y
The entries in the feature array are of dimension (batch, 200, N_features), with the features being:
rechit_energy, rechit_x, rechit_y, rechit_z, rechit_layer, rechit_detid
The "train.zip" file contains the training sample The "test.zip" file the test sample
The main test sample is identical to the training sample in composition, but statistically independent. Other samples can be found in subfolders:
test/flatNpart: sample with flat distribution of additional particles in the event w.r.t. each individual particle Test/hiNPart: sample with up to 15 particles per event
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This resource contains Jupyter Notebooks with examples for accessing USGS NWIS data via web services and performing subsequent analysis related to drought with particular focus on sites in Utah and the southwestern United States (could be modified to any USGS sites). The code uses the Python DataRetrieval package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.
This resources consists of 6 example notebooks: 1. Example 1: Import and plot daily flow data 2. Example 2: Import and plot instantaneous flow data for multiple sites 3. Example 3: Perform analyses with USGS annual statistics data 4. Example 4: Retrieve data and find daily flow percentiles 3. Example 5: Further examination of drought year flows 6. Coding challenge: Assess drought severity
Graph Database Market Size 2025-2029
The graph database market size is forecast to increase by USD 11.24 billion at a CAGR of 29% between 2024 and 2029.
The market is experiencing significant growth, driven by the increasing popularity of open knowledge networks and the rising demand for low-latency query processing. These trends reflect the growing importance of real-time data analytics and the need for more complex data relationships to be managed effectively. However, the market also faces challenges, including the lack of standardization and programming flexibility. These obstacles require innovative solutions from market participants to ensure interoperability and ease of use for businesses looking to adopt graph databases.
Companies seeking to capitalize on market opportunities must focus on addressing these challenges while also offering advanced features and strong performance to differentiate themselves. Effective navigation of these dynamics will be crucial for success in the evolving graph database landscape. Compliance requirements and data privacy regulations drive the need for security access control and data anonymization methods. Graph databases are deployed in both on-premises data centers and cloud regions, providing flexibility for businesses with varying IT infrastructures.
What will be the Size of the Graph Database Market during the forecast period?
Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample
In the dynamic market, security and data management are increasingly prioritized. Authorization mechanisms and encryption techniques ensure data access control and confidentiality. Query optimization strategies and indexing enhance query performance, while data anonymization methods protect sensitive information. Fault tolerance mechanisms and data governance frameworks maintain data availability and compliance with regulations. Data quality assessment and consistency checks address data integrity issues, and authentication protocols secure concurrent graph updates. This model is particularly well-suited for applications in social networks, recommendation engines, and business processes that require real-time analytics and visualization.
Graph database tuning and monitoring optimize hardware resource usage and detect performance bottlenecks. Data recovery procedures and replication methods ensure data availability during disasters and maintain data consistency. Data version control and concurrent graph updates address versioning and conflict resolution challenges. Data anomaly detection and consistency checks maintain data accuracy and reliability. Distributed transactions and data recovery procedures ensure data consistency across nodes in a distributed graph database system.
How is this Graph Database Industry segmented?
The graph database industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
End-user
Large enterprises
SMEs
Type
RDF
LPG
Solution
Native graph database
Knowledge graph engines
Graph processing engines
Graph extension
Geography
North America
US
Canada
Europe
France
Germany
Italy
Spain
UK
APAC
China
India
Japan
Rest of World (ROW)
By End-user Insights
The Large enterprises segment is estimated to witness significant growth during the forecast period. In today's business landscape, large enterprises are turning to graph databases to manage intricate data relationships and improve decision-making processes. Graph databases offer unique advantages over traditional relational databases, enabling superior agility in modeling and querying interconnected data. These systems are particularly valuable for applications such as fraud detection, supply chain optimization, customer 360 views, and network analysis. Graph databases provide the scalability and performance required to handle large, dynamic datasets and uncover hidden patterns and insights in real time. Their support for advanced analytics and AI-driven applications further bolsters their role in enterprise digital transformation strategies. Additionally, their flexibility and integration capabilities make them well-suited for deployment in hybrid and multi-cloud environments.
Graph databases offer various features that cater to diverse business needs. Data lineage tracking ensures accountability and transparency, while graph analytics engines provide advanced insights. Graph database benchmarking helps organizations evaluate performance, and relationship property indexing streamlines data access. Node relationship management facilitates complex data modeling, an
https://doi.org/10.5061/dryad.nk98sf823 |
---|
Images obtained on Lightsheet Z.1 (Zeiss) with incubation and dual pco.edge 4.2 cameras (PCO), with 20X/1.0 plan apochromat water-dipping detection objective (RI=1.34–1.35) and dual 10X/0.2 illumination objectives. The specimen tank was filled with culture medium and maintained at 37°C and 5% CO2. Mouse post-gastrulation embryos were subjected to 3 hours of imaging at 18-minute intervals, imaged from the ventral aspect in two views offset by 100, using our Zeiss Lightsheet Adaptive Position System (ZLAPS), linked below. The .czi files were compressed with bzip2 and presented here. Also included is the fused .klb dataset,which is the result of deconvol...
Observed linkages between emails and mobile advertising identifiers (MAIDs) from website and device activity.
Files are updated daily. These are highly comprehensive datasets from multiple live sources. The linkages are scored by recency, frequency, intensity, and strength.
BIGDBM Privacy Policy: https://bigdbm.com/privacy.html
NOTE: This dataset is now outdated. Please see https://doi.org/10.11583/DTU.17091101 for the updated version with many more problems. This is a collection of min-cut/max-flow problem instances that can be used for benchmarking min-cut/max-flow algorithms. The collection is released in companionship with the paper: Jensen et al., "Review of Serial and Parallel Min-Cut/Max-Flow Algorithms for Computer Vision", T-PAMI, 2022. The problem instances are collected from a wide selection of sources to be as representative as possible. Specifically, this collection contains: Most of the problem instances (some are unavailable due to dead links) published by the University of Waterloo: https://vision.cs.uwaterloo.ca/data/maxflow Super-resolution, texture restoration, deconvolution, decision tree field (DTF) and automatic labeling environment from Verma & Batra, "MaxFlow Revisited: An Empirical Comparison of Maxflow Algorithms for Dense Vision Problems", 2012, BMVC Sparse Layered Graph instances from Jeppesen et al., "Sparse Layered Graphs for Multi-Object Segmentation", 2020, CVPR Multi-object surface fitting from Jensen et al., "Multi-Object Graph-Based Segmentation With Non-Overlapping Surfaces", 2020, CVPRW The reason for releasing this collection is to provide a single place to download all datasets used in our paper (and various previous paper) instead of having to scavenge from multiple sources. Furthermore, several of the problem instances typically used for benchmarking min-cut/max-flow algorithms are no longer available at their original locations and may be difficult to find. By storing the data on Zenodo with a dedicated DOI we hope to avoid this. For license information, see below. Files and formats We provide all problem instances in two file formats: DIMACS and a custom binary format. Both are described below. Each file has been zipped, and similar files have then been grouped into their own zip file (i.e. it is a zip of zips). DIMACS files have been prefixed with dimacs_
and binary files have been prefixed with bin_
. DIMACS All problem instances are available in DIMACS format (explained here: http://lpsolve.sourceforge.net/5.5/DIMACS_maxf.htm). For the larger problem instances, we have also published a partition of the graph nodes into blocks for block-based parallel min-cut/max-flow. The partition matches the one used in the companion review paper (Jensen et al., 2021). For a problem instance with filename <name>.max
the blocks are in the text file <name>.blk.txt
. The first line gives the number of blocks. The second gives the block index (zero-indexed) for every node; each entry is separated by a space. Example for a sixteen node graph with four blocks: 4 0 0 1 1 2 2 3 3 0 0 1 1 2 2 3 3 0 0 1 1 2 2 3 3 0 0 1 1 2 2 3 3 Binary files While DIMACS has the advantage of being human-readable, storing everything as text requires a lot of space. This makes the files unnecessarily large and slow to parse. To overcome this, we also release all problem instances in a simple binary storage format. We have two formats: one for graphs and one for quadratic pseudo-boolean optimization (QPBO) problems. Code to convert to/from DIMACS is also available at: https://www.doi.org/10.5281/zenodo.4903946 or https://github.com/patmjen/maxflow_algorithms. Binary BK (.bbk
) files are for storing normal graphs for min-cut/max-flow. They closely follow the internal storage format used in the original implementation of the Boykov-Kolmogorov algorithm, meaning that terminal arcs are stored in a separate list from normal neighbor arcs. The format is: Uncompressed: Header: (3 x uint8) 'BBQ' Types codes: (2 x uint8) captype, tcaptype Sizes: (3 x uint64) num_nodes, num_terminal_arcs, num_neighbor_arcs Terminal arcs: (num_terminal_arcs x BkTermArc) Neighbor arcs: (num_neighbor_arcs x BkNborArc) Compressed (using Google's snappy: https://github.com/google/snappy): Header: (3 x uint8) 'bbq' Types codes: (2 x uint8) captype, tcaptype Sizes: (3 x uint64) num_nodes, num_terminal_arcs, num_neighbor_arcs Terminal arcs: (1 x uint64) compressed_bytes_1 (compressed_bytes_1 x uint8) compressed num_terminal_arcs x BkTermArc Neighbor arcs: (1 x uint64) compressed_bytes_2 (compressed_bytes_2 x uint8) compressed num_neighbor_arcs x BkNborArc Where: /** Enum for switching over POD types. / enum TypeCode : uint8_t { TYPE_UINT8, TYPE_INT8, TYPE_UINT16, TYPE_INT16, TYPE_UINT32, TYPE_INT32, TYPE_UINT64, TYPE_INT64, TYPE_FLOAT, TYPE_DOUBLE, TYPE_INVALID = 0xFF }; /* Terminal arc with source and sink capacity for given node. / template * Neighbor arc with forward and reverse capacity. */ template
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ns4:pThe Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and/ns4:pns4:p implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset./ns4:pns4:p SDG is freely available under the MIT license at
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Existing Home Sales in the United States decreased to 3930 Thousand in June from 4040 Thousand in May of 2025. This dataset provides the latest reported value for - United States Existing Home Sales - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The various performance criteria applied in this analysis include the probability of reaching the ultimate target, the costs, elapsed times and system vulnerability resulting from any intrusion. This Excel file contains all the logical, probabilistic and statistical data entered by a user, and required for the evaluation of the criteria. It also reports the results of all the computations.