100+ datasets found

Data sources for anti-fraud data analytics initiatives in global...
statista.com
Updated May 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Data sources for anti-fraud data analytics initiatives in global organizations 2019 [Dataset]. https://www.statista.com/statistics/1043542/worldwide-fraud-fight-data-analytics-data-sources/
Explore at:
Dataset updated
May 23, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2019
Area covered
Worldwide
Description
Internal structured data is the most commonly used data source for anti-fraud data analytics initiatives in organizations, according to a global company survey in 2019. Almost three quarters of the respondents said that internal structured data was used in their companies for anti-fraud analytics tests.
d
Spreadsheet of resistance values and data sources used to compile the...
catalog.data.gov
datasets.ai
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Fish and Wildlife Service (2025). Spreadsheet of resistance values and data sources used to compile the resistance surface - A landscape connectivity analysis for the coastal marten (Martes caurina humboldtensis) [Dataset]. https://catalog.data.gov/dataset/spreadsheet-of-resistance-values-and-data-sources-used-to-compile-the-resistance-surface-a
Explore at:
Dataset updated
Feb 22, 2025
Dataset provided by
U.S. Fish and Wildlife Service
Description
This spreadsheet contains a list of component raster data layers that were used to compile our resistance surface, the classes of data represented within each of these rasters, and the resistance value we assigned to each class. It also provides a web reference for each data layer to provide additional context and information about the source datasets. Please refer to the embedded spatial metadata and the information in our full report for details on the development of the resulting ResistanceSurface, as well as these component data layers: ResistanceData_Roads ResistanceData_ForestedCover ResistanceData_Rivers ResistanceData_Waterbodies ResistanceData_NonForestedCover ResistanceData_BaysEstuaries ResistancePostProcessing_Serpentine
Importance of data sources for analytics vs access among U.S. businesses...
statista.com
Updated Mar 25, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2016). Importance of data sources for analytics vs access among U.S. businesses 2015 [Dataset]. https://www.statista.com/statistics/562625/united-states-data-analytics-importance-vs-access/
Explore at:
Dataset updated
Mar 25, 2016
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
This statistic illustrates the importance of various data sources for business analytics, compared to the level of access businesses have to those data sources, according to a marketing survey of C-level executives, conducted in December 2015 by Black Ink. As of December 2015, product and service usage data was listed as important by 68 percent of respondents, but the degree of access to that data was put at 33 percent.
Sentiment Analysis for Mental Health
kaggle.com
Updated Jul 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suchintika Sarkar (2024). Sentiment Analysis for Mental Health [Dataset]. https://www.kaggle.com/datasets/suchintikasarkar/sentiment-analysis-for-mental-health
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 5, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Suchintika Sarkar
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
This comprehensive dataset is a meticulously curated collection of mental health statuses tagged from various statements. The dataset amalgamates raw data from multiple sources, cleaned and compiled to create a robust resource for developing chatbots and performing sentiment analysis.

Data Source:

The dataset integrates information from the following Kaggle datasets:

3k Conversations Dataset for Chatbot

Depression Reddit Cleaned

Human Stress Prediction

Predicting Anxiety in Mental Health Data

Mental Health Dataset Bipolar

Reddit Mental Health Data

Students Anxiety and Depression Dataset

Suicidal Mental Health Dataset

Suicidal Tweet Detection Dataset

Data Overview:

The dataset consists of statements tagged with one of the following seven mental health statuses: - Normal - Depression - Suicidal - Anxiety - Stress - Bi-Polar - Personality Disorder

Data Collection:

The data is sourced from diverse platforms including social media posts, Reddit posts, Twitter posts, and more. Each entry is tagged with a specific mental health status, making it an invaluable asset for:

Developing intelligent mental health chatbots.

Performing in-depth sentiment analysis.

Research and studies related to mental health trends.

Features:

unique_id: A unique identifier for each entry.

Statement: The textual data or post.

Mental Health Status: The tagged mental health status of the statement.

Usage:

This dataset is ideal for training machine learning models aimed at understanding and predicting mental health conditions based on textual data. It can be used in various applications such as:

Chatbot development for mental health support.

Sentiment analysis to gauge mental health trends.

Academic research on mental health patterns.

Acknowledgments:

This dataset was created by aggregating and cleaning data from various publicly available datasets on Kaggle. Special thanks to the original dataset creators for their contributions.
A
‘Statistics on the Open Data site ’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Statistics on the Open Data site ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-europa-eu-statistics-on-the-open-data-site-55ba/8b2737b0/?iid=002-180&v=presentation
Explore at:
Dataset updated
Jan 12, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Statistics on the Open Data site ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-mon-saint-quentin-hub-arcgis-com-datasets-5426305826594a33a561acfd02d25808_0 on 12 January 2022.

--- Dataset description provided by original source is as follows ---

Statistics on official and obsolete consignments broken down by actor in the portal.

Definition of Obsolète: A batch of data is considered obsolete when obvious defects have been detected as a result of a quality check or where there is no longer an update strategy carried out by the business department responsible for the maintenance of the lot.

Definition of official: The lot is usable and suitable.

--- Original source retains full ownership of the source dataset ---
Z
Data from: CaImAn: An open source tool for scalable Calcium Imaging data...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brandon L. Brown (2020). CaImAn: An open source tool for scalable Calcium Imaging data Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1659148
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Eftychios A. Pnevmatikakis
Brandon L. Brown
David W. Tank
Pengcheng Zhou
Andrea Giovannucci
Dmitri Chklovskii
Jeffrey L. Gauthier
Johannes Friedrich
Jiannis Taxidis
Pat Gunn
Baljit S. Khakh
Sue Ann Koay
Farzaneh Najafi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Advances in fluorescence microscopy enable monitoring larger brain areas in-vivo with finer time resolution. The resulting data rates require reproducible analysis pipelines that are reliable, fully automated, and scalable to datasets generated over the course of months. We present CaImAn, an open-source library for calcium imaging data analysis. CaImAn provides automatic and scalable methods to address problems common to preprocessing, including motion correction, neural activity identification, and registration across different sessions of data collection. It does this while requiring minimal user intervention, with good scalability on computers ranging from laptops to high-performance computing clusters. CaImAn is suitable for two-photon and one-photon imaging, and also enables real-time analysis on streaming data.

To benchmark the performance of CaImAn we collected and combined a corpus of manual annotations from multiple labelers on nine mouse two-photon datasets, that are contained in this open access repository. We demonstrate that CaImAn achieves near-human performance in detecting locations of active neurons.

In order to reproduce the results of the paper or download the annotations and the raw movies, please refer to the readme.md at:

https://github.com/flatironinstitute/CaImAn/blob/master/use_cases/eLife_scripts/README.md
d
GLO climate data stats summary
data.gov.au
demo.dev.magda.io
+1more
zip
Updated Apr 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2022). GLO climate data stats summary [Dataset]. https://data.gov.au/data/dataset/afed85e0-7819-493d-a847-ec00a318e657
Explore at:
zip(8810)Available download formats
Dataset updated
Apr 13, 2022
Dataset authored and provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

Various climate variables summary for all 15 subregions based on Bureau of Meteorology Australian Water Availability Project (BAWAP) climate grids. Including

Time series mean annual BAWAP rainfall from 1900 - 2012.

Long term average BAWAP rainfall and Penman Potentail Evapotranspiration (PET) from Jan 1981 - Dec 2012 for each month

Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P (precipitation); (ii) Penman ETp; (iii) Tavg (average temperature); (iv) Tmax (maximum temperature); (v) Tmin (minimum temperature); (vi) VPD (Vapour Pressure Deficit); (vii) Rn (net radiation); and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend.

Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009).

As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

There are 4 csv files here:

BAWAP_P_annual_BA_SYB_GLO.csv

Desc: Time series mean annual BAWAP rainfall from 1900 - 2012.

Source data: annual BILO rainfall

P_PET_monthly_BA_SYB_GLO.csv

long term average BAWAP rainfall and Penman PET from 198101 - 201212 for each month

Climatology_Trend_BA_SYB_GLO.csv

Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P; (ii) Penman ETp; (iii) Tavg; (iv) Tmax; (v) Tmin; (vi) VPD; (vii) Rn; and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend

Risbey_Remote_Rainfall_Drivers_Corr_Coeffs_BA_NSB_GLO.csv

Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009). As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

Dataset History

Dataset was created from various BAWAP source data, including Monthly BAWAP rainfall, Tmax, Tmin, VPD, etc, and other source data including monthly Penman PET, Correlation coefficient data. Data were extracted from national datasets for the GLO subregion.

BAWAP_P_annual_BA_SYB_GLO.csv

Desc: Time series mean annual BAWAP rainfall from 1900 - 2012.

Source data: annual BILO rainfall

P_PET_monthly_BA_SYB_GLO.csv

long term average BAWAP rainfall and Penman PET from 198101 - 201212 for each month

Climatology_Trend_BA_SYB_GLO.csv

Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P; (ii) Penman ETp; (iii) Tavg; (iv) Tmax; (v) Tmin; (vi) VPD; (vii) Rn; and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend

Risbey_Remote_Rainfall_Drivers_Corr_Coeffs_BA_NSB_GLO.csv

Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009). As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

Dataset Citation

Bioregional Assessment Programme (2014) GLO climate data stats summary. Bioregional Assessment Derived Dataset. Viewed 18 July 2018, http://data.bioregionalassessments.gov.au/dataset/afed85e0-7819-493d-a847-ec00a318e657.

Dataset Ancestors

Derived From Natural Resource Management (NRM) Regions 2010

Derived From Bioregional Assessment areas v03

Derived From BILO Gridded Climate Data: Daily Climate Data for each year from 1900 to 2012

Derived From Bioregional Assessment areas v01

Derived From Bioregional Assessment areas v02

Derived From GEODATA TOPO 250K Series 3

Derived From NSW Catchment Management Authority Boundaries 20130917

Derived From Geological Provinces - Full Extent

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
c
Global Big Data in the Oil and Gas Sector Market Report 2025 Edition, Market...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Apr 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). Global Big Data in the Oil and Gas Sector Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/big-data-in-the-oil-and-gas-sector-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Apr 12, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global Big Data in Oil and Gas Sector market size is projected to reach USD XX million by 2024 and is expected to expand at a compound annual growth rate (CAGR) of XX% from 2024 to 2031.

The global Big Data in Oil and Gas Sector market is anticipated to grow significantly, with a projected CAGR of XX% between 2024 and 2031. North America is expected to hold a major market share of more than XX%, with a market size of USD XX million in 2024, and is forecasted to grow at a CAGR of XX% from 2024 to 2031 due to the advanced technological infrastructure and the high adoption rate of digital technologies in the oil and gas sector. The upstream application segment held the highest Big Data in Oil and Gas Sector market revenue share in 2024, attributed to the critical role of big data in exploration and production activities, optimizing reservoir performance, and minimizing risks.

Market Dynamics - Key Drivers of the Big Data in Oil and Gas Sector

Integration of Advanced Analytics for Enhanced Decision-Making Drives the Big Data in Oil & Gas Market

The Big Data in Oil & Gas market is driven by the adoption of advanced analytics, where cost efficiency is a major achievement. Big data analytics processes complex datasets for better predictions and optimisations. Its affordability relative to other precious metals like gold and platinum further amplifies its appeal. As Big Data is further integrated, the development of the Oil & Gas Sector is buoyed by enhancing decision-making, efficiency, and safety.

For instance, ExxonMobil, in their "2020 Energy & Carbon Summary" report, highlighted the use of advanced seismic imaging and data analytics to improve the accuracy of subsurface exploration, thereby reducing drilling risks and enhancing operational efficiency.

IoT Deployment for Real-Time Monitoring and Efficiency Further Propel the Big Data in Oil & Gas Market

The rising demand for monitored infographics and data analytics is to fuel the Big Data in the Oil & Gas market. The deployment of IoT devices facilitates real-time monitoring and operational efficiency. This development aligns with the broader shift towards self-sufficiency and positive capital allocations. As IoT sensors on equipment and in operations provide critical data for predictive maintenance and decision-making, contributing to the shift from capital expenditure to operational expenditure in multiple outsourced activities for the businesses.

Schlumberger, in their "Digital Transformation in the Oil and Gas Industry" report, discussed implementing IoT solutions to monitor well operations, which has led to significant improvements in maintenance strategies and operational efficiencies.

Market Dynamics - Key Restraints of the Big Data in Oil and Gas Sector

Data Security and Privacy Concerns is a Challenge for the Big Data in Oil & Gas Market

With the companies storing all the its data on every aspect of business for a more efficient future working, there is still room for avoidable threats. The rising demand for big data might come with the threat of Data security and privacy are significant concerns with the increasing use of big data analytics, given the oil and gas sector's sensitive nature. Cyber threats limit the adoption of big data solutions, limiting the demand for Big data in the Oil & Gas market.

The International Energy Agency (IEA), in its "Digitalization & Energy" report, highlighted the cybersecurity challenges facing the energy sector, emphasizing the need for robust security measures in the adoption of digital technologies, including big data analytics.

Integration and Interoperability Challenges will Restraint the Big Data in Oil & Gas Market

Data access, analysis, and storage are becoming more and more of an issue for businesses. Compatibility and interoperability issues arise when big data technologies are integrated with legacy systems. The integration process is made more difficult by the diversity of data sources and formats. Most firms are finding it necessary to evaluate new technologies and legacy infrastructure as the needs of Big Data outpace those of traditional relational databases.

A study by Deloitte, titled "Digital Transformation: Shaping the Future of the Oil and Gas Industry", identified integration of new technologies with existin...
Most reliable sources of data for market researchers in the U.S. 2017
statista.com
Updated Dec 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most reliable sources of data for market researchers in the U.S. 2017 [Dataset]. https://www.statista.com/statistics/917534/market-research-industry-us-most-reliable-sources-of-data/
Explore at:
Dataset updated
Dec 10, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017
Area covered
United States
Description
This statistic displays the most reliable sources of data according to professionals in the market research industry in the United States in 2017. During the survey, 32 percent of respondents cited marketing analytics as the most reliable data source.
A
‘Sign Line Task & Work Order Data’ analyzed by Analyst-2
analyst-2.ai
Updated Oct 22, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2017). ‘Sign Line Task & Work Order Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-sign-line-task-work-order-data-bb94/latest
Explore at:
Dataset updated
Oct 22, 2017
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Sign Line Task & Work Order Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/6687ab3e-87a4-423c-8e4d-11dc1d9c2943 on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Sign (Task & Work Order) Information from eWork.

--- Original source retains full ownership of the source dataset ---
Z
Enterprise-Driven Open Source Software
data.niaid.nih.gov
opendatalab.com
Updated Apr 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kravvaritis, Konstantinos (2020). Enterprise-Driven Open Source Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3653877
Explore at:
Dataset updated
Apr 22, 2020
Dataset provided by
Theodorou, Georgios
Kravvaritis, Konstantinos
Louridas, Panos
Kotti, Zoe
Spinellis, Diomidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a dataset of open source software developed mainly by enterprises rather than volunteers. This can be used to address known generalizability concerns, and, also, to perform research on open source business software development. Based on the premise that an enterprise's employees are likely to contribute to a project developed by their organization using the email account provided by it, we mine domain names associated with enterprises from open data sources as well as through white- and blacklisting, and use them through three heuristics to identify 17,264 enterprise GitHub projects. We provide these as a dataset detailing their provenance and properties. A manual evaluation of a dataset sample shows an identification accuracy of 89%. Through an exploratory data analysis we found that projects are staffed by a plurality of enterprise insiders, who appear to be pulling more than their weight, and that in a small percentage of relatively large projects development happens exclusively through enterprise insiders.

The main dataset is provided as a 17,264 record tab-separated file named enterprise_projects.txt with the following 29 fields.

url: the project's GitHub URL

project_id: the project's GHTorrent identifier

sdtc: true if selected using the same domain top committers heuristic (9,016 records)

mcpc: true if selected using the multiple committers from a valid enterprise heuristic (8,314 records)

mcve: true if selected using the multiple committers from a probable company heuristic (8,015 records),

star_number: number of GitHub watchers

commit_count: number of commits

files: number of files in current main branch

lines: corresponding number of lines in text files

pull_requests: number of pull requests

github_repo_creation: timestamp of the GitHub repository creation

earliest_commit: timestamp of the earliest commit

most_recent_commit: date of the most recent commit

committer_count: number of different committers

author_count: number of different authors

dominant_domain: the projects dominant email domain

dominant_domain_committer_commits: number of commits made by committers whose email matches the project's dominant domain

dominant_domain_author_commits: corresponding number for commit authors

dominant_domain_committers: number of committers whose email matches the project's dominant domain

dominant_domain_authors: corresponding number for commit authors

cik: SEC's EDGAR "central index key"

fg500: true if this is a Fortune Global 500 company (2,233 records)

sec10k: true if the company files SEC 10-K forms (4,180 records)

sec20f: true if the company files SEC 20-F forms (429 records)

project_name: GitHub project name

owner_login: GitHub project's owner login

company_name: company name as derived from the SEC and Fortune 500 data

owner_company: GitHub project's owner company name

license: SPDX license identifier

The file cohost_project_details.txt provides the full set of 311,223 cohort projects that are not part of the enterprise data set, but have comparable quality attributes.

url: the project's GitHub URL

project_id: the project's GHTorrent identifier

stars: number of GitHub watchers

commit_count: number of commits
g
Land Cover Summary Statistics Data Package for Greater Yellowstone Network...
gimi9.com
catalog.data.gov
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Land Cover Summary Statistics Data Package for Greater Yellowstone Network Park Units [Dataset]. https://gimi9.com/dataset/data-gov_land-cover-summary-statistics-data-package-for-greater-yellowstone-network-park-units/
Explore at:
Dataset updated
Dec 16, 2023
Description
This report documents the acquisition of source data, and calculation of land cover summary statistics datasets for four National Park Service Greater Yellowstone Network park units and six custom areas of analysis: Bighorn Canyon National Recreation Area, Grand Teton National Park, John D. Rockefeller Jr. Memorial Parkway, Yellowstone National Park, and the six custom areas of analysis. The source data and land cover calculations are available for use within the National Park Service (NPS) Inventory and Monitoring Program. Land cover summary statistics datasets can be calculated for all geographic regions within the extent of the NPS; this report includes statistics calculated for the conterminous United States. The land cover summary statistics datasets are calculated from multiple sources, including Multi-Resolution Land Characteristics Consortium products in the National Land Cover Database (NLCD) and the United States Geological Survey’s (USGS) Earth Resources Observation and Science (EROS) Center products in the Land Change Monitoring, Assessment, and Projection (LCMAP) raster dataset. These summary statistics calculate land cover at up to three classification scales: Level 1, modified Anderson Level 2, and Natural versus Converted land cover. The output land cover summary statistics datasets produced here for the four Greater Yellowstone Network park units and six custom areas of analysis utilize the most recent versions of the source datasets (NLCD and LCMAP). These land cover summary statistics datasets are used in the NPS Inventory and Monitoring Program, including the NPS Environmental Settings Monitoring Protocol and may be used by networks and parks for additional efforts.
d
Impact and Risk Analysis Database Documentation
data.gov.au
cloud.csiss.gmu.edu
+3more
zip
Updated Nov 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2019). Impact and Risk Analysis Database Documentation [Dataset]. https://data.gov.au/data/dataset/groups/05e851cf-57a5-4127-948a-1b41732d538c
Explore at:
zip(3577368)Available download formats
Dataset updated
Nov 20, 2019
Dataset provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

Four documents describe the specifications, methods and scripts of the Impact and Risk Analysis Databases developed for the Bioregional Assessments Programme. They are:

Bioregional Assessment Impact and Risk Databases Installation Advice (IMIA Database Installation Advice v1.docx).

Naming Convention of the Bioregional Assessment Impact and Risk Databases (IMIA Project Naming Convention v39.docx).

Data treatments for the Bioregional Assessment Impact and Risk Databases (IMIA Project Data Treatments v02.docx).

Quality Assurance of the Bioregional Assessment Impact and Risk Databases (IMIA Project Quality Assurance Protocol v17.docx).

This dataset also includes the Materialised View Information Manager (MatInfoManager.zip). This Microsoft Access database is used to manage the overlay definitions of materialized views of the Impact and Risk Analysis Databases. For more information about this tool, refer to the Data Treatments document.

The documentation supports all five Impact and Risk Analysis Databases developed for the assessment areas:

Maranoa-Balonne-Condamine: http://data.bioregionalassessments.gov.au/dataset/69075f3e-67ba-405b-8640-96e6cb2a189a

Gloucester: http://data.bioregionalassessments.gov.au/dataset/d78c474c-5177-42c2-873c-64c7fe2b178c

Hunter: http://data.bioregionalassessments.gov.au/dataset/7c170d60-ff09-4982-bd89-dd3998a88a47

Namoi: http://data.bioregionalassessments.gov.au/dataset/1549c88d-927b-4cb5-b531-1d584d59be58

Galilee: http://data.bioregionalassessments.gov.au/dataset/3dbb5380-2956-4f40-a535-cbdcda129045

Purpose

These documents describe end-to-end treatments of scientific data for the Impact and Risk Analysis Databases, developed and published by the Bioregional Assessment Programme. The applied approach to data quality assurance is also described. These documents are intended for people with an advanced knowledge in geospatial analysis and database administration, who seek to understand, restore or utilise the Analysis Databases and their underlying methods of analysis.

Dataset History

The Impact and Risk Analysis Database Documentation was created for and by the Information Modelling and Impact Assessment Project (IMIA Project).

Dataset Citation

Bioregional Assessment Programme (2018) Impact and Risk Analysis Database Documentation. Bioregional Assessment Source Dataset. Viewed 12 December 2018, http://data.bioregionalassessments.gov.au/dataset/05e851cf-57a5-4127-948a-1b41732d538c.
D
Data And Analytics Software Market Report
promarketreports.com
doc, pdf, ppt
Updated Feb 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pro Market Reports (2025). Data And Analytics Software Market Report [Dataset]. https://www.promarketreports.com/reports/data-and-analytics-software-market-18429
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Feb 23, 2025
Dataset authored and provided by
Pro Market Reports
License
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The data and analytics software market is poised to experience significant growth, expanding from USD 108.69 billion in 2025 to a projected USD 248.84 billion by 2033, exhibiting a CAGR of 9.72% during the forecast period. This growth is fueled by the increasing adoption of big data and cloud computing, as well as the rising demand for data-driven insights to improve decision-making and gain a competitive edge in various industries. Major market drivers include the growing volume and complexity of data, technological advancements in data management and analytics, and the need for real-time insights to optimize operations and customer experiences. Market trends include the rise of artificial intelligence (AI) and machine learning (ML), which enable more advanced data analysis and predictive modeling. The adoption of cloud-based data analytics solutions is also gaining traction, offering flexibility, cost-effectiveness, and scalability. Some market restraints include data security and privacy concerns, the lack of skilled data analytics professionals, and the integration challenges associated with diverse data sources. The market is highly competitive, with established vendors such as Qlik, Informatica, Oracle, Microsoft, and Teradata, along with emerging players like Databricks, Amazon Web Services (AWS), and Google Cloud Platform (GCP) vying for market share. Key drivers for this market are: 1. Self-service analytics tools 2. Integration with other cloud applications 3. Prescriptive and predictive analytics 4. Artificial intelligence and machine 5. learning Data storytelling. Potential restraints include: Cloud adoption real-time analytics artificial intelligence.
Big Data Analytics in Retail Market - Trends & Industry Analysis
mordorintelligence.com
pdf,excel,csv,ppt
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mordor Intelligence, Big Data Analytics in Retail Market - Trends & Industry Analysis [Dataset]. https://www.mordorintelligence.com/industry-reports/big-data-analytics-in-retail-marketing-market
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset authored and provided by
Mordor Intelligence
License
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
Time period covered
2021 - 2030
Area covered
Global
Description
The Data Analytics in Retail Industry is segmented by Application (Merchandising and Supply Chain Analytics, Social Media Analytics, Customer Analytics, Operational Intelligence, Other Applications), by Business Type (Small and Medium Enterprises, Large-scale Organizations), and Geography. The market size and forecasts are provided in terms of value (USD billion) for all the above segments.
News Events Data in Latin America( Techsalerator)
datarade.ai
Updated Mar 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2024). News Events Data in Latin America( Techsalerator) [Dataset]. https://datarade.ai/data-products/news-events-data-in-latin-america-techsalerator-techsalerator
Explore at:
.json, .csv, .xls, .txtAvailable download formats
Dataset updated
Mar 20, 2024
Dataset provided by
Techsalerator LLC
Authors
Techsalerator
Area covered
Americas, Latin America, Aruba, Chile, Falkland Islands (Malvinas), French Guiana, Martinique, Montserrat, Dominican Republic, Ecuador, Cuba, Argentina
Description
Techsalerator’s News Event Data in Latin America offers a detailed and extensive dataset designed to provide businesses, analysts, journalists, and researchers with an in-depth view of significant news events across the Latin American region. This dataset captures and categorizes key events reported from a wide array of news sources, including press releases, industry news sites, blogs, and PR platforms, offering valuable insights into regional developments, economic changes, political shifts, and cultural events.

Key Features of the Dataset: Comprehensive Coverage:

The dataset aggregates news events from numerous sources such as company press releases, industry news outlets, blogs, PR sites, and traditional news media. This broad coverage ensures a wide range of information from multiple reporting channels. Categorization of Events:

News events are categorized into various types including business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly locate and analyze information relevant to their interests or sectors. Real-Time Updates:

The dataset is updated regularly to include the most recent events, ensuring users have access to the latest news and can stay informed about current developments. Geographic Segmentation:

Events are tagged with their respective countries and regions within Latin America. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:

Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps in understanding the context and significance of each event. Historical Data:

The dataset includes historical news event data, enabling users to track trends and perform comparative analysis over time. This feature supports longitudinal studies and provides insights into how news events evolve. Advanced Search and Filter Options:

Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. Latin American Countries Covered: South America: Argentina Bolivia Brazil Chile Colombia Ecuador Guyana Paraguay Peru Suriname Uruguay Venezuela Central America: Belize Costa Rica El Salvador Guatemala Honduras Nicaragua Panama Caribbean: Cuba Dominican Republic Haiti (Note: Primarily French-speaking but included due to geographic and cultural ties) Jamaica Trinidad and Tobago Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across Latin America, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to Latin American news and events. Techsalerator’s News Event Data in Latin America is a crucial resource for accessing and analyzing significant news events across the region. By providing detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.

Global Data Element Market Research Report: By Data Source (Relational...

wiseguyreports.com

Updated Jul 23, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

wWiseguy Research Consultants Pvt Ltd (2024). Global Data Element Market Research Report: By Data Source (Relational Databases, NoSQL Databases, Big Data Platforms, Cloud-based Data Warehouses), By Type (Structured Data, Unstructured Data, Semi-Structured Data), By Format (XML, JSON, CSV, Parquet), By Purpose (Data Analysis, Machine Learning, Data Visualization, Data Governance), By Deployment Model (On-premises, Cloud-based, Hybrid) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-element-market

Explore at:

Dataset updated

Jul 23, 2024

Dataset authored and provided by

wWiseguy Research Consultants Pvt Ltd

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Jan 7, 2024

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2024
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2023	7.6(USD Billion)
MARKET SIZE 2024	8.66(USD Billion)
MARKET SIZE 2032	24.7(USD Billion)
SEGMENTS COVERED	Data Source ,Type ,Format ,Purpose ,Deployment Model ,Regional
COUNTRIES COVERED	North America, Europe, APAC, South America, MEA
KEY MARKET DYNAMICS	AIdriven data element management Data privacy and regulations Cloudbased data element platforms Data sharing and collaboration Increasing demand for realtime data
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Informatica ,Micro Focus ,IBM ,SAS ,Denodo ,Oracle ,TIBCO ,Talend ,SAP
MARKET FORECAST PERIOD	2024 - 2032
KEY MARKET OPPORTUNITIES	1 Adoption of AI and ML 2 Growing demand for data analytics 3 Increasing cloud adoption 4 Data privacy and security concerns 5 Integration with emerging technologies
COMPOUND ANNUAL GROWTH RATE (CAGR)	13.99% (2024 - 2032)

D
Data Preparation Tool Market Report
promarketreports.com
doc, pdf, ppt
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pro Market Reports (2025). Data Preparation Tool Market Report [Dataset]. https://www.promarketreports.com/reports/data-preparation-tool-market-18555
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Feb 3, 2025
Dataset authored and provided by
Pro Market Reports
License
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global data preparation tool market is estimated to be valued at $674.52 million in 2025, with a compound annual growth rate (CAGR) of 16.46% from 2025 to 2033. The rising need to manage and analyze large volumes of complex data from various sources is driving the growth of the market. Additionally, the increasing adoption of cloud-based data management solutions and the growing demand for data-driven decision-making are contributing to the market's expansion. Key market trends include the growing adoption of artificial intelligence (AI) and machine learning (ML) technologies for data preparation automation, the increasing use of data visualization tools for data analysis, and the growing popularity of data fabric architectures for data integration and management. The market is segmented by deployment (on-premises, cloud, hybrid), data volume (small data, big data), data type (structured data, unstructured data, semi-structured data), industry vertical (BFSI, healthcare, retail, manufacturing), and use case (data integration, data cleansing, data transformation, data enrichment). North America is the largest regional market, followed by Europe and Asia Pacific. IBM, Collibra, Talend, Microsoft, Informatica, SAP, SAS Institute, and Denodo are some of the key players in the market. Key drivers for this market are: Cloud-based deployment AIML integration Self-service capabilities Real-time data processing Data governance and compliance. Potential restraints include: Increasing cloud adoption Growing volume of data Advancements in artificial intelligence (AI) and machine learning (ML) Stringent regulatory compliance Rising demand for self-service data preparation.
N
Comprehensive Median Household Income and Distribution Dataset for Fall...
neilsberg.com
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Comprehensive Median Household Income and Distribution Dataset for Fall River, MA: Analysis by Household Type, Size and Income Brackets [Dataset]. https://www.neilsberg.com/research/datasets/cd9a4b22-b041-11ee-aaca-3860777c1fe6/
Explore at:
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Fall River, Massachusetts
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the median household income in Fall River. It can be utilized to understand the trend in median household income and to analyze the income distribution in Fall River by household type, size, and across various income brackets.

Content

The dataset will have the following datasets when applicable

Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

Fall River, MA Median Household Income Trends (2010-2021, in 2022 inflation-adjusted dollars)

Median Household Income Variation by Family Size in Fall River, MA: Comparative analysis across 7 household sizes

Income Distribution by Quintile: Mean Household Income in Fall River, MA

Fall River, MA households by income brackets: family, non-family, and total, in 2022 inflation-adjusted dollars

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Interested in deeper insights and visual analysis?

Explore our comprehensive data analysis and visual representations for a deeper understanding of Fall River median household income. You can refer the same here
N
Comprehensive Median Household Income and Distribution Dataset for Widener,...
neilsberg.com
Updated Jan 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Comprehensive Median Household Income and Distribution Dataset for Widener, AR: Analysis by Household Type, Size and Income Brackets [Dataset]. https://www.neilsberg.com/research/datasets/cdc89507-b041-11ee-aaca-3860777c1fe6/
Explore at:
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Widener, Arkansas
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the median household income in Widener. It can be utilized to understand the trend in median household income and to analyze the income distribution in Widener by household type, size, and across various income brackets.

Content

The dataset will have the following datasets when applicable

Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

Widener, AR Median Household Income Trends (2010-2021, in 2022 inflation-adjusted dollars)

Median Household Income Variation by Family Size in Widener, AR: Comparative analysis across 7 household sizes

Income Distribution by Quintile: Mean Household Income in Widener, AR

Widener, AR households by income brackets: family, non-family, and total, in 2022 inflation-adjusted dollars

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Interested in deeper insights and visual analysis?

Explore our comprehensive data analysis and visual representations for a deeper understanding of Widener median household income. You can refer the same here

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2022). Data sources for anti-fraud data analytics initiatives in global organizations 2019 [Dataset]. https://www.statista.com/statistics/1043542/worldwide-fraud-fight-data-analytics-data-sources/

Data sources for anti-fraud data analytics initiatives in global organizations 2019

Explore at:

Dataset updated

May 23, 2022

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Feb 2019

Area covered

Worldwide

Description

Internal structured data is the most commonly used data source for anti-fraud data analytics initiatives in organizations, according to a global company survey in 2019. Almost three quarters of the respondents said that internal structured data was used in their companies for anti-fraud analytics tests.

Clear search

Close search

Google apps

Main menu

Data sources for anti-fraud data analytics initiatives in global...

Spreadsheet of resistance values and data sources used to compile the...

Importance of data sources for analytics vs access among U.S. businesses...

Sentiment Analysis for Mental Health

Data Source:

Data Overview:

Data Collection:

Features:

Usage:

Acknowledgments:

‘Statistics on the Open Data site ’ analyzed by Analyst-2

Data from: CaImAn: An open source tool for scalable Calcium Imaging data...

GLO climate data stats summary

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

Global Big Data in the Oil and Gas Sector Market Report 2025 Edition, Market...

Most reliable sources of data for market researchers in the U.S. 2017

‘Sign Line Task & Work Order Data’ analyzed by Analyst-2

Enterprise-Driven Open Source Software

Land Cover Summary Statistics Data Package for Greater Yellowstone Network...

Impact and Risk Analysis Database Documentation

Abstract

Purpose

Dataset History

Dataset Citation

Data And Analytics Software Market Report

Big Data Analytics in Retail Market - Trends & Industry Analysis

News Events Data in Latin America( Techsalerator)

Global Data Element Market Research Report: By Data Source (Relational...

Data Preparation Tool Market Report

Comprehensive Median Household Income and Distribution Dataset for Fall...

About this dataset

Content

Inspiration

Interested in deeper insights and visual analysis?

Comprehensive Median Household Income and Distribution Dataset for Widener,...

About this dataset

Content

Inspiration

Interested in deeper insights and visual analysis?

Data sources for anti-fraud data analytics initiatives in global organizations 2019