100+ datasets found
  1. Data sources for anti-fraud data analytics initiatives in global...

    • statista.com
    Updated May 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Data sources for anti-fraud data analytics initiatives in global organizations 2019 [Dataset]. https://www.statista.com/statistics/1043542/worldwide-fraud-fight-data-analytics-data-sources/
    Explore at:
    Dataset updated
    May 23, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2019
    Area covered
    Worldwide
    Description

    Internal structured data is the most commonly used data source for anti-fraud data analytics initiatives in organizations, according to a global company survey in 2019. Almost three quarters of the respondents said that internal structured data was used in their companies for anti-fraud analytics tests.

  2. d

    Spreadsheet of resistance values and data sources used to compile the...

    • catalog.data.gov
    • datasets.ai
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Fish and Wildlife Service (2025). Spreadsheet of resistance values and data sources used to compile the resistance surface - A landscape connectivity analysis for the coastal marten (Martes caurina humboldtensis) [Dataset]. https://catalog.data.gov/dataset/spreadsheet-of-resistance-values-and-data-sources-used-to-compile-the-resistance-surface-a
    Explore at:
    Dataset updated
    Feb 22, 2025
    Dataset provided by
    U.S. Fish and Wildlife Service
    Description

    This spreadsheet contains a list of component raster data layers that were used to compile our resistance surface, the classes of data represented within each of these rasters, and the resistance value we assigned to each class. It also provides a web reference for each data layer to provide additional context and information about the source datasets. Please refer to the embedded spatial metadata and the information in our full report for details on the development of the resulting ResistanceSurface, as well as these component data layers: ResistanceData_Roads ResistanceData_ForestedCover ResistanceData_Rivers ResistanceData_Waterbodies ResistanceData_NonForestedCover ResistanceData_BaysEstuaries ResistancePostProcessing_Serpentine

  3. Importance of data sources for analytics vs access among U.S. businesses...

    • statista.com
    Updated Mar 25, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2016). Importance of data sources for analytics vs access among U.S. businesses 2015 [Dataset]. https://www.statista.com/statistics/562625/united-states-data-analytics-importance-vs-access/
    Explore at:
    Dataset updated
    Mar 25, 2016
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    This statistic illustrates the importance of various data sources for business analytics, compared to the level of access businesses have to those data sources, according to a marketing survey of C-level executives, conducted in December 2015 by Black Ink. As of December 2015, product and service usage data was listed as important by 68 percent of respondents, but the degree of access to that data was put at 33 percent.

  4. Sentiment Analysis for Mental Health

    • kaggle.com
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suchintika Sarkar (2024). Sentiment Analysis for Mental Health [Dataset]. https://www.kaggle.com/datasets/suchintikasarkar/sentiment-analysis-for-mental-health
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Suchintika Sarkar
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This comprehensive dataset is a meticulously curated collection of mental health statuses tagged from various statements. The dataset amalgamates raw data from multiple sources, cleaned and compiled to create a robust resource for developing chatbots and performing sentiment analysis.

    Data Source:

    The dataset integrates information from the following Kaggle datasets:

    Data Overview:

    The dataset consists of statements tagged with one of the following seven mental health statuses: - Normal - Depression - Suicidal - Anxiety - Stress - Bi-Polar - Personality Disorder

    Data Collection:

    The data is sourced from diverse platforms including social media posts, Reddit posts, Twitter posts, and more. Each entry is tagged with a specific mental health status, making it an invaluable asset for:

    • Developing intelligent mental health chatbots.
    • Performing in-depth sentiment analysis.
    • Research and studies related to mental health trends.

    Features:

    • unique_id: A unique identifier for each entry.
    • Statement: The textual data or post.
    • Mental Health Status: The tagged mental health status of the statement.

    Usage:

    This dataset is ideal for training machine learning models aimed at understanding and predicting mental health conditions based on textual data. It can be used in various applications such as:

    • Chatbot development for mental health support.
    • Sentiment analysis to gauge mental health trends.
    • Academic research on mental health patterns.

    Acknowledgments:

    This dataset was created by aggregating and cleaning data from various publicly available datasets on Kaggle. Special thanks to the original dataset creators for their contributions.

  5. A

    ‘Statistics on the Open Data site ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Statistics on the Open Data site ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-europa-eu-statistics-on-the-open-data-site-55ba/8b2737b0/?iid=002-180&v=presentation
    Explore at:
    Dataset updated
    Jan 12, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Statistics on the Open Data site ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-mon-saint-quentin-hub-arcgis-com-datasets-5426305826594a33a561acfd02d25808_0 on 12 January 2022.

    --- Dataset description provided by original source is as follows ---

    Statistics on official and obsolete consignments broken down by actor in the portal.

    Definition of Obsolète: A batch of data is considered obsolete when obvious defects have been detected as a result of a quality check or where there is no longer an update strategy carried out by the business department responsible for the maintenance of the lot.

    Definition of official: The lot is usable and suitable.

    --- Original source retains full ownership of the source dataset ---

  6. Z

    Data from: CaImAn: An open source tool for scalable Calcium Imaging data...

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brandon L. Brown (2020). CaImAn: An open source tool for scalable Calcium Imaging data Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1659148
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Eftychios A. Pnevmatikakis
    Brandon L. Brown
    David W. Tank
    Pengcheng Zhou
    Andrea Giovannucci
    Dmitri Chklovskii
    Jeffrey L. Gauthier
    Johannes Friedrich
    Jiannis Taxidis
    Pat Gunn
    Baljit S. Khakh
    Sue Ann Koay
    Farzaneh Najafi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Advances in fluorescence microscopy enable monitoring larger brain areas in-vivo with finer time resolution. The resulting data rates require reproducible analysis pipelines that are reliable, fully automated, and scalable to datasets generated over the course of months. We present CaImAn, an open-source library for calcium imaging data analysis. CaImAn provides automatic and scalable methods to address problems common to preprocessing, including motion correction, neural activity identification, and registration across different sessions of data collection. It does this while requiring minimal user intervention, with good scalability on computers ranging from laptops to high-performance computing clusters. CaImAn is suitable for two-photon and one-photon imaging, and also enables real-time analysis on streaming data.

    To benchmark the performance of CaImAn we collected and combined a corpus of manual annotations from multiple labelers on nine mouse two-photon datasets, that are contained in this open access repository. We demonstrate that CaImAn achieves near-human performance in detecting locations of active neurons.

    In order to reproduce the results of the paper or download the annotations and the raw movies, please refer to the readme.md at:

    https://github.com/flatironinstitute/CaImAn/blob/master/use_cases/eLife_scripts/README.md

  7. d

    GLO climate data stats summary

    • data.gov.au
    • demo.dev.magda.io
    • +1more
    zip
    Updated Apr 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2022). GLO climate data stats summary [Dataset]. https://data.gov.au/data/dataset/afed85e0-7819-493d-a847-ec00a318e657
    Explore at:
    zip(8810)Available download formats
    Dataset updated
    Apr 13, 2022
    Dataset authored and provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    Various climate variables summary for all 15 subregions based on Bureau of Meteorology Australian Water Availability Project (BAWAP) climate grids. Including

    1. Time series mean annual BAWAP rainfall from 1900 - 2012.

    2. Long term average BAWAP rainfall and Penman Potentail Evapotranspiration (PET) from Jan 1981 - Dec 2012 for each month

    3. Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P (precipitation); (ii) Penman ETp; (iii) Tavg (average temperature); (iv) Tmax (maximum temperature); (v) Tmin (minimum temperature); (vi) VPD (Vapour Pressure Deficit); (vii) Rn (net radiation); and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend.

    4. Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009).

    As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

    There are 4 csv files here:

    BAWAP_P_annual_BA_SYB_GLO.csv

    Desc: Time series mean annual BAWAP rainfall from 1900 - 2012.

    Source data: annual BILO rainfall

    P_PET_monthly_BA_SYB_GLO.csv

    long term average BAWAP rainfall and Penman PET from 198101 - 201212 for each month

    Climatology_Trend_BA_SYB_GLO.csv

    Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P; (ii) Penman ETp; (iii) Tavg; (iv) Tmax; (v) Tmin; (vi) VPD; (vii) Rn; and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend

    Risbey_Remote_Rainfall_Drivers_Corr_Coeffs_BA_NSB_GLO.csv

    Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009). As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

    Dataset History

    Dataset was created from various BAWAP source data, including Monthly BAWAP rainfall, Tmax, Tmin, VPD, etc, and other source data including monthly Penman PET, Correlation coefficient data. Data were extracted from national datasets for the GLO subregion.

    BAWAP_P_annual_BA_SYB_GLO.csv

    Desc: Time series mean annual BAWAP rainfall from 1900 - 2012.

    Source data: annual BILO rainfall

    P_PET_monthly_BA_SYB_GLO.csv

    long term average BAWAP rainfall and Penman PET from 198101 - 201212 for each month

    Climatology_Trend_BA_SYB_GLO.csv

    Values calculated over the years 1981 - 2012 (inclusive), for 17 time periods (i.e., annual, 4 seasons and 12 months) for the following 8 meteorological variables: (i) BAWAP_P; (ii) Penman ETp; (iii) Tavg; (iv) Tmax; (v) Tmin; (vi) VPD; (vii) Rn; and (viii) Wind speed. For each of the 17 time periods for each of the 8 meteorological variables have calculated the: (a) average; (b) maximum; (c) minimum; (d) average plus standard deviation (stddev); (e) average minus stddev; (f) stddev; and (g) trend

    Risbey_Remote_Rainfall_Drivers_Corr_Coeffs_BA_NSB_GLO.csv

    Correlation coefficients (-1 to 1) between rainfall and 4 remote rainfall drivers between 1957-2006 for the four seasons. The data and methodology are described in Risbey et al. (2009). As described in the Risbey et al. (2009) paper, the rainfall was from 0.05 degree gridded data described in Jeffrey et al. (2001 - known as the SILO datasets); sea surface temperature was from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) on a 1 degree grid. BLK=Blocking; DMI=Dipole Mode Index; SAM=Southern Annular Mode; SOI=Southern Oscillation Index; DJF=December, January, February; MAM=March, April, May; JJA=June, July, August; SON=September, October, November. The analysis is a summary of Fig. 15 of Risbey et al. (2009).

    Dataset Citation

    Bioregional Assessment Programme (2014) GLO climate data stats summary. Bioregional Assessment Derived Dataset. Viewed 18 July 2018, http://data.bioregionalassessments.gov.au/dataset/afed85e0-7819-493d-a847-ec00a318e657.

    Dataset Ancestors

  8. c

    Global Big Data in the Oil and Gas Sector Market Report 2025 Edition, Market...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Apr 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Global Big Data in the Oil and Gas Sector Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/big-data-in-the-oil-and-gas-sector-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Apr 12, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global Big Data in Oil and Gas Sector market size is projected to reach USD XX million by 2024 and is expected to expand at a compound annual growth rate (CAGR) of XX% from 2024 to 2031.

    The global Big Data in Oil and Gas Sector market is anticipated to grow significantly, with a projected CAGR of XX% between 2024 and 2031.
    North America is expected to hold a major market share of more than XX%, with a market size of USD XX million in 2024, and is forecasted to grow at a CAGR of XX% from 2024 to 2031 due to the advanced technological infrastructure and the high adoption rate of digital technologies in the oil and gas sector.
    The upstream application segment held the highest Big Data in Oil and Gas Sector market revenue share in 2024, attributed to the critical role of big data in exploration and production activities, optimizing reservoir performance, and minimizing risks.
    

    Market Dynamics - Key Drivers of the Big Data in Oil and Gas Sector

    Integration of Advanced Analytics for Enhanced Decision-Making Drives the Big Data in Oil & Gas Market

    The Big Data in Oil & Gas market is driven by the adoption of advanced analytics, where cost efficiency is a major achievement. Big data analytics processes complex datasets for better predictions and optimisations. Its affordability relative to other precious metals like gold and platinum further amplifies its appeal. As Big Data is further integrated, the development of the Oil & Gas Sector is buoyed by enhancing decision-making, efficiency, and safety.

    For instance, ExxonMobil, in their "2020 Energy & Carbon Summary" report, highlighted the use of advanced seismic imaging and data analytics to improve the accuracy of subsurface exploration, thereby reducing drilling risks and enhancing operational efficiency.

    IoT Deployment for Real-Time Monitoring and Efficiency Further Propel the Big Data in Oil & Gas Market

    The rising demand for monitored infographics and data analytics is to fuel the Big Data in the Oil & Gas market. The deployment of IoT devices facilitates real-time monitoring and operational efficiency. This development aligns with the broader shift towards self-sufficiency and positive capital allocations. As IoT sensors on equipment and in operations provide critical data for predictive maintenance and decision-making, contributing to the shift from capital expenditure to operational expenditure in multiple outsourced activities for the businesses.

    Schlumberger, in their "Digital Transformation in the Oil and Gas Industry" report, discussed implementing IoT solutions to monitor well operations, which has led to significant improvements in maintenance strategies and operational efficiencies.

    Market Dynamics - Key Restraints of the Big Data in Oil and Gas Sector

    Data Security and Privacy Concerns is a Challenge for the Big Data in Oil & Gas Market

    With the companies storing all the its data on every aspect of business for a more efficient future working, there is still room for avoidable threats. The rising demand for big data might come with the threat of Data security and privacy are significant concerns with the increasing use of big data analytics, given the oil and gas sector's sensitive nature. Cyber threats limit the adoption of big data solutions, limiting the demand for Big data in the Oil & Gas market.

    The International Energy Agency (IEA), in its "Digitalization & Energy" report, highlighted the cybersecurity challenges facing the energy sector, emphasizing the need for robust security measures in the adoption of digital technologies, including big data analytics.

    Integration and Interoperability Challenges will Restraint the Big Data in Oil & Gas Market

    Data access, analysis, and storage are becoming more and more of an issue for businesses. Compatibility and interoperability issues arise when big data technologies are integrated with legacy systems. The integration process is made more difficult by the diversity of data sources and formats. Most firms are finding it necessary to evaluate new technologies and legacy infrastructure as the needs of Big Data outpace those of traditional relational databases.

    A study by Deloitte, titled "Digital Transformation: Shaping the Future of the Oil and Gas Industry", identified integration of new technologies with existin...

  9. Most reliable sources of data for market researchers in the U.S. 2017

    • statista.com
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most reliable sources of data for market researchers in the U.S. 2017 [Dataset]. https://www.statista.com/statistics/917534/market-research-industry-us-most-reliable-sources-of-data/
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017
    Area covered
    United States
    Description

    This statistic displays the most reliable sources of data according to professionals in the market research industry in the United States in 2017. During the survey, 32 percent of respondents cited marketing analytics as the most reliable data source.

  10. A

    ‘Sign Line Task & Work Order Data’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Oct 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2017). ‘Sign Line Task & Work Order Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-sign-line-task-work-order-data-bb94/latest
    Explore at:
    Dataset updated
    Oct 22, 2017
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Sign Line Task & Work Order Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/6687ab3e-87a4-423c-8e4d-11dc1d9c2943 on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Sign (Task & Work Order) Information from eWork.

    --- Original source retains full ownership of the source dataset ---

  11. Z

    Enterprise-Driven Open Source Software

    • data.niaid.nih.gov
    • opendatalab.com
    Updated Apr 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kravvaritis, Konstantinos (2020). Enterprise-Driven Open Source Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3653877
    Explore at:
    Dataset updated
    Apr 22, 2020
    Dataset provided by
    Theodorou, Georgios
    Kravvaritis, Konstantinos
    Louridas, Panos
    Kotti, Zoe
    Spinellis, Diomidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present a dataset of open source software developed mainly by enterprises rather than volunteers. This can be used to address known generalizability concerns, and, also, to perform research on open source business software development. Based on the premise that an enterprise's employees are likely to contribute to a project developed by their organization using the email account provided by it, we mine domain names associated with enterprises from open data sources as well as through white- and blacklisting, and use them through three heuristics to identify 17,264 enterprise GitHub projects. We provide these as a dataset detailing their provenance and properties. A manual evaluation of a dataset sample shows an identification accuracy of 89%. Through an exploratory data analysis we found that projects are staffed by a plurality of enterprise insiders, who appear to be pulling more than their weight, and that in a small percentage of relatively large projects development happens exclusively through enterprise insiders.

    The main dataset is provided as a 17,264 record tab-separated file named enterprise_projects.txt with the following 29 fields.

    url: the project's GitHub URL

    project_id: the project's GHTorrent identifier

    sdtc: true if selected using the same domain top committers heuristic (9,016 records)

    mcpc: true if selected using the multiple committers from a valid enterprise heuristic (8,314 records)

    mcve: true if selected using the multiple committers from a probable company heuristic (8,015 records),

    star_number: number of GitHub watchers

    commit_count: number of commits

    files: number of files in current main branch

    lines: corresponding number of lines in text files

    pull_requests: number of pull requests

    github_repo_creation: timestamp of the GitHub repository creation

    earliest_commit: timestamp of the earliest commit

    most_recent_commit: date of the most recent commit

    committer_count: number of different committers

    author_count: number of different authors

    dominant_domain: the projects dominant email domain

    dominant_domain_committer_commits: number of commits made by committers whose email matches the project's dominant domain

    dominant_domain_author_commits: corresponding number for commit authors

    dominant_domain_committers: number of committers whose email matches the project's dominant domain

    dominant_domain_authors: corresponding number for commit authors

    cik: SEC's EDGAR "central index key"

    fg500: true if this is a Fortune Global 500 company (2,233 records)

    sec10k: true if the company files SEC 10-K forms (4,180 records)

    sec20f: true if the company files SEC 20-F forms (429 records)

    project_name: GitHub project name

    owner_login: GitHub project's owner login

    company_name: company name as derived from the SEC and Fortune 500 data

    owner_company: GitHub project's owner company name

    license: SPDX license identifier

    The file cohost_project_details.txt provides the full set of 311,223 cohort projects that are not part of the enterprise data set, but have comparable quality attributes.

    url: the project's GitHub URL

    project_id: the project's GHTorrent identifier

    stars: number of GitHub watchers

    commit_count: number of commits

  12. g

    Land Cover Summary Statistics Data Package for Greater Yellowstone Network...

    • gimi9.com
    • catalog.data.gov
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Land Cover Summary Statistics Data Package for Greater Yellowstone Network Park Units [Dataset]. https://gimi9.com/dataset/data-gov_land-cover-summary-statistics-data-package-for-greater-yellowstone-network-park-units/
    Explore at:
    Dataset updated
    Dec 16, 2023
    Description

    This report documents the acquisition of source data, and calculation of land cover summary statistics datasets for four National Park Service Greater Yellowstone Network park units and six custom areas of analysis: Bighorn Canyon National Recreation Area, Grand Teton National Park, John D. Rockefeller Jr. Memorial Parkway, Yellowstone National Park, and the six custom areas of analysis. The source data and land cover calculations are available for use within the National Park Service (NPS) Inventory and Monitoring Program. Land cover summary statistics datasets can be calculated for all geographic regions within the extent of the NPS; this report includes statistics calculated for the conterminous United States. The land cover summary statistics datasets are calculated from multiple sources, including Multi-Resolution Land Characteristics Consortium products in the National Land Cover Database (NLCD) and the United States Geological Survey’s (USGS) Earth Resources Observation and Science (EROS) Center products in the Land Change Monitoring, Assessment, and Projection (LCMAP) raster dataset. These summary statistics calculate land cover at up to three classification scales: Level 1, modified Anderson Level 2, and Natural versus Converted land cover. The output land cover summary statistics datasets produced here for the four Greater Yellowstone Network park units and six custom areas of analysis utilize the most recent versions of the source datasets (NLCD and LCMAP). These land cover summary statistics datasets are used in the NPS Inventory and Monitoring Program, including the NPS Environmental Settings Monitoring Protocol and may be used by networks and parks for additional efforts.

  13. d

    Impact and Risk Analysis Database Documentation

    • data.gov.au
    • cloud.csiss.gmu.edu
    • +3more
    zip
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2019). Impact and Risk Analysis Database Documentation [Dataset]. https://data.gov.au/data/dataset/groups/05e851cf-57a5-4127-948a-1b41732d538c
    Explore at:
    zip(3577368)Available download formats
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    Four documents describe the specifications, methods and scripts of the Impact and Risk Analysis Databases developed for the Bioregional Assessments Programme. They are:

    1. Bioregional Assessment Impact and Risk Databases Installation Advice (IMIA Database Installation Advice v1.docx).

    2. Naming Convention of the Bioregional Assessment Impact and Risk Databases (IMIA Project Naming Convention v39.docx).

    3. Data treatments for the Bioregional Assessment Impact and Risk Databases (IMIA Project Data Treatments v02.docx).

    4. Quality Assurance of the Bioregional Assessment Impact and Risk Databases (IMIA Project Quality Assurance Protocol v17.docx).

    This dataset also includes the Materialised View Information Manager (MatInfoManager.zip). This Microsoft Access database is used to manage the overlay definitions of materialized views of the Impact and Risk Analysis Databases. For more information about this tool, refer to the Data Treatments document.

    The documentation supports all five Impact and Risk Analysis Databases developed for the assessment areas:

    Purpose

    These documents describe end-to-end treatments of scientific data for the Impact and Risk Analysis Databases, developed and published by the Bioregional Assessment Programme. The applied approach to data quality assurance is also described. These documents are intended for people with an advanced knowledge in geospatial analysis and database administration, who seek to understand, restore or utilise the Analysis Databases and their underlying methods of analysis.

    Dataset History

    The Impact and Risk Analysis Database Documentation was created for and by the Information Modelling and Impact Assessment Project (IMIA Project).

    Dataset Citation

    Bioregional Assessment Programme (2018) Impact and Risk Analysis Database Documentation. Bioregional Assessment Source Dataset. Viewed 12 December 2018, http://data.bioregionalassessments.gov.au/dataset/05e851cf-57a5-4127-948a-1b41732d538c.

  14. D

    Data And Analytics Software Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Data And Analytics Software Market Report [Dataset]. https://www.promarketreports.com/reports/data-and-analytics-software-market-18429
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Feb 23, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The data and analytics software market is poised to experience significant growth, expanding from USD 108.69 billion in 2025 to a projected USD 248.84 billion by 2033, exhibiting a CAGR of 9.72% during the forecast period. This growth is fueled by the increasing adoption of big data and cloud computing, as well as the rising demand for data-driven insights to improve decision-making and gain a competitive edge in various industries. Major market drivers include the growing volume and complexity of data, technological advancements in data management and analytics, and the need for real-time insights to optimize operations and customer experiences. Market trends include the rise of artificial intelligence (AI) and machine learning (ML), which enable more advanced data analysis and predictive modeling. The adoption of cloud-based data analytics solutions is also gaining traction, offering flexibility, cost-effectiveness, and scalability. Some market restraints include data security and privacy concerns, the lack of skilled data analytics professionals, and the integration challenges associated with diverse data sources. The market is highly competitive, with established vendors such as Qlik, Informatica, Oracle, Microsoft, and Teradata, along with emerging players like Databricks, Amazon Web Services (AWS), and Google Cloud Platform (GCP) vying for market share. Key drivers for this market are: 1. Self-service analytics tools 2. Integration with other cloud applications 3. Prescriptive and predictive analytics 4. Artificial intelligence and machine 5. learning Data storytelling. Potential restraints include: Cloud adoption real-time analytics artificial intelligence.

  15. Big Data Analytics in Retail Market - Trends & Industry Analysis

    • mordorintelligence.com
    pdf,excel,csv,ppt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mordor Intelligence, Big Data Analytics in Retail Market - Trends & Industry Analysis [Dataset]. https://www.mordorintelligence.com/industry-reports/big-data-analytics-in-retail-marketing-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset authored and provided by
    Mordor Intelligence
    License

    https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy

    Time period covered
    2021 - 2030
    Area covered
    Global
    Description

    The Data Analytics in Retail Industry is segmented by Application (Merchandising and Supply Chain Analytics, Social Media Analytics, Customer Analytics, Operational Intelligence, Other Applications), by Business Type (Small and Medium Enterprises, Large-scale Organizations), and Geography. The market size and forecasts are provided in terms of value (USD billion) for all the above segments.

  16. News Events Data in Latin America( Techsalerator)

    • datarade.ai
    Updated Mar 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techsalerator (2024). News Events Data in Latin America( Techsalerator) [Dataset]. https://datarade.ai/data-products/news-events-data-in-latin-america-techsalerator-techsalerator
    Explore at:
    .json, .csv, .xls, .txtAvailable download formats
    Dataset updated
    Mar 20, 2024
    Dataset provided by
    Techsalerator LLC
    Authors
    Techsalerator
    Area covered
    Americas, Latin America, Aruba, Chile, Falkland Islands (Malvinas), French Guiana, Martinique, Montserrat, Dominican Republic, Ecuador, Cuba, Argentina
    Description

    Techsalerator’s News Event Data in Latin America offers a detailed and extensive dataset designed to provide businesses, analysts, journalists, and researchers with an in-depth view of significant news events across the Latin American region. This dataset captures and categorizes key events reported from a wide array of news sources, including press releases, industry news sites, blogs, and PR platforms, offering valuable insights into regional developments, economic changes, political shifts, and cultural events.

    Key Features of the Dataset: Comprehensive Coverage:

    The dataset aggregates news events from numerous sources such as company press releases, industry news outlets, blogs, PR sites, and traditional news media. This broad coverage ensures a wide range of information from multiple reporting channels. Categorization of Events:

    News events are categorized into various types including business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly locate and analyze information relevant to their interests or sectors. Real-Time Updates:

    The dataset is updated regularly to include the most recent events, ensuring users have access to the latest news and can stay informed about current developments. Geographic Segmentation:

    Events are tagged with their respective countries and regions within Latin America. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:

    Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps in understanding the context and significance of each event. Historical Data:

    The dataset includes historical news event data, enabling users to track trends and perform comparative analysis over time. This feature supports longitudinal studies and provides insights into how news events evolve. Advanced Search and Filter Options:

    Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. Latin American Countries Covered: South America: Argentina Bolivia Brazil Chile Colombia Ecuador Guyana Paraguay Peru Suriname Uruguay Venezuela Central America: Belize Costa Rica El Salvador Guatemala Honduras Nicaragua Panama Caribbean: Cuba Dominican Republic Haiti (Note: Primarily French-speaking but included due to geographic and cultural ties) Jamaica Trinidad and Tobago Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across Latin America, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to Latin American news and events. Techsalerator’s News Event Data in Latin America is a crucial resource for accessing and analyzing significant news events across the region. By providing detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.

  17. w

    Global Data Element Market Research Report: By Data Source (Relational...

    • wiseguyreports.com
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Data Element Market Research Report: By Data Source (Relational Databases, NoSQL Databases, Big Data Platforms, Cloud-based Data Warehouses), By Type (Structured Data, Unstructured Data, Semi-Structured Data), By Format (XML, JSON, CSV, Parquet), By Purpose (Data Analysis, Machine Learning, Data Visualization, Data Governance), By Deployment Model (On-premises, Cloud-based, Hybrid) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-element-market
    Explore at:
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 7, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20237.6(USD Billion)
    MARKET SIZE 20248.66(USD Billion)
    MARKET SIZE 203224.7(USD Billion)
    SEGMENTS COVEREDData Source ,Type ,Format ,Purpose ,Deployment Model ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSAIdriven data element management Data privacy and regulations Cloudbased data element platforms Data sharing and collaboration Increasing demand for realtime data
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDInformatica ,Micro Focus ,IBM ,SAS ,Denodo ,Oracle ,TIBCO ,Talend ,SAP
    MARKET FORECAST PERIOD2024 - 2032
    KEY MARKET OPPORTUNITIES1 Adoption of AI and ML 2 Growing demand for data analytics 3 Increasing cloud adoption 4 Data privacy and security concerns 5 Integration with emerging technologies
    COMPOUND ANNUAL GROWTH RATE (CAGR) 13.99% (2024 - 2032)
  18. D

    Data Preparation Tool Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Data Preparation Tool Market Report [Dataset]. https://www.promarketreports.com/reports/data-preparation-tool-market-18555
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Feb 3, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global data preparation tool market is estimated to be valued at $674.52 million in 2025, with a compound annual growth rate (CAGR) of 16.46% from 2025 to 2033. The rising need to manage and analyze large volumes of complex data from various sources is driving the growth of the market. Additionally, the increasing adoption of cloud-based data management solutions and the growing demand for data-driven decision-making are contributing to the market's expansion. Key market trends include the growing adoption of artificial intelligence (AI) and machine learning (ML) technologies for data preparation automation, the increasing use of data visualization tools for data analysis, and the growing popularity of data fabric architectures for data integration and management. The market is segmented by deployment (on-premises, cloud, hybrid), data volume (small data, big data), data type (structured data, unstructured data, semi-structured data), industry vertical (BFSI, healthcare, retail, manufacturing), and use case (data integration, data cleansing, data transformation, data enrichment). North America is the largest regional market, followed by Europe and Asia Pacific. IBM, Collibra, Talend, Microsoft, Informatica, SAP, SAS Institute, and Denodo are some of the key players in the market. Key drivers for this market are: Cloud-based deployment AIML integration Self-service capabilities Real-time data processing Data governance and compliance. Potential restraints include: Increasing cloud adoption Growing volume of data Advancements in artificial intelligence (AI) and machine learning (ML) Stringent regulatory compliance Rising demand for self-service data preparation.

  19. N

    Comprehensive Median Household Income and Distribution Dataset for Fall...

    • neilsberg.com
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Median Household Income and Distribution Dataset for Fall River, MA: Analysis by Household Type, Size and Income Brackets [Dataset]. https://www.neilsberg.com/research/datasets/cd9a4b22-b041-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Fall River, Massachusetts
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the median household income in Fall River. It can be utilized to understand the trend in median household income and to analyze the income distribution in Fall River by household type, size, and across various income brackets.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Fall River, MA Median Household Income Trends (2010-2021, in 2022 inflation-adjusted dollars)
    • Median Household Income Variation by Family Size in Fall River, MA: Comparative analysis across 7 household sizes
    • Income Distribution by Quintile: Mean Household Income in Fall River, MA
    • Fall River, MA households by income brackets: family, non-family, and total, in 2022 inflation-adjusted dollars

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Fall River median household income. You can refer the same here

  20. N

    Comprehensive Median Household Income and Distribution Dataset for Widener,...

    • neilsberg.com
    Updated Jan 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Median Household Income and Distribution Dataset for Widener, AR: Analysis by Household Type, Size and Income Brackets [Dataset]. https://www.neilsberg.com/research/datasets/cdc89507-b041-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Widener, Arkansas
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the median household income in Widener. It can be utilized to understand the trend in median household income and to analyze the income distribution in Widener by household type, size, and across various income brackets.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Widener, AR Median Household Income Trends (2010-2021, in 2022 inflation-adjusted dollars)
    • Median Household Income Variation by Family Size in Widener, AR: Comparative analysis across 7 household sizes
    • Income Distribution by Quintile: Mean Household Income in Widener, AR
    • Widener, AR households by income brackets: family, non-family, and total, in 2022 inflation-adjusted dollars

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Widener median household income. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2022). Data sources for anti-fraud data analytics initiatives in global organizations 2019 [Dataset]. https://www.statista.com/statistics/1043542/worldwide-fraud-fight-data-analytics-data-sources/
Organization logo

Data sources for anti-fraud data analytics initiatives in global organizations 2019

Explore at:
Dataset updated
May 23, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2019
Area covered
Worldwide
Description

Internal structured data is the most commonly used data source for anti-fraud data analytics initiatives in organizations, according to a global company survey in 2019. Almost three quarters of the respondents said that internal structured data was used in their companies for anti-fraud analytics tests.

Search
Clear search
Close search
Google apps
Main menu