8 datasets found
  1. d

    Company Financial Data | Multi-Source Docs | Extraction & Structuring (100+...

    • datarade.ai
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elsai (2025). Company Financial Data | Multi-Source Docs | Extraction & Structuring (100+ Languages, 5K Docs/Hour) | Standardized Outputs | Compliance & Analysis [Dataset]. https://datarade.ai/data-products/company-financial-data-multi-source-docs-extraction-str-elsai
    Explore at:
    .json, .xml, .csv, .sql, .txtAvailable download formats
    Dataset updated
    Feb 14, 2025
    Dataset authored and provided by
    Elsai
    Area covered
    Belize, Togo, Åland Islands, Virgin Islands (U.S.), Lao People's Democratic Republic, Guinea-Bissau, El Salvador, Bangladesh, Tonga, Sri Lanka
    Description

    Transform Unstructured Financial Docs into Actionable Insights Harness proprietary AI models to extract, validate, and standardize financial data from any document format, including scanned images, handwritten notes, or multi-language PDFs. Unlike basic OCR tools, our solution handles complex layouts, merged cells, poor quality PDFs and low-quality scans with industry-leading precision.

    Key Features Universal Format Support: Extract data from scanned PDFs, images (JPEG/PNG), Excel, Word, and any other handwritten documents.

    AI-Driven OCR & LLM Standardization:

    1. Convert unstructured text into standardized fields (e.g., "Net Profit" → ISO 20022-compliant tags).

    2. Resolve inconsistencies (e.g., "$1M" vs. "1,000,000 USD") using context-aware LLMs.

    3. 100+ Language Coverage: Process financial docs in Arabic, Bulgarian, and more with automated translation.

    4. Up to 99% Accuracy: Triple-validation via AI cross-checks, rule-based audits, and human-in-the-loop reviews.

    5. Prebuilt Templates: Auto-detect formats for common documents (e.g., IFRS-compliant P&L statements, IRS tax forms).

    Data Sourcing & Output Supported Documents: Balance sheets, invoices, tax filings, bank statements, receipts and more. Export Formats: Excel, CSV, JSON, API, PostgreSQL, or direct integration with tools like QuickBooks, SAP.

    Use Cases 1. Credit Risk Analysis: Automate financial health assessments for loan approvals and vender analysis.

    1. Audit Compliance: Streamline data aggregation for GAAP/IFRS audits.

    2. Due Diligence: Verify company legitimacy for mergers, investments, acquisitions, or partnerships.

    3. Compliance: Streamline KYC/AML workflows with automated financials check.

    4. Invoice Processing: Extract vendor payment terms, due dates, and amounts.

    Technical Edge 1. AI Architecture: Leverages proprietary algorithm which combines vision transformers and OCR pipelines for layout detection, LLM models for context analysis, and rule-based validation.

    1. Security: SOC 2 compliance, and on-premise storage options.

    2. Latency: Process as much as 10,000 pages/hour with sub-60-second extractions.

    Pricing & Trials Pay-as-you-go (min 1,000 docs/month).

    Enterprise: Custom pricing for volume discounts, SLA guarantees, and white-glove onboarding.

    Free Trial Available

  2. m

    Arabic news and public opinion dataset from YouTube

    • data.mendeley.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hezam Gawbah (2025). Arabic news and public opinion dataset from YouTube [Dataset]. http://doi.org/10.17632/3mnjw5hjkh.2
    Explore at:
    Dataset updated
    Mar 3, 2025
    Authors
    Hezam Gawbah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    • The dataset contains 19,404,395 public comments and replies from 70,000 video news published from 20 renowned Arabic news YouTube channels. Each channel contributes 3,500 video news segments. It includes 10 properties of news that include video URL, ID, title, likes, views, date of publishing, hashtags, description, number of comments, and comments details which include comment time, comment, likes in the comment, and reply count, providing a comprehensive corpus for analysis. • The data is organized in a standardized Excel format, making it easy to access and analyze. The data is organized in a standardized Excel format, making it easy to access and analyze. It includes 10 columns and 3500 records. • The final curated datasets are saved in 20 primary folders; each news channel has a separate folder. This folder contains two files, a data file in Arabic and a file translated into English. This file contains raw data, which includes include the video URL, ID, title, likes, views, date of publishing, hashtags, description, number of comments, and comments details which include comment time, comment, likes in the comment, and reply count.

  3. Byrd Polar and Climate Research Center Ice Core Paleoclimatology Datasets in...

    • zenodo.org
    bin
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin M. Weber; Austin M. Weber (2023). Byrd Polar and Climate Research Center Ice Core Paleoclimatology Datasets in a Standardized Excel Format [Dataset]. http://doi.org/10.5281/zenodo.8353857
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Austin M. Weber; Austin M. Weber
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All published datasets from the ice core paleoclimatology (ICP) group at the Byrd Polar and Climate Research Center (BPCRC) are archived in the NOAA-NCEI Paleoclimatology Database (https://www.ncei.noaa.gov/access/paleo-search/?dataTypeId=7). However, the formatting of these datasets is not consistent across the archival files, making it difficult to download and aggregate multiple datasets for research purposes. This repository is intended to provide a simple, consistently formatted archive of Excel files containing the published data for more than 16 ice core records collected by the BPCRC-ICP group since the 1980s.

    The file "2023-ByrdICP-datasets.xlsx " contains a column for each ice core location and a list of the sheet names within the corresponding Excel file for that ice core location.

  4. Geospatial Data from the Alpine Treeline Warming Experiment (ATWE) on Niwot...

    • osti.gov
    • data.ess-dive.lbl.gov
    • +2more
    Updated Jan 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) (United States) (2021). Geospatial Data from the Alpine Treeline Warming Experiment (ATWE) on Niwot Ridge, Colorado, USA [Dataset]. http://doi.org/10.15485/1804896
    Explore at:
    Dataset updated
    Jan 1, 2021
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    Department of Energy Biological and Environmental Research Program
    Subalpine and Alpine Species Range Shifts with Climate Change: Temperature and Soil Moisture Manipulations to Test Species and Population Responses (Alpine Treeline Warming Experiment)
    Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) (United States)
    Area covered
    United States, Colorado, Niwot Ridge
    Description

    This is a collection of all GPS- and computer-generated geospatial data specific to the Alpine Treeline Warming Experiment (ATWE), located on Niwot Ridge, Colorado, USA. The experiment ran between 2008 and 2016, and consisted of three sites spread across an elevation gradient. Geospatial data for all three experimental sites and cone/seed collection locations are included in this package. –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––Geospatial files include cone collection, experimental site, seed trap, and other GPS location/terrain data. File types include ESRI shapefiles, ESRI grid files or Arc/Info binary grids, TIFFs (.tif), and keyhole markup language (.kml) files. Trimble-imported data include plain text files (.txt), Trimble COR (CorelDRAW) files, and Trimble SSF (Standard Storage Format) files. Microsoft Excel (.xlsx) and comma-separated values (.csv) files corresponding to the attribute tables of many files within this package are also included. A complete list of files can be found in this document in the “Data File Organization” section in the included Data User's Guide. Maps are also included in this data package for reference and use. These maps are separated into two categories, 2021 maps and legacy maps, which were made in 2010. Each 2021 map has one copy in portable network graphics (.png) format, and the other in .pdf format. All legacy maps are in .pdf format. .png image files can be opened with any compatible programs, such as Preview (Mac OS) and Photos (Windows). All GIS files were imported into geopackages (.gpkg) using QGIS, and double-checked for compatibility and data/attribute integrity using ESRI ArcGIS Pro. Note that files packaged within geopackages will open in ArcGIS Pro with “main.” preceding each file name, and an extra column named “geom” defining geometry type in the attribute table. The contents of each geospatial file remain intact, unless otherwise stated in “niwot_geospatial_data_list_07012021.pdf/.xlsx”. This list of files can be found as an .xlsx and a .pdf in this archive.As an open-source file format, files within gpkgs (TIFF, shapefiles, ESRI grid or “Arc/Info Binary”) can be read using both QGIS and ArcGIS Pro, and any other geospatial softwares. Text and .csv files can be read using TextEdit/Notepad/any simple text-editing software; .csv’s can also be opened using Microsoft Excel and R. .kml files can be opened using Google Maps or Google Earth, and Trimble files are most compatible with Trimble’s GPS Pathfinder Office software. .xlsx files can be opened using Microsoft Excel. PDFs can be opened using Adobe Acrobat Reader, and any other compatible programs. A selection of original shapefiles within this archive were generated using ArcMap with associated FGDC-standardized metadata (xml file format). We are including these original files because they contain metadata only accessible using ESRI programs at this time, and so that the relationship between shapefiles and xml files is maintained. Individual xml files can be opened (without a GIS-specific program) using TextEdit or Notepad. Since ESRI’s compatibility with FGDC metadata has changed since the generation of these files, many shapefiles will require upgrading to be compatible with ESRI’s latest versions of geospatial software. These details are also noted in the “niwot_geospatial_data_list_07012021” file.

  5. c

    Standardization in Quantitative Imaging: A Multi-center Comparison of...

    • cancerimagingarchive.net
    n/a, nifti and zip +1
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive, Standardization in Quantitative Imaging: A Multi-center Comparison of Radiomic Feature Values [Dataset]. http://doi.org/10.7937/tcia.2020.9era-gg29
    Explore at:
    xlsx, n/a, nifti and zipAvailable download formats
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Jun 9, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    This dataset was used by the NCI's Quantitative Imaging Network (QIN) PET-CT Subgroup for their project titled: Multi-center Comparison of Radiomic Features from Different Software Packages on Digital Reference Objects and Patient Datasets. The purpose of this project was to assess the agreement among radiomic features when computed by several groups by using different software packages under very tightly controlled conditions, which included common image data sets and standardized feature definitions. The image datasets (and Volumes of Interest – VOIs) provided here are the same ones used in that project and reported in the publication listed below (ISSN 2379-1381 https://doi.org/10.18383/j.tom.2019.00031). In addition, we have provided detailed information about the software packages used (Table 1 in that publication) as well as the individual feature value results for each image dataset and each software package that was used to create the summary tables (Tables 2, 3 and 4) in that publication. For that project, nine common quantitative imaging features were selected for comparison including features that describe morphology, intensity, shape, and texture and that are described in detail in the International Biomarker Standardisation Initiative (IBSI, https://arxiv.org/abs/1612.07003 and publication (Zwanenburg A. Vallières M, et al, The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020 May;295(2):328-338. doi: https://doi.org/10.1148/radiol.2020191145). There are three datasets provided – two image datasets and one dataset consisting of four excel spreadsheets containing feature values.

    1. The first image dataset is a set of three Digital Reference Objects (DROs) used in the project, which are: (a) a sphere with uniform intensity, (b) a sphere with intensity variation (c) a nonspherical (but mathematically defined) object with uniform intensity. These DROs were created by the team at Stanford University and are described in (Jaggi A, Mattonen SA, McNitt-Gray M, Napel S. Stanford DRO Toolkit: digital reference objects for standardization of radiomic features. Tomography. 2019;6:–.) and are a subset of the DROs described in DRO Toolkit. Each DRO is represented in both DICOM and NIfTI format and the VOI was provided in each format as well (DICOM Segmentation Object (DSO) as well as NIfTI segmentation boundary).
    2. The second image dataset is the set of 10 patient CT scans, originating from the LIDC-IDRI dataset, that were used in the QIN multi-site collection of Lung CT data with Nodule Segmentations project ( https://doi.org/10.7937/K9/TCIA.2015.1BUVFJR7 ). In that QIN study, a single lesion from each case was identified for analysis and then nine VOIs were generated using three repeat runs of three segmentation algorithms (one from each of three academic institutions) on each lesion. To eliminate one source of variability in our project, only one of the VOIs previously created for each lesion was identified and all sites used that same VOI definition. The specific VOI chosen for each lesion was the first run of the first algorithm (algorithm 1, run 1). DICOM images were provided for each dataset and the VOI was provided in both DICOM Segmentation Object (DSO) and NIfTI segmentation formats.
    3. The third dataset is a collection of four excel spreadsheets, each of which contains detailed information corresponding to each of the four tables in the publication. For example, the raw feature values and the summary tables for Tables 2,3 and 4 reported in the publication cited (https://doi.org/10.18383/j.tom.2019.00031). These tables are:
    Software Package details : This table contains detailed information about the software packages used in the study (and listed in Table 1 in the publication) including version number and any parameters specified in the calculation of the features reported. DRO results : This contains the original feature values obtained for each software package for each DRO as well as the table summarizing results across software packages (Table 2 in the publication) . Patient Dataset results: This contains the original feature values for each software package for each patient dataset (1 lesion per case) as well as the table summarizing results across software packages and patient datasets (Table 3 in the publication). Harmonized GLCM Entropy Results : This contains the values for the “Harmonized” GLCM Entropy feature for each patient dataset and each software package as well as the summary across software packages (Table 4 in the publication).

  6. d

    Relaxed Naïve Bayes Data

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Relaxed Naïve Bayes Team (2023). Relaxed Naïve Bayes Data [Dataset]. http://doi.org/10.7910/DVN/7KNKLL
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Relaxed Naïve Bayes Team
    Description

    NaiveBayes_R.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given recidivism (P(x_ij│R)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|R): This tab contains probabilities of feature attributes among recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. Recidivated_Train: This tab contains re-coded features of recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|R) tab. NaiveBayes_NR.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given non-recidivism (P(x_ij│N)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|N): This tab contains probabilities of feature attributes among non-recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. NonRecidivated_Train: This tab contains re-coded features of non-recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given non-recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|N) tab. Training_LnTransformed.xlsx: Figures in each cell are log-transformed ratios of probabilities in NaiveBayes_R.xlsx (P(Xi|R)) to the probabilities in NaiveBayes_NR.xlsx (P(Xi|N)). TestData.xlsx: This Excel file includes the following tabs based on the test data: P(Xi|R), P(Xi|N), NIJ_Recoded, and Test_LnTransformed (log-transformed P(Xi|R)/ P(Xi|N)). Training_LnTransformed.dta: We transform Training_LnTransformed.xlsx to Stata data set. We use Stat/Transfer 13 software package to transfer the file format. StataLog.smcl: This file includes the results of the logistic regression analysis. Both estimated intercept and coefficient estimates in this Stata log correspond to the raw weights and standardized weights in Figure 1. Brier Score_Re-Check.xlsx: This Excel file recalculates Brier scores of Relaxed Naïve Bayes Classifier in Table 3, showing evidence that results displayed in Table 3 are correct. *****Full List***** NaiveBayes_R.xlsx NaiveBayes_NR.xlsx Training_LnTransformed.xlsx TestData.xlsx Training_LnTransformed.dta StataLog.smcl Brier Score_Re-Check.xlsx Data for Weka (Training Set): Bayes_2022_NoID Data for Weka (Test Set): BayesTest_2022_NoID Weka output for machine learning models (Conventional naïve Bayes, AdaBoost, Multilayer Perceptron, Logistic Regression, and Random Forest)

  7. A database of non-aqueous proton conductors

    • zenodo.org
    bin, tsv
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harrison J. Cassady; Harrison J. Cassady; Emeline Martin; Yifan Liu; Yifan Liu; Debjyoti Bhattacharya; Debjyoti Bhattacharya; Maria F. Rochow; Maria F. Rochow; Brock A. Dyer; Brock A. Dyer; Wesley F. Reinhart; Wesley F. Reinhart; Valentino R. Cooper; Valentino R. Cooper; Michael A. Hickner; Michael A. Hickner; Emeline Martin (2025). A database of non-aqueous proton conductors [Dataset]. http://doi.org/10.5281/zenodo.14853828
    Explore at:
    tsv, binAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Harrison J. Cassady; Harrison J. Cassady; Emeline Martin; Yifan Liu; Yifan Liu; Debjyoti Bhattacharya; Debjyoti Bhattacharya; Maria F. Rochow; Maria F. Rochow; Brock A. Dyer; Brock A. Dyer; Wesley F. Reinhart; Wesley F. Reinhart; Valentino R. Cooper; Valentino R. Cooper; Michael A. Hickner; Michael A. Hickner; Emeline Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports the publication titled "A database of non-aqueous proton conducting materials," which compiles experimental data on non-aqueous proton conductors from 48 peer-reviewed papers. The dataset encompasses 74 distinct compounds, yielding a total of 3152 data points that cover a broad temperature range from −70°C to 260°C.

    Contents of the Dataset:

    1. Chemical Structures: Molecules are encoded using SMILES (Simplified Molecular-Input Line-Entry System) for easy parsing and compatibility with cheminformatics tools.

    2. Experimental Data: The dataset includes proton conductivity and proton diffusion coefficients. Parameters are reported both for doped and undoped systems, with doping levels explicitly quantified.

    3. File Formats:

      • Raw Data: An Excel spreadsheet (.xlsx) with two sheets (“Compounds” and “Parameters”) containing original data as extracted from the papers.

      • Cleaned Data: Two tab-separated values (.tsv) files, containing conductivity and diffusion coefficients, which have been standardized for easy integration into machine learning models.

  8. w

    Municipal Financial and Statistical Data

    • data.wu.ac.at
    Updated Jun 27, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Alberta | Gouvernement de l'Alberta (2018). Municipal Financial and Statistical Data [Dataset]. https://data.wu.ac.at/schema/www_data_gc_ca/ODgzOGYwZTEtYzY1ZS00ZTc4LWFiMmEtNjljYzgwZDhkOWQz
    Explore at:
    Dataset updated
    Jun 27, 2018
    Dataset provided by
    Government of Alberta | Gouvernement de l'Alberta
    License

    http://open.alberta.ca/licencehttp://open.alberta.ca/licence

    Description

    Municipal Financial and Statistical Data includes the information submitted annually by all Alberta municipalities via the financial information return (FIR) and statistical information return (SIR). Information is available both in excel and zipped formats. The FIR is a standardized summary of the information contained in the annual audited financial statements of each municipality including assets, liabilities, revenue, expenses, long term debt, and property taxes. The municipal data is converted to excel for analysis purposes. The SIR provides basic municipal statistics including population, assessment and tax rate information.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Elsai (2025). Company Financial Data | Multi-Source Docs | Extraction & Structuring (100+ Languages, 5K Docs/Hour) | Standardized Outputs | Compliance & Analysis [Dataset]. https://datarade.ai/data-products/company-financial-data-multi-source-docs-extraction-str-elsai

Company Financial Data | Multi-Source Docs | Extraction & Structuring (100+ Languages, 5K Docs/Hour) | Standardized Outputs | Compliance & Analysis

Explore at:
.json, .xml, .csv, .sql, .txtAvailable download formats
Dataset updated
Feb 14, 2025
Dataset authored and provided by
Elsai
Area covered
Belize, Togo, Åland Islands, Virgin Islands (U.S.), Lao People's Democratic Republic, Guinea-Bissau, El Salvador, Bangladesh, Tonga, Sri Lanka
Description

Transform Unstructured Financial Docs into Actionable Insights Harness proprietary AI models to extract, validate, and standardize financial data from any document format, including scanned images, handwritten notes, or multi-language PDFs. Unlike basic OCR tools, our solution handles complex layouts, merged cells, poor quality PDFs and low-quality scans with industry-leading precision.

Key Features Universal Format Support: Extract data from scanned PDFs, images (JPEG/PNG), Excel, Word, and any other handwritten documents.

AI-Driven OCR & LLM Standardization:

  1. Convert unstructured text into standardized fields (e.g., "Net Profit" → ISO 20022-compliant tags).

  2. Resolve inconsistencies (e.g., "$1M" vs. "1,000,000 USD") using context-aware LLMs.

  3. 100+ Language Coverage: Process financial docs in Arabic, Bulgarian, and more with automated translation.

  4. Up to 99% Accuracy: Triple-validation via AI cross-checks, rule-based audits, and human-in-the-loop reviews.

  5. Prebuilt Templates: Auto-detect formats for common documents (e.g., IFRS-compliant P&L statements, IRS tax forms).

Data Sourcing & Output Supported Documents: Balance sheets, invoices, tax filings, bank statements, receipts and more. Export Formats: Excel, CSV, JSON, API, PostgreSQL, or direct integration with tools like QuickBooks, SAP.

Use Cases 1. Credit Risk Analysis: Automate financial health assessments for loan approvals and vender analysis.

  1. Audit Compliance: Streamline data aggregation for GAAP/IFRS audits.

  2. Due Diligence: Verify company legitimacy for mergers, investments, acquisitions, or partnerships.

  3. Compliance: Streamline KYC/AML workflows with automated financials check.

  4. Invoice Processing: Extract vendor payment terms, due dates, and amounts.

Technical Edge 1. AI Architecture: Leverages proprietary algorithm which combines vision transformers and OCR pipelines for layout detection, LLM models for context analysis, and rule-based validation.

  1. Security: SOC 2 compliance, and on-premise storage options.

  2. Latency: Process as much as 10,000 pages/hour with sub-60-second extractions.

Pricing & Trials Pay-as-you-go (min 1,000 docs/month).

Enterprise: Custom pricing for volume discounts, SLA guarantees, and white-glove onboarding.

Free Trial Available

Search
Clear search
Close search
Google apps
Main menu