6 datasets found
  1. Data from: An Automatic Method to Extract Patent Citations from Google

    • figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kayvan Kousha; Mike Thelwall (2023). An Automatic Method to Extract Patent Citations from Google [Dataset]. http://doi.org/10.6084/m9.figshare.1418234.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Kayvan Kousha; Mike Thelwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Appendix A1-A16 report the top 10 highly cited articles in Google Patents from the Bing API search in the sixteen selected fields. It shows that there are some minority articles with many patent citations but with few or no Scopus citation.

  2. Data from: PatCit: A Comprehensive Dataset of Patent Citations

    • search.datacite.org
    Updated Dec 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cyril Verluise; Gabriele Cristelli; Kyle Higham; Lucas Violon; Gaétan De Rassenfosse (2020). PatCit: A Comprehensive Dataset of Patent Citations [Dataset]. http://doi.org/10.5281/zenodo.4391095
    Explore at:
    Dataset updated
    Dec 23, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    DataCitehttps://www.datacite.org/
    Authors
    Cyril Verluise; Gabriele Cristelli; Kyle Higham; Lucas Violon; Gaétan De Rassenfosse
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    patCit: A Comprehensive Dataset of Patent Citations [Newsletter, GitHub] Patents are at the crossroads of many innovation nodes: science, industry, products, competition, etc. Such interactions can be identified through citations in a broad sense. It is now common to use front-page patent citations to study some aspects of the innovation system. However, there is much more buried in the Non Patent Literature (NPL) citations and in the patent text itself. patCit extracts and structures these citations. Want to know more? Read patCit academic presentation or dive into usage and technical guides on patCit documentation website. IN PRACTICE At patCit, we are building a comprehensive dataset of patent citations to help the community explore this terra incognita. patCit has the following features: global coverage front-page and in-text citations all categories of NPL documents Front-page patCit builds on DOCDB, the largest database of Non Patent Literature (NPL) citations. First, we deduplicate this corpus and organize it into 10 categories (bibliographical reference, database, norm & standard, etc). Then, we design and apply category specific information extraction models using spaCy. Eventually, when possible, we enrich the data using external domain specific high quality databases (e.g. Crossref for bibliographical references). In-text patCit builds on Google Patents corpus of USPTO full-text patents. First, we extract patent and bibliographical reference citations. Then, we parse detected in-text citations into a series of category dependent attributes using grobid. Patent citations are matched with a standard publication number using the Google Patents matching API and bibliographical references are matched with a DOI using biblio-glutton. Eventually, when possible, we enrich the data using external domain specific high quality databases (e.g. Crossref for bibliographical references). FAIR Find - The patCit dataset is available on BigQuery in an interactive environment. For those who have a smattering of SQL, this is the perfect place to explore the data. It can also be downloaded on Zenodo. Interoperate - Interoperability is at the core of patCit ambition. We take care to extract unique identifiers whenever it is possible to enable data enrichment for domain specific high quality databases. This includes the DOI, PMID and PMCID for bibliographical references, the Technical Doc Number for standards, the Accession Number for Genetic databases, the publication number for PATSTAT and Claims, etc. See specific table for more details. Reproduce - Our gitHub repository is the project factory. You can learn more about data recipes and models on the patCit documentation website.

  3. USPTO Patent Examiner Data System (PEDS) API Data

    • console.cloud.google.com
    Updated Aug 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Google%20Patents%20Public%20Datasets&hl=en-GB (2023). USPTO Patent Examiner Data System (PEDS) API Data [Dataset]. https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/uspto-peds?hl=en-GB
    Explore at:
    Dataset updated
    Aug 10, 2023
    Dataset provided by
    Googlehttp://google.com/
    Description

    USPTO Patent Examiner Data System (PEDS) API Data contains data from the examination process of USPTO patent applications. PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.

  4. USPTO Patent Trial and Appeal Board (PTAB) API Data

    • console.cloud.google.com
    Updated Apr 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Google%20Patents%20Public%20Datasets (2021). USPTO Patent Trial and Appeal Board (PTAB) API Data [Dataset]. https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/uspto-ptab
    Explore at:
    Dataset updated
    Apr 30, 2021
    Dataset provided by
    Googlehttp://google.com/
    Description

    USPTO Patent Trial and Appeal Board (PTAB) API Data contains data from the PTAB E2E (end-to-end) system making public America Invents Action (AIA) Trials information and documents available.

  5. o

    Data from: Reliance on Science in Patenting

    • explore.openaire.eu
    • zenodo.org
    Updated Oct 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt Marx; Aaron Fuegi (2020). Reliance on Science in Patenting [Dataset]. http://doi.org/10.5281/zenodo.3236339
    Explore at:
    Dataset updated
    Oct 13, 2020
    Authors
    Matt Marx; Aaron Fuegi
    Description

    This dataset contains citations from USPTO patents granted 1947-2018 to articles captured by the Microsoft Academic Graph (MAG) from 1800-2018. If you use the data, please cite these two papers: for the dataset of citations: Marx, Matt and Aaron Fuegi, "Reliance on Science in Patenting: USPTO Front-Page Citations to Scientific Articles" (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686). for the underlying dataset of papers Sinha, Arnab, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion). ACM, New York, NY, USA, 243-246. The main file, pcs.tsv, contains the resolved citations. Fields are tab-separated. Each match has the patent number, MAG ID, the original citation from the patent, an indicator for whether the citation was supplied by the applicant, examiner, or unknown, and a confidence score (1-10) indicating how likely this match is correct. Note that this distribution does not contain matches with confidence 2 or 1. There is also a PubMed-specific match in pcs-pubmed.tsv. The remaining files are a redistribution of the 1 January 2019 release of the Microsoft Academic Graph. All of these files are compressed using ZIP compression under CentOS5. Original files, documented at https://docs.microsoft.com/en-us/academic-services/graph/reference-data-schema, can be downloaded from https://aka.ms/msracad; this redistribution carves up the original files into smaller, variable-specific files that can be loaded individually (see _relianceonscience.pdf for full details). Source code for generating the patent citations to science in pcs.tsv is available at https://github.com/mattmarx/reliance_on_science. Source code for generating jif.zip and jcif.zip (Journal Impact Factor and Journal Commercial Impact Factor) is at https://github.com/mattmarx/jcif. Although MAG contains authors and affiliations for each paper, it does not contain the location for affiliations. We have created a dataset of locations for affiliations appearing at least 100x using Bing Maps and Google Maps; however, it is unclear to us whether the API licensing terms allow us to repost their data. In any case, you can download our source code for doing so here: https://github.com/ksjiaxian/api-requester-locations. MAG extracts field keywords for each paper (paperfieldid.zip and fieldidname.zip) --more than 200,000 fields in all! When looking to study industries or technical areas you might find this a bit overwhelming. We mapped the MAG subjects to six OECD fields and 39 subfields, defined here: http://www.oecd.org/science/inno/38235147.pdf. Clarivate provides a crosswalk between the OECD classifications and Web of Science fields, so we include WoS fields as well. This file is magfield_oecd_wos_crosswalk.zip.

  6. l

    Data from: Recognizing Figure Labels in Patents [AAAI 2021 SDU Workshop]

    • laro.lanl.gov
    • figshare.com
    Updated Dec 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jian Wu; Diane Oyen; Ming Gong; A A BB (2020). Recognizing Figure Labels in Patents [AAAI 2021 SDU Workshop] [Dataset]. https://laro.lanl.gov/esploro/outputs/dataset/Recognizing-Figure-Labels-in-Patents-AAAI/9916370739103761
    Explore at:
    Dataset updated
    Dec 19, 2020
    Dataset provided by
    figshare
    Authors
    Jian Wu; Diane Oyen; Ming Gong; A A BB
    Time period covered
    Dec 19, 2020
    Description
    • 100patents_design_original_png.zip: 100 figures extracted from 100 US DESIGN patents, original (unrotated), PNG format * 100patents_design_original-tif.zip: 100 figures extracted from 100 US DESIGN patents, original (unrotated), TIF format* 100patents_design_rotated-tif.zip: 100 figures extracted from 100 US DESIGN patents, rotated to upright if needed TIF format* 100patents_design_rotated-png.zip: 100 figures extracted from 100 US DESIGN patents, rotated to upright if needed, PNG format* all_results_labels.xml: ground truth labels and labels extracted by 8 tools: SWT, Adobe Acrobat, EAST, Amazon Textract, Tesseract, Google Vision API, Abbyy, and the alpha-shape method.For details, see the following paper: Gong Ming, Xin Wei, Diane Oyen, Jian Wu, Martin Gryder, and Liping Yang. "Recognizing Figure Labels in Patents." In: AAAI-2021 workshop on Scientific Document Understanding (SDU). Virtual Event.
  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kayvan Kousha; Mike Thelwall (2023). An Automatic Method to Extract Patent Citations from Google [Dataset]. http://doi.org/10.6084/m9.figshare.1418234.v1
Organization logoOrganization logo

Data from: An Automatic Method to Extract Patent Citations from Google

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Kayvan Kousha; Mike Thelwall
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Appendix A1-A16 report the top 10 highly cited articles in Google Patents from the Bing API search in the sixteen selected fields. It shows that there are some minority articles with many patent citations but with few or no Scopus citation.

Search
Clear search
Close search
Google apps
Main menu