6 datasets found

Data from: An Automatic Method to Extract Patent Citations from Google
figshare.com
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kayvan Kousha; Mike Thelwall (2023). An Automatic Method to Extract Patent Citations from Google [Dataset]. http://doi.org/10.6084/m9.figshare.1418234.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1418234.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Kayvan Kousha; Mike Thelwall
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Appendix A1-A16 report the top 10 highly cited articles in Google Patents from the Bing API search in the sixteen selected fields. It shows that there are some minority articles with many patent citations but with few or no Scopus citation.
Data from: PatCit: A Comprehensive Dataset of Patent Citations
search.datacite.org
Updated Dec 23, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cyril Verluise; Gabriele Cristelli; Kyle Higham; Lucas Violon; Gaétan De Rassenfosse (2020). PatCit: A Comprehensive Dataset of Patent Citations [Dataset]. http://doi.org/10.5281/zenodo.4391095
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4391095
Dataset updated
Dec 23, 2020
Dataset provided by
Zenodohttp://zenodo.org/
DataCitehttps://www.datacite.org/
Authors
Cyril Verluise; Gabriele Cristelli; Kyle Higham; Lucas Violon; Gaétan De Rassenfosse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
patCit: A Comprehensive Dataset of Patent Citations [Newsletter, GitHub] Patents are at the crossroads of many innovation nodes: science, industry, products, competition, etc. Such interactions can be identified through citations in a broad sense. It is now common to use front-page patent citations to study some aspects of the innovation system. However, there is much more buried in the Non Patent Literature (NPL) citations and in the patent text itself. patCit extracts and structures these citations. Want to know more? Read patCit academic presentation or dive into usage and technical guides on patCit documentation website. IN PRACTICE At patCit, we are building a comprehensive dataset of patent citations to help the community explore this terra incognita. patCit has the following features: global coverage front-page and in-text citations all categories of NPL documents Front-page patCit builds on DOCDB, the largest database of Non Patent Literature (NPL) citations. First, we deduplicate this corpus and organize it into 10 categories (bibliographical reference, database, norm & standard, etc). Then, we design and apply category specific information extraction models using spaCy. Eventually, when possible, we enrich the data using external domain specific high quality databases (e.g. Crossref for bibliographical references). In-text patCit builds on Google Patents corpus of USPTO full-text patents. First, we extract patent and bibliographical reference citations. Then, we parse detected in-text citations into a series of category dependent attributes using grobid. Patent citations are matched with a standard publication number using the Google Patents matching API and bibliographical references are matched with a DOI using biblio-glutton. Eventually, when possible, we enrich the data using external domain specific high quality databases (e.g. Crossref for bibliographical references). FAIR Find - The patCit dataset is available on BigQuery in an interactive environment. For those who have a smattering of SQL, this is the perfect place to explore the data. It can also be downloaded on Zenodo. Interoperate - Interoperability is at the core of patCit ambition. We take care to extract unique identifiers whenever it is possible to enable data enrichment for domain specific high quality databases. This includes the DOI, PMID and PMCID for bibliographical references, the Technical Doc Number for standards, the Accession Number for Genetic databases, the publication number for PATSTAT and Claims, etc. See specific table for more details. Reproduce - Our gitHub repository is the project factory. You can learn more about data recipes and models on the patCit documentation website.
USPTO Patent Examiner Data System (PEDS) API Data
console.cloud.google.com
Updated Aug 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Google%20Patents%20Public%20Datasets&hl=en-GB (2023). USPTO Patent Examiner Data System (PEDS) API Data [Dataset]. https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/uspto-peds?hl=en-GB
Explore at:
Dataset updated
Aug 10, 2023
Dataset provided by
Googlehttp://google.com/
Description
USPTO Patent Examiner Data System (PEDS) API Data contains data from the examination process of USPTO patent applications. PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.
USPTO Patent Trial and Appeal Board (PTAB) API Data
console.cloud.google.com
Updated Apr 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Google%20Patents%20Public%20Datasets (2021). USPTO Patent Trial and Appeal Board (PTAB) API Data [Dataset]. https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/uspto-ptab
Explore at:
Dataset updated
Apr 30, 2021
Dataset provided by
Googlehttp://google.com/
Description
USPTO Patent Trial and Appeal Board (PTAB) API Data contains data from the PTAB E2E (end-to-end) system making public America Invents Action (AIA) Trials information and documents available.
o
Data from: Reliance on Science in Patenting
explore.openaire.eu
zenodo.org
Updated Oct 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matt Marx; Aaron Fuegi (2020). Reliance on Science in Patenting [Dataset]. http://doi.org/10.5281/zenodo.3236339
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3236339
Dataset updated
Oct 13, 2020
Authors
Matt Marx; Aaron Fuegi
Description
This dataset contains citations from USPTO patents granted 1947-2018 to articles captured by the Microsoft Academic Graph (MAG) from 1800-2018. If you use the data, please cite these two papers: for the dataset of citations: Marx, Matt and Aaron Fuegi, "Reliance on Science in Patenting: USPTO Front-Page Citations to Scientific Articles" (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331686). for the underlying dataset of papers Sinha, Arnab, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion). ACM, New York, NY, USA, 243-246. The main file, pcs.tsv, contains the resolved citations. Fields are tab-separated. Each match has the patent number, MAG ID, the original citation from the patent, an indicator for whether the citation was supplied by the applicant, examiner, or unknown, and a confidence score (1-10) indicating how likely this match is correct. Note that this distribution does not contain matches with confidence 2 or 1. There is also a PubMed-specific match in pcs-pubmed.tsv. The remaining files are a redistribution of the 1 January 2019 release of the Microsoft Academic Graph. All of these files are compressed using ZIP compression under CentOS5. Original files, documented at https://docs.microsoft.com/en-us/academic-services/graph/reference-data-schema, can be downloaded from https://aka.ms/msracad; this redistribution carves up the original files into smaller, variable-specific files that can be loaded individually (see _relianceonscience.pdf for full details). Source code for generating the patent citations to science in pcs.tsv is available at https://github.com/mattmarx/reliance_on_science. Source code for generating jif.zip and jcif.zip (Journal Impact Factor and Journal Commercial Impact Factor) is at https://github.com/mattmarx/jcif. Although MAG contains authors and affiliations for each paper, it does not contain the location for affiliations. We have created a dataset of locations for affiliations appearing at least 100x using Bing Maps and Google Maps; however, it is unclear to us whether the API licensing terms allow us to repost their data. In any case, you can download our source code for doing so here: https://github.com/ksjiaxian/api-requester-locations. MAG extracts field keywords for each paper (paperfieldid.zip and fieldidname.zip) --more than 200,000 fields in all! When looking to study industries or technical areas you might find this a bit overwhelming. We mapped the MAG subjects to six OECD fields and 39 subfields, defined here: http://www.oecd.org/science/inno/38235147.pdf. Clarivate provides a crosswalk between the OECD classifications and Web of Science fields, so we include WoS fields as well. This file is magfield_oecd_wos_crosswalk.zip.
l
Data from: Recognizing Figure Labels in Patents [AAAI 2021 SDU Workshop]
laro.lanl.gov
figshare.com
Updated Dec 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jian Wu; Diane Oyen; Ming Gong; A A BB (2020). Recognizing Figure Labels in Patents [AAAI 2021 SDU Workshop] [Dataset]. https://laro.lanl.gov/esploro/outputs/dataset/Recognizing-Figure-Labels-in-Patents-AAAI/9916370739103761
Explore at:
Dataset updated
Dec 19, 2020
Dataset provided by
figshare
Authors
Jian Wu; Diane Oyen; Ming Gong; A A BB
Time period covered
Dec 19, 2020
Description
100patents_design_original_png.zip: 100 figures extracted from 100 US DESIGN patents, original (unrotated), PNG format * 100patents_design_original-tif.zip: 100 figures extracted from 100 US DESIGN patents, original (unrotated), TIF format* 100patents_design_rotated-tif.zip: 100 figures extracted from 100 US DESIGN patents, rotated to upright if needed TIF format* 100patents_design_rotated-png.zip: 100 figures extracted from 100 US DESIGN patents, rotated to upright if needed, PNG format* all_results_labels.xml: ground truth labels and labels extracted by 8 tools: SWT, Adobe Acrobat, EAST, Amazon Textract, Tesseract, Google Vision API, Abbyy, and the alpha-shape method.For details, see the following paper: Gong Ming, Xin Wei, Diane Oyen, Jian Wu, Martin Gryder, and Liping Yang. "Recognizing Figure Labels in Patents." In: AAAI-2021 workshop on Scientific Document Understanding (SDU). Virtual Event.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Kayvan Kousha; Mike Thelwall (2023). An Automatic Method to Extract Patent Citations from Google [Dataset]. http://doi.org/10.6084/m9.figshare.1418234.v1

Data from: An Automatic Method to Extract Patent Citations from Google

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.1418234.v1

Dataset updated

May 30, 2023

Dataset provided by

Figsharehttp://figshare.com/
figshare

Authors

Kayvan Kousha; Mike Thelwall

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Appendix A1-A16 report the top 10 highly cited articles in Google Patents from the Bing API search in the sixteen selected fields. It shows that there are some minority articles with many patent citations but with few or no Scopus citation.

Clear search

Close search

Google apps

Main menu

Data from: An Automatic Method to Extract Patent Citations from Google

Data from: PatCit: A Comprehensive Dataset of Patent Citations

USPTO Patent Examiner Data System (PEDS) API Data

USPTO Patent Trial and Appeal Board (PTAB) API Data

Data from: Reliance on Science in Patenting

Data from: Recognizing Figure Labels in Patents [AAAI 2021 SDU Workshop]

Data from: An Automatic Method to Extract Patent Citations from Google