13 datasets found

h
sql-create-context-copy
huggingface.co
Updated Apr 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philipp Schmid (2023). sql-create-context-copy [Dataset]. https://huggingface.co/datasets/philschmid/sql-create-context-copy
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 21, 2023
Authors
Philipp Schmid
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Fork of b-mc2/sql-create-context

Overview

This dataset builds from WikiSQL and Spider. There are 78,577 examples of natural language queries, SQL CREATE TABLE statements, and SQL Query answering the question using the CREATE statement as context. This dataset was built with text-to-sql LLMs in mind, intending to prevent hallucination of column and table names often seen when trained on text-to-sql datasets. The CREATE TABLE statement can often be copy and pasted from… See the full description on the dataset page: https://huggingface.co/datasets/philschmid/sql-create-context-copy.
USPTO Patent Examiner Data System (PEDS) Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). USPTO Patent Examiner Data System (PEDS) Data [Dataset]. https://www.kaggle.com/bigquery/uspto-peds
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Patent Examination Data System gives users access to multiple records of USPTO patent application or patent filing status at no cost. PEDS is updated daily and mirrors the data available in the Patent Application Location and Monitoring system (PALM). PEDS provides access to public applications including: published patent applications and patents. PCT applications that have not been published by WIPO. Any applications that have not been released by the USPTO will not be available in PEDS.

Content

USPTO Patent Examiner Data System (PEDS) API Data contains data from the examination process of USPTO patent applications. PEDS contains the bibliographic, published document and patent term extension data tabs in Public PAIR from 1981 to present. There is also some data dating back to 1935.

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

"Patent Examination Data System" by the USPTO, for public use.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_peds

Banner photo by Thought Catalog on Unsplash
h
sql-create-context-id
huggingface.co
Updated Nov 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gede Putra Nugraha (2023). sql-create-context-id [Dataset]. https://huggingface.co/datasets/detakarang/sql-create-context-id
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 18, 2023
Authors
Gede Putra Nugraha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

This dataset is a fork from sql-create-context This dataset builds from WikiSQL and Spider. There are 78,577 examples of natural language queries, SQL CREATE TABLE statements, and SQL Query answering the question using the CREATE statement as context. This dataset was built with text-to-sql LLMs in mind, intending to prevent hallucination of column and table names often seen when trained on text-to-sql datasets. The CREATE TABLE statement can often be copy and pasted from… See the full description on the dataset page: https://huggingface.co/datasets/detakarang/sql-create-context-id.
Google Patents Public Data
kaggle.com
zip
Updated Sep 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2018). Google Patents Public Data [Dataset]. https://www.kaggle.com/datasets/bigquery/patents
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Sep 19, 2018
Dataset provided by
Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Google Patents Public Data, provided by IFI CLAIMS Patent Services, is a worldwide bibliographic and US full-text dataset of patent publications. Patent information accessibility is critical for examining new patents, informing public policy decisions, managing corporate investment in intellectual property, and promoting future scientific innovation. The growing number of available patent data sources means researchers often spend more time downloading, parsing, loading, syncing and managing local databases than conducting analysis. With these new datasets, researchers and companies can access the data they need from multiple sources in one place, thus spending more time on analysis than data preparation.

Content

The Google Patents Public Data dataset contains a collection of publicly accessible, connected database tables for empirical analysis of the international patent system.

Acknowledgements

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:patents

For more info, see the documentation at https://developers.google.com/web/tools/chrome-user-experience-report/

“Google Patents Public Data” by IFI CLAIMS Patent Services and Google is licensed under a Creative Commons Attribution 4.0 International License.

Banner photo by Helloquence on Unsplash
USPTO Cancer Moonshot Patent Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). USPTO Cancer Moonshot Patent Data [Dataset]. https://www.kaggle.com/datasets/bigquery/uspto-oce-cancer
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

This curated dataset consists of 269,353 patent documents (published patent applications and granted patents) spanning the 1976 to 2016 period and is intended to help identify promising R&D on the horizon in diagnostics, therapeutics, data analytics, and model biological systems.

Content

USPTO Cancer Moonshot Patent Data was generated using USPTO examiner tools to execute a series of queries designed to identify cancer-specific patents and patent applications. This includes drugs, diagnostics, cell lines, mouse models, radiation-based devices, surgical devices, image analytics, data analytics, and genomic-based inventions.

Acknowledgements

“USPTO Cancer Moonshot Patent Data” by the USPTO, for public use. Frumkin, Jesse and Myers, Amanda F., Cancer Moonshot Patent Data (August, 2016).

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_oce_cancer

Banner photo by Jaron Nix on Unsplash
The Office Action Research Dataset for Patents
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). The Office Action Research Dataset for Patents [Dataset]. https://www.kaggle.com/datasets/bigquery/uspto-oce-office-actions
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

The Office Action Research Dataset for Patents contains detailed information derived from the Office actions issued by patent examiners to applicants during the patent examination process. The “Office action” is a written notification to the applicant of the examiner’s decision on patentability and generally discloses the grounds for a rejection, the claims affected, and the pertinent prior art.

Content

This initial release consists of three files derived from 4.4 million Office actions mailed during the 2008 to mid-2017 period from USPTO examiners to the applicants of 2.2 million unique patent applications.

Acknowledgements

A working paper describing this dataset is available and can be cited as Lu, Qiang and Myers, Amanda F. and Beliveau, Scott, USPTO Patent Prosecution Research Data: Unlocking Office Action Traits (November 20, 2017). USPTO Economic Working Paper No. 2017-10. Available at SSRN: https://ssrn.com/abstract=3024621 (link is external).

This effort is made possible by the USPTO Digital Services & Big Data portfolio and collaboration with the USPTO Office of the Chief Economist (OCE). The OCE provides these data files for public use and encourages users to identify fixes and improvements. Please provide all feedback to: EconomicsData@uspto.gov.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_oce_office_actions

Banner photo by Trent Erwin on Unsplash
USPTO OCE Patent Claims Research Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). USPTO OCE Patent Claims Research Data [Dataset]. https://www.kaggle.com/datasets/bigquery/uspto-oce-claims
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

The Patent Claims Research Dataset contain detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014. The dataset is derived from the Patent Application Publication Full-Text and Patent Grant Full Text files, available at https://bulkdata.uspto.gov/, to which the Office of Chief Economist (OCE) applied a Python algorithm to identify individual claims as well as the dependency relationship between claims. From the parsed claims text, OCE created six data files containing individually-parsed claims, claim-level statistics, and document-level statistics, including newly-developed measures of patent scope.

Content

USPTO OCE Patent Claims Research data contains detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014.

Acknowledgements

"USPTO OCE Patent Claims Research Data" by the USPTO, for public use. Marco, Alan C. and Sarnoff, Joshua D. and deGrazia, Charles, "Patent Claims and Patent Scope" (October 2016). USPTO Economic Working Paper 2016-04.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_oce_claims

Banner photo by William Iven on Unsplash
World Development Indicators (WDI) Data
kaggle.com
zip
Updated Aug 27, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2018). World Development Indicators (WDI) Data [Dataset]. https://www.kaggle.com/datasets/bigquery/worldbank-wdi
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Aug 27, 2018
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

World Development Indicators (WDI) by World Bank includes data spanning up to 56 years—from 1960 to 2016. WDI frames global trends with indicators on population, population density, urbanization, GNI, and GDP. These indicators measure the world’s economy and progress toward improving lives, achieving sustainable development, providing support for vulnerable populations, and reducing gender disparities.

Content

World Development Indicators Data is the primary World Bank collection of development indicators, compiled from officially-recognized international sources. It presents the most current and accurate global development data available, and includes national, regional and global estimates.

Acknowledgements

“World Development Indicators” by the World Bank, used under CC BY 3.0 IGO.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:worldbank_wdi

Banner photo by Joshua Rawson-Harris on Unsplash
Cooperative Patent Classification (CPC) Data
kaggle.com
zip
Updated Mar 7, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). Cooperative Patent Classification (CPC) Data [Dataset]. https://www.kaggle.com/bigquery/cpc
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 7, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

The CPC is the result of a partnership between the EPO and the USPTO in their joint effort to develop a common, internationally compatible classification system for technical documents, in particular patent publications, which will be used by both offices in the patent granting process.

Content

Cooperative Patent Classification Data contains the scheme and definitions of the Cooperative Patent Classification system for classifying patent documents.

Acknowledgments

“Cooperative Patent Classification” by the EPO and USPTO, for public use. Modifications have been made to parse the XML description sections to extract references to other classification symbols.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:cpc

Banner photo by Helloquence on Unsplash
USPTO OCE Patent Litigation Docket Reports Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). USPTO OCE Patent Litigation Docket Reports Data [Dataset]. https://www.kaggle.com/bigquery/uspto-oce-litigation
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

OCE collected all of the data from the Public Access to Court Electronics Records (PACER) and RECAP, an independent project designed to serve as a repository for litigation data sourced from PACER. The final output datasets include information on the litigating parties involved and their attorneys; the cause of action; the court location; important dates in the litigation history; and descriptions of all documents submitted in a given case, which cover more than 5 million separate documents contained in the case docket reports.

Content

USPTO OCE Patent Litigation Docket Reports Data contains detailed patent litigation data on 74,623 unique district court cases filed during the period 1963-2015.

Acknowledgements

"USPTO OCE Patent Litigation Docket Reports Data" by the USPTO, for public use. Marco, A., A. Tesfayesus, A. Toole (2017). “Patent Litigation Data from US District Court Electronic Records (1963-2015).” USPTO Economic Working Paper No. 2017-06.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:uspto_oce_litigation

Banner photo by Samuel Zeller on Unsplash
Disclosed Standard Essential Patents (dSEP) Data
kaggle.com
zip
Updated Aug 27, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2018). Disclosed Standard Essential Patents (dSEP) Data [Dataset]. https://www.kaggle.com/datasets/bigquery/dsep
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Aug 27, 2018
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Context

This database is the result of a combined effort of several European and US researchers to collect, clean and harmonise disclosed SEPs data at thirteen major standard setting organisations (including ETSI, ITU, IEEE, ISO, and more).

Content

Disclosed Standard Essential Patents (dSEP) Data provides a full overview of disclosed intellectual property rights at setting organizations worldwide.

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:dsep

“Disclosed Standard Essential Patents Database” by Bekkers, R., Catalini, C., Martinelli, A., & Simcoe, T. (2012). Intellectual Property Disclosure in Standards Development. Proceedings from NBER conference on Standards, Patents & Innovation, Tucson (AZ), January 20 and 21, 2012.

Banner photo by Helloquence on Unsplash
ChEMBL EBI Small Molecules Database
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). ChEMBL EBI Small Molecules Database [Dataset]. https://www.kaggle.com/bigquery/ebi-chembl
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Context

ChEMBL is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

Content

ChEMBL is a manually curated database of bioactive molecules with drug-like properties used in drug discovery, including information about existing patented drugs.

Schema: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/chembl_23_schema.png

Documentation: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/schema_documentation.html

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

“ChEMBL” by the European Bioinformatics Institute (EMBL-EBI), used under CC BY-SA 3.0. Modifications have been made to add normalized publication numbers.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:ebi_chembl

Banner photo by rawpixel on Unsplash
PatentsView Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). PatentsView Data [Dataset]. https://www.kaggle.com/bigquery/patentsview
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
Googlehttp://google.com/
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
Description
Context

The USPTO grants US patents to inventors and assignees all over the world. For researchers in particular, PatentsView is intended to encourage the study and understanding of the intellectual property (IP) and innovation system; to serve as a fundamental function of the government in creating “public good” platforms in these data; and to eliminate redundant cleaning, converting and matching of these data by individual researchers, thus freeing up researcher time to do what they do best—study IP, innovation, and technological change.

Content

PatentsView Data is a database that longitudinally links inventors, their organizations, locations, and overall patenting activity. The dataset uses data derived from USPTO bulk data files.

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

“PatentsView” by the USPTO, US Department of Agriculture (USDA), the Center for the Science of Science and Innovation Policy, New York University, the University of California at Berkeley, Twin Arch Technologies, and Periscopic, used under CC BY 4.0.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:patentsview

Banner photo by rawpixel on Unsplash
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Philipp Schmid (2023). sql-create-context-copy [Dataset]. https://huggingface.co/datasets/philschmid/sql-create-context-copy

sql-create-context-copy

sql-create-context

philschmid/sql-create-context-copy

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 21, 2023

Authors

Philipp Schmid

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Fork of b-mc2/sql-create-context

  Overview

This dataset builds from WikiSQL and Spider. There are 78,577 examples of natural language queries, SQL CREATE TABLE statements, and SQL Query answering the question using the CREATE statement as context. This dataset was built with text-to-sql LLMs in mind, intending to prevent hallucination of column and table names often seen when trained on text-to-sql datasets. The CREATE TABLE statement can often be copy and pasted from… See the full description on the dataset page: https://huggingface.co/datasets/philschmid/sql-create-context-copy.

Clear search

Close search

Google apps

Main menu

sql-create-context-copy

USPTO Patent Examiner Data System (PEDS) Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

sql-create-context-id

Google Patents Public Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

USPTO Cancer Moonshot Patent Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

The Office Action Research Dataset for Patents

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

USPTO OCE Patent Claims Research Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

World Development Indicators (WDI) Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

Cooperative Patent Classification (CPC) Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgments

USPTO OCE Patent Litigation Docket Reports Data

Fork this notebook to get started on accessing data in the BigQuery dataset by writing SQL queries using the BQhelper module.

Context

Content

Acknowledgements

Disclosed Standard Essential Patents (dSEP) Data

Context

Content

Acknowledgements

ChEMBL EBI Small Molecules Database

Context

Content

Acknowledgements

PatentsView Data

Context

Content

Acknowledgements

sql-create-context-copySee More Versions

sql-create-context

philschmid/sql-create-context-copy

sql-create-context-copy