2 datasets found

g
IP Australia - [Superseded] Intellectual Property Government Open Data 2019...
gimi9.com
Updated Jul 20, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). IP Australia - [Superseded] Intellectual Property Government Open Data 2019 | gimi9.com [Dataset]. https://gimi9.com/dataset/au_intellectual-property-government-open-data-2019
Explore at:
Dataset updated
Jul 20, 2018
Area covered
Australia
Description
What is IPGOD? The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD. # How do I use IPGOD? IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar. # IP Data Platform IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform # References The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset. * Patents * Trade Marks * Designs * Plant Breeder’s Rights # Updates ### Tables and columns Due to the changes in our systems, some tables have been affected. * We have added IPGOD 225 and IPGOD 325 to the dataset! * The IPGOD 206 table is not available this year. * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use. ### Data quality improvements Data quality has been improved across all tables. * Null values are simply empty rather than '31/12/9999'. * All date columns are now in ISO format 'yyyy-mm-dd'. * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0. * All tables are encoded in UTF-8. * All tables use the backslash \ as the escape character. * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.
movielens-20m-dataset_train_test
kaggle.com
Updated May 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sas Pav (2023). movielens-20m-dataset_train_test [Dataset]. https://www.kaggle.com/datasets/saspav/movielens-20m-dataset-train-test
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 7, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sas Pav
Description
def train_test_split(X, train_size=0.7, user_col='userId', item_col='movieId', rating_col='rating', time_col='timestamp'): X.sort_values(by=[time_col], inplace=True) user_ids = X[user_col].unique() X_train_data = [] X_test_data = [] for user_id in tqdm_notebook(user_ids): cur_user = X[X[user_col] == user_id] idx = int(cur_user.shape[0] * train_size) X_train_data.append(cur_user[[user_col, item_col, rating_col]].iloc[:idx, :].values) X_test_data.append(cur_user[[user_col, item_col, rating_col]].iloc[idx:, :].values) X_train = pd.DataFrame(np.vstack(X_train_data), columns=[user_col, item_col, rating_col]) X_test = pd.DataFrame(np.vstack(X_test_data), columns=[user_col, item_col, rating_col]) return X_train, X_test

# аккуратно, очень долгий процесс

X_train, X_test = train_test_split(data)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2018). IP Australia - [Superseded] Intellectual Property Government Open Data 2019 | gimi9.com [Dataset]. https://gimi9.com/dataset/au_intellectual-property-government-open-data-2019

IP Australia - [Superseded] Intellectual Property Government Open Data 2019 | gimi9.com

Explore at:

Dataset updated

Jul 20, 2018

Area covered

Australia

Description

What is IPGOD? The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD. # How do I use IPGOD? IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar. # IP Data Platform IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform # References The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset. * Patents * Trade Marks * Designs * Plant Breeder’s Rights # Updates ### Tables and columns Due to the changes in our systems, some tables have been affected. * We have added IPGOD 225 and IPGOD 325 to the dataset! * The IPGOD 206 table is not available this year. * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use. ### Data quality improvements Data quality has been improved across all tables. * Null values are simply empty rather than '31/12/9999'. * All date columns are now in ISO format 'yyyy-mm-dd'. * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0. * All tables are encoded in UTF-8. * All tables use the backslash \ as the escape character. * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.

Clear search

Close search

Google apps

Main menu

IP Australia - [Superseded] Intellectual Property Government Open Data 2019...

movielens-20m-dataset_train_test

# аккуратно, очень долгий процесс

X_train, X_test = train_test_split(data)

IP Australia - [Superseded] Intellectual Property Government Open Data 2019 | gimi9.com