100+ datasets found

Number of Feedbacks by Category of Text.Cortex (Bar Chart)
toolkitly.com
Updated Feb 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toolkitly (2025). Number of Feedbacks by Category of Text.Cortex (Bar Chart) [Dataset]. https://www.toolkitly.com/feedbacks/text-cortex
Explore at:
Dataset updated
Feb 23, 2025
Dataset authored and provided by
Toolkitly
Description
Data for generating a bar chart on feedback counts by category for Text.Cortex.
a
TEXT - Chart patterns
atmatix.pl
Updated Mar 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATmatix (2025). TEXT - Chart patterns [Dataset]. https://www.atmatix.pl/en/patterns/all/wse/TEXT
Explore at:
Dataset updated
Mar 25, 2025
Dataset provided by
ATmatix
License
https://www.atmatix.pl/help/terms-of-service#copyrighthttps://www.atmatix.pl/help/terms-of-service#copyright
Description
TEXT (TXT) - Text SA - Technical analysis chart patterns - pattern list, candlestick charts and statistics
P
AutoChart Dataset
paperswithcode.com
opendatalab.com
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiawen Zhu; Jinye Ran; Roy Ka-Wei Lee; Kenny Choo; Zhi Li (2023). AutoChart Dataset [Dataset]. https://paperswithcode.com/dataset/autochart
Explore at:
Dataset updated
Apr 25, 2023
Authors
Jiawen Zhu; Jinye Ran; Roy Ka-Wei Lee; Kenny Choo; Zhi Li
Description
AutoChart is a dataset for chart-to-text generation, a task that consists on generating analytical descriptions of visual plots.
Text.Cortex Feedback by Category (Pie Chart)
toolkitly.com
Updated Feb 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Text.Cortex Feedback by Category (Pie Chart) [Dataset]. https://www.toolkitly.com/feedbacks/text-cortex
Explore at:
Dataset updated
Feb 23, 2025
Dataset authored and provided by
Toolkitly
Description
Data for generating a pie chart on the distribution of feedback categories of Text.Cortex.
Illuminated labels for ArcGIS Pro text
cacgeoportal.com
hub.arcgis.com
Updated Mar 19, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Esri Styles (2019). Illuminated labels for ArcGIS Pro text [Dataset]. https://www.cacgeoportal.com/content/5189d6227cae42de89c1cdfaee396792
Explore at:
Dataset updated
Mar 19, 2019
Dataset provided by
Esrihttp://esri.com/
Authors
Esri Styles
Description
Sometimes a basic solid color for your map's labels and text just isn't going to cut it. Here is an ArcGIS Pro style with light and dark gradient fills and shadow/glow effects that you can apply to map text via the "Text fill symbol" picker in your label pane. Level up those labels! Make them look touchable. Glassy. Shady. Intriguing.Find a how-to here.Save this style, add it to your ArcGIS Pro project, then use it for any text (including labels).**UPDATE**I've added a symbol that makes text look like is being illuminated from below, casting a shadow upwards and behind. Pretty dramatic if you ask me. Here is an example:Happy Mapping! John Nelson
T
Open Text | OTC - Market Capitalization
tradingeconomics.com
csv, excel, json, xml
Updated Feb 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2018). Open Text | OTC - Market Capitalization [Dataset]. https://tradingeconomics.com/otc:cn:market-capitalization
Explore at:
csv, xml, excel, jsonAvailable download formats
Dataset updated
Feb 22, 2018
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Mar 26, 2025
Area covered
Canada
Description
Open Text reported CAD10.55B in Market Capitalization this March of 2025, considering the latest stock price and the number of outstanding shares.Data for Open Text | OTC - Market Capitalization including historical, tables and charts were last updated by Trading Economics this last March in 2025.
d
Michigan Stratigraphic Nomenclature Chart
datadiscoverystudio.org
data.wu.ac.at
pdf
Updated Feb 8, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steve Wilson (2013). Michigan Stratigraphic Nomenclature Chart [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/3975fade7f464b649d7cd44ff47f81ce/html
Explore at:
pdfAvailable download formats
Dataset updated
Feb 8, 2013
Authors
Steve Wilson
Area covered

Description
Large format chart of Michigan stratigraphic formations. For information or to download this resource, please see links provided.
Publication text: code, data, and new measures
zenodo.org
data.niaid.nih.gov
csv
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sam Arts; Sam Arts; Nicola Melluso; Nicola Melluso; Reinhilde Veugelers; Reinhilde Veugelers; Leonidas Aristodemou; Leonidas Aristodemou (2024). Publication text: code, data, and new measures [Dataset]. http://doi.org/10.5281/zenodo.8283353
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8283353
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sam Arts; Sam Arts; Nicola Melluso; Nicola Melluso; Reinhilde Veugelers; Reinhilde Veugelers; Leonidas Aristodemou; Leonidas Aristodemou
License
Attribution-NonCommercial 1.0 (CC BY-NC 1.0)https://creativecommons.org/licenses/by-nc/1.0/
License information was derived automatically
Description
This Zenodo page describes data collection, processing, and different open access data files related to the text of scientific publications from Microsoft Academic Graph (MAG) (now OpenAlex). If you use the code or data, please cite the following paper:
Arts S, Melluso N, Veugelers R (2023). Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text. https://doi.org/10.48550/arXiv.2309.16437
CompanyKG Dataset V2.0: A Large-Scale Heterogeneous Graph for Company...
zenodo.org
data.niaid.nih.gov
application/gzip, bin +1
Updated Jun 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lele Cao; Lele Cao; Vilhelm von Ehrenheim; Vilhelm von Ehrenheim; Mark Granroth-Wilding; Mark Granroth-Wilding; Richard Anselmo Stahl; Richard Anselmo Stahl; Drew McCornack; Drew McCornack; Armin Catovic; Armin Catovic; Dhiana Deva Cavacanti Rocha; Dhiana Deva Cavacanti Rocha (2024). CompanyKG Dataset V2.0: A Large-Scale Heterogeneous Graph for Company Similarity Quantification [Dataset]. http://doi.org/10.5281/zenodo.11391315
Explore at:
application/gzip, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11391315
Dataset updated
Jun 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lele Cao; Lele Cao; Vilhelm von Ehrenheim; Vilhelm von Ehrenheim; Mark Granroth-Wilding; Mark Granroth-Wilding; Richard Anselmo Stahl; Richard Anselmo Stahl; Drew McCornack; Drew McCornack; Armin Catovic; Armin Catovic; Dhiana Deva Cavacanti Rocha; Dhiana Deva Cavacanti Rocha
Time period covered
May 29, 2024
Description
CompanyKG is a heterogeneous graph consisting of 1,169,931 nodes and 50,815,503 undirected edges, with each node representing a real-world company and each edge signifying a relationship between the connected pair of companies.

Edges: We model 15 different inter-company relations as undirected edges, each of which corresponds to a unique edge type. These edge types capture various forms of similarity between connected company pairs. Associated with each edge of a certain type, we calculate a real-numbered weight as an approximation of the similarity level of that type. It is important to note that the constructed edges do not represent an exhaustive list of all possible edges due to incomplete information. Consequently, this leads to a sparse and occasionally skewed distribution of edges for individual relation/edge types. Such characteristics pose additional challenges for downstream learning tasks. Please refer to our paper for a detailed definition of edge types and weight calculations.

Nodes: The graph includes all companies connected by edges defined previously. Each node represents a company and is associated with a descriptive text, such as "Klarna is a fintech company that provides support for direct and post-purchase payments ...". To comply with privacy and confidentiality requirements, we encoded the text into numerical embeddings using four different pre-trained text embedding models: mSBERT (multilingual Sentence BERT), ADA2, SimCSE (fine-tuned on the raw company descriptions) and PAUSE.

Evaluation Tasks. The primary goal of CompanyKG is to develop algorithms and models for quantifying the similarity between pairs of companies. In order to evaluate the effectiveness of these methods, we have carefully curated three evaluation tasks:

Similarity Prediction (SP). To assess the accuracy of pairwise company similarity, we constructed the SP evaluation set comprising 3,219 pairs of companies that are labeled either as positive (similar, denoted by "1") or negative (dissimilar, denoted by "0"). Of these pairs, 1,522 are positive and 1,697 are negative.

Competitor Retrieval (CR). Each sample contains one target company and one of its direct competitors. It contains 76 distinct target companies, each of which has 5.3 competitors annotated in average. For a given target company A with N direct competitors in this CR evaluation set, we expect a competent method to retrieve all N competitors when searching for similar companies to A.

Similarity Ranking (SR) is designed to assess the ability of any method to rank candidate companies (numbered 0 and 1) based on their similarity to a query company. Paid human annotators, with backgrounds in engineering, science, and investment, were tasked with determining which candidate company is more similar to the query company. It resulted in an evaluation set comprising 1,856 rigorously labeled ranking questions. We retained 20% (368 samples) of this set as a validation set for model development.

Edge Prediction (EP) evaluates a model's ability to predict future or missing relationships between companies, providing forward-looking insights for investment professionals. The EP dataset, derived (and sampled) from new edges collected between April 6, 2023, and May 25, 2024, includes 40,000 samples, with edges not present in the pre-existing CompanyKG (a snapshot up until April 5, 2023).

Background and Motivation

In the investment industry, it is often essential to identify similar companies for a variety of purposes, such as market/competitor mapping and Mergers & Acquisitions (M&A). Identifying comparable companies is a critical task, as it can inform investment decisions, help identify potential synergies, and reveal areas for growth and improvement. The accurate quantification of inter-company similarity, also referred to as company similarity quantification, is the cornerstone to successfully executing such tasks. However, company similarity quantification is often a challenging and time-consuming process, given the vast amount of data available on each company, and the complex and diversified relationships among them.

While there is no universally agreed definition of company similarity, researchers and practitioners in PE industry have adopted various criteria to measure similarity, typically reflecting the companies' operations and relationships. These criteria can embody one or more dimensions such as industry sectors, employee profiles, keywords/tags, customers' review, financial performance, co-appearance in news, and so on. Investment professionals usually begin with a limited number of companies of interest (a.k.a. seed companies) and require an algorithmic approach to expand their search to a larger list of companies for potential investment.

In recent years, transformer-based Language Models (LMs) have become the preferred method for encoding textual company descriptions into vector-space embeddings. Then companies that are similar to the seed companies can be searched in the embedding space using distance metrics like cosine similarity. The rapid advancements in Large LMs (LLMs), such as GPT-3/4 and LLaMA, have significantly enhanced the performance of general-purpose conversational models. These models, such as ChatGPT, can be employed to answer questions related to similar company discovery and quantification in a Q&A format.

However, graph is still the most natural choice for representing and learning diverse company relations due to its ability to model complex relationships between a large number of entities. By representing companies as nodes and their relationships as edges, we can form a Knowledge Graph (KG). Utilizing this KG allows us to efficiently capture and analyze the network structure of the business landscape. Moreover, KG-based approaches allow us to leverage powerful tools from network science, graph theory, and graph-based machine learning, such as Graph Neural Networks (GNNs), to extract insights and patterns to facilitate similar company analysis. While there are various company datasets (mostly commercial/proprietary and non-relational) and graph datasets available (mostly for single link/node/graph-level predictions), there is a scarcity of datasets and benchmarks that combine both to create a large-scale KG dataset expressing rich pairwise company relations.

Source Code and Tutorial:
https://github.com/llcresearch/CompanyKG2

Paper: to be published
Pathway2Text: Dataset for Biomedical Pathway Description Generation
zenodo.org
data.niaid.nih.gov
zip
Updated May 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junwei Yang; Zequn Liu; Ming Zhang; Sheng Wang; Junwei Yang; Zequn Liu; Ming Zhang; Sheng Wang (2022). Pathway2Text: Dataset for Biomedical Pathway Description Generation [Dataset]. http://doi.org/10.5281/zenodo.6510039
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6510039
Dataset updated
May 3, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Junwei Yang; Zequn Liu; Ming Zhang; Sheng Wang; Junwei Yang; Zequn Liu; Ming Zhang; Sheng Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset of the NAACL 2022 paper:

Pathway2Text: Dataset and Method for Biomedical Pathway Description Generation.

This dataset contains 2,367 pairs of biomedical pathways and textual descriptions. It can be used for automatic pathway description generation. In our paper, we showed it is also appropriate for Text2Graph and BioNER.

Read readme.pdf for detaild information.
f
Statistical and text graph data of each dataset.
figshare.com
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hend Alrasheed (2023). Statistical and text graph data of each dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0255127.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0255127.t006
Dataset updated
Jun 9, 2023
Dataset provided by
PLOS ONE
Authors
Hend Alrasheed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Number of words and number of tokens denote the number of words in the dataset before and after preprocessing respectively. Direct edges and indirect edges represent the number of direct and indirect synonym relationships between words in the text graph respectively.
T
Open Text | OTC - PE Price to Earnings
tradingeconomics.com
cdn.tradingeconomics.com
csv, excel, json, xml
Updated Sep 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). Open Text | OTC - PE Price to Earnings [Dataset]. https://tradingeconomics.com/otc:cn:pe
Explore at:
json, excel, csv, xmlAvailable download formats
Dataset updated
Sep 15, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Mar 27, 2025
Area covered
Canada
Description
Open Text reported 16.43 in PE Price to Earnings for its fiscal quarter ending in September of 2024. Data for Open Text | OTC - PE Price to Earnings including historical, tables and charts were last updated by Trading Economics this last March in 2025.
f
Text 100 H1B cases
f1hire.com
Updated Sep 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FrogHire.ai (2024). Text 100 H1B cases [Dataset]. https://www.f1hire.com/company/text-100
Explore at:
Dataset updated
Sep 30, 2024
Dataset provided by
FrogHire.ai
Description
The H1B Sponsorship Trends linear chart shows the number of H1B cases filed by Text 100 from 2020 to 2023, providing a clear view of filing trends over time. Alongside, the horizontal bar chart titled Distribution of Job Fields Receiving H1B Sponsorship breaks down which roles and industries are most commonly sponsored.
f
Configuration of reduced datasets.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leila M. Naeni; Hugh Craig; Regina Berretta; Pablo Moscato (2023). Configuration of reduced datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0157988.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0157988.t006
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Leila M. Naeni; Hugh Craig; Regina Berretta; Pablo Moscato
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Configuration of reduced datasets.
Table to Text Generation Utils
kaggle.com
Updated Feb 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aishik Rakshit (2022). Table to Text Generation Utils [Dataset]. https://www.kaggle.com/datasets/aishikai/table-to-text-generation-utils/suggestions?status=pending&yourSuggestions=true
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 4, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aishik Rakshit
Description
Dataset

This dataset was created by Aishik Rakshit

Contents
f
Data and code for: Variational Graph Author Topic Modeling
figshare.com
researchdata.smu.edu.sg
zip
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ZHANG, CE (SMU); Hady Wirawan LAUW (2023). Data and code for: Variational Graph Author Topic Modeling [Dataset]. http://doi.org/10.25440/smu.21378237.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.21378237.v1
Dataset updated
Jun 3, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
ZHANG, CE (SMU); Hady Wirawan LAUW
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the tensorflow implementation of KDD-2022 paper "Variational Graph Author Topic Modeling" by Delvin Ce Zhang and Hady W. Lauw.

VGATM is a Graph Neural Network model that extracts interpretable topics for documents with authors and venues. Topics of documents then fulfill document classification, citation prediction, etc.
f
Best solutions found by iMA-Net in 10 kNN graphs derived from each reduced...
figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leila M. Naeni; Hugh Craig; Regina Berretta; Pablo Moscato (2023). Best solutions found by iMA-Net in 10 kNN graphs derived from each reduced dataset (G1-G5). [Dataset]. http://doi.org/10.1371/journal.pone.0157988.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0157988.t007
Dataset updated
Jun 15, 2023
Dataset provided by
PLOS ONE
Authors
Leila M. Naeni; Hugh Craig; Regina Berretta; Pablo Moscato
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The highest values of NMI, ARI and NMI×ARI in each dataset are denoted in bold.
Additional file 2 of Mining a stroke knowledge graph from literature
figshare.com
xlsx
Updated Feb 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xi Yang; Chengkun Wu; Goran Nenadic; Wei Wang; Kai Lu (2024). Additional file 2 of Mining a stroke knowledge graph from literature [Dataset]. http://doi.org/10.6084/m9.figshare.15080412.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.15080412.v1
Dataset updated
Feb 22, 2024
Dataset provided by
figshare
Authors
Xi Yang; Chengkun Wu; Goran Nenadic; Wei Wang; Kai Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 2. The list of stroke-related symptoms.
T
Open Text | OTC - Ebit
tradingeconomics.com
csv, excel, json, xml
Updated Sep 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). Open Text | OTC - Ebit [Dataset]. https://tradingeconomics.com/otc:cn:ebit
Explore at:
csv, excel, json, xmlAvailable download formats
Dataset updated
Sep 15, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Mar 26, 2025
Area covered
Canada
Description
Open Text reported $411.68M in EBIT for its fiscal quarter ending in September of 2024. Data for Open Text | OTC - Ebit including historical, tables and charts were last updated by Trading Economics this last March in 2025.
d
Table containing descriptions of column headings in...
catalog.data.gov
search.dataone.org
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Table containing descriptions of column headings in All_georef_images_descriptive_information_table.csv table [Dataset]. https://catalog.data.gov/dataset/table-containing-descriptions-of-column-headings-in-all-georef-images-descriptive-informat
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The .csv table is part of a dataset package that was compiled for use as mineral assessment guidance in the Sagebrush Mineral-Resource Assessment project (SaMiRA). Mineral potential maps from previous mineral-resource assessments which included areas of the SaMiRA project areas were georeferenced. The images were clipped to the extent of the map and all explanatory text, gathered from map explanations or report text, was recorded into the All_georef_images_descriptive_information_table.csv table. This table lists and describes the column headings in the All_georef_images_descriptive_information_table.csv table.