Data for generating a bar chart on feedback counts by category for Text.Cortex.
https://www.atmatix.pl/help/terms-of-service#copyrighthttps://www.atmatix.pl/help/terms-of-service#copyright
TEXT (TXT) - Text SA - Technical analysis chart patterns - pattern list, candlestick charts and statistics
AutoChart is a dataset for chart-to-text generation, a task that consists on generating analytical descriptions of visual plots.
Data for generating a pie chart on the distribution of feedback categories of Text.Cortex.
Sometimes a basic solid color for your map's labels and text just isn't going to cut it. Here is an ArcGIS Pro style with light and dark gradient fills and shadow/glow effects that you can apply to map text via the "Text fill symbol" picker in your label pane. Level up those labels! Make them look touchable. Glassy. Shady. Intriguing.Find a how-to here.Save this style, add it to your ArcGIS Pro project, then use it for any text (including labels).**UPDATE**I've added a symbol that makes text look like is being illuminated from below, casting a shadow upwards and behind. Pretty dramatic if you ask me. Here is an example:Happy Mapping! John Nelson
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Text reported CAD10.55B in Market Capitalization this March of 2025, considering the latest stock price and the number of outstanding shares.Data for Open Text | OTC - Market Capitalization including historical, tables and charts were last updated by Trading Economics this last March in 2025.
Large format chart of Michigan stratigraphic formations. For information or to download this resource, please see links provided.
Attribution-NonCommercial 1.0 (CC BY-NC 1.0)https://creativecommons.org/licenses/by-nc/1.0/
License information was derived automatically
This Zenodo page describes data collection, processing, and different open access data files related to the text of scientific publications from Microsoft Academic Graph (MAG) (now OpenAlex). If you use the code or data, please cite the following paper:
Arts S, Melluso N, Veugelers R (2023). Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text. https://doi.org/10.48550/arXiv.2309.16437
CompanyKG is a heterogeneous graph consisting of 1,169,931 nodes and 50,815,503 undirected edges, with each node representing a real-world company and each edge signifying a relationship between the connected pair of companies.
Edges: We model 15 different inter-company relations as undirected edges, each of which corresponds to a unique edge type. These edge types capture various forms of similarity between connected company pairs. Associated with each edge of a certain type, we calculate a real-numbered weight as an approximation of the similarity level of that type. It is important to note that the constructed edges do not represent an exhaustive list of all possible edges due to incomplete information. Consequently, this leads to a sparse and occasionally skewed distribution of edges for individual relation/edge types. Such characteristics pose additional challenges for downstream learning tasks. Please refer to our paper for a detailed definition of edge types and weight calculations.
Nodes: The graph includes all companies connected by edges defined previously. Each node represents a company and is associated with a descriptive text, such as "Klarna is a fintech company that provides support for direct and post-purchase payments ...". To comply with privacy and confidentiality requirements, we encoded the text into numerical embeddings using four different pre-trained text embedding models: mSBERT (multilingual Sentence BERT), ADA2, SimCSE (fine-tuned on the raw company descriptions) and PAUSE.
Evaluation Tasks. The primary goal of CompanyKG is to develop algorithms and models for quantifying the similarity between pairs of companies. In order to evaluate the effectiveness of these methods, we have carefully curated three evaluation tasks:
Background and Motivation
In the investment industry, it is often essential to identify similar companies for a variety of purposes, such as market/competitor mapping and Mergers & Acquisitions (M&A). Identifying comparable companies is a critical task, as it can inform investment decisions, help identify potential synergies, and reveal areas for growth and improvement. The accurate quantification of inter-company similarity, also referred to as company similarity quantification, is the cornerstone to successfully executing such tasks. However, company similarity quantification is often a challenging and time-consuming process, given the vast amount of data available on each company, and the complex and diversified relationships among them.
While there is no universally agreed definition of company similarity, researchers and practitioners in PE industry have adopted various criteria to measure similarity, typically reflecting the companies' operations and relationships. These criteria can embody one or more dimensions such as industry sectors, employee profiles, keywords/tags, customers' review, financial performance, co-appearance in news, and so on. Investment professionals usually begin with a limited number of companies of interest (a.k.a. seed companies) and require an algorithmic approach to expand their search to a larger list of companies for potential investment.
In recent years, transformer-based Language Models (LMs) have become the preferred method for encoding textual company descriptions into vector-space embeddings. Then companies that are similar to the seed companies can be searched in the embedding space using distance metrics like cosine similarity. The rapid advancements in Large LMs (LLMs), such as GPT-3/4 and LLaMA, have significantly enhanced the performance of general-purpose conversational models. These models, such as ChatGPT, can be employed to answer questions related to similar company discovery and quantification in a Q&A format.
However, graph is still the most natural choice for representing and learning diverse company relations due to its ability to model complex relationships between a large number of entities. By representing companies as nodes and their relationships as edges, we can form a Knowledge Graph (KG). Utilizing this KG allows us to efficiently capture and analyze the network structure of the business landscape. Moreover, KG-based approaches allow us to leverage powerful tools from network science, graph theory, and graph-based machine learning, such as Graph Neural Networks (GNNs), to extract insights and patterns to facilitate similar company analysis. While there are various company datasets (mostly commercial/proprietary and non-relational) and graph datasets available (mostly for single link/node/graph-level predictions), there is a scarcity of datasets and benchmarks that combine both to create a large-scale KG dataset expressing rich pairwise company relations.
Source Code and Tutorial:
https://github.com/llcresearch/CompanyKG2
Paper: to be published
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset of the NAACL 2022 paper:
Pathway2Text: Dataset and Method for Biomedical Pathway Description Generation.
This dataset contains 2,367 pairs of biomedical pathways and textual descriptions. It can be used for automatic pathway description generation. In our paper, we showed it is also appropriate for Text2Graph and BioNER.
Read readme.pdf for detaild information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of words and number of tokens denote the number of words in the dataset before and after preprocessing respectively. Direct edges and indirect edges represent the number of direct and indirect synonym relationships between words in the text graph respectively.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Text reported 16.43 in PE Price to Earnings for its fiscal quarter ending in September of 2024. Data for Open Text | OTC - PE Price to Earnings including historical, tables and charts were last updated by Trading Economics this last March in 2025.
The H1B Sponsorship Trends linear chart shows the number of H1B cases filed by Text 100 from 2020 to 2023, providing a clear view of filing trends over time. Alongside, the horizontal bar chart titled Distribution of Job Fields Receiving H1B Sponsorship breaks down which roles and industries are most commonly sponsored.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Configuration of reduced datasets.
This dataset was created by Aishik Rakshit
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the tensorflow implementation of KDD-2022 paper "Variational Graph Author Topic Modeling" by Delvin Ce Zhang and Hady W. Lauw.
VGATM is a Graph Neural Network model that extracts interpretable topics for documents with authors and venues. Topics of documents then fulfill document classification, citation prediction, etc.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The highest values of NMI, ARI and NMIĆARI in each dataset are denoted in bold.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2. The list of stroke-related symptoms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Text reported $411.68M in EBIT for its fiscal quarter ending in September of 2024. Data for Open Text | OTC - Ebit including historical, tables and charts were last updated by Trading Economics this last March in 2025.
The .csv table is part of a dataset package that was compiled for use as mineral assessment guidance in the Sagebrush Mineral-Resource Assessment project (SaMiRA). Mineral potential maps from previous mineral-resource assessments which included areas of the SaMiRA project areas were georeferenced. The images were clipped to the extent of the map and all explanatory text, gathered from map explanations or report text, was recorded into the All_georef_images_descriptive_information_table.csv table. This table lists and describes the column headings in the All_georef_images_descriptive_information_table.csv table.
Data for generating a bar chart on feedback counts by category for Text.Cortex.