6 datasets found

DBLP-Scholar
kaggle.com
zip
Updated Apr 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mostafa Massoud (2022). DBLP-Scholar [Dataset]. https://www.kaggle.com/datasets/mostafafathy4869/dblpscholar/suggestions
Explore at:
zip(4211634 bytes)Available download formats
Dataset updated
Apr 19, 2022
Authors
Mostafa Massoud
Description
Datasets for Binary Entity Resolution

Source Page : DBLP-Source

In the VLDB 2010 paper [1] we present a first comparative evaluation on the relative match quality and runtime efficiency of entity resolution approaches using challenging real-world match tasks. The evaluation considers existing approaches both with and without using machine learning to find suitable parameterization and combination of similarity functions. In addition to approaches from the research community a state-of-the-art commercial entity resolution implementation is considered. Our results indicate significant quality and efficiency differences between different approaches. We also find that some challenging resolution tasks such as matching product entities from online shops are not sufficiently solved with conventional approaches based on the similarity of attribute values.

The dataset consists of 3 tables:

Two lists of academic publications: DBLP and Scholar. 1. DBLP1.csv: Contain no redundant 2. Scholar.csv: Contain messy data with redundant entities. 3. DBLP-Scholar_PerfectMapping.csv: The perfect mapping for entities between both tables.

Workflow:

Provide an approach to find the perfect mapping between entities from the DBLP1 dataset and Scholar dataset to find same documents from DBLP dataset that is in Scholar dataset or duplicated in the Scholar
Cora Dataset
search.gesis.org
Updated Oct 29, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ramezani, Mahin (2021). Cora Dataset [Dataset]. http://doi.org/10.3886/E109167V2-11132
Explore at:
Unique identifier
https://doi.org/10.3886/E109167V2-11132
Dataset updated
Oct 29, 2021
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
GESIS search
Authors
Ramezani, Mahin
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de675664https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de675664
Description
Abstract (en): The Cora data contains bibliographic records of machine learning papers that have been manually clustered into groups that refer to the same publication. Originally, Cora was prepared by Andrew McCallum, and his versions of this data set are available on his Data web page. The data is also hosted here. Note that various versions of the Cora data set have been used by many publications in record linkage and entity resolution over the years.
G
Named Entity Linking AI Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Named Entity Linking AI Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/named-entity-linking-ai-market
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Sep 1, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Named Entity Linking AI Market Outlook

According to our latest research, the global Named Entity Linking AI market size in 2024 stands at USD 1.42 billion, demonstrating robust momentum driven by the proliferation of AI-powered data analytics and natural language processing technologies. The market is forecasted to reach USD 7.98 billion by 2033, expanding at a remarkable CAGR of 21.2% during the period from 2025 to 2033. This significant growth is primarily propelled by the escalating adoption of AI for automating information extraction and enhancing digital content understanding across various industries.

The surge in demand for advanced natural language processing (NLP) solutions is a major growth driver for the Named Entity Linking AI market. As organizations accumulate vast volumes of unstructured data from multiple digital channels, the need for automated tools to identify, disambiguate, and link entities within text has become critical. Named Entity Linking (NEL) AI solutions enable businesses to extract actionable insights from text, improve search relevance, and enhance customer experiences. Sectors such as BFSI, healthcare, and e-commerce are increasingly leveraging NEL AI to streamline compliance, personalize content, and automate document processing, which is fueling widespread adoption.

Another pivotal growth factor is the integration of Named Entity Linking AI into knowledge graph construction and content recommendation systems. Enterprises are investing heavily in AI-driven knowledge management tools to organize and contextualize data, making information retrieval more efficient. NEL AI plays a crucial role in building and maintaining knowledge graphs by accurately linking entities to real-world concepts and databases. This capability is invaluable for applications ranging from enterprise search and digital assistants to fraud detection and sentiment analysis. The growing focus on digital transformation and intelligent automation is expected to further accelerate the deployment of NEL AI solutions across diverse verticals.

The continuous advancements in machine learning algorithms and the increasing availability of high-quality annotated datasets have significantly enhanced the accuracy and scalability of Named Entity Linking AI. Vendors are developing more sophisticated models capable of handling multilingual data, domain-specific jargon, and context-sensitive entity resolution. The expansion of cloud computing has also democratized access to powerful NEL AI tools, enabling even small and medium enterprises to implement these solutions without substantial upfront investments. As regulatory and ethical considerations around data privacy and AI transparency become more prominent, vendors are also focusing on explainable AI and secure deployment practices, further boosting market confidence and adoption.

From a regional perspective, North America currently dominates the Named Entity Linking AI market, accounting for the largest share due to the early adoption of AI technologies and the presence of leading NLP research institutions and tech companies. However, the Asia Pacific region is witnessing the fastest growth, driven by the rapid digitization of enterprises, government initiatives promoting AI innovation, and the expanding e-commerce and fintech sectors. Europe is also a significant market, with strong investments in AI research and a growing emphasis on data-driven decision-making in both public and private sectors. Latin America and the Middle East & Africa, while still nascent, are expected to offer lucrative opportunities as digital transformation initiatives gain traction in these regions.

Ontology Management AI is increasingly becoming a vital component in the realm of Named Entity Linking AI, as it provides a structured framework for organizing and managing complex data relationships. By integrating Ontology Management AI, organizations can enhance their ability to interpret and contextualize data, leading to more accurate entity linking and improved knowledge graph construction. This integration supports the seamless alignment of data across diverse domains, facilitating better decision-making and strategic insights. As businesses continue to embrace digital transformation, the synergy between Ontology Management AI and Named Ent
Self-contained ground-truths for cross-domain linkage
figshare.com
zip
Updated Apr 28, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mayank Kejriwal (2016). Self-contained ground-truths for cross-domain linkage [Dataset]. http://doi.org/10.6084/m9.figshare.3204325.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3204325.v1
Dataset updated
Apr 28, 2016
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Mayank Kejriwal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cross-domain knowledge bases such as DBpedia, Freebase and YAGO have emerged as encyclopedic hubs in the Web of Linked Data. Despite enabling several practical applications in the Semantic Web, the large-scale, schema-free nature of such graphs often precludes research groups from employing them widely as evaluation test cases for entity resolution and instance-based ontology alignment applications. Although the ground-truth linkages between the three knowledge bases above are available, they are not amenable to resource-limited applications. One reason is that the ground-truth files are not self-contained, meaning that a researcher must usually perform a series of expensive joins (typically in MapReduce) to obtain usable information sets. We constructed this resource by uploading several publicly licensed data resources to the public cloud and used simple Hadoop clusters to compile, and make accessible, three cross-domain self-contained test cases involving linked instances from DBpedia, Freebase and YAGO. Self-containment is enabled by virtue of a simple NoSQL JSON-like serialization format. Potential applications for these resources, particularly related to testing transfer learning research hypotheses, are described in more detail in a paper submission in the resource track at ISWC 2016.
d
Asia Pacific B2C Consumer Contact Lookup - Privacy-Compliant Identity...
datarade.ai
.csv, .xls
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
eGentic (2025). Asia Pacific B2C Consumer Contact Lookup - Privacy-Compliant Identity Resolution [Dataset]. https://datarade.ai/data-products/asia-pacific-b2c-consumer-contact-lookup-privacy-compliant-egentic
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
May 23, 2025
Dataset authored and provided by
eGentic
Area covered
Asia, Australia, New Zealand, Philippines, Hong Kong, Indonesia, Singapore, Malaysia, Thailand, South Africa, Taiwan
Description
Key Features: • Matches emails, phone numbers, and names to consumer profiles • Appends additional contact fields and demographic attributes (where available) • Built on permission-based, privacy-compliant global data sources • High match rates for reliable identity resolution

What You Can Match & Append: • Full Name • Email Address • Phone Number • Physical Address (City, Zipcode, Country - based on availability)

Use Cases: • Customer record enrichment • Identity resolution and deduplication • Fraud prevention and validation

Data Format: Emails, Phone Numbers, or Mixed Identifier Inputs

Data Delivery: SFTP

Perfect For: • Identity & Fraud Solutions • Data Brokers & Enrichment Providers • Customer Intelligence & Insights Teams
i
Decentralized Municipal Entities v1.0 - January 2025
catalegs.ide.cat
Updated Mar 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Decentralized Municipal Entities v1.0 - January 2025 [Dataset]. https://catalegs.ide.cat/geonetwork/sidl/search?format=DWG
Explore at:
Dataset updated
Mar 27, 2025
Description
Geographic base of the decentralized municipal entities (EMD) of Catalonia, with their names and codes, derived from the delimitation files of the entities and the municipalities to which they belong, and from reference cartographic sources. Of the boundaries of the polygons of the EMD represented in this base, only those that have a recognition act, a resolution published in the DOGC or a judicial resolution that determines the current official line are definitive, as long as they have coordinates in the current reference system. The rest of the boundaries must be considered provisional. The EMD are a type of entity with a territorial scope lower than the municipality, that is, local governments below the municipalities.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mostafa Massoud (2022). DBLP-Scholar [Dataset]. https://www.kaggle.com/datasets/mostafafathy4869/dblpscholar/suggestions

DBLP-Scholar

Explore at:

zip(4211634 bytes)Available download formats

Dataset updated

Apr 19, 2022

Authors

Mostafa Massoud

Description

Datasets for Binary Entity Resolution

Source Page : DBLP-Source

In the VLDB 2010 paper [1] we present a first comparative evaluation on the relative match quality and runtime efficiency of entity resolution approaches using challenging real-world match tasks. The evaluation considers existing approaches both with and without using machine learning to find suitable parameterization and combination of similarity functions. In addition to approaches from the research community a state-of-the-art commercial entity resolution implementation is considered. Our results indicate significant quality and efficiency differences between different approaches. We also find that some challenging resolution tasks such as matching product entities from online shops are not sufficiently solved with conventional approaches based on the similarity of attribute values.

The dataset consists of 3 tables:

Two lists of academic publications: DBLP and Scholar. 1. DBLP1.csv: Contain no redundant 2. Scholar.csv: Contain messy data with redundant entities. 3. DBLP-Scholar_PerfectMapping.csv: The perfect mapping for entities between both tables.

Workflow:

Provide an approach to find the perfect mapping between entities from the DBLP1 dataset and Scholar dataset to find same documents from DBLP dataset that is in Scholar dataset or duplicated in the Scholar

Clear search

Close search

Google apps

Main menu

DBLP-Scholar

Datasets for Binary Entity Resolution

The dataset consists of 3 tables:

Workflow:

Cora Dataset

Named Entity Linking AI Market Research Report 2033

Named Entity Linking AI Market Outlook

Self-contained ground-truths for cross-domain linkage

Asia Pacific B2C Consumer Contact Lookup - Privacy-Compliant Identity...

Decentralized Municipal Entities v1.0 - January 2025

DBLP-Scholar

Datasets for Binary Entity Resolution

The dataset consists of 3 tables:

Workflow: