Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Here you find the History of Work resources as Linked Open Data. It enables you to look ups for HISCO and HISCAM scores for an incredible amount of occupational titles in numerous languages.
Data can be queried (obtained) via the SPARQL endpoint or via the example queries. If the Linked Open Data format is new to you, you might enjoy these data stories on History of Work as Linked Open Data and this user question on Is there a list of female occupations?.
This version is dated Apr 2025 and is not backwards compatible with the previous version (Feb 2021). The major changes are: - incredible simplification of graph representation (from 81 to 12); - use of sdo (https://schema.org/) rather than schema (http://schema.org); - replacement of prov:wasDerivedFrom with sdo:isPartOf to link occupational titles to originating datasets; - etl files (used for conversion to Linked Data) now publicly available via https://github.com/rlzijdeman/rdf-hisco; - update of issues with language tags; - specfication of language tags for english (eg. @en-gb, instead of @en); - new preferred API: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/sparql (old API will be deprecated at some point: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/services/historyOfWork-all-latest/sparql ) .
There are bound to be some issues. Please leave report them here.
Figure 1. Part of model illustrating the basic relation between occupations, schema.org and HISCO.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca5521" alt="hisco-basic">
Figure 2. Part of model illustrating the relation between occupation, provenance and HISCO auxiliary variables.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca551e" alt="hisco-aux">
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The graph technology market is experiencing robust growth, driven by the increasing need for advanced data analytics and the rising adoption of artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by the ability of graph databases to handle complex, interconnected data more efficiently than traditional relational databases. This is particularly crucial in industries like finance (fraud detection, risk management), healthcare (patient relationship mapping, drug discovery), and e-commerce (recommendation systems, personalized marketing). Key trends include the move towards cloud-based graph solutions, the integration of graph technology with other data management systems, and the development of more sophisticated graph algorithms for advanced analytics. While challenges remain, such as the need for skilled professionals and the complexity of implementing graph databases, the overall market outlook remains positive, with a projected Compound Annual Growth Rate (CAGR) – let's conservatively estimate this at 25% – for the forecast period 2025-2033. This growth will be driven by ongoing digital transformation initiatives across various sectors, leading to an increased demand for efficient data management and analytics capabilities. We can expect to see continued innovation in both open-source and commercial graph database solutions, further fueling the market's expansion. The competitive landscape is characterized by a mix of established players like Oracle, IBM, and Microsoft, alongside emerging innovative companies such as Neo4j, TigerGraph, and Amazon Web Services. These companies are constantly vying for market share through product innovation, strategic partnerships, and acquisitions. The presence of both open-source and proprietary solutions caters to a diverse range of needs and budgets. The market segmentation, while not explicitly detailed, likely includes categories based on deployment (cloud, on-premise), database type (property graph, RDF), and industry vertical. The regional distribution will likely show strong growth in North America and Europe, reflecting the higher adoption of advanced technologies in these regions, followed by a steady rise in Asia-Pacific and other developing markets. Looking ahead, the convergence of graph technology with other emerging technologies like blockchain and the Internet of Things (IoT) promises to unlock even greater opportunities for growth and innovation in the years to come.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Master Data Graph Platforms market size reached USD 2.14 billion in 2024, reflecting robust demand across diverse industries. The market is projected to expand at a CAGR of 18.2% from 2025 to 2033, culminating in a forecasted market size of USD 10.53 billion by 2033. This remarkable growth is primarily driven by the increasing adoption of graph-based master data management solutions to address complex data relationships, enhance decision-making, and ensure regulatory compliance in an era defined by digital transformation and data-centric business models.
The primary growth factor fueling the Master Data Graph Platforms market is the explosive surge in enterprise data volumes and complexity. As organizations accumulate vast amounts of structured and unstructured data from multiple sources, the limitations of traditional relational databases have become increasingly apparent. Businesses are recognizing that graph platforms offer a highly flexible and scalable approach to modeling, integrating, and querying interconnected data. This capability is especially critical for organizations seeking to derive actionable insights from complex relationships, such as customer journeys, supply chain dependencies, and risk exposure. The market is further propelled by the need for real-time data integration and management, as enterprises strive to achieve a single, unified view of their master data across disparate systems and geographies.
Another significant driver is the growing emphasis on data governance, compliance, and risk management across regulated industries such as BFSI, healthcare, and government. Regulatory mandates like GDPR, HIPAA, and CCPA have heightened the importance of data lineage, transparency, and traceability. Master Data Graph Platforms excel at mapping data relationships and tracking data flows, enabling organizations to meet stringent compliance requirements while minimizing operational risk. The ability to visualize and audit data connections in real-time is a compelling value proposition, prompting enterprises to invest in advanced graph-based solutions that can adapt to evolving regulatory landscapes and safeguard sensitive information.
The proliferation of digital transformation initiatives, cloud migration, and the adoption of advanced analytics and artificial intelligence are also fueling market expansion. As organizations modernize their IT infrastructure and transition to cloud-native architectures, the demand for scalable, cloud-based master data management solutions is accelerating. The integration of Master Data Graph Platforms with AI and machine learning tools enhances the ability to uncover hidden patterns, automate data quality processes, and deliver personalized customer experiences. This convergence of technologies is creating new opportunities for innovation and competitive differentiation, further amplifying the market's growth trajectory.
From a regional perspective, North America continues to dominate the Master Data Graph Platforms market, accounting for the largest share in 2024, driven by early technology adoption, a mature digital ecosystem, and significant investments in data-driven initiatives. Europe is witnessing robust growth due to stringent data privacy regulations and the widespread adoption of advanced analytics in sectors such as finance, healthcare, and manufacturing. The Asia Pacific region is emerging as a high-growth market, fueled by rapid digitalization, expanding IT infrastructure, and increasing demand for data management solutions among enterprises in China, India, Japan, and Southeast Asia. Latin America and the Middle East & Africa are also showing promising growth, albeit from a smaller base, as organizations in these regions embark on digital transformation journeys and seek to enhance operational efficiency through better data management.
The Component segment of the Master Data Graph Platforms market is primarily categorized into software and services, both of which play pivotal roles in driving the adoption and effectiveness of graph-based master data management solutions. The software component encompasses graph databases, data modeling tools, integration frameworks, and analytics engines that form the backbone of modern master data platforms. These software solutions are designed to facilitate the ingestion, storage, querying, and
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This is your workbench for historical occupations as all graphs from the historyOfWork are combined here. This version is dated Feb 2021. Use this dataset to retrieve HISCO and HISCAM scores for an incredible amount of occupations in numerous languages.
Data can be queried (obtained) via the SPARQL endpoint or via the example queries. If the Linked Open Data format is new to you, you might enjoy these data stories on History of Work as Linked Open Data and this user question on Is there a list of female occupations?.
Figure 1. Part of model illustrating the basic relation between occupations, schema.org and HISCO.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca5521" alt="hisco-basic">
Figure 2. Part of model illustrating the relation between occupation, provenance and HISCO auxiliary variables.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca551e" alt="hisco-aux">
Facebook
Twitter
According to our latest research, the global Graph Database Platforms for Supply Chain market size reached USD 1.84 billion in 2024, reflecting robust demand across multiple industries. The market is projected to register a compelling CAGR of 19.7% from 2025 to 2033, with the total market value expected to reach USD 9.16 billion by 2033. This impressive growth is primarily driven by the increasing complexity of global supply chains, the need for real-time data analytics, and the rapid adoption of digital transformation initiatives in logistics, manufacturing, and retail sectors. As per our most recent analysis, organizations are increasingly leveraging graph database platforms to enhance visibility, optimize operations, and address supply chain disruptions more effectively.
The primary growth factor fueling the expansion of the Graph Database Platforms for Supply Chain market is the escalating demand for advanced data management solutions capable of handling the intricate relationships and dependencies inherent in modern supply chains. Traditional relational databases often struggle with the dynamic and interconnected nature of supply chain data, which includes suppliers, manufacturers, logistics partners, and end customers. Graph databases, by contrast, are designed to efficiently map and analyze these complex networks, enabling organizations to gain actionable insights, identify bottlenecks, and mitigate risks. The ability to visualize and traverse vast data sets in real time is particularly valuable in scenarios involving multi-tier suppliers, global logistics, and compliance requirements, thus propelling the adoption of graph database platforms across industries.
Another significant driver is the growing emphasis on supply chain resilience and risk management, especially in the wake of global disruptions such as pandemics, geopolitical tensions, and natural disasters. Organizations are increasingly recognizing the importance of end-to-end supply chain visibility to anticipate and respond to potential threats. Graph database platforms facilitate real-time monitoring and predictive analytics, empowering businesses to proactively manage risks and ensure business continuity. Enhanced traceability and compliance capabilities also support industries with stringent regulatory requirements, such as healthcare, automotive, and food & beverage, further accelerating market growth. Additionally, the integration of artificial intelligence and machine learning with graph databases amplifies their value, allowing for advanced scenario modeling, anomaly detection, and optimization.
Digital transformation initiatives, particularly the adoption of cloud computing and the Internet of Things (IoT), are further catalyzing the growth of the Graph Database Platforms for Supply Chain market. Cloud-based deployment models offer scalability, flexibility, and cost-effectiveness, making graph database solutions accessible to organizations of all sizes, including small and medium enterprises. The proliferation of IoT devices throughout supply chains generates massive volumes of interconnected data, which graph databases are uniquely equipped to manage and analyze. This convergence of technologies is fostering innovative applications in inventory management, logistics optimization, and supplier collaboration, thereby expanding the addressable market and driving sustained investment in graph database platforms.
From a regional perspective, North America currently dominates the Graph Database Platforms for Supply Chain market, accounting for the largest revenue share in 2024. This leadership position can be attributed to the region’s advanced IT infrastructure, high levels of digitalization, and strong presence of major technology providers. However, Asia Pacific is projected to exhibit the highest CAGR over the forecast period, fueled by rapid industrialization, expanding e-commerce, and significant investments in supply chain modernization. Europe is also witnessing robust growth, driven by regulatory requirements for traceability and sustainability, particularly in manufacturing and automotive sectors. Latin America and the Middle East & Africa are emerging markets with increasing adoption of graph database technologies, supported by growing awareness and digital transformation initiatives.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets and experiments from two areas: 1) how much disk space graphs occupy before and after the transformation, 2) case studies show how our proposal can be extended with additional semantics
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This project is inspired on https://github.com/neo4j-graph-examples/twitter-v2.
Show data from your personal Twitter account
The Graph Your Network application inserts your Twitter activity into Neo4j.
https://neo4jsandbox.com/guides/twitter/img/twitter-data-model.svg" alt="">
~10 MB of graphs data (CSV)
43.325 node labels - Hashtag - Link - Me - Source - Tweet - User
57.896 relationship types - AMPLIFIES - CONTAINS - FOLLOWS - INTERACTS_WITH - MENTIONS - POSTS - REPLY_TO - RETWEETS - RT_MENTIONS - SIMILAR_TO - TAGS - USING
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Discover the booming Knowledge Graph Technology market! This comprehensive analysis reveals key trends, growth drivers, and regional market shares from 2025-2033. Learn about market size, CAGR, and top players shaping this transformative technology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data set and complete code
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Knowledge Graph Technology market is experiencing robust growth, driven by the increasing need for enhanced data interoperability, improved data analysis capabilities, and the rising adoption of artificial intelligence (AI) and machine learning (ML) across various industries. The market's expansion is fueled by the advantages of knowledge graphs in improving decision-making processes, streamlining operations, and fostering innovation. Specific applications, such as semantic search, personalized recommendations, and fraud detection, are witnessing significant traction. While precise market size figures are unavailable, a conservative estimate places the 2025 market value at $5 billion, with a Compound Annual Growth Rate (CAGR) of 25% projected through 2033. This growth trajectory is supported by the escalating demand for efficient data management solutions in sectors like healthcare, finance, and retail, where knowledge graphs can significantly enhance operational efficiency and strategic decision-making. Technological advancements, particularly in graph database technologies and semantic web technologies, further bolster market expansion. However, the market faces challenges such as the complexity of knowledge graph implementation, the need for specialized expertise, and data integration issues across disparate sources. Despite these challenges, the long-term outlook for knowledge graph technology remains positive, driven by continuous technological innovations and the growing recognition of its transformative potential across diverse sectors. The segmentation of the Knowledge Graph Technology market reveals significant opportunities within various application areas and technology types. Application-wise, semantic search and recommendation engines are currently leading the market, while emerging applications in areas such as risk management and supply chain optimization are poised for rapid growth in the coming years. In terms of technology types, ontology engineering and graph databases are experiencing high demand. Regionally, North America and Europe currently dominate the market due to early adoption and established technological infrastructure. However, the Asia-Pacific region is projected to witness significant growth, spurred by increasing digitalization and investments in AI and ML initiatives. Competitive landscape analysis reveals a mix of established technology providers and emerging startups, creating a dynamic and competitive ecosystem. The continuous evolution of technologies and the expansion into new applications will continue to shape the market's growth and trajectory over the forecast period.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present the Data Set Knowledge Graph (DSKG.org), an RDF dataset about datasets that are linked to publications (modeled in the Microsoft Academic Knowledge Graph, MAKG) that mention the datasets. The metadata of the datasets is based on datasets that are registered in OpenAIRE and Wikidata.
What exactly do we provide?
Periodically updated RDF dump files of the Data Set Knowledge Graph.
URI resolution of the Data Set Knowledge Graph within the Linked Open Data.
A publicly accessible SPARQL endpoint containing the latest Dataset Knowledge Graph data.
How big is the Dataset Knowledge Graph?
The Dataset Knowledge Graph models, among others,
2,208 datasets from all scientific disciplines
813,551 links to 634,803 unique papers
1,169 authors of datasets
208 ORCID IDs.
Potential use cases:
Use the DSKG for the development of semantic search engines (e.g. use the metadata of the linked publications of the datasets for advanced search capabilities)
Easier data integration by using the RDF standard vocabulary DCAT and by linking resources to other data sources (e.g., combining the DSKG with other dataset collections in RDF).
Data analysis to measure and award the provisioning of datasets (e.g., determine the scientific influence of datasets and authors).
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Graph Database-as-a-Service market size reached USD 2.1 billion in 2024, reflecting a robust expansion across multiple industries. The market is exhibiting a strong compound annual growth rate (CAGR) of 25.6%, and is projected to attain a value of USD 15.2 billion by 2033. This impressive growth trajectory is primarily driven by the increasing demand for highly scalable, flexible, and cloud-native data management solutions that can efficiently handle complex, interconnected datasets. The proliferation of digital transformation initiatives, surging adoption of advanced analytics, and the critical need for real-time data insights are further propelling the market forward, as organizations across sectors strive to optimize operations and unlock new business opportunities through graph-based technologies.
A significant factor fueling the expansion of the Graph Database-as-a-Service market is the escalating complexity of enterprise data environments. Traditional relational databases are often ill-equipped to manage the intricate relationships and dynamic data structures prevalent in modern business contexts. As a result, organizations are turning to graph databases for their ability to model, store, and analyze highly connected data efficiently. The rise of artificial intelligence, machine learning, and big data analytics has also intensified the need for data platforms that can seamlessly integrate with these technologies. Graph Database-as-a-Service solutions, with their cloud-native architecture and managed service offerings, enable businesses to rapidly deploy, scale, and maintain graph databases without the overhead of on-premises infrastructure, thus accelerating innovation and reducing operational costs.
Another key growth driver is the surge in demand for real-time analytics and personalized customer experiences across industries such as BFSI, retail, healthcare, and telecommunications. Graph databases excel at uncovering hidden patterns, detecting fraud, and enabling recommendation engines, which are critical for delivering tailored services and mitigating risks. Enterprises are leveraging Graph Database-as-a-Service platforms to enhance customer analytics, streamline risk and compliance management, and optimize network and IT operations. The flexibility of deployment models—including public, private, and hybrid cloud—further amplifies adoption, as organizations can select the architecture that best aligns with their security, scalability, and regulatory requirements. The integration of graph databases with existing IT ecosystems and the availability of robust APIs and developer tools are making it increasingly accessible for businesses of all sizes to harness the power of connected data.
From a regional perspective, North America continues to dominate the Graph Database-as-a-Service market, owing to its advanced technological infrastructure, early adoption of cloud computing, and a vibrant ecosystem of innovative startups and established enterprises. Europe is witnessing rapid growth, driven by stringent data privacy regulations and the increasing digitalization of industries. The Asia Pacific region is emerging as a significant growth engine, propelled by the expansion of e-commerce, financial services, and healthcare sectors, coupled with substantial investments in digital transformation initiatives. As organizations worldwide recognize the strategic value of graph data management, the market is expected to experience widespread adoption across both developed and emerging economies, with tailored solutions catering to diverse industry verticals and regulatory landscapes.
The Graph Database-as-a-Service market is segmented by component into software and services, each playing a pivotal role in shaping the overall market dynamics. The software segment encompasses the core graph database platforms and associated tools that facilitate data modeling, querying, visualization, and integration. These platforms are designed to deliver high performance, scalability, and ease of use, enabling organizations to manage complex relationships and large volumes of interconnected data seamlessly. Leading vendors are continuously innovating, introducing advanced features such as multi-model support, enhanced security, and automated scaling, which are driving widespread adoption across various industry verticals. The software component is particularly critical for enterprise
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Zhang et al. (https://link.springer.com/article/10.1140/epjb/e2017-80122-8) suggest a temporal random network with changing dynamics that follow a Markov process, allowing for a continuous-time network history moving from a static definition of a random graph with a fixed number of nodes n and edge probability p to a temporal one. Defining lambda = probability per time granule of a new edge to appear and mu = probability per time granule of an existing edge to disappear, Zhang et al. show that the equilibrium probability of an edge is p=lambda/(lambda+mu) Our implementation, a Python package that we refer to as RandomDynamicGraph https://github.com/ScanLab-ossi/DynamicRandomGraphs, generates large-scale dynamic random graphs according to the defined density. The package focuses on massive data generation; it uses efficient math calculations, writes to file instead of in-memory when datasets are too large, and supports multi-processing. Please note the datetime is arbitrary.
Facebook
Twitterhttps://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Graph data represents complex relationships across diverse domains, from social networks to healthcare and chemical sciences. However, real-world graph data often spans multiple modalities, including time-varying signals from sensors, semantic information from textual representations, and domain-specific encodings. This dissertation introduces innovative multimodal learning techniques for graph-based predictive modeling, addressing the intricate nature of these multidimensional data representations. The research systematically advances graph learning through innovative methodological approaches across three critical modalities. Initially, we establish robust graph-based methodological foundations through advanced techniques including prompt tuning for heterogeneous graphs and a comprehensive framework for imbalanced learning on graph data. we then extend these methods to time series analysis, demonstrating their practical utility through applications such as hierarchical spatio-temporal modeling for COVID-19 forecasting and graph-based density estimation for anomaly detection in unmanned aerial systems. Finally, we explore textual representations of graphs in the chemical domain, reformulating reaction yield prediction as an imbalanced regression problem to enhance performance in underrepresented high-yield regions critical to chemists.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Following the format of the Open Graph Benchmark (OGB), we design four prediction tasks of relations (mag-write, mag-cite) and higher-order patterns (tags-math, DBLP-coauthor) and construct the corresponding datasets over heterogeneous graphs and hypergraphs [1]. The original ogb-mag dataset only contains features for 'paper'-type nodes. We add the node embedding provided by [2] as raw features for other node types in MAG(P-A)/(P-P). For these four tasks, the model is evaluated by one positive query paired with a certain number of randomly sampled negative queries (1:1000 by default, except for tags-math 1:100).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Graph database has developed rapidly and plays an important role in research nowadays. It helps scientists in various ways, e.g., finding related works, exploring works in a research area, or gaining knowledge from connections between different nodes. There are already some graph databases for research available on the Internet. However, they do not meet the needs of Digital Humanities (DH) scientists, who mainly work with historical data. Therefore, we create a graph database specifically for DH scientists. This database is part of MINE, a service that facilitates data acquisition and big data analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
November 2020: Please check out the newer version of the OpenAIRE Research Graph dump available at https://doi.org/10.5281/zenodo.4201546. The newer version contains json files that are more compact and easy to process. learn more about the OpenAIRE Research Graph at https://graph.openaire.eu.
The OpenAIRE Research Graph is exported as several dumps, so you can download the parts you are interested into.
Please go to http://develop.openaire.eu/graph-dumps.html for instructions on how to consume the dumps.
Libraries: this blog describes the openairegraph libraries, which can be used to perform analytics on this dataset.
Facebook
TwitterGlobal trade data of Graph under 6815190000, 6815190000 global trade data, trade data of Graph from 80+ Countries.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Graphs are a representative type of fundamental data structures. They are capable of representing complex association relationships in diverse domains. For large-scale graph processing, the stream graphs have become efficient tools to process dynamically evolving graph data. When processing stream graphs, the subgraph counting problem is a key technique, which faces significant computational challenges due to its #P-complete nature. This work introduces StreamSC, a novel framework that efficiently estimate subgraph counting results on stream graphs through two key innovations: (i) It’s the first learning-based framework to address the subgraph counting problem focused on stream graphs; and (ii) this framework addresses the challenges from dynamic changes of the data graph caused by the insertion or deletion of edges. Experiments on 5 real-word graphs show the priority of StreamSC on accuracy and efficiency.
Facebook
TwitterThese data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-speech, tag, dependency relation, dependency head, and lemma. Each word within the lexicon was tokenized. A stop-words list was utilized only to remove punctuation and symbols from the text, excluding hyphenated words (ex. bowl-shaped) which remained as such. The tokens’ lemmas were then aggregated and totaled to find their recurrences within the lexicon. This procedure was repeated for tokenizing noun chunks using the same glossary definitions.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Here you find the History of Work resources as Linked Open Data. It enables you to look ups for HISCO and HISCAM scores for an incredible amount of occupational titles in numerous languages.
Data can be queried (obtained) via the SPARQL endpoint or via the example queries. If the Linked Open Data format is new to you, you might enjoy these data stories on History of Work as Linked Open Data and this user question on Is there a list of female occupations?.
This version is dated Apr 2025 and is not backwards compatible with the previous version (Feb 2021). The major changes are: - incredible simplification of graph representation (from 81 to 12); - use of sdo (https://schema.org/) rather than schema (http://schema.org); - replacement of prov:wasDerivedFrom with sdo:isPartOf to link occupational titles to originating datasets; - etl files (used for conversion to Linked Data) now publicly available via https://github.com/rlzijdeman/rdf-hisco; - update of issues with language tags; - specfication of language tags for english (eg. @en-gb, instead of @en); - new preferred API: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/sparql (old API will be deprecated at some point: https://api.druid.datalegend.net/datasets/HistoryOfWork/historyOfWork-all-latest/services/historyOfWork-all-latest/sparql ) .
There are bound to be some issues. Please leave report them here.
Figure 1. Part of model illustrating the basic relation between occupations, schema.org and HISCO.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca5521" alt="hisco-basic">
Figure 2. Part of model illustrating the relation between occupation, provenance and HISCO auxiliary variables.
https://druid.datalegend.net/HistoryOfWork/historyOfWork-all-latest/assets/601beed0f7d371035bca551e" alt="hisco-aux">