As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
As of June 2024, the most popular relational database management system (RDBMS) worldwide was Oracle, with a ranking score of 1244.08. Oracle was also the most popular DBMS overall. MySQL and Microsoft SQL server rounded out the top three.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Market Overview: The global in-memory database (IMDB) market is poised for substantial growth, with a projected CAGR of 19.00% from 2025 to 2033. The market size, valued at XX million in 2025, is attributed to the increasing adoption of IMDBs in various industries, including telecommunications, BFSI, logistics, retail, entertainment, and healthcare. Key drivers behind this growth include the need for real-time data processing, improved performance, and the rise of big data and analytics. Market Dynamics: The IMDB market is influenced by several trends and challenges. The growing adoption of cloud-based IMDB solutions is a key trend, as it provides flexibility and cost-effectiveness. However, security concerns and latency issues associated with cloud-based deployments pose challenges. Additionally, the increasing demand for high-performance computing and the need for faster data processing are driving the development of advanced IMDB technologies. The market is fragmented, with established players such as IBM, Oracle, and Microsoft competing alongside emerging startups like VoltDB and MemSQL. Regional variations in market maturity and adoption rates are also observed, with North America leading the way in terms of market penetration. Recent developments include: May 2022: IBM and SAP announced the extension of their collaboration as IBM embarks on a corporate transformation initiative to optimize its business operations using RISE and SAP S/4HANA Cloud. To execute work for over 1,000 legal entities in more than 120 countries and multiple IBM companies supporting hardware, software, consulting, and finance, IBM said it is transferring to SAP S/4HANA, SAP's most recent ERP system, as part of the extended relationship. The replacement for SAP R/3 and SAP ERP, SAP S/4HANA, is SAP's ERP system for large businesses. It is intended to work optimally with SAP's in-memory database, SAP HANA., November 2022: Redis, a provider of real-time in-memory databases, and Amazon Web Services have announced a multi-year strategic alliance. Redis is a networked, open-source NoSQL system that stores data on disk for durability before moving it to DRAM as necessary. It can function as a streaming engine, message broker, database, or cache. The business claims that when Redis is used as a database, apps may instantly search across tens of millions of rows of customer data to locate information specific to one particular customer. A managed database-as-a-service product on AWS is called the real-time Redis Enterprise Cloud., December 2022: The National Stock Exchange, the largest stock exchange in India, chose the Raima Database Manager (RDM) Workgroup 12.0 in-memory system as a foundational component for the next iterations of its trading platform front-end, the National Exchange for Automated Trading (NEAT).. Key drivers for this market are: Decreasing Hardware Cost, Increasing Penetration Of Trends Like Big Data And IOT; Increase In The Volume Of Data Generated And Shift Of Enterprise Operations. Potential restraints include: Resilience In Integration With VLDB'S. Notable trends are: Telecommunication End-User Industry to Hold Significant Market Share.
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Graph Database Market size was valued at USD 2.86 Billion in 2024 and is projected to reach USD 14.58 Billion by 2032, growing at a CAGR of 22.6% from 2026 to 2032.
Global Graph Database Market Drivers
The growth and development of the Graph Database Market is attributed to certain main market drivers. These factors have a big impact on how Graph Database are demanded and adopted in different sectors. Several of the major market forces are as follows:
Growth of Connected Data: Graph databases are excellent at expressing and querying relationships as businesses work with datasets that are more complex and interconnected. Graph databases are becoming more and more in demand as connected data gains significance across multiple industries.
Knowledge Graph Emergence: In fields like artificial intelligence, machine learning, and data analytics, knowledge graphs—which arrange information in a graph structure—are becoming more and more popular. Knowledge graphs can only be created and queried via graph databases, which is what is causing their widespread use.
Analytics and Machine Learning Advancements: Graph databases handle relationships and patterns in data effectively, enabling applications related to advanced analytics and machine learning. Graph databases are becoming more and more in demand when combined with analytics and machine learning as businesses want to extract more insights from their data.
Real-Time Data Processing: Graph databases can process data in real-time, which makes them appropriate for applications that need quick answers and insights. In situations like fraud detection, recommendation systems, and network analysis, this is especially helpful.
Increasing Need for Security and Fraud Detection: Graph databases are useful for fraud security and detection applications because they can identify patterns and abnormalities in linked data. The growing need for graph databases in security solutions is a result of the ongoing evolution of cybersecurity threats.
https://www.marketdatapoint.com/privacy-policyhttps://www.marketdatapoint.com/privacy-policy
The in-memory database (IMDB) market is experiencing robust growth, driven by the increasing need for real-time data processing and analytics across diverse sectors. The market's Compound Annual Growth Rate (CAGR) of 19% from 2019 to 2024 signifies a strong upward trajectory, projected to continue in the forecast period (2025-2033). This growth is fueled by the expanding adoption of cloud computing, big data analytics, and the Internet of Things (IoT). Industries like telecommunications, BFSI (Banking, Financial Services, and Insurance), and retail are major contributors, demanding high-speed transactional capabilities and advanced analytical insights from their data. The market is segmented by industry type (small and medium-sized enterprises versus large enterprises) and end-user industry, reflecting varied needs and adoption rates across different sectors. The competitive landscape includes both established players like IBM, Oracle, and Microsoft, alongside specialized IMDB vendors such as VoltDB, Redis Labs, and DataStax, fostering innovation and diverse solutions. The substantial growth in the IMDB market is anticipated to continue, driven by several factors. The rise of real-time applications, including fraud detection in BFSI, personalized customer experiences in retail, and efficient logistics management, necessitates faster data processing than traditional database systems can provide. Furthermore, the increasing volume and velocity of data generated by IoT devices necessitates solutions capable of handling high data throughput and low latency. While challenges remain, such as data security concerns and the potential for high implementation costs, the overall market outlook remains positive, with opportunities for growth across various regions, particularly in North America and Asia Pacific, due to their advanced technological infrastructure and high adoption rates of cloud-based services. The continued development of advanced features like enhanced scalability, improved security protocols, and integration with other data management tools will further propel market expansion. Recent developments include: May 2022: IBM and SAP announced the extension of their collaboration as IBM embarks on a corporate transformation initiative to optimize its business operations using RISE and SAP S/4HANA Cloud. To execute work for over 1,000 legal entities in more than 120 countries and multiple IBM companies supporting hardware, software, consulting, and finance, IBM said it is transferring to SAP S/4HANA, SAP's most recent ERP system, as part of the extended relationship. The replacement for SAP R/3 and SAP ERP, SAP S/4HANA, is SAP's ERP system for large businesses. It is intended to work optimally with SAP's in-memory database, SAP HANA., November 2022: Redis, a provider of real-time in-memory databases, and Amazon Web Services have announced a multi-year strategic alliance. Redis is a networked, open-source NoSQL system that stores data on disk for durability before moving it to DRAM as necessary. It can function as a streaming engine, message broker, database, or cache. The business claims that when Redis is used as a database, apps may instantly search across tens of millions of rows of customer data to locate information specific to one particular customer. A managed database-as-a-service product on AWS is called the real-time Redis Enterprise Cloud., December 2022: The National Stock Exchange, the largest stock exchange in India, chose the Raima Database Manager (RDM) Workgroup 12.0 in-memory system as a foundational component for the next iterations of its trading platform front-end, the National Exchange for Automated Trading (NEAT).. Key drivers for this market are: Decreasing Hardware Cost, Increasing Penetration Of Trends Like Big Data And IOT; Increase In The Volume Of Data Generated And Shift Of Enterprise Operations. Potential restraints include: Resilience In Integration With VLDB'S. Notable trends are: Telecommunication End-User Industry to Hold Significant Market Share.
In 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.
(Note: This description is taken from a draft report entitled "Creation of a Database of Lakes in the St. Johns River Water Management District of Northeast Florida" by Palmer Kinser.Introduction“Lakes are among the District’s most valued resources. Their aesthetic appeal adds substantially to waterfront property values, which in turn generate tax revenues for local governments. Fish camps and other businesses, that provide lake visitors with supplies and services, benefit local economies directly. Commercial fishing on the District’s larger lakes produces some income, , but far greater economic benefits are produced from sport fishing. Some of the best bass fishing lakes in the world occur in the District. Trophy fishing, guide services and high-stakes fishing tournaments, which they support, also generate substantial revenues for local economies. In addition, the high quality of District lakes has allowed swimming, fishing, and boating to become among the most popular outdoor activities for many District residents and attracts many visitors. Others frequently take advantage of the abundant opportunities afforded for duck hunting, bird watching, photography, and other nature related activities.”(from likelihood of harm to lakes report).ObjectiveThe objective of this work was to create a consistent database of natural lake polygon features for the St. Johns River Water Management District. Other databases examined contained point features only, polygons representing a wide range of dates, water bodies not separated or coded adequately by feature type (i.e. no distinctions were made between lakes, rivers, excavations, etc.), or were incomplete. This new database will allow users to better characterize and measure the lakes resource of the District, allowing comparisons to be made and trends detected; thereby facilitating better protection and management of the resource.BackgroundPrior to creation of this database, the District had 2 waterbody databases. The first of these, the 2002 FDEP Primary Lake Location database, contained 3859 lake point features, state-wide, 1418 of which were in SJRWMD. Only named lakes were included. Data sources were the Geographic Names Information System (GNIS), USGS 1:24000 hydrography data, 1994 Digital orthophoto quarter quadrangles (DOQQs), and USGS digital raster graphics (DRGs). The second was the SJRWMD Hydrologic Network (Lake / Pond and Reservoir classes). This data base contained 42,002 lake / pond and reservoir features for the SJRWMD. Lakes with multiple pools of open water were often mapped as multiple features and many man-made features (borrow pits, reservoirs, etc.) were included. This dataset was developed from USGS map data of varying dates.MethodsPolygons in this new lakes dataset were derived from a "wet period" landcover map (SJRWMD, 1999), in which most lake levels were relatively high. Polygons from other dates, mostly 2009, were used for lakes in regionally dry locations or for lakes that were uncharacteristically wet in 1999, e.g. Alachua Sink. Our intension was to capture lakes in a basin-full condition; neither unusually high nor low. To build the data set, a selection was made of polygons coded as lakes (5200), marshy lakes (5250, enclosed saltwater ponds in salt marsh (5430), slough waters (5600), and emergent aquatic vegetation (6440). Some large, regionally significant or named man-made reservoirs were also included, as well as a small number of named excavations. All polygons were inspected and edited, where appropriate, to correct lake shores and merge adjacent lake basin features. Water polygons separated by marshes or other low-ground features were grouped and merged to form multipart features when clearly associated within a single lake basin. The initial set of lake names were captured from the Florida Primary Lake Location database. Labels were then moved where needed to insure that they fell within the water bodies referenced. Additional lake names were hand entered using data from USGS 7.5 minute quads, Google Maps, MapQuest, Florida Department of Transportation (FDOT) county maps, and other sources. The final dataset contains 4892 polygons, many of which are multi-part.Operationally, lakes, as captured in this data base, are those features that were identified and mapped using the District’s landuse/landcover scheme in the 5200, 5250, 5430, 5600 classes referenced above; in addition to some areas mapped tin the 6440 class. Some additional features named as lakes, ponds, or reservoirs were also included, even when not currently appearing to be lakes. Some are now very marshy or even dry, but apparently held deeper pools of water in the past. A size limit of 1 acre or more was enforced, except for named features, 30 of which were smaller. The smallest lake was Fox Lake, a doline of 0.04 acres in Orange county. The largest lake, Lake George covered 43,212.8 acres.The lakes of the SJRWMD are a diverse set of features that may be classified in many ways. These include: by surrounding landforms or landcover, by successional stage (lacustrine to palustrine gradient), by hydrology (presence of inflows and/or outflows, groundwater linkages, permanence, etc.), by water quality (trophic state, water color, dissolved solids, etc.), and by origin. We chose to classify the lakes in this set by origin, based on the lake type concepts of Hutchinson (1957). These types are listed in the table below (Table 1). We added some additional types and modified the descriptions to better reflect Florida’s geological conditions (Table 2). Some types were readily identified, others are admittedly conjectural or were of mixed origins, making it difficult to pick a primary mechanism. Geological map layers, particularly total thickness of overburden above the Floridan aquifer system and thickness of the intermediate confining unit, were used to estimate the likelihood of sinkhole formation. Wind sculpting appears to be common and sometimes is a primary mechanism but can be difficult to judge from remotely sensed imagery. For these and others, the classification should be considered provisional. Many District lakes appear to have been formed by several processes, for instance, sinkholes may occur within lakes which lie between sand dunes. Here these would be classified as dune / karst. Mixtures of dunes, deflation and karst are common. Saltmarsh ponds vary in origin and were not further classified. In the northern coastal area they are generally small, circular in outline and appear to have been formed by the collapse and breakdown of a peat substrate, Hutchinson type 70. Further south along the coast additional ponds have been formed by the blockage of tidal creeks, a fluvial process, perhaps of Hutchinson’s Type 52, lateral lakes, in which sediments deposited by a main stream back up the waters of a tributary. In the area of the Cape Canaveral, many salt marsh ponds clearly occupy dune swales flooded by rising ocean levels. A complete listing of lake types and combinations is in Table 3. TypeSub-TypeSecondary TypeTectonic BasinsMarine BasinTectonic BasinsMarine BasinCompound dolineTectonic BasinsMarine BasinkarstTectonic BasinsMarine BasinPhytogenic damTectonic BasinsMarine BasinAbandoned channelTectonic BasinsMarine BasinKarstSolution LakesCompound dolineSolution LakesCompound dolineFluvialSolution LakesCompound dolinePhytogenicSolution LakesDolineSolution LakesDolineDeflationSolution LakesDolineDredgedSolution LakesDolineExcavatedSolution LakesDolineExcavationSolution LakesDolineFluvialSolution LakesKarstKarst / ExcavationSolution LakesKarstKarst / FluvialSolution LakesKarstDeflationSolution LakesKarstDeflation / excavationSolution LakesKarstExcavationSolution LakesKarstFluvialSolution LakesPoljeSolution LakesSpring poolSolution LakesSpring poolFluvialFluvialAbandoned channelFluvialFluvialFluvial Fluvial PhytogenicFluvial LeveeFluvial Oxbow lakeFluvial StrathFluvial StrathPhytogenicAeolianDeflationAeolianDeflationDuneAeolianDeflationExcavationAeolianDeflationKarstAeolianDuneAeolianDune DeflationAeolianDuneExcavationAeolianDuneAeolianDuneKarstShoreline lakesMaritime coastalKarst / ExcavationOrganic accumulationPhytogenic damSalt Marsh PondsMan madeExcavationMan madeDam
Front the current scenario of scientific research about startups, the question that this study focuses on answering is: ���How is the current research panorama of competencies used in ventures classified as startups?��� To answer the research problem described, the general objective of this study is: (i) to show the current research panorama about startup competencies. As a secondary objective: (ii) to map the main competencies of startups present in the literature. The scientific gap that this work seeks to response is create a competency framework for early-stage enterprises, in this case, startups. A group of competencies of startups operationalized in the literature were founded and can guide the theoretical framework of future works about the startup business environments, the association with their life cycle and main research themes. The database was did to help the systematic review according to these principles [1]: (i) create a research problem that guides the research; (ii) choose the aspect to be analyzed in the literature; (iii) filter the collected data according to their relevance to the research problem; and (iv) analyze and interpret the data. The databases chosen were "Scopus" and "Web of Science" taking into account the areas "business", "business finance", "economics" and "management" in the Web of Science, and "business, management and accounting" and ���economics, econometric and finance��� in the Scopus. The research was carried out between January and April 2020 and January 2022. The choice for these databases occurred due to their databases contain most relevant journals in these areas in the literature [2,3]. There was no predetermined period of time in this research, but the authors only considered publications that used ���blind review��� process by peers. The authors used the keywords ���startups���, ���startup capabilities���, ���startup competence���, ���startup competencies��� and ���competencies��� to collect the data and obtain the largest possible number of publications about the subject proposed. A total of 1,953 scientific articles published between 1938 and 2021 were found. Firstly, the authors excluded works that had duplication between the two platforms (the same work appearing on both) or that had no relationship with the area of administration, reducing the number of scientific articles to 1,205 published between 1976 and 2021. Next, works that did not specifically deal with topics related to startups were excluded, obtaining 148 articles published between 1996 and 2021. Finally, works that did not deal with startup competencies were excluded, obtaining 71 articles published between 2000 and 2021, that were described in detail in the database. In a bibliometric view, the articles described in this data base were classified according to descriptive (most cited journals, country of origin and year of publication) and methodological (research method) characteristics; in addition to the results (main research topic) and citations (most cited journals). All selected articles were managed by this database plotted in Microsoft Excel with the reference of each publication and its basic information (abstract, keywords, database, type of study and so on). From the reading of the selected papers, the main competences related to startups were mapped and identified. {"references": ["[1] Souza, M. d., & Ribeiro, H. C. M. (2013). Sustentabilidade ambiental: uma meta-an\u00e1lise da produ\u00e7\u00e3o brasileira em peri\u00f3dicos de administra\u00e7\u00e3o. Revista de Administra\u00e7\u00e3o Contempor\u00e2nea, 17(3), 368-396. DOI: http://doi.org/10.1590/s1415-65552013000300007", "[2] Almeida, C.C., & Gracio, M.C.C. (2019). Produ\u00e7\u00e3o cient\u00edfica brasileira sobre o indicador "Fator de Impacto": um estudo nas bases SciELO, Scopus e Web of Science. Encontros Bibli: Revista Eletr\u00f4nica de Biblioteconomia e Ci\u00eancia da informa\u00e7\u00e3o, 24(54), 62-77. DOI: https://doi.org/10.5007/1518-2924.2019v24n54p62", "[3] Noronha, M.E.S., Rodrigues, C.D., Longo, L.R., & Avrichir, I. (2021). An analysis of international scientific production on business accelerators from 1990 to 2019. Iberoamerican Journal of Entrepreneurship and Small Business, 11(1), in copyediting process. DOI: https://doi.org/10.14211/ibjesb.e2072"]}
Note: This description is taken from a draft report entitled "Creation of a Database of Lakes in the St. Johns River Water Management District of Northeast Florida" by Palmer Kinser. Introduction“Lakes are among the District’s most valued resources. Their aesthetic appeal adds substantially to waterfront property values, which in turn generate tax revenues for local governments. Fish camps and other businesses, that provide lake visitors with supplies and services, benefit local economies directly. Commercial fishing on the District’s larger lakes produces some income, , but far greater economic benefits are produced from sport fishing. Some of the best bass fishing lakes in the world occur in the District. Trophy fishing, guide services and high-stakes fishing tournaments, which they support, also generate substantial revenues for local economies. In addition, the high quality of District lakes has allowed swimming, fishing, and boating to become among the most popular outdoor activities for many District residents and attracts many visitors. Others frequently take advantage of the abundant opportunities afforded for duck hunting, bird watching, photography, and other nature related activities.”(from likelihood of harm to lakes report).ObjectiveThe objective of this work was to create a consistent database of natural lake polygon features for the St. Johns River Water Management District. Other databases examined contained point features only, polygons representing a wide range of dates, water bodies not separated or coded adequately by feature type (i.e. no distinctions were made between lakes, rivers, excavations, etc.), or were incomplete. This new database will allow users to better characterize and measure the lakes resource of the District, allowing comparisons to be made and trends detected; thereby facilitating better protection and management of the resource.BackgroundPrior to creation of this database, the District had 2 waterbody databases. The first of these, the 2002 FDEP Primary Lake Location database, contained 3859 lake point features, state-wide, 1418 of which were in SJRWMD. Only named lakes were included. Data sources were the Geographic Names Information System (GNIS), USGS 1:24000 hydrography data, 1994 Digital orthophoto quarter quadrangles (DOQQs), and USGS digital raster graphics (DRGs). The second was the SJRWMD Hydrologic Network (Lake / Pond and Reservoir classes). This data base contained 42,002 lake / pond and reservoir features for the SJRWMD. Lakes with multiple pools of open water were often mapped as multiple features and many man-made features (borrow pits, reservoirs, etc.) were included. This dataset was developed from USGS map data of varying dates.MethodsPolygons in this new lakes dataset were derived from a "wet period" landcover map (SJRWMD, 1999), in which most lake levels were relatively high. Polygons from other dates, mostly 2009, were used for lakes in regionally dry locations or for lakes that were uncharacteristically wet in 1999, e.g. Alachua Sink. Our intension was to capture lakes in a basin-full condition; neither unusually high nor low. To build the data set, a selection was made of polygons coded as lakes (5200), marshy lakes (5250, enclosed saltwater ponds in salt marsh (5430), slough waters (5600), and emergent aquatic vegetation (6440). Some large, regionally significant or named man-made reservoirs were also included, as well as a small number of named excavations. All polygons were inspected and edited, where appropriate, to correct lake shores and merge adjacent lake basin features. Water polygons separated by marshes or other low-ground features were grouped and merged to form multipart features when clearly associated within a single lake basin. The initial set of lake names were captured from the Florida Primary Lake Location database. Labels were then moved where needed to insure that they fell within the water bodies referenced. Additional lake names were hand entered using data from USGS 7.5 minute quads, Google Maps, MapQuest, Florida Department of Transportation (FDOT) county maps, and other sources. The final dataset contains 4892 polygons, many of which are multi-part.Operationally, lakes, as captured in this data base, are those features that were identified and mapped using the District’s landuse/landcover scheme in the 5200, 5250, 5430, 5600 classes referenced above; in addition to some areas mapped tin the 6440 class. Some additional features named as lakes, ponds, or reservoirs were also included, even when not currently appearing to be lakes. Some are now very marshy or even dry, but apparently held deeper pools of water in the past. A size limit of 1 acre or more was enforced, except for named features, 30 of which were smaller. The smallest lake was Fox Lake, a doline of 0.04 acres in Orange county. The largest lake, Lake George covered 43,212.8 acres.The lakes of the SJRWMD are a diverse set of features that may be classified in many ways. These include: by surrounding landforms or landcover, by successional stage (lacustrine to palustrine gradient), by hydrology (presence of inflows and/or outflows, groundwater linkages, permanence, etc.), by water quality (trophic state, water color, dissolved solids, etc.), and by origin. We chose to classify the lakes in this set by origin, based on the lake type concepts of Hutchinson (1957). These types are listed in the table below (Table 1). We added some additional types and modified the descriptions to better reflect Florida’s geological conditions (Table 2). Some types were readily identified, others are admittedly conjectural or were of mixed origins, making it difficult to pick a primary mechanism. Geological map layers, particularly total thickness of overburden above the Floridan aquifer system and thickness of the intermediate confining unit, were used to estimate the likelihood of sinkhole formation. Wind sculpting appears to be common and sometimes is a primary mechanism but can be difficult to judge from remotely sensed imagery. For these and others, the classification should be considered provisional. Many District lakes appear to have been formed by several processes, for instance, sinkholes may occur within lakes which lie between sand dunes. Here these would be classified as dune / karst. Mixtures of dunes, deflation and karst are common. Saltmarsh ponds vary in origin and were not further classified. In the northern coastal area they are generally small, circular in outline and appear to have been formed by the collapse and breakdown of a peat substrate, Hutchinson type 70. Further south along the coast additional ponds have been formed by the blockage of tidal creeks, a fluvial process, perhaps of Hutchinson’s Type 52, lateral lakes, in which sediments deposited by a main stream back up the waters of a tributary. In the area of the Cape Canaveral, many salt marsh ponds clearly occupy dune swales flooded by rising ocean levels. A complete listing of lake types and combinations is in Table 3. TypeSub-TypeSecondary TypeTectonic BasinsMarine BasinTectonic BasinsMarine BasinCompound dolineTectonic BasinsMarine BasinkarstTectonic BasinsMarine BasinPhytogenic damTectonic BasinsMarine BasinAbandoned channelTectonic BasinsMarine BasinKarstSolution LakesCompound dolineSolution LakesCompound dolineFluvialSolution LakesCompound dolinePhytogenicSolution LakesDolineSolution LakesDolineDeflationSolution LakesDolineDredgedSolution LakesDolineExcavatedSolution LakesDolineExcavationSolution LakesDolineFluvialSolution LakesKarstKarst / ExcavationSolution LakesKarstKarst / FluvialSolution LakesKarstDeflationSolution LakesKarstDeflation / excavationSolution LakesKarstExcavationSolution LakesKarstFluvialSolution LakesPoljeSolution LakesSpring poolSolution LakesSpring poolFluvialFluvialAbandoned channelFluvialFluvialFluvial Fluvial PhytogenicFluvial LeveeFluvial Oxbow lakeFluvial StrathFluvial StrathPhytogenicAeolianDeflationAeolianDeflationDuneAeolianDeflationExcavationAeolianDeflationKarstAeolianDuneAeolianDune DeflationAeolianDuneExcavationAeolianDuneAeolianDuneKarstShoreline lakesMaritime coastalKarst / ExcavationOrganic accumulationPhytogenic damSalt Marsh PondsMan madeExcavationMan madeDam
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data collected during a study "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems" conducted by Martin Lnenicka (University of Hradec Králové, Czech Republic), Anastasija Nikiforova (University of Tartu, Estonia), Mariusz Luterek (University of Warsaw, Warsaw, Poland), Petar Milic (University of Pristina - Kosovska Mitrovica, Serbia), Daniel Rudmark (Swedish National Road and Transport Research Institute, Sweden), Sebastian Neumaier (St. Pölten University of Applied Sciences, Austria), Karlo Kević (University of Zagreb, Croatia), Anneke Zuiderwijk (Delft University of Technology, Delft, the Netherlands), Manuel Pedro Rodríguez Bolívar (University of Granada, Granada, Spain).
As there is a lack of understanding of the elements that constitute different types of value-adding public data ecosystems and how these elements form and shape the development of these ecosystems over time, which can lead to misguided efforts to develop future public data ecosystems, the aim of the study is: (1) to explore how public data ecosystems have developed over time and (2) to identify the value-adding elements and formative characteristics of public data ecosystems. Using an exploratory retrospective analysis and a deductive approach, we systematically review 148 studies published between 1994 and 2023. Based on the results, this study presents a typology of public data ecosystems and develops a conceptual model of elements and formative characteristics that contribute most to value-adding public data ecosystems, and develops a conceptual model of the evolutionary generation of public data ecosystems represented by six generations called Evolutionary Model of Public Data Ecosystems (EMPDE). Finally, three avenues for a future research agenda are proposed.
This dataset is being made public both to act as supplementary data for "Understanding the development of public data ecosystems: from a conceptual model to a six-generation model of the evolution of public data ecosystems ", Telematics and Informatics*, and its Systematic Literature Review component that informs the study.
***Description of the data in this data set***
PublicDataEcosystem_SLR.docx provides the structure of the protocol
PDEtypes.png provides a typology of public data ecosystems
PDE_conceptual_model.png provides a conceptual model of elements and formative characteristics that contribute most to value-adding public data ecosystems
PublicDataEcosystem_SLR.xlsx, Spreadsheet#1 provides the list of results after the search over three indexing databases and filtering out irrelevant studies
Spreadsheets #2 provides the protocol structure.
Spreadsheets #3 provides the filled protocol for relevant studies.
The information on each selected study - presented in PublicDataEcosystem_SLR.xlsx - was collected in four categories:
(1) descriptive information,
(2) approach- and research design- related information,
(3) quality-related information,
(4) HVD determination-related information
Descriptive Information
Approach- and research design-related information
Quality-related information
Public Data Ecosystem-related information
***Format of the file***
.xls, .csv (for the first spreadsheet only), .docx
***Licenses or restrictions***
CC-BY
For more info, see README.txt
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PUDL v2025.2.0 Data Release
This is our regular quarterly release for 2025Q1. It includes updates to all the datasets that are published with quarterly or higher frequency, plus initial verisons of a few new data sources that have been in the works for a while.
One major change this quarter is that we are now publishing all processed PUDL data as Apache Parquet files, alongside our existing SQLite databases. See Data Access for more on how to access these outputs.
Some potentially breaking changes to be aware of:
In the EIA Form 930 – Hourly and Daily Balancing Authority Operations Report a number of new energy sources have been added, and some old energy sources have been split into more granular categories. See Changes in energy source granularity over time.
We are now running the EPA’s CAMD to EIA unit crosswalk code for each individual year starting from 2018, rather than just 2018 and 2021, resulting in more connections between these two datasets and changes to some sub-plant IDs. See the note below for more details.
Many thanks to the organizations who make these regular updates possible! Especially GridLab, RMI, and the ZERO Lab at Princeton University. If you rely on PUDL and would like to help ensure that the data keeps flowing, please consider joining them as a PUDL Sustainer, as we are still fundraising for 2025.
New Data
EIA 176
Add a couple of semi-transformed interim EIA-176 (natural gas sources and dispositions) tables. They aren’t yet being written to the database, but are one step closer. See #3555 and PRs #3590, #3978. Thanks to @davidmudrauskas for moving this dataset forward.
Extracted these interim tables up through the latest 2023 data release. See #4002 and #4004.
EIA 860
Added EIA 860 Multifuel table. See #3438 and #3946.
FERC 1
Added three new output tables containing granular utility accounting data. See #4057, #3642 and the table descriptions in the data dictionary:
out_ferc1_yearly_detailed_income_statements
out_ferc1_yearly_detailed_balance_sheet_assets
out_ferc1_yearly_detailed_balance_sheet_liabilities
SEC Form 10-K Parent-Subsidiary Ownership
We have added some new tables describing the parent-subsidiary company ownership relationships reported in the SEC’s Form 10-K, Exhibit 21 “Subsidiaries of the Registrant”. Where possible these tables link the SEC filers or their subsidiary companies to the corresponding EIA utilities. This work was funded by a grant from the Mozilla Foundation. Most of the ML models and data preparation took place in the mozilla-sec-eia repository separate from the main PUDL ETL, as it requires processing hundreds of thousands of PDFs and the deployment of some ML experiment tracking infrastructure. The new tables are handed off as nearly finished products to the PUDL ETL pipeline. Note that these are preliminary, experimental data products and are known to be incomplete and to contain errors. Extracting data tables from unstructured PDFs and the SEC to EIA record linkage are necessarily probabalistic processes.
See PRs #4026, #4031, #4035, #4046, #4048, #4050 and check out the table descriptions in the PUDL data dictionary:
out_sec10k_parents_and_subsidiaries
core_sec10k_quarterly_filings
core_sec10k_quarterly_exhibit_21_company_ownership
core_sec10k_quarterly_company_information
Expanded Data Coverage
EPA CEMS
Added 2024 Q4 of CEMS data. See #4041 and #4052.
EPA CAMD EIA Crosswalk
In the past, the crosswalk in PUDL has used the EPA’s published crosswalk (run with 2018 data), and an additional crosswalk we ran with 2021 EIA 860 data. To ensure that the crosswalk reflects updates in both EIA and EPA data, we re-ran the EPA R code which generates the EPA CAMD EIA crosswalk with 4 new years of data: 2019, 2020, 2022 and 2023. Re-running the crosswalk pulls the latest data from the CAMD FACT API, which results in some changes to the generator and unit IDs reported on the EPA side of the crosswalk, which feeds into the creation of core_epa_assn_eia_epacamd.
The changes only result in the addition of new units and generators in the EPA data, with no changes to matches at the plant level. However, the updates to generator and unit IDs have resulted in changes to the subplant IDs - some EIA boilers and generators which previously had no matches to EPA data have now been matched to EPA unit data, resulting in an overall reduction in the number of rows in the core_epa_assn_eia_epacamd_subplant_ids table. See issues #4039 and PR #4056 for a discussion of the changes observed in the course of this update.
EIA 860M
Added EIA 860m through December 2024. See #4038 and #4047.
EIA 923
Added EIA 923 monthly data through September 2024. See #4038 and #4047.
EIA Bulk Electricity Data
Updated the EIA Bulk Electricity data to include data published up through 2024-11-01. See #4042 and PR #4051.
EIA 930
Updated the EIA 930 data to include data published up through the beginning of February 2025. See #4040 and PR #4054. 10 new energy sources were added and 3 were retired; see Changes in energy source granularity over time for more information.
Bug Fixes
Fix an accidentally swapped set of starting balance / ending balance column rename parameters in the pre-2021 DBF derived data that feeds into core_ferc1_yearly_other_regulatory_liabilities_sched278. See issue #3952 and PRs #3969, #3979. Thanks to @yolandazzz13 for making this fix.
Added preliminary data validation checks for several FERC 1 tables that were missing it #3860.
Fix spelling of Lake Huron and Lake Saint Clair in out_vcerare_hourly_available_capacity_factor and related tables. See issue #4007 and PR #4029.
Quality of Life Improvements
We added a sources parameter to pudl.metadata.classes.DataSource.from_id() in order to make it possible to use the pudl-archiver repository to archive datasets that won’t necessarily be ingested into PUDL. See this PUDL archiver issue and PRs #4003 and #4013.
Other PUDL v2025.2.0 Resources
PUDL v2025.2.0 Data Dictionary
PUDL v2025.2.0 Documentation
PUDL in the AWS Open Data Registry
PUDL v2025.2.0 in a free, public AWS S3 bucket: s3://pudl.catalyst.coop/v2025.2.0/
PUDL v2025.2.0 in a requester-pays GCS bucket: gs://pudl.catalyst.coop/v2025.2.0/
Zenodo archive of the PUDL GitHub repo for this release
PUDL v2025.2.0 release on GitHub
PUDL v2025.2.0 package in the Python Package Index (PyPI)
Contact Us
If you're using PUDL, we would love to hear from you! Even if it's just a note to let us know that you exist, and how you're using the software or data. Here's a bunch of different ways to get in touch:
Follow us on GitHub
Use the PUDL Github issue tracker to let us know about any bugs or data issues you encounter
GitHub Discussions is where we provide user support.
Watch our GitHub Project to see what we're working on.
Email us at hello@catalyst.coop for private communications.
On Mastodon: @CatalystCoop@mastodon.energy
On BlueSky: @catalyst.coop
On Twitter: @CatalystCoop
Connect with us on LinkedIn
Play with our data and notebooks on Kaggle
Combine our data with ML models on HuggingFace
Learn more about us on our website: https://catalyst.coop
Subscribe to our announcements list for email updates.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our most comprehensive database of AI models, containing over 800 models that are state of the art, highly cited, or otherwise historically notable. It tracks key factors driving machine learning progress and includes over 300 training compute estimates.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.