36 datasets found
  1. Z

    Sample Dataset - HR Subject Areas

    • data.niaid.nih.gov
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weber, Marc (2023). Sample Dataset - HR Subject Areas [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7447111
    Explore at:
    Dataset updated
    Jan 18, 2023
    Dataset authored and provided by
    Weber, Marc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset created as part of the Master Thesis "Business Intelligence – Automation of Data Marts modeling and its data processing".

    Lucerne University of Applied Sciences and Arts

    Master of Science in Applied Information and Data Science (MScIDS)

    Autumn Semester 2022

    Change log Version 1.1:

    The following SQL scripts were added:

        Index
        Type
        Name
    
    
        1
        View
        pg.dictionary_table
    
    
        2
        View
        pg.dictionary_column
    
    
        3
        View
        pg.dictionary_relation
    
    
        4
        View
        pg.accesslayer_table
    
    
        5
        View
        pg.accesslayer_column
    
    
        6
        View
        pg.accesslayer_relation
    
    
        7
        View
        pg.accesslayer_fact_candidate
    
    
        8
        Stored Procedure
        pg.get_fact_candidate
    
    
        9
        Stored Procedure
        pg.get_dimension_candidate
    
    
        10
        Stored Procedure
        pg.get_columns
    

    Scripts are based on Microsoft SQL Server Version 2017 and compatible with a data warehouse built with Datavault Builder. Data warehouse objects scripts of the sample data warehouse are restricted and cannot be shared.

  2. Most popular database management systems worldwide 2024

    • statista.com
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
    Explore at:
    Dataset updated
    Jun 19, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2024
    Area covered
    Worldwide
    Description

    As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.

  3. Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL

    • zenodo.org
    bin, json, txt
    Updated Aug 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson (2021). Spider Realistic Dataset In Structure-Grounded Pretraining for Text-to-SQL [Dataset]. http://doi.org/10.5281/zenodo.5205322
    Explore at:
    txt, json, binAvailable download formats
    Dataset updated
    Aug 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson; Xiang Deng; Ahmed Hassan Awadallah; Christopher Meek; Oleksandr Polozov; Huan Sun; Matthew Richardson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This folder contains the Spider-Realistic dataset used for evaluation in the paper "Structure-Grounded Pretraining for Text-to-SQL". The dataset is created based on the dev split of the Spider dataset (2020-06-07 version from https://yale-lily.github.io/spider). We manually modified the original questions to remove the explicit mention of column names while keeping the SQL queries unchanged to better evaluate the model's capability in aligning the NL utterance and the DB schema. For more details, please check our paper at https://arxiv.org/abs/2010.12773.

    It contains the following files:

    - spider-realistic.json
    # The spider-realistic evaluation set
    # Examples: 508
    # Databases: 19
    - dev.json
    # The original dev split of Spider
    # Examples: 1034
    # Databases: 20
    - tables.json
    # The original DB schemas from Spider
    # Databases: 166
    - README.txt
    - license

    The Spider-Realistic dataset is created based on the dev split of the Spider dataset realsed by Yu, Tao, et al. "Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task." It is a subset of the original dataset with explicit mention of the column names removed. The sql queries and databases are kept unchanged.
    For the format of each json file, please refer to the github page of Spider https://github.com/taoyds/spider.
    For the database files please refer to the official Spider release https://yale-lily.github.io/spider.

    This dataset is distributed under the CC BY-SA 4.0 license.

    If you use the dataset, please cite the following papers including the original Spider datasets, Finegan-Dollak et al., 2018 and the original datasets for Restaurants, GeoQuery, Scholar, Academic, IMDB, and Yelp.

    @article{deng2020structure,
    title={Structure-Grounded Pretraining for Text-to-SQL},
    author={Deng, Xiang and Awadallah, Ahmed Hassan and Meek, Christopher and Polozov, Oleksandr and Sun, Huan and Richardson, Matthew},
    journal={arXiv preprint arXiv:2010.12773},
    year={2020}
    }

    @inproceedings{Yu&al.18c,
    year = 2018,
    title = {Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task},
    booktitle = {EMNLP},
    author = {Tao Yu and Rui Zhang and Kai Yang and Michihiro Yasunaga and Dongxu Wang and Zifan Li and James Ma and Irene Li and Qingning Yao and Shanelle Roman and Zilin Zhang and Dragomir Radev }
    }

    @InProceedings{P18-1033,
    author = "Finegan-Dollak, Catherine
    and Kummerfeld, Jonathan K.
    and Zhang, Li
    and Ramanathan, Karthik
    and Sadasivam, Sesh
    and Zhang, Rui
    and Radev, Dragomir",
    title = "Improving Text-to-SQL Evaluation Methodology",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2018",
    publisher = "Association for Computational Linguistics",
    pages = "351--360",
    location = "Melbourne, Australia",
    url = "http://aclweb.org/anthology/P18-1033"
    }

    @InProceedings{data-sql-imdb-yelp,
    dataset = {IMDB and Yelp},
    author = {Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig},
    title = {SQLizer: Query Synthesis from Natural Language},
    booktitle = {International Conference on Object-Oriented Programming, Systems, Languages, and Applications, ACM},
    month = {October},
    year = {2017},
    pages = {63:1--63:26},
    url = {http://doi.org/10.1145/3133887},
    }

    @article{data-academic,
    dataset = {Academic},
    author = {Fei Li and H. V. Jagadish},
    title = {Constructing an Interactive Natural Language Interface for Relational Databases},
    journal = {Proceedings of the VLDB Endowment},
    volume = {8},
    number = {1},
    month = {September},
    year = {2014},
    pages = {73--84},
    url = {http://dx.doi.org/10.14778/2735461.2735468},
    }

    @InProceedings{data-atis-geography-scholar,
    dataset = {Scholar, and Updated ATIS and Geography},
    author = {Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer},
    title = {Learning a Neural Semantic Parser from User Feedback},
    booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
    year = {2017},
    pages = {963--973},
    location = {Vancouver, Canada},
    url = {http://www.aclweb.org/anthology/P17-1089},
    }

    @inproceedings{data-geography-original
    dataset = {Geography, original},
    author = {John M. Zelle and Raymond J. Mooney},
    title = {Learning to Parse Database Queries Using Inductive Logic Programming},
    booktitle = {Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2},
    year = {1996},
    pages = {1050--1055},
    location = {Portland, Oregon},
    url = {http://dl.acm.org/citation.cfm?id=1864519.1864543},
    }

    @inproceedings{data-restaurants-logic,
    author = {Lappoon R. Tang and Raymond J. Mooney},
    title = {Automated Construction of Database Interfaces: Intergrating Statistical and Relational Learning for Semantic Parsing},
    booktitle = {2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora},
    year = {2000},
    pages = {133--141},
    location = {Hong Kong, China},
    url = {http://www.aclweb.org/anthology/W00-1317},
    }

    @inproceedings{data-restaurants-original,
    author = {Ana-Maria Popescu, Oren Etzioni, and Henry Kautz},
    title = {Towards a Theory of Natural Language Interfaces to Databases},
    booktitle = {Proceedings of the 8th International Conference on Intelligent User Interfaces},
    year = {2003},
    location = {Miami, Florida, USA},
    pages = {149--157},
    url = {http://doi.acm.org/10.1145/604045.604070},
    }

    @inproceedings{data-restaurants,
    author = {Alessandra Giordani and Alessandro Moschitti},
    title = {Automatic Generation and Reranking of SQL-derived Answers to NL Questions},
    booktitle = {Proceedings of the Second International Conference on Trustworthy Eternal Systems via Evolving Software, Data and Knowledge},
    year = {2012},
    location = {Montpellier, France},
    pages = {59--76},
    url = {https://doi.org/10.1007/978-3-642-45260-4_5},
    }

  4. c

    SQL Server Monitoring Tools Market size will grow at a CAGR of 5.50% from...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). SQL Server Monitoring Tools Market size will grow at a CAGR of 5.50% from 2023 to 2030! [Dataset]. https://www.cognitivemarketresearch.com/sql-server-monitoring-tools-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global SQL Server Monitoring Tools market will be USD XX million in 2023 and will expand at a compound annual growth rate (CAGR) of 5.50% from 2023 to 2030.

    North America held the major market of more than 40% of the global revenue. It will grow at a compound annual growth rate (CAGR) of 3.7% from 2023 to 2030
    Europe SQL Server Monitoring Tools is projected to expand at a compound annual growth rate (CAGR) of 5.50% from 2023 to 2030, Europe accounted for a share of over 30% of the global
    Asia Pacific held the market of more than 23% of the global revenue and will grow at a compound annual growth rate (CAGR) of 7.5% from 2023 to 2030
    Latin America market has more than 5% of the global revenue . It will grow at a compound annual growth rate (CAGR) of 4.9% from 2023 to 2030.
    Middle East and Africa held the major market of more than 3% of the global revenue and will grow at a compound annual growth rate (CAGR) of 5.2% from 2023 to 2030
    The demand for SQL Server Monitoring Tools is rising due Increasing Complexity and Volume of Data to Provide Viable Market Output.
    Demand for Web remains higher in the SQL Server Monitoring Tools market.
    The consumer and retail category held the highest SQL Server Monitoring Tools market revenue share in 2023.
    

    Increasing Complexity and Volume of Data to Provide Viable Market Output

    In today's data-intensive market, enterprises must deal with massive data quantities, which strains SQL Server performance. To solve this difficulty, monitoring solutions have become essential for guaranteeing the proper operation and availability of crucial workloads. These technologies monitor database performance parameters in real-time, finding bottlenecks and optimizing queries to improve overall system efficiency. Organizations may reduce performance concerns, avoid downtime, and ensure database dependability by proactively monitoring SQL Server environments. As a result, SQL Server monitoring solutions play an important role in assisting businesses as they traverse the complexity of maintaining and extracting value from large amounts of information.

    Digital Transformation to Propel Market Growth
    

    The growing reliance on digital services and apps has increased the need for performance monitoring and uptime technologies. Maintaining consistent performance becomes critical as businesses rely more on digital platforms for operations, customer interactions, and data management. Real-time monitoring, optimization, and troubleshooting tools are critical for avoiding disruptions and downtime while providing a consistent user experience. This increased demand reflects a growing realization of the vital role that digital services play in modern operations, prompting organizations to invest in solutions that ensure the performance and availability of their digital infrastructure.

    Market Restraints of the SQL Server Monitoring Tools

    High Cost to Restrict Market Growth
    

    Monitoring tool adoption and maintenance costs can be prohibitively expensive for smaller enterprises. While these technologies are critical for guaranteeing optimal system performance, smaller companies' financial constraints may limit their use. The initial setup costs, recurring license fees, and the need for qualified personnel to manage and interpret monitoring data can all burden tight budgets. As a result, smaller firms may need to carefully consider cost-effective alternatives or alternate techniques to overcome these constraints while still providing important monitoring capabilities without jeopardizing their financial stability.

    Impact of COVID–19 on the SQL Server Monitoring Tools Market

    COVID-19 has a dual impact on the market for SQL Server Monitoring Tools. On the one hand, growing remote work highlighted the significance of robust database monitoring for dispersed systems, driving up demand. On the other hand, economic uncertainty prompted some enterprises to reconsider investments, influencing purchasing decisions. The requirement for efficient database management, particularly in remote operations, fostered market resilience. Adaptable tools to manage performance difficulties were critical, reflecting a market dynamic in which the pandemic increased the adoption of monitoring solutions while influencing decision-making based on economic restrictions. Introduction of SQL Server Monitoring Tools

    The SQL Serv...

  5. d

    GetSeq : a PC program for extracting data from a remote SQL database : an...

    • data.gov.au
    pdf
    Updated Jan 1, 1990
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Mineral Resources, Geology and Geophysics (1990). GetSeq : a PC program for extracting data from a remote SQL database : an example of client-server programming [Dataset]. https://data.gov.au/dataset/ds-ga-a05f7892-755f-7506-e044-00144fdd4fa6
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 1, 1990
    Dataset provided by
    Bureau of Mineral Resources, Geology and Geophysics
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Legacy product - no abstract available Legacy product - no abstract available

  6. Purchase Order Data

    • data.ca.gov
    csv, docx, pdf
    Updated Oct 23, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of General Services (2019). Purchase Order Data [Dataset]. https://data.ca.gov/dataset/purchase-order-data
    Explore at:
    pdf, csv, docxAvailable download formats
    Dataset updated
    Oct 23, 2019
    Dataset authored and provided by
    California Department of General Services
    Description

    The State Contract and Procurement Registration System (SCPRS) was established in 2003, as a centralized database of information on State contracts and purchases over $5000. eSCPRS represents the data captured in the State's eProcurement (eP) system, Bidsync, as of March 16, 2009. The data provided is an extract from that system for fiscal years 2012-2013, 2013-2014, and 2014-2015

    Data Limitations:
    Some purchase orders have multiple UNSPSC numbers, however only first was used to identify the purchase order. Multiple UNSPSC numbers were included to provide additional data for a DGS special event however this affects the formatting of the file. The source system Bidsync is being deprecated and these issues will be resolved in the future as state systems transition to Fi$cal.

    Data Collection Methodology:

    The data collection process starts with a data file from eSCPRS that is scrubbed and standardized prior to being uploaded into a SQL Server database. There are four primary tables. The Supplier, Department and United Nations Standard Products and Services Code (UNSPSC) tables are reference tables. The Supplier and Department tables are updated and mapped to the appropriate numbering schema and naming conventions. The UNSPSC table is used to categorize line item information and requires no further manipulation. The Purchase Order table contains raw data that requires conversion to the correct data format and mapping to the corresponding data fields. A stacking method is applied to the table to eliminate blanks where needed. Extraneous characters are removed from fields. The four tables are joined together and queries are executed to update the final Purchase Order Dataset table. Once the scrubbing and standardization process is complete the data is then uploaded into the SQL Server database.

    Secondary/Related Resources:

  7. Structured Query Language Server Transformation Market Trend | SQL Server...

    • emergenresearch.com
    pdf,excel,csv,ppt
    Updated Jun 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emergen Research (2022). Structured Query Language Server Transformation Market Trend | SQL Server Transformation Industry Forecast by 2030 [Dataset]. https://www.emergenresearch.com/industry-report/structured-query-language-server-transformation-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jun 21, 2022
    Dataset authored and provided by
    Emergen Research
    License

    https://www.emergenresearch.com/privacy-policyhttps://www.emergenresearch.com/privacy-policy

    Area covered
    Global
    Variables measured
    Base Year, No. of Pages, Growth Drivers, Forecast Period, Segments covered, Historical Data for, Pitfalls Challenges, 2030 Value Projection, Tables, Charts, and Figures, Forecast Period 2022 - 2030 CAGR, and 1 more
    Description

    The global Structured Query Language Server Transformation market size reached USD 12.42 Billion in 2021 and is expected to reach USD 26.93 Billion in 2030 registering a CAGR of 9.4%. SQL Server Transformation industry report classifies global market by share, trend, growth and based on function typ...

  8. P

    CoSQL Dataset

    • paperswithcode.com
    Updated Aug 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tao Yu; Rui Zhang; He Yang Er; Suyi Li; Eric Xue; Bo Pang; Xi Victoria Lin; Yi Chern Tan; Tianze Shi; Zihan Li; Youxuan Jiang; Michihiro Yasunaga; Sungrok Shim; Tao Chen; Alexander Fabbri; Zifan Li; Luyao Chen; Yuwen Zhang; Shreya Dixit; Vincent Zhang; Caiming Xiong; Richard Socher; Walter S. Lasecki; Dragomir Radev (2024). CoSQL Dataset [Dataset]. https://paperswithcode.com/dataset/cosql
    Explore at:
    Dataset updated
    Aug 8, 2024
    Authors
    Tao Yu; Rui Zhang; He Yang Er; Suyi Li; Eric Xue; Bo Pang; Xi Victoria Lin; Yi Chern Tan; Tianze Shi; Zihan Li; Youxuan Jiang; Michihiro Yasunaga; Sungrok Shim; Tao Chen; Alexander Fabbri; Zifan Li; Luyao Chen; Yuwen Zhang; Shreya Dixit; Vincent Zhang; Caiming Xiong; Richard Socher; Walter S. Lasecki; Dragomir Radev
    Description

    CoSQL is a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions.

  9. GetSeq : a PC program for extracting data from a remote SQL database : an...

    • devweb.dga.links.com.au
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geoscience Australia (2025). GetSeq : a PC program for extracting data from a remote SQL database : an example of client-server programming [Dataset]. https://devweb.dga.links.com.au/data/dataset/getseq-a-pc-program-for-extracting-data-from-a-remote-sql-database-an-example-of-client-server-
    Explore at:
    pdf, 0main%20features32008Available download formats
    Dataset updated
    Jan 20, 2025
    Dataset authored and provided by
    Geoscience Australiahttp://ga.gov.au/
    Description

    Legacy product - no abstract available

  10. Australian Employee Salary/Wages DATAbase by detailed occupation, location...

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Ferrers; Australian Taxation Office (2023). Australian Employee Salary/Wages DATAbase by detailed occupation, location and year (2002-14); (plus Sole Traders) [Dataset]. http://doi.org/10.6084/m9.figshare.4522895.v5
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Richard Ferrers; Australian Taxation Office
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The ATO (Australian Tax Office) made a dataset openly available (see links) showing all the Australian Salary and Wages (2002, 2006, 2010, 2014) by detailed occupation (around 1,000) and over 100 SA4 regions. Sole Trader sales and earnings are also provided. This open data (csv) is now packaged into a database (*.sql) with 45 sample SQL queries (backupSQL[date]_public.txt).See more description at related Figshare #datavis record. Versions:V5: Following #datascience course, I have made main data (individual salary and wages) available as csv and Jupyter Notebook. Checksum matches #dataTotals. In 209,xxx rows.Also provided Jobs, and SA4(Locations) description files as csv. More details at: Where are jobs growing/shrinking? Figshare DOI: 4056282 (linked below). Noted 1% discrepancy ($6B) in 2010 wages total - to follow up.#dataTotals - Salary and WagesYearWorkers (M)Earnings ($B) 20028.528520069.4372201010.2481201410.3584#dataTotal - Sole TradersYearWorkers (M)Sales ($B)Earnings ($B)20020.9611320061.0881920101.11122620141.19630#links See ATO request for data at ideascale link below.See original csv open data set (CC-BY) at data.gov.au link below.This database was used to create maps of change in regional employment - see Figshare link below (m9.figshare.4056282).#packageThis file package contains a database (analysing the open data) in SQL package and sample SQL text, interrogating the DB. DB name: test. There are 20 queries relating to Salary and Wages.#analysisThe database was analysed and outputs provided on Nectar(.org.au) resources at: http://118.138.240.130.(offline)This is only resourced for max 1 year, from July 2016, so will expire in June 2017. Hence the filing here. The sample home page is provided here (and pdf), but not all the supporting files, which may be packaged and added later. Until then all files are available at the Nectar URL. Nectar URL now offline - server files attached as package (html_backup[date].zip), including php scripts, html, csv, jpegs.#installIMPORT: DB SQL dump e.g. test_2016-12-20.sql (14.8Mb)1.Started MAMP on OSX.1.1 Go to PhpMyAdmin2. New Database: 3. Import: Choose file: test_2016-12-20.sql -> Go (about 15-20 seconds on MacBookPro 16Gb, 2.3 Ghz i5)4. four tables appeared: jobTitles 3,208 rows | salaryWages 209,697 rows | soleTrader 97,209 rows | stateNames 9 rowsplus views e.g. deltahair, Industrycodes, states5. Run test query under **#; Sum of Salary by SA4 e.g. 101 $4.7B, 102 $6.9B#sampleSQLselect sa4,(select sum(count) from salaryWageswhere year = '2014' and sa4 = sw.sa4) as thisYr14,(select sum(count) from salaryWageswhere year = '2010' and sa4 = sw.sa4) as thisYr10,(select sum(count) from salaryWageswhere year = '2006' and sa4 = sw.sa4) as thisYr06,(select sum(count) from salaryWageswhere year = '2002' and sa4 = sw.sa4) as thisYr02from salaryWages swgroup by sa4order by sa4

  11. SECs Compiled Financial Statements & Notes Dataset

    • kaggle.com
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deny Tran (2024). SECs Compiled Financial Statements & Notes Dataset [Dataset]. https://www.kaggle.com/datasets/denytran/im-a-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Deny Tran
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Description

    This dataset is from the SEC's Financial Statements and Notes Data Set.
    It was a personal project to see if I could make the queries efficient.
    It's just been collecting dust ever since, maybe someone will make good use of it.
    Data is up to about early-2024.
    It doesn't differ from the source, other than it's compiled - so maybe you can try it out, then compile your own (with the link below).
    Dataset was created using SEC Files and SQL Server on Docker.
    For details on the SQL Server database this came from, see: "dataset-previous-life-info" folder, which will contain: - Row Counts - Primary/Foreign Keys - SQL Statements to recreate database tables - Example queries on how to join the data tables. - A pretty picture of the table associations. Source: https://www.sec.gov/data-research/financial-statement-notes-data-sets

    Happy coding!

  12. I

    Illinois Coastal Zone Water Quality Database (ICoastalDB)

    • databank.illinois.edu
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elias Getahun; Atticus Zavelle; Laura Keefer (2025). Illinois Coastal Zone Water Quality Database (ICoastalDB) [Dataset]. http://doi.org/10.13012/B2IDB-7799136_V2
    Explore at:
    Dataset updated
    Apr 1, 2025
    Authors
    Elias Getahun; Atticus Zavelle; Laura Keefer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Illinois
    Dataset funded by
    Illinois Department of Natural Resources (IDNR) - Illinois Coastal Management Program (ICMP)
    Description

    ICoastalDB, which was developed using Microsoft structured query language (SQL) Server, consists of water quality and related data in the Illinois coastal zone that were collected by various organizations. The information in the dataset includes, but is not limited to, sample data type, method of data sampling, location, time and date of sampling and data units.

  13. g

    Moving Violations Issued in August 2024

    • gimi9.com
    • visionzero.dc.gov
    • +3more
    Updated Aug 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Moving Violations Issued in August 2024 [Dataset]. https://gimi9.com/dataset/data-gov_moving-violations-issued-in-august-2024/
    Explore at:
    Dataset updated
    Aug 31, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

  14. e

    Geodatabase for the Baltimore Ecosystem Study Spatial Data

    • portal.edirepository.org
    • search.dataone.org
    application/vnd.rar
    Updated May 4, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jarlath O'Neal-Dunne; Morgan Grove (2012). Geodatabase for the Baltimore Ecosystem Study Spatial Data [Dataset]. http://doi.org/10.6073/pasta/377da686246f06554f7e517de596cd2b
    Explore at:
    application/vnd.rar(29574980 kilobyte)Available download formats
    Dataset updated
    May 4, 2012
    Dataset provided by
    EDI
    Authors
    Jarlath O'Neal-Dunne; Morgan Grove
    Time period covered
    Jan 1, 1999 - Jun 1, 2014
    Area covered
    Description

    The establishment of a BES Multi-User Geodatabase (BES-MUG) allows for the storage, management, and distribution of geospatial data associated with the Baltimore Ecosystem Study. At present, BES data is distributed over the internet via the BES website. While having geospatial data available for download is a vast improvement over having the data housed at individual research institutions, it still suffers from some limitations. BES-MUG overcomes these limitations; improving the quality of the geospatial data available to BES researches, thereby leading to more informed decision-making.

       BES-MUG builds on Environmental Systems Research Institute's (ESRI) ArcGIS and ArcSDE technology. ESRI was selected because its geospatial software offers robust capabilities. ArcGIS is implemented agency-wide within the USDA and is the predominant geospatial software package used by collaborating institutions.
    
    
       Commercially available enterprise database packages (DB2, Oracle, SQL) provide an efficient means to store, manage, and share large datasets. However, standard database capabilities are limited with respect to geographic datasets because they lack the ability to deal with complex spatial relationships. By using ESRI's ArcSDE (Spatial Database Engine) in conjunction with database software, geospatial data can be handled much more effectively through the implementation of the Geodatabase model. Through ArcSDE and the Geodatabase model the database's capabilities are expanded, allowing for multiuser editing, intelligent feature types, and the establishment of rules and relationships. ArcSDE also allows users to connect to the database using ArcGIS software without being burdened by the intricacies of the database itself.
    
    
       For an example of how BES-MUG will help improve the quality and timeless of BES geospatial data consider a census block group layer that is in need of updating. Rather than the researcher downloading the dataset, editing it, and resubmitting to through ORS, access rules will allow the authorized user to edit the dataset over the network. Established rules will ensure that the attribute and topological integrity is maintained, so that key fields are not left blank and that the block group boundaries stay within tract boundaries. Metadata will automatically be updated showing who edited the dataset and when they did in the event any questions arise.
    
    
       Currently, a functioning prototype Multi-User Database has been developed for BES at the University of Vermont Spatial Analysis Lab, using Arc SDE and IBM's DB2 Enterprise Database as a back end architecture. This database, which is currently only accessible to those on the UVM campus network, will shortly be migrated to a Linux server where it will be accessible for database connections over the Internet. Passwords can then be handed out to all interested researchers on the project, who will be able to make a database connection through the Geographic Information Systems software interface on their desktop computer. 
    
    
       This database will include a very large number of thematic layers. Those layers are currently divided into biophysical, socio-economic and imagery categories. Biophysical includes data on topography, soils, forest cover, habitat areas, hydrology and toxics. Socio-economics includes political and administrative boundaries, transportation and infrastructure networks, property data, census data, household survey data, parks, protected areas, land use/land cover, zoning, public health and historic land use change. Imagery includes a variety of aerial and satellite imagery.
    
    
       See the readme: http://96.56.36.108/geodatabase_SAL/readme.txt
    
    
       See the file listing: http://96.56.36.108/geodatabase_SAL/diroutput.txt
    
  15. a

    Moving Violations Issued in November 2024

    • hub.arcgis.com
    • opendata.dc.gov
    • +4more
    Updated Jan 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2025). Moving Violations Issued in November 2024 [Dataset]. https://hub.arcgis.com/datasets/38711a587d4643739c784cef7481c6fb
    Explore at:
    Dataset updated
    Jan 17, 2025
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

  16. d

    Moving Violations Issued in September 2024

    • opendata.dc.gov
    • gimi9.com
    • +3more
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2024). Moving Violations Issued in September 2024 [Dataset]. https://opendata.dc.gov/datasets/DCGIS::moving-violations-issued-in-september-2024
    Explore at:
    Dataset updated
    Sep 1, 2024
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

  17. p

    Registered Programs and Drop In Courses Offering - Dataset - CKAN

    • ckan0.cf.opendata.inter.prod-toronto.ca
    Updated Jul 23, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Registered Programs and Drop In Courses Offering - Dataset - CKAN [Dataset]. https://ckan0.cf.opendata.inter.prod-toronto.ca/gl_ES/dataset/registered-programs-and-drop-in-courses-offering
    Explore at:
    Dataset updated
    Jul 23, 2019
    Description

    City of Toronto has a multitude of exciting programs for all ages. Whether public is looking for swimming, fitness, skating, skiing, arts and crafts or dance Parks, Forestry, and Recreation Division of the City of Toronto has something for everyone. Information presented in this dataset is used as data source for Parks, Forestry, and Recreation website. For examples of registered and drop-in recreation programs at particular locations please refer to: Examples. Dataset contains 4 resources: Registered Programs. Each row of the spreadsheet describes a course of a registered program. Course_ID is unique identifier. Recreational programs fall under various categories. A registered program course runs under an activity, on schedule, and has boundaries in regards age of a registrant. You can get information on course location from Locations tab by LocationID. There are multiple courses running at one location. Drop-in. Each row of the spreadsheet describes a drop-in course. Drop-In Course is a program offered within PFR facilities where registration is not required. Drop-In Participation is an informal involvement in a program where space may permit in an activity, and allow for a pay as you go option (e.g. fitness class, Aquafit class). A drop-in course runs under a category, on schedule, and has boundaries in regards age of registrants. You can get information on course location from Locations tab by LocationID. There are multiple courses running at one location. Locations. Each row of the spreadsheet describes a location. LocationID is unique identifier. A location has a name, is characterized by a type (park, rec centre), address, and short description, may be fully or partially accessible, and may have a parent location. Facilities. Each row of the spreadsheet describes a facility. FacilityID is unique identifier. Facility is described by a Type (Display Name) and has multiple assets. 'Facility Rating' column contains values of facility rates (used to learn about amenities that might be available at each facility - see metadata). Facilities of some types can be permitted - 'Permit' column contains information on necessary permit. You can get information on facility's location from Locations tab by LocationID. Every location has multiple facilities. The source of the data is Parks, Forestry & Recreation SQL Server database that combines information from the City of Toronto Recreation Management System and Parks, Forestry & Recreation Asset Management System. The dataset updates are scheduled daily at 8:00 AM. Comments: The file contains data on recreation programs presented in the City of Toronto Parks, Forestry, and Recreation web site. The information in data repositories that are sources for the website is updated on daily and reflects situation as of 1 day prior to dataset update. Changes might not be reflected right away in the dataset due to processing lags. Examples of such are program cancellations, community centres closures, locations change, etc. In cases where data changes require additional time for implementation this gap may exceed 24 hours. The Parks, Forestry, and Recreation Division makes every effort to keep this Open Data set updated and accurate.

  18. a

    Moving Violations Issued in May 2025

    • hub.arcgis.com
    • opendata.dc.gov
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2025). Moving Violations Issued in May 2025 [Dataset]. https://hub.arcgis.com/datasets/d6ae87d0da9d4a84a53417584af29b8f
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

  19. g

    Moving Violations Issued in April 2023

    • gimi9.com
    • opendata.dc.gov
    • +3more
    Updated Apr 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Moving Violations Issued in April 2023 [Dataset]. https://gimi9.com/dataset/data-gov_moving-violations-issued-in-april-2023/
    Explore at:
    Dataset updated
    Apr 30, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

  20. a

    Moving Violations Issued in January 2025

    • dc-vision-zero-dcgis.hub.arcgis.com
    • opendata.dc.gov
    • +2more
    Updated Jan 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2025). Moving Violations Issued in January 2025 [Dataset]. https://dc-vision-zero-dcgis.hub.arcgis.com/datasets/moving-violations-issued-in-january-2025
    Explore at:
    Dataset updated
    Jan 1, 2025
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Moving citation locations in the District of Columbia. The Vision Zero data contained in this layer pertain to moving violations issued by the District of Columbia's Metropolitan Police Department (MPD) and partner agencies with the authority. For example, DC's enforcement camera program cites speeders, blocking the box, and other moving offenses. Moving violation locations are summarized ticket counts based on time of day, week of year, year, and category of violation. Data was originally downloaded from the District Department of Motor Vehicle's eTIMS meter work order management system. Data was exported into DDOT’s SQL server, where the Office of the Chief Technology Officer (OCTO) geocoded citation data to the street segment level. Data was then visualized using the street segment centroid coordinates.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Weber, Marc (2023). Sample Dataset - HR Subject Areas [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7447111

Sample Dataset - HR Subject Areas

Explore at:
Dataset updated
Jan 18, 2023
Dataset authored and provided by
Weber, Marc
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset created as part of the Master Thesis "Business Intelligence – Automation of Data Marts modeling and its data processing".

Lucerne University of Applied Sciences and Arts

Master of Science in Applied Information and Data Science (MScIDS)

Autumn Semester 2022

Change log Version 1.1:

The following SQL scripts were added:

    Index
    Type
    Name


    1
    View
    pg.dictionary_table


    2
    View
    pg.dictionary_column


    3
    View
    pg.dictionary_relation


    4
    View
    pg.accesslayer_table


    5
    View
    pg.accesslayer_column


    6
    View
    pg.accesslayer_relation


    7
    View
    pg.accesslayer_fact_candidate


    8
    Stored Procedure
    pg.get_fact_candidate


    9
    Stored Procedure
    pg.get_dimension_candidate


    10
    Stored Procedure
    pg.get_columns

Scripts are based on Microsoft SQL Server Version 2017 and compatible with a data warehouse built with Datavault Builder. Data warehouse objects scripts of the sample data warehouse are restricted and cannot be shared.

Search
Clear search
Close search
Google apps
Main menu