12 datasets found
  1. A

    Canada Takes Different Approach for Pipeline

    • data.amerigeoss.org
    • data.wu.ac.at
    Updated Aug 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Energy Data Exchange (2019). Canada Takes Different Approach for Pipeline [Dataset]. https://data.amerigeoss.org/en/dataset/canada-takes-different-approach-for-pipeline
    Explore at:
    Dataset updated
    Aug 9, 2019
    Dataset provided by
    Energy Data Exchange
    Area covered
    Canada
    Description

    Most of TransCanada/ExxonMobil's proposed 1,717-mile natural gas pipeline from Alaska's North Slope would be built in Canada, where it faces government scrutiny remarkably similar to the oversight under way in the United States. Canadian government agencies federal, provincial and territorialstill must issue final approvals for the pipeline project. They are empowered to ensure the pipeline is designed, constructed and operated safely. They have a strong environmental voice over the project. This includes say-so on how the pipeline crosses streams, how land may be disturbed to trench and assemble the pipe, and what happens when the pipeline path penetrates acreage used by woodland caribou and other important wildlife. But there's one significant difference between U.S. and Canadian oversight: The pipeline project sponsor already has in hand some important Canadian authorizationsincluding arguably the most important ones of all, federal certificates to build and operate the pipeline. While the U.S. and Canadian governments both approved the gas pipeline project when it was initially proposed in the 1970s, project sponsors in the U.S. later gave up their rights. However, that 1970s-era pipeline project, with its certificates in hand, continues to exist in Canada. And the Alaska Pipeline Projecta joint effort of TransCanada Corp. and ExxonMobilhas structured the Canadian portion of its multibillion-dollar pipeline proposal around the plan that first gelled when Jimmy Carter was U.S. president, "Laverne & Shirley" was the top-rated TV show and Alaskans were taking their first strides as newly christened oil tycoons.

  2. h

    temporal_cookbook_db

    • huggingface.co
    Updated Aug 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomoro AI Ltd (2020). temporal_cookbook_db [Dataset]. https://huggingface.co/datasets/TomoroAI/temporal_cookbook_db
    Explore at:
    Dataset updated
    Aug 20, 2020
    Dataset authored and provided by
    Tomoro AI Ltd
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    🧠 Temporal Cookbook DB

    A multi-table dataset designed to represent structured, relational data used in event extraction, temporal reasoning, and fact representation pipelines. Originally built as an SQLite database and converted into CSVs for hosting on the Hugging Face Hub. The data tables are created from processing a subset of data from jlh-ibm/earnings_call and covered comapnies AMD and Nvidia.

      📦 Dataset Structure
    

    This dataset is organized as multiple… See the full description on the dataset page: https://huggingface.co/datasets/TomoroAI/temporal_cookbook_db.

  3. A

    Moving Alaska Gas from Canada to the Lower 48

    • data.amerigeoss.org
    • cloud.csiss.gmu.edu
    • +1more
    Updated Aug 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Energy Data Exchange (2019). Moving Alaska Gas from Canada to the Lower 48 [Dataset]. https://data.amerigeoss.org/de/dataset/moving-alaska-gas-from-canada-to-the-lower-48
    Explore at:
    Dataset updated
    Aug 9, 2019
    Dataset provided by
    Energy Data Exchange
    Area covered
    Canada, Contiguous United States, Alaska
    Description

    What would happen to Alaska's natural gas once it reaches the end of the proposed pipeline, 1,700 miles from Prudhoe Bay? The gas would flow into a vast network of Canadian and U.S. pipelines assembled over the past 60 years. Some key components of that network were built or expanded in the early 1980s in anticipation of Alaska gas starting to flow back then. Those components went into service without Alaska gas and helped Canada double its natural gas exports to the United States in the 1980s, then double them again in the 1990s. In all, the entire network today can move 15 billion to 20 billion cubic feet a day of natural gas, roughly three to four times the volume the Alaska pipeline would deliver to the British Columbia-Alberta border northwest of Edmonton. Of course, the network still moves billions of cubic feet of gas daily. But the volume it handles has been declining, leaving room for Alaska gas, and even if the flow is relatively flush when the Alaska pipeline is finished, the network's capacity could be expanded. No longer is there serious talk of needing a pipeline stretching all the way from Prudhoe Bay to Chicago. But why end the Alaska pipeline near the B.C.-Alberta border as opposed to somewhere else? The answer is simple: Three major North American gas pipeline systems converge there, in the heart of some of Canada's hottest natural gas plays.

  4. h

    beautiVis

    • huggingface.co
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    beautiVis (2025). beautiVis [Dataset]. https://huggingface.co/datasets/beautiVis/beautiVis
    Explore at:
    Dataset updated
    Apr 16, 2025
    Authors
    beautiVis
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    beautiVis

      About the Dataset
    

    beautiVis is a richly annotated dataset of 50,000+ static images sourced from Reddit's r/dataisbeautiful subreddit between February 2012 and January 2025. The dataset was built through a three-phase pipeline: Phase 1: Data CollectionFirst, we downloaded the complete post history from r/dataisbeautiful using the Arctic-Shift Reddit Download Tool, which provided raw JSON data containing post metadata, titles, and image URLs. During this initial… See the full description on the dataset page: https://huggingface.co/datasets/beautiVis/beautiVis.

  5. h

    PARROT

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    momo, PARROT [Dataset]. https://huggingface.co/datasets/momo006/PARROT
    Explore at:
    Authors
    momo
    Description

    PARROT is a large-scale benchmark designed to evaluate the ability of models to generate executable data preparation (DP) pipelines from natural language instructions. It introduces a new task that aims to lower the technical barrier of data preparation by translating human-written instructions into code. To reflect real-world usage, the benchmark includes ~18,000 pipelines spanning 16 core transformation operations, built from 23,009 tables across six public datasets. This benchmark is… See the full description on the dataset page: https://huggingface.co/datasets/momo006/PARROT.

  6. h

    SynCPR

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Delong Liu (2025). SynCPR [Dataset]. https://huggingface.co/datasets/a1557811266/SynCPR
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Delong Liu
    Description

    SynCPR Dataset

      Overview
    

    The SynCPR dataset is a large-scale, fully synthetic dataset designed specifically for the composed person retrieval task. Built using our automated construction pipeline, SynCPR offers unmatched diversity, quality, and realism for person-centric image retrieval research.

    For more details, see https://github.com/Delong-liu-bupt/Composed_Person_Retrieval

      Construction Pipeline
    

    The dataset is constructed in three main stages:

    Textual… See the full description on the dataset page: https://huggingface.co/datasets/a1557811266/SynCPR.

  7. h

    S2R-HDR-2

    • huggingface.co
    Updated May 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yujin (2025). S2R-HDR-2 [Dataset]. https://huggingface.co/datasets/iimmortall/S2R-HDR-2
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    Yujin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    🌐Homepage | 📖Arxiv | GitHub

      ✨Dataset Summary
    

    S2R-HDR is a large-scale synthetic dataset for high dynamic range (HDR) reconstruction tasks. It contains 1,000 motion sequences, each comprising 24 images at 1920×1080 resolution, with a total of 24,000 images. To support flexible data augmentation, all images are stored in EXR format with linear HDR values. The dataset is rendered using Unreal Engine 5 and our custom pipeline built upon XRFeitoria, encompassing diverse dynamic… See the full description on the dataset page: https://huggingface.co/datasets/iimmortall/S2R-HDR-2.

  8. US Electric Grid Outages

    • kaggle.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2025). US Electric Grid Outages [Dataset]. http://doi.org/10.34740/kaggle/dsv/11245146
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    Kaggle
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The United States electric grid, a vast and complex infrastructure, has experienced numerous outages from 2002 to 2023, with causes ranging from extreme weather events to cyberattacks and aging infrastructure. The resilience of the grid has been tested repeatedly as demand for electricity continues to grow while climate change exacerbates the frequency and intensity of storms, wildfires, and other natural disasters.

    Between 2002 and 2023, the U.S. Department of Energy recorded thousands of power outages, varying in scale from localized blackouts to large-scale regional failures affecting millions. The Northeast blackout of 2003 was one of the most significant, impacting 50 million people across the United States and Canada. A software bug in an alarm system prevented operators from recognizing and responding to transmission line failures, leading to a cascading effect that took hours to contain and days to restore completely.

    Weather-related disruptions have been among the most common causes of outages, particularly hurricanes, ice storms, and heatwaves. In 2005, Hurricane Katrina devastated the Gulf Coast, knocking out power for over 1.7 million customers. Similarly, in 2012, Hurricane Sandy caused widespread destruction in the Northeast, leaving over 8 million customers in the dark. More recently, the Texas winter storm of February 2021 resulted in one of the most catastrophic power failures in state history. Unusually cold temperatures overwhelmed the state’s independent power grid, leading to equipment failures, frozen natural gas pipelines, and rolling blackouts that lasted days. The event highlighted vulnerabilities in grid preparedness for extreme weather, particularly in regions unaccustomed to such conditions.

    Wildfires in California have also played a significant role in grid outages. The state's largest utility companies, such as Pacific Gas and Electric (PG&E), have implemented preemptive power shutoffs to reduce wildfire risks during high-wind events. These Public Safety Power Shutoffs (PSPS) have affected millions of residents, causing disruptions to businesses, emergency services, and daily life. The 2018 Camp Fire, the deadliest and most destructive wildfire in California history, was ignited by faulty PG&E transmission lines, leading to increased scrutiny over utility maintenance and fire mitigation efforts.

    In addition to natural disasters, cyber threats have emerged as a growing concern for the U.S. electric grid. In 2015 and 2016, Russian-linked cyberattacks targeted Ukraine’s power grid, serving as a stark warning of the potential vulnerabilities in American infrastructure. In 2021, the Colonial Pipeline ransomware attack, while not directly targeting the electric grid, demonstrated how critical energy infrastructure could be compromised, leading to widespread fuel shortages and economic disruptions. Federal agencies and utility companies have since ramped up investments in cybersecurity measures to protect against potential attacks.

    Aging infrastructure remains another pressing issue. Many parts of the U.S. grid were built decades ago and have not kept pace with modern energy demands or technological advancements. The shift towards renewable energy sources, such as solar and wind, presents new challenges for grid stability, requiring updated transmission systems and improved energy storage solutions. Federal and state governments have initiated grid modernization efforts, including investments in smart grids, microgrids, and battery storage to enhance resilience and reliability.

    Looking forward, the future of the U.S. electric grid depends on continued investments in infrastructure, cybersecurity, and climate resilience. With the increasing electrification of transportation and industry, demand for reliable and clean energy will only grow. Policymakers, utility companies, and regulators must collaborate to address vulnerabilities, adapt to emerging threats, and ensure a more robust, efficient, and sustainable electric grid for the decades to come.

  9. d

    Residential Real Estate Data via API | USA Coverage | 74% Right Party...

    • datarade.ai
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BatchData (2024). Residential Real Estate Data via API | USA Coverage | 74% Right Party Contact Rate | BatchData [Dataset]. https://datarade.ai/data-products/batchdata-property-search-lookup-api-real-estate-and-homeow-batchservice
    Explore at:
    .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Mar 13, 2024
    Dataset authored and provided by
    BatchData
    Area covered
    United States
    Description

    In the realm of real estate data solutions, BatchData Property Data Search API emerges as a technical marvel, tailored for product and engineering leadership seeking robust and scalable solutions. This purpose-built API seamlessly integrates diverse datasets, offering over 600 data points, to provide a holistic view of property characteristics, valuation, homeowner information, listing data, county assessor details, photos, and foreclosure information. With state-of-the-art infrastructure and performance features, BatchData sets the standard for efficiency, reliability, and developer satisfaction.

    Unraveling the Technical Prowess of BatchData Property Data Search API:

    State-of-the-Art Infrastructure: At the heart of BatchData lies a state-of-the-art infrastructure that leverages the latest technologies available. Our systems are engineered to handle increased loads and growing datasets with ease, ensuring optimal performance without significant degradation. This commitment to technological advancement ensures that our data infrastructure and API systems operate at peak efficiency, even in the face of evolving demands and complexities.

    Integration Capabilities: BatchData boasts integration capabilities that are second to none, thanks to our innovative data lake house architecture. This architecture empowers us to seamlessly integrate our data with any data platforms or pipelines in a matter of minutes. Whether it's connecting with existing data systems, third-party applications, or internal pipelines, our API offers limitless integration possibilities, enabling product and engineering teams to unlock the full potential of property data with minimal effort.

    Developer Documentation: One of the hallmarks of BatchData is our clear and comprehensive developer documentation, which developers love. We understand the importance of providing developers with the resources they need to integrate our API seamlessly into their projects. Our documentation offers detailed guides, code samples, API reference materials, and best practices, empowering developers to hit the ground running and leverage the full capabilities of BatchData with confidence.

    Performance Features: BatchData Property Search API is engineered for performance, delivering lightning-fast response times and seamless scalability. Our API is designed to efficiently handle increased loads and growing datasets, ensuring that users experience minimal latency and maximum reliability. Whether it's retrieving property data, conducting complex queries, or accessing real-time updates, our API delivers exceptional performance, empowering product and engineering teams to build high-performance applications and systems with ease. BatchData's APIs work for both residential real estate data and commercial real estate data.

    Common Use Cases for BatchData Property Data Search API:

    Powering Data-Driven Applications: Product and engineering teams can leverage BatchData Property Data Search API to power data-driven applications tailored for the real estate industry. Whether it's building real estate websites, mobile applications, or internal tools, our API offers comprehensive property data that can drive informed decision-making, enhance user experiences, and streamline operations.

    Enabling Advanced Analytics: With BatchData, product and engineering leaders can unlock the power of advanced analytics and reporting capabilities. Our API provides access to rich property data, enabling analysts and researchers to uncover insights, identify trends, and make data-driven recommendations with confidence. Whether it's analyzing market trends, evaluating investment opportunities, or conducting competitive analysis, BatchData empowers teams to derive actionable insights from vast property datasets.

    Optimizing Data Infrastructure: BatchData Property Data Search API can play a pivotal role in optimizing data infrastructure within organizations. By seamlessly integrating our API with existing data platforms and pipelines, product and engineering teams can streamline data workflows, improve data accessibility, and enhance overall data infrastructure efficiency. Our API's integration capabilities and performance features ensure that organizations can leverage property data seamlessly across their data ecosystem, driving operational excellence and innovation.

    Conclusion: BatchData Property Data Search API stands at the forefront of real estate data solutions, offering product and engineering leaders a comprehensive, scalable, and high-performance API for accessing property data. With state-of-the-art infrastructure, seamless integration capabilities, clear developer documentation, and exceptional performance features, BatchData empowers teams to build data-driven applications, optimize data infrastructure, and unlock actionable insights with ease. As the real estate industry continues to evolve, BatchData remains committed to delivering innovative sol...

  10. h

    alpaca

    • huggingface.co
    • opendatalab.com
    Updated Mar 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tatsu Lab (2023). alpaca [Dataset]. https://huggingface.co/datasets/tatsu-lab/alpaca
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2023
    Dataset authored and provided by
    Tatsu Lab
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for Alpaca

      Dataset Summary
    

    Alpaca is a dataset of 52,000 instructions and demonstrations generated by OpenAI's text-davinci-003 engine. This instruction data can be used to conduct instruction-tuning for language models and make the language model follow instruction better. The authors built on the data generation pipeline from Self-Instruct framework and made the following modifications:

    The text-davinci-003 engine to generate the instruction data instead… See the full description on the dataset page: https://huggingface.co/datasets/tatsu-lab/alpaca.

  11. f

    Best genes and proteins for each dataset.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanislav Listopad; Christophe Magnan; Le Z. Day; Aliya Asghar; Andrew Stolz; John A. Tayek; Zhang-Xu Liu; Jon M. Jacobs; Timothy R. Morgan; Trina M. Norden-Krichmar (2024). Best genes and proteins for each dataset. [Dataset]. http://doi.org/10.1371/journal.pdig.0000447.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    PLOS Digital Health
    Authors
    Stanislav Listopad; Christophe Magnan; Le Z. Day; Aliya Asghar; Andrew Stolz; John A. Tayek; Zhang-Xu Liu; Jon M. Jacobs; Timothy R. Morgan; Trina M. Norden-Krichmar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For the integrated datasets, the matching genes and proteins are bolded.

  12. h

    GlotCC-V1

    • huggingface.co
    • hf.qhduan.com
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CIS, LMU Munich (2024). GlotCC-V1 [Dataset]. https://huggingface.co/datasets/cis-lmu/GlotCC-V1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 23, 2024
    Dataset authored and provided by
    CIS, LMU Munich
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Summary

    GlotCC-V1.0 is a document-level, general domain dataset derived from CommonCrawl, covering more than 1000 languages.It is built using the GlotLID language identification and Ungoliant pipeline from CommonCrawl.We release our pipeline as open-source at https://github.com/cisnlp/GlotCC.
    List of Languages: See https://datasets-server.huggingface.co/splits?dataset=cis-lmu/GlotCC-V1 to get the list of splits available.

      Usage (Huggingface Hub -- Recommended)… See the full description on the dataset page: https://huggingface.co/datasets/cis-lmu/GlotCC-V1.
    
  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Energy Data Exchange (2019). Canada Takes Different Approach for Pipeline [Dataset]. https://data.amerigeoss.org/en/dataset/canada-takes-different-approach-for-pipeline

Canada Takes Different Approach for Pipeline

Explore at:
Dataset updated
Aug 9, 2019
Dataset provided by
Energy Data Exchange
Area covered
Canada
Description

Most of TransCanada/ExxonMobil's proposed 1,717-mile natural gas pipeline from Alaska's North Slope would be built in Canada, where it faces government scrutiny remarkably similar to the oversight under way in the United States. Canadian government agencies federal, provincial and territorialstill must issue final approvals for the pipeline project. They are empowered to ensure the pipeline is designed, constructed and operated safely. They have a strong environmental voice over the project. This includes say-so on how the pipeline crosses streams, how land may be disturbed to trench and assemble the pipe, and what happens when the pipeline path penetrates acreage used by woodland caribou and other important wildlife. But there's one significant difference between U.S. and Canadian oversight: The pipeline project sponsor already has in hand some important Canadian authorizationsincluding arguably the most important ones of all, federal certificates to build and operate the pipeline. While the U.S. and Canadian governments both approved the gas pipeline project when it was initially proposed in the 1970s, project sponsors in the U.S. later gave up their rights. However, that 1970s-era pipeline project, with its certificates in hand, continues to exist in Canada. And the Alaska Pipeline Projecta joint effort of TransCanada Corp. and ExxonMobilhas structured the Canadian portion of its multibillion-dollar pipeline proposal around the plan that first gelled when Jimmy Carter was U.S. president, "Laverne & Shirley" was the top-rated TV show and Alaskans were taking their first strides as newly christened oil tycoons.

Search
Clear search
Close search
Google apps
Main menu