12 datasets found

A
Canada Takes Different Approach for Pipeline
data.amerigeoss.org
data.wu.ac.at
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). Canada Takes Different Approach for Pipeline [Dataset]. https://data.amerigeoss.org/en/dataset/canada-takes-different-approach-for-pipeline
Explore at:
Dataset updated
Aug 9, 2019
Dataset provided by
Energy Data Exchange
Area covered
Canada
Description
Most of TransCanada/ExxonMobil's proposed 1,717-mile natural gas pipeline from Alaska's North Slope would be built in Canada, where it faces government scrutiny remarkably similar to the oversight under way in the United States. Canadian government agencies federal, provincial and territorialstill must issue final approvals for the pipeline project. They are empowered to ensure the pipeline is designed, constructed and operated safely. They have a strong environmental voice over the project. This includes say-so on how the pipeline crosses streams, how land may be disturbed to trench and assemble the pipe, and what happens when the pipeline path penetrates acreage used by woodland caribou and other important wildlife. But there's one significant difference between U.S. and Canadian oversight: The pipeline project sponsor already has in hand some important Canadian authorizationsincluding arguably the most important ones of all, federal certificates to build and operate the pipeline. While the U.S. and Canadian governments both approved the gas pipeline project when it was initially proposed in the 1970s, project sponsors in the U.S. later gave up their rights. However, that 1970s-era pipeline project, with its certificates in hand, continues to exist in Canada. And the Alaska Pipeline Projecta joint effort of TransCanada Corp. and ExxonMobilhas structured the Canadian portion of its multibillion-dollar pipeline proposal around the plan that first gelled when Jimmy Carter was U.S. president, "Laverne & Shirley" was the top-rated TV show and Alaskans were taking their first strides as newly christened oil tycoons.
h
temporal_cookbook_db
huggingface.co
Updated Aug 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomoro AI Ltd (2020). temporal_cookbook_db [Dataset]. https://huggingface.co/datasets/TomoroAI/temporal_cookbook_db
Explore at:
Dataset updated
Aug 20, 2020
Dataset authored and provided by
Tomoro AI Ltd
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
🧠 Temporal Cookbook DB

A multi-table dataset designed to represent structured, relational data used in event extraction, temporal reasoning, and fact representation pipelines. Originally built as an SQLite database and converted into CSVs for hosting on the Hugging Face Hub. The data tables are created from processing a subset of data from jlh-ibm/earnings_call and covered comapnies AMD and Nvidia.

📦 Dataset Structure

This dataset is organized as multiple… See the full description on the dataset page: https://huggingface.co/datasets/TomoroAI/temporal_cookbook_db.
A
Moving Alaska Gas from Canada to the Lower 48
data.amerigeoss.org
cloud.csiss.gmu.edu
+1more
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). Moving Alaska Gas from Canada to the Lower 48 [Dataset]. https://data.amerigeoss.org/de/dataset/moving-alaska-gas-from-canada-to-the-lower-48
Explore at:
Dataset updated
Aug 9, 2019
Dataset provided by
Energy Data Exchange
Area covered
Canada, Contiguous United States, Alaska
Description
What would happen to Alaska's natural gas once it reaches the end of the proposed pipeline, 1,700 miles from Prudhoe Bay? The gas would flow into a vast network of Canadian and U.S. pipelines assembled over the past 60 years. Some key components of that network were built or expanded in the early 1980s in anticipation of Alaska gas starting to flow back then. Those components went into service without Alaska gas and helped Canada double its natural gas exports to the United States in the 1980s, then double them again in the 1990s. In all, the entire network today can move 15 billion to 20 billion cubic feet a day of natural gas, roughly three to four times the volume the Alaska pipeline would deliver to the British Columbia-Alberta border northwest of Edmonton. Of course, the network still moves billions of cubic feet of gas daily. But the volume it handles has been declining, leaving room for Alaska gas, and even if the flow is relatively flush when the Alaska pipeline is finished, the network's capacity could be expanded. No longer is there serious talk of needing a pipeline stretching all the way from Prudhoe Bay to Chicago. But why end the Alaska pipeline near the B.C.-Alberta border as opposed to somewhere else? The answer is simple: Three major North American gas pipeline systems converge there, in the heart of some of Canada's hottest natural gas plays.
h
beautiVis
huggingface.co
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
beautiVis (2025). beautiVis [Dataset]. https://huggingface.co/datasets/beautiVis/beautiVis
Explore at:
Dataset updated
Apr 16, 2025
Authors
beautiVis
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
beautiVis

About the Dataset

beautiVis is a richly annotated dataset of 50,000+ static images sourced from Reddit's r/dataisbeautiful subreddit between February 2012 and January 2025. The dataset was built through a three-phase pipeline: Phase 1: Data CollectionFirst, we downloaded the complete post history from r/dataisbeautiful using the Arctic-Shift Reddit Download Tool, which provided raw JSON data containing post metadata, titles, and image URLs. During this initial… See the full description on the dataset page: https://huggingface.co/datasets/beautiVis/beautiVis.
h
PARROT
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
momo, PARROT [Dataset]. https://huggingface.co/datasets/momo006/PARROT
Explore at:
Authors
momo
Description
PARROT is a large-scale benchmark designed to evaluate the ability of models to generate executable data preparation (DP) pipelines from natural language instructions. It introduces a new task that aims to lower the technical barrier of data preparation by translating human-written instructions into code. To reflect real-world usage, the benchmark includes ~18,000 pipelines spanning 16 core transformation operations, built from 23,009 tables across six public datasets. This benchmark is… See the full description on the dataset page: https://huggingface.co/datasets/momo006/PARROT.
h
SynCPR
huggingface.co
Updated Jun 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Delong Liu (2025). SynCPR [Dataset]. https://huggingface.co/datasets/a1557811266/SynCPR
Explore at:
Dataset updated
Jun 1, 2025
Authors
Delong Liu
Description
SynCPR Dataset

Overview

The SynCPR dataset is a large-scale, fully synthetic dataset designed specifically for the composed person retrieval task. Built using our automated construction pipeline, SynCPR offers unmatched diversity, quality, and realism for person-centric image retrieval research.

For more details, see https://github.com/Delong-liu-bupt/Composed_Person_Retrieval

Construction Pipeline

The dataset is constructed in three main stages:

Textual… See the full description on the dataset page: https://huggingface.co/datasets/a1557811266/SynCPR.
h
S2R-HDR-2
huggingface.co
Updated May 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yujin (2025). S2R-HDR-2 [Dataset]. https://huggingface.co/datasets/iimmortall/S2R-HDR-2
Explore at:
Dataset updated
May 29, 2025
Authors
Yujin
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
🌐Homepage | 📖Arxiv | GitHub

✨Dataset Summary

S2R-HDR is a large-scale synthetic dataset for high dynamic range (HDR) reconstruction tasks. It contains 1,000 motion sequences, each comprising 24 images at 1920×1080 resolution, with a total of 24,000 images. To support flexible data augmentation, all images are stored in EXR format with linear HDR values. The dataset is rendered using Unreal Engine 5 and our custom pipeline built upon XRFeitoria, encompassing diverse dynamic… See the full description on the dataset page: https://huggingface.co/datasets/iimmortall/S2R-HDR-2.
US Electric Grid Outages
kaggle.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
willian oliveira (2025). US Electric Grid Outages [Dataset]. http://doi.org/10.34740/kaggle/dsv/11245146
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/11245146
Dataset updated
Apr 1, 2025
Dataset provided by
Kaggle
Authors
willian oliveira
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The United States electric grid, a vast and complex infrastructure, has experienced numerous outages from 2002 to 2023, with causes ranging from extreme weather events to cyberattacks and aging infrastructure. The resilience of the grid has been tested repeatedly as demand for electricity continues to grow while climate change exacerbates the frequency and intensity of storms, wildfires, and other natural disasters.

Between 2002 and 2023, the U.S. Department of Energy recorded thousands of power outages, varying in scale from localized blackouts to large-scale regional failures affecting millions. The Northeast blackout of 2003 was one of the most significant, impacting 50 million people across the United States and Canada. A software bug in an alarm system prevented operators from recognizing and responding to transmission line failures, leading to a cascading effect that took hours to contain and days to restore completely.

Weather-related disruptions have been among the most common causes of outages, particularly hurricanes, ice storms, and heatwaves. In 2005, Hurricane Katrina devastated the Gulf Coast, knocking out power for over 1.7 million customers. Similarly, in 2012, Hurricane Sandy caused widespread destruction in the Northeast, leaving over 8 million customers in the dark. More recently, the Texas winter storm of February 2021 resulted in one of the most catastrophic power failures in state history. Unusually cold temperatures overwhelmed the state’s independent power grid, leading to equipment failures, frozen natural gas pipelines, and rolling blackouts that lasted days. The event highlighted vulnerabilities in grid preparedness for extreme weather, particularly in regions unaccustomed to such conditions.

Wildfires in California have also played a significant role in grid outages. The state's largest utility companies, such as Pacific Gas and Electric (PG&E), have implemented preemptive power shutoffs to reduce wildfire risks during high-wind events. These Public Safety Power Shutoffs (PSPS) have affected millions of residents, causing disruptions to businesses, emergency services, and daily life. The 2018 Camp Fire, the deadliest and most destructive wildfire in California history, was ignited by faulty PG&E transmission lines, leading to increased scrutiny over utility maintenance and fire mitigation efforts.

In addition to natural disasters, cyber threats have emerged as a growing concern for the U.S. electric grid. In 2015 and 2016, Russian-linked cyberattacks targeted Ukraine’s power grid, serving as a stark warning of the potential vulnerabilities in American infrastructure. In 2021, the Colonial Pipeline ransomware attack, while not directly targeting the electric grid, demonstrated how critical energy infrastructure could be compromised, leading to widespread fuel shortages and economic disruptions. Federal agencies and utility companies have since ramped up investments in cybersecurity measures to protect against potential attacks.

Aging infrastructure remains another pressing issue. Many parts of the U.S. grid were built decades ago and have not kept pace with modern energy demands or technological advancements. The shift towards renewable energy sources, such as solar and wind, presents new challenges for grid stability, requiring updated transmission systems and improved energy storage solutions. Federal and state governments have initiated grid modernization efforts, including investments in smart grids, microgrids, and battery storage to enhance resilience and reliability.

Looking forward, the future of the U.S. electric grid depends on continued investments in infrastructure, cybersecurity, and climate resilience. With the increasing electrification of transportation and industry, demand for reliable and clean energy will only grow. Policymakers, utility companies, and regulators must collaborate to address vulnerabilities, adapt to emerging threats, and ensure a more robust, efficient, and sustainable electric grid for the decades to come.
d
Residential Real Estate Data via API | USA Coverage | 74% Right Party...
datarade.ai
Updated Mar 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BatchData (2024). Residential Real Estate Data via API | USA Coverage | 74% Right Party Contact Rate | BatchData [Dataset]. https://datarade.ai/data-products/batchdata-property-search-lookup-api-real-estate-and-homeow-batchservice
Explore at:
.json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Mar 13, 2024
Dataset authored and provided by
BatchData
Area covered
United States
Description
In the realm of real estate data solutions, BatchData Property Data Search API emerges as a technical marvel, tailored for product and engineering leadership seeking robust and scalable solutions. This purpose-built API seamlessly integrates diverse datasets, offering over 600 data points, to provide a holistic view of property characteristics, valuation, homeowner information, listing data, county assessor details, photos, and foreclosure information. With state-of-the-art infrastructure and performance features, BatchData sets the standard for efficiency, reliability, and developer satisfaction.

Unraveling the Technical Prowess of BatchData Property Data Search API:

State-of-the-Art Infrastructure: At the heart of BatchData lies a state-of-the-art infrastructure that leverages the latest technologies available. Our systems are engineered to handle increased loads and growing datasets with ease, ensuring optimal performance without significant degradation. This commitment to technological advancement ensures that our data infrastructure and API systems operate at peak efficiency, even in the face of evolving demands and complexities.

Integration Capabilities: BatchData boasts integration capabilities that are second to none, thanks to our innovative data lake house architecture. This architecture empowers us to seamlessly integrate our data with any data platforms or pipelines in a matter of minutes. Whether it's connecting with existing data systems, third-party applications, or internal pipelines, our API offers limitless integration possibilities, enabling product and engineering teams to unlock the full potential of property data with minimal effort.

Developer Documentation: One of the hallmarks of BatchData is our clear and comprehensive developer documentation, which developers love. We understand the importance of providing developers with the resources they need to integrate our API seamlessly into their projects. Our documentation offers detailed guides, code samples, API reference materials, and best practices, empowering developers to hit the ground running and leverage the full capabilities of BatchData with confidence.

Performance Features: BatchData Property Search API is engineered for performance, delivering lightning-fast response times and seamless scalability. Our API is designed to efficiently handle increased loads and growing datasets, ensuring that users experience minimal latency and maximum reliability. Whether it's retrieving property data, conducting complex queries, or accessing real-time updates, our API delivers exceptional performance, empowering product and engineering teams to build high-performance applications and systems with ease. BatchData's APIs work for both residential real estate data and commercial real estate data.

Common Use Cases for BatchData Property Data Search API:

Powering Data-Driven Applications: Product and engineering teams can leverage BatchData Property Data Search API to power data-driven applications tailored for the real estate industry. Whether it's building real estate websites, mobile applications, or internal tools, our API offers comprehensive property data that can drive informed decision-making, enhance user experiences, and streamline operations.

Enabling Advanced Analytics: With BatchData, product and engineering leaders can unlock the power of advanced analytics and reporting capabilities. Our API provides access to rich property data, enabling analysts and researchers to uncover insights, identify trends, and make data-driven recommendations with confidence. Whether it's analyzing market trends, evaluating investment opportunities, or conducting competitive analysis, BatchData empowers teams to derive actionable insights from vast property datasets.

Optimizing Data Infrastructure: BatchData Property Data Search API can play a pivotal role in optimizing data infrastructure within organizations. By seamlessly integrating our API with existing data platforms and pipelines, product and engineering teams can streamline data workflows, improve data accessibility, and enhance overall data infrastructure efficiency. Our API's integration capabilities and performance features ensure that organizations can leverage property data seamlessly across their data ecosystem, driving operational excellence and innovation.

Conclusion: BatchData Property Data Search API stands at the forefront of real estate data solutions, offering product and engineering leaders a comprehensive, scalable, and high-performance API for accessing property data. With state-of-the-art infrastructure, seamless integration capabilities, clear developer documentation, and exceptional performance features, BatchData empowers teams to build data-driven applications, optimize data infrastructure, and unlock actionable insights with ease. As the real estate industry continues to evolve, BatchData remains committed to delivering innovative sol...
h
alpaca
huggingface.co
opendatalab.com
Updated Mar 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tatsu Lab (2023). alpaca [Dataset]. https://huggingface.co/datasets/tatsu-lab/alpaca
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 14, 2023
Dataset authored and provided by
Tatsu Lab
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Card for Alpaca

Dataset Summary

Alpaca is a dataset of 52,000 instructions and demonstrations generated by OpenAI's text-davinci-003 engine. This instruction data can be used to conduct instruction-tuning for language models and make the language model follow instruction better. The authors built on the data generation pipeline from Self-Instruct framework and made the following modifications:

The text-davinci-003 engine to generate the instruction data instead… See the full description on the dataset page: https://huggingface.co/datasets/tatsu-lab/alpaca.
f
Best genes and proteins for each dataset.
figshare.com
plos.figshare.com
xls
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanislav Listopad; Christophe Magnan; Le Z. Day; Aliya Asghar; Andrew Stolz; John A. Tayek; Zhang-Xu Liu; Jon M. Jacobs; Timothy R. Morgan; Trina M. Norden-Krichmar (2024). Best genes and proteins for each dataset. [Dataset]. http://doi.org/10.1371/journal.pdig.0000447.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000447.t005
Dataset updated
Feb 9, 2024
Dataset provided by
PLOS Digital Health
Authors
Stanislav Listopad; Christophe Magnan; Le Z. Day; Aliya Asghar; Andrew Stolz; John A. Tayek; Zhang-Xu Liu; Jon M. Jacobs; Timothy R. Morgan; Trina M. Norden-Krichmar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
For the integrated datasets, the matching genes and proteins are bolded.
h
GlotCC-V1
huggingface.co
hf.qhduan.com
Updated Feb 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CIS, LMU Munich (2024). GlotCC-V1 [Dataset]. https://huggingface.co/datasets/cis-lmu/GlotCC-V1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 23, 2024
Dataset authored and provided by
CIS, LMU Munich
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Dataset Summary

GlotCC-V1.0 is a document-level, general domain dataset derived from CommonCrawl, covering more than 1000 languages.It is built using the GlotLID language identification and Ungoliant pipeline from CommonCrawl.We release our pipeline as open-source at https://github.com/cisnlp/GlotCC.
List of Languages: See https://datasets-server.huggingface.co/splits?dataset=cis-lmu/GlotCC-V1 to get the list of splits available.

Usage (Huggingface Hub -- Recommended)… See the full description on the dataset page: https://huggingface.co/datasets/cis-lmu/GlotCC-V1.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Energy Data Exchange (2019). Canada Takes Different Approach for Pipeline [Dataset]. https://data.amerigeoss.org/en/dataset/canada-takes-different-approach-for-pipeline

Canada Takes Different Approach for Pipeline

Explore at:

Dataset updated

Aug 9, 2019

Dataset provided by

Energy Data Exchange

Area covered

Canada

Description

Most of TransCanada/ExxonMobil's proposed 1,717-mile natural gas pipeline from Alaska's North Slope would be built in Canada, where it faces government scrutiny remarkably similar to the oversight under way in the United States. Canadian government agencies federal, provincial and territorialstill must issue final approvals for the pipeline project. They are empowered to ensure the pipeline is designed, constructed and operated safely. They have a strong environmental voice over the project. This includes say-so on how the pipeline crosses streams, how land may be disturbed to trench and assemble the pipe, and what happens when the pipeline path penetrates acreage used by woodland caribou and other important wildlife. But there's one significant difference between U.S. and Canadian oversight: The pipeline project sponsor already has in hand some important Canadian authorizationsincluding arguably the most important ones of all, federal certificates to build and operate the pipeline. While the U.S. and Canadian governments both approved the gas pipeline project when it was initially proposed in the 1970s, project sponsors in the U.S. later gave up their rights. However, that 1970s-era pipeline project, with its certificates in hand, continues to exist in Canada. And the Alaska Pipeline Projecta joint effort of TransCanada Corp. and ExxonMobilhas structured the Canadian portion of its multibillion-dollar pipeline proposal around the plan that first gelled when Jimmy Carter was U.S. president, "Laverne & Shirley" was the top-rated TV show and Alaskans were taking their first strides as newly christened oil tycoons.

Clear search

Close search

Google apps

Main menu

Canada Takes Different Approach for Pipeline

temporal_cookbook_db

Moving Alaska Gas from Canada to the Lower 48

beautiVis

PARROT

SynCPR

S2R-HDR-2

US Electric Grid Outages

Residential Real Estate Data via API | USA Coverage | 74% Right Party...

alpaca

Best genes and proteins for each dataset.

GlotCC-V1

Canada Takes Different Approach for Pipeline