This data set represents information reported after June 30, 2022 to the Department of Energy and Environmental Protection (CT DEEP), generally to the CT DEEP Dispatch Center, regarding releases of substances to the environment, generally through accidental spills. The update frequeny on this data set is about once a month. For current information related to releases reported to CT DEEP from July 1, 2022 to the present, go to Incident Reports for Releases Reported to CT DEEP July 1, 2022 to present at: https://connecticut.hazconnect.com/listincidentpublic.aspx. For a dataset related to releases reported to CT DEEP from July 1, 1996 to June 30, 2022 refer to the CT Open Data dataset: https://data.ct.gov/Environment-and-Natural-Resources/Spill-Incidents-from-January-1-1996-to-June-30-202/wr2a-rnsg This dataset is a snapshot of data from the live system that is updated periodically to provide to public users the download capabilities offered from the CT Open Data platform. Connecticut General Statutes Section 22a-450 requires anyone who causes any discharge, spillage, uncontrolled loss, seepage or filtration of oil or petroleum or chemical liquids or solid, liquid or gaseous products, or hazardous wastes which poses a potential threat to human health or the environment to report that release to the CT DEEP. Reports of releases from other persons are also included in this dataset. Examples of what may be included in a spill incident record includes: Administrative information (unique spill case number), Spill date/time, Location.,Spill source and cause, Material(s) and material type spilled. Data limitations and factors to consider when using this data: This data is limited to information about a spill incident as it was known at the time it was reported to CT DEEP. Although some data reflects updated information after the time of the initial notification, CT DEEP idoes not field check and verify all reported information. Therefore, information later determined to be incomplete or inaccurate may exist in this data set. There may also be spelling errors or other unintentionally inaccurate data that was transcribed in the spill incident report. This dataset is a subset of records and information that may be available about releases that have occurred at specific locations. This dataset does not replace a full review of files publicly available either on-line and/or at CT DEEP’s Records Center. For a complete review of agency records for this or other agency programs, you can perform your own search in our DEEP public file room (https://portal.ct.gov/DEEP/About/Environmental-Quality-Records-Records-Center-File-Room) located at 79 Elm Street, Hartford CT or at our DEEP Online Search Portal at: https://filings.deep.ct.gov/DEEPDocumentSearchPortal/Home If errors are found or there are questions about the data, please contact the program unit using the following email address: DEEP.SpillsDocs@ct.gov
Note: Please use the following view to be able to see the entire Dataset Description: https://data.ct.gov/Environment-and-Natural-Resources/Hazardous-Waste-Portal-Manifest-Metadata/x2z6-swxe Dataset Description Outline (5 sections) • INTRODUCTION • WHY USE THE CONNECTICUT OPEN DATA PORTAL MANIFEST METADATA DATASET INSTEAD OF THE DEEP DOCUMENT ONLINE SEARCH PORTAL ITSELF? • WHAT MANIFESTS ARE INCLUDED IN DEEP’S MANIFEST PERMANENT RECORDS ARE ALSO AVAILABLE VIA THE DEEP DOCUMENT SEARCH PORTAL AND CT OPEN DATA? • HOW DOES THE PORTAL MANIFEST METADATA DATASET RELATE TO THE OTHER TWO MANIFEST DATASETS PUBLISHED IN CT OPEN DATA? • IMPORTANT NOTES INTRODUCTION • All of DEEP’s paper hazardous waste manifest records were recently scanned and “indexed”. • Indexing consisted of 6 basic pieces of information or “metadata” taken from each manifest about the Generator and stored with the scanned image. The metadata enables searches by: Site Town, Site Address, Generator Name, Generator ID Number, Manifest ID Number and Date of Shipment. • All of the metadata and scanned images are available electronically via DEEP’s Document Online Search Portal at: https://filings.deep.ct.gov/DEEPDocumentSearchPortal/ • Therefore, it is no longer necessary to visit the DEEP Records Center in Hartford for manifest records or information. • This CT Data dataset “Hazardous Waste Portal Manifest Metadata” (or “Portal Manifest Metadata”) was copied from the DEEP Document Online Search Portal, and includes only the metadata – no images. WHY USE THE CONNECTICUT OPEN DATA PORTAL MANIFEST METADATA DATASET INSTEAD OF THE DEEP DOCUMENT ONLINE SEARCH PORTAL ITSELF? The Portal Manifest Metadata is a good search tool to use along with the Portal. Searching the Portal Manifest Metadata can provide the following advantages over searching the Portal: • faster searches, especially for “large searches” - those with a large number of search returns unlimited number of search returns (Portal is limited to 500); • larger display of search returns; • search returns can be sorted and filtered online in CT Data; and • search returns and the entire dataset can be downloaded from CT Data and used offline (e.g. download to Excel format) • metadata from searches can be copied from CT Data and pasted into the Portal search fields to quickly find single scanned images. The main advantages of the Portal are: • it provides access to scanned images of manifest documents (CT Data does not); and • images can be downloaded one or multiple at a time. WHAT MANIFESTS ARE INCLUDED IN DEEP’S MANIFEST PERMANENT RECORDS ARE ALSO AVAILABLE VIA THE DEEP DOCUMENT SEARCH PORTAL AND CT OPEN DATA? All hazardous waste manifest records received and maintained by the DEEP Manifest Program; including: • manifests originating from a Connecticut Generator or sent to a Connecticut Destination Facility including manifests accompanying an exported shipment • manifests with RCRA hazardous waste listed on them (such manifests may also have non-RCRA hazardous waste listed) • manifests from a Generator with a Connecticut Generator ID number (permanent or temporary number) • manifests with sufficient quantities of RCRA hazardous waste listed for DEEP to consider the Generator to be a Small or Large Quantity Generator • manifests with PCBs listed on them from 2016 to 6-29-2018. • Note: manifests sent to a CT Destination Facility were indexed by the Connecticut or Out of State Generator. Searches by CT Designated Facility are not possible unless such facility is the Generator for the purposes of manifesting. All other manifests were considered “non-hazardous” manifests and not scanned. They were discarded after 2 years in accord with DEEP records retention schedule. Non-hazardous manifests include: • Manifests with only non-RCRA hazardous waste listed • Manifests from generators that did not have a permanent or temporary Generator ID number • Sometimes non-hazardous manifests were considered “Hazar
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore historical ownership and registration records by performing a reverse Whois lookup for the email address darren.deep@gmail.com..
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to deep.tv (Domain). Get insights into ownership history and changes over time.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
The lookup tables relate deep-water significant wave height, peak wave period, and peak wave direction with nearshore significant wave height, peak wave period, mean wave period, peak wave direction, and directional spreading at 4,802 stations spaced ~100 m apart along the 10 m bathymetric contour and 20 intermediate-water stations coincident with CDIP buoys within the Southern California Bight from Point Conception to the Mexican border.Deep-water significant wave height bin size = 0.25 m.Deep-water peak wave period bin size = 3 s.Deep-water peak wave direction bin size = 8 deg.Station file = socal.loc is a text file with 4822 lat,lon coordinates for stations starting from 1 to 4822.Lookup table files = LUT_STNtoSTN.matEach lookup table file is a mat file containing the lookup table for 600 stations, as noted in the filename. Structures of input/deep-water significant wave height, peak wave period, and peak wave direction and output/nearshore significant wave height, peak wave period, mean wave period, peak wave direction, and directional spreading allow for easy indexing of deep-water wave conditions to find nearshore conditions at a station of interest.These lookup tables were generated with Simulating WAves Nearshore model. Detailed modeling information can be found in:Hegermiller et al., submitted. Controls of multimodal wave conditions in a complex coastal setting.Please email chegermiller@usgs.gov for more information.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to deep.im (Domain). Get insights into ownership history and changes over time.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Uncover historical ownership history and changes over time by performing a reverse Whois lookup for the company deep.
The NF-19-09 expedition on board the NOAA Ship Nancy Foster was part of the multi-year Deep SEARCH project. This cruise occurred from October 21 to October 30, 2019. The cruise will focus on several seep sites, canyons, and hard bottom features located less than 100 nm offshore. This is the third research expedition associated with the Deep SEARCH project focused on exploring and characterizing seeps, corals, and canyon environments along the Atlantic margin. This project is a collaboration among three federal agencies: Bureau of Ocean Energy Management (BOEM), NOAA Office of Ocean Exploration and Research (OER), and the U.S. Geological Survey (USGS). TDI Brooks with academic partners has been selected to serve as BOEM contractor for this study. Data gathered during this mission and past cruises for this project will help inform multiple management issues concerning this region. The goal of this expedition was to recover benthic lander deployments, conduct mid-water trawling of the deep-scattering layer, collect water samples for christry and microbial diversity analyses, perform multibeam mapping at specific targeted areas, and collect sediment, water, and faunal samples for eDNA work.
NOTES: 1. Please use this link to leave the data view to see the full description: https://data.ct.gov/Environment-and-Natural-Resources/Hazardous-Waste-Manifest-Data-CT-1984-2008/h6d8-qiar 2. Please Use ALL CAPS when searching using the "Filter" function on text such as: LITCHFIELD. But not needed for the upper right corner "Find in this Dataset" search where for example "Litchfield" can be used. Dataset Description: We know there are errors in the data although we strive to minimize them. Examples include: • Manifests completed incorrectly by the generator or the transporter - data was entered based on the incorrect information. We can only enter the information we receive. • Data entry errors – we now have QA/QC procedures in place to prevent or catch and fix a lot of these. • Historically there are multiple records of the same generator. Each variation in spelling in name or address generated a separate handler record. We have worked to minimize these but many remain. The good news is that as long as they all have the same EPA ID they will all show up in your search results. • Handlers provide erroneous data to obtain an EPA ID - data entry was based on erroneous information. Examples include incorrect or bogus addresses and names. There are also a lot of MISSPELLED NAMES AND ADDRESSES! • Missing manifests – Not every required manifest gets submitted to DEEP. Also, of the more than 100,000 paper manifests we receive each year, some were incorrectly handled and never entered. • Missing data – we know that the records for approximately 25 boxes of manifests, mostly prior to 1985 were lost from the database in the 1980’s. • Translation errors – the data has been migrated to newer data platforms numerous times, and each time there have been errors and data losses. • Wastes incorrectly entered – mostly due to complex names that were difficult to spell, or typos in quantities or units of measure. Since Summer 2019, scanned images of manifest hardcopies may be viewed at the DEEP Document Online Search Portal: https://filings.deep.ct.gov/DEEPDocumentSearchPortal/
Note: Please use this link to leave the data view and to see the full description: https://data.ct.gov/Environment-and-Natural-Resources/Spill-Incidents/wr2a-rnsg Description of Dataset: This data set represents information reported between July 1, 1996 and June 30, 2022 to the Department of Energy and Environmental Protection (CT DEEP), generally to the CT DEEP Dispatch Center, regarding releases of substances to the environment, generally through accidental spills. For information related to releases reported to CT DEEP from July 1, 2022 to the present, go to Incident Reports for Releases Reported to CT DEEP July 1, 2022 to present at: https://connecticut.hazconnect.com/listincidentpublic.aspx For a dataset related to releases reported to CT DEEP from July 1, 2022 to recent refer to the CT Open Data dataset: https://data.ct.gov/Environment-and-Natural-Resources/Spill-Incidents-from-July-1-2022-to-Recent-for-Dow/ffju-s5c5 Connecticut General Statutes Section 22a-450 requires anyone who causes any discharge, spillage, uncontrolled loss, seepage or filtration of oil or petroleum or chemical liquids or solid, liquid or gaseous products, or hazardous wastes which poses a potential threat to human health or the environment to report that release to the CT DEEP. Reports of releases from other persons are also included in this dataset. Examples of what may be included in a spill incident record includes: Administrative information (unique spill case number). Spill date/time. Location. Spill source and cause. Material(s) and material type spilled. Quantity spilled. Measurement units. Surface water bodies affected. Data limitations and factors to consider when using this data: This data is limited to information about a spill incident as it was known at the time it was reported to CT DEEP. Although some data reflects updated information after the time of the initial notification, CT DEEP is unable to field check and verify all reported information. Therefore, information later determined to be incomplete or inaccurate may exist in this data set. There may also be spelling errors or other unintentionally inaccurate data that was transcribed in the spill incident report. This dataset is a subset of records and information that may be available about releases that have occurred at specific locations. This dataset does not replace a full review of files publicly available either on-line and/or at CT DEEP’s Records Center. For a complete review of agency records for this or other agency programs, you can perform your own search in our DEEP public file room located at 79 Elm Street, Hartford CT or at our DEEP Online Search Portal at: https://filings.deep.ct.gov/DEEPDocumentSearchPortal/Home . If errors are found or there are questions about the data, please contact the program unit using the following email address: DEEP.SpillsDocs@ct.gov
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this study, we used unidirectional and bidirectional long short-term memory (LSTM) deep learning networks for Chinese news classification and characterized the effects of contextual information on text classification, achieving a high level of accuracy. A Chinese glossary was created using jieba—a word segmentation tool—stop-word removal, and word frequency analysis. Next, word2vec was used to map the processed words into word vectors, creating a convenient lookup table for word vectors that could be used as feature inputs for the LSTM model. A bidirectional LSTM (BiLSTM) network was used for feature extraction from word vectors to facilitate the transfer of information in both the backward and forward directions to the hidden layer. Subsequently, an LSTM network was used to perform feature integration on all the outputs of the BiLSTM network, with the output from the last layer of the LSTM being treated as the mapping of the text into a feature vector. The output feature vectors were then connected to a fully connected layer to construct a feature classifier using the integrated features, finally classifying the news articles. The hyperparameters of the model were optimized based on the loss between the true and predicted values using the adaptive moment estimation (Adam) optimizer. Additionally, multiple dropout layers were added to the model to reduce overfitting. As text classification models for Chinese news articles, the Bi-LSTM and unidirectional LSTM models obtained f1-scores of 94.15% and 93.16%, respectively, with the former outperforming the latter in terms of feature extraction.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Investigate historical ownership changes and registration details by initiating a reverse Whois lookup for the name Deep.
Spatial Recorder Deed Lookup. This dataset uses the auditor's conveyance number to match with the recorder's document number
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset contains calibrated images of comet 9P/Tempel 1 acquired by the Medium Resolution Instrument Visible CCD (MRI) from 01 May through 06 July 2005 during the encounter phase of the Deep Impact mission. Version 3.0 was calibrated by the EPOXI mission pipeline and includes corrected observation times with a maximum difference of about 40 milliseconds, a change to decompress the camera's zero-DN lookup table entry to the top of its range and flag the affected pixels as saturated, the replacement of the I-over-F data products by multiplicative constants for converting radiance products to I-over-F, and the application of a horizontal destriping process and improved absolute radiometric calibration constants.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to deep-tutorial.com (Domain). Get insights into ownership history and changes over time.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset contains calibrated images of comet 9P/Tempel 1 acquired by the Impactor Targeting Sensor Visible CCD (ITS) after the impactor was released from the flyby spacecraft on 03 July 2005 during the Deep Impact mission. Version 3.0 was calibrated by the EPOXI mission pipeline and includes corrected observation times with a maximum difference of about 40 milliseconds, a change to decompress the camera's zero-DN lookup table entry to the top of its range and flag the affected pixels as saturated, and the replacement of the I-over-F data products by multiplicative constants for converting radiance products to I-over-F.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Experimental Evidence Code Only
This is a PostgreSQL database backup using the pgvector
extension to store high-dimensional protein embeddings. It contains precomputed embeddings and functional annotations from the UniProt July 2025 release, including only entries supported by experimental evidence.
This lookup table was generated using version v2.0.0 of the Protein Information System (PIS), an integrated biological information system designed for the automated extraction, processing, and management of protein-related data. PIS consolidates information from UniProt, PDB, and GOA, allowing efficient retrieval and organization of sequences, structures, and annotations.
The resulting database is designed for compatibility with FANTASIA V3, an advanced pipeline for large-scale functional annotation of proteins using state-of-the-art Protein Language Models (PLMs). While the lookup table is stored in a vector database for persistence, FANTASIA loads the relevant data into memory at runtime to enable high-speed annotation.
FANTASIA uses precomputed deep learning embeddings to perform nearest-neighbor searches in embedding space and transfer Gene Ontology (GO) terms from experimentally annotated proteins to query sequences.
Total proteins: 127,546
Total sequences: 124,397
Total embeddings: 621,849
Total GO annotations: 627,932
Included evidence codes (Gene Ontology, experimental only):
EXP
– Inferred from Experiment
IDA
– Inferred from Direct Assay
IPI
– Inferred from Physical Interaction
IMP
– Inferred from Mutant Phenotype
IGI
– Inferred from Genetic Interaction
IEP
– Inferred from Expression Pattern
TAS
– Traceable Author Statement
IC
– Inferred by Curator
ESM-2 (650M parameters)
A transformer-based protein language model trained on UniRef50 using masked language modeling. It captures structural and functional features directly from raw sequences without requiring MSAs. ESM-2 is widely used for contact map prediction, unsupervised learning, and representation extraction.
ProtT5-XL-UniRef50 (~1.2B parameters)
A large-scale encoder-decoder model using the T5 architecture, trained on UniRef50 via masked span prediction. It generates high-dimensional sequence representations that perform well across structure and function prediction tasks.
ProstT5 (~1.2B parameters)
A multi-modal extension of ProtT5, trained to predict both sequence and coarse-grained 3Di structural states. Useful for downstream applications like contact prediction, functional annotation, and classification.
Ankh3-Large (620M parameters)
An encoder-only T5-style model trained with masked span prediction. Optimized for fast inference, it encodes both semantic and structural protein information and can replace ProtT5 in many ML pipelines.
ESM3c (Cambrian 600M)
Part of the new ESM C model family, trained on UniRef, MGnify, and JGI datasets. With rotary embeddings and 36 layers, it offers enhanced performance for masked language modeling, producing high-quality structural and functional embeddings without alignments.
A small subset of proteins could not be processed on the Finisterrae III (CESGA) supercomputer due to memory limitations with 40 GB A100 GPUs.
The file missing_proteins.csv
lists all affected UniProt identifiers. These entries are excluded from the final lookup table.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This list provides Underground Storage Tank (UST) site, tank, contact and Enforcement information for the approximately 45,000 commercial underground storage tanks (USTs) previously and currently registered in Connecticut, of which about 8,000 are still in use. (There are 3 other related data sets: 1-Contact Details 2-Enforcement Summary, and 3-Compliance Details)
Annually, or when a UST is installed, removed, or altered, a notification form must be completed via EZFile (see link) and submitted to CT DEEP. Notification is required for non-residential underground storage tanks, including those for oil, petroleum and chemical liquids, as well as residential home heating oil tanks serving five or more units. See online at: https://www.ct.gov/deep/cwp/view.asp?q=322600
The underground storage tank regulations and the Connecticut underground storage tank enforcement program have been in effect since November 1985. This list is based on notification information submitted by the public since 1985, and is updated weekly. This list contains information on both active and on non-active USTs, as well as federally regulated and state regulated USTs.
Factors to Consider When Using this data:
-Not every required notification form is submitted to the DEEP. We can only enter the information we receive.
-We know there are errors in the data although we strive to minimize them. Error examples may include: notification forms completed incorrectly by the owner/operator, data entry errors, duplicate site information, misspelled names and addresses and/or missing data.
This dataset contains calibrated images of comet 9P/Tempel 1 acquired by the Medium Resolution Instrument Visible CCD (MRI) from 01 May through 06 July 2005 during the encounter phase of the Deep Impact mission. Version 3.0 was calibrated by the EPOXI mission pipeline and includes corrected observation times with a maximum difference of about 40 milliseconds, a change to decompress the camera's zero-DN lookup table entry to the top of its range and flag the affected pixels as saturated, the replacement of the I-over-F data products by multiplicative constants for converting radiance products to I-over-F, and the application of a horizontal destriping process and improved absolute radiometric calibration constants.
https://whoisdatacenter.com/index.php/terms-of-use/https://whoisdatacenter.com/index.php/terms-of-use/
Explore historical ownership and registration records by performing a reverse Whois lookup for the email address deep.poudel30@gmail.com..
This data set represents information reported after June 30, 2022 to the Department of Energy and Environmental Protection (CT DEEP), generally to the CT DEEP Dispatch Center, regarding releases of substances to the environment, generally through accidental spills. The update frequeny on this data set is about once a month. For current information related to releases reported to CT DEEP from July 1, 2022 to the present, go to Incident Reports for Releases Reported to CT DEEP July 1, 2022 to present at: https://connecticut.hazconnect.com/listincidentpublic.aspx. For a dataset related to releases reported to CT DEEP from July 1, 1996 to June 30, 2022 refer to the CT Open Data dataset: https://data.ct.gov/Environment-and-Natural-Resources/Spill-Incidents-from-January-1-1996-to-June-30-202/wr2a-rnsg This dataset is a snapshot of data from the live system that is updated periodically to provide to public users the download capabilities offered from the CT Open Data platform. Connecticut General Statutes Section 22a-450 requires anyone who causes any discharge, spillage, uncontrolled loss, seepage or filtration of oil or petroleum or chemical liquids or solid, liquid or gaseous products, or hazardous wastes which poses a potential threat to human health or the environment to report that release to the CT DEEP. Reports of releases from other persons are also included in this dataset. Examples of what may be included in a spill incident record includes: Administrative information (unique spill case number), Spill date/time, Location.,Spill source and cause, Material(s) and material type spilled. Data limitations and factors to consider when using this data: This data is limited to information about a spill incident as it was known at the time it was reported to CT DEEP. Although some data reflects updated information after the time of the initial notification, CT DEEP idoes not field check and verify all reported information. Therefore, information later determined to be incomplete or inaccurate may exist in this data set. There may also be spelling errors or other unintentionally inaccurate data that was transcribed in the spill incident report. This dataset is a subset of records and information that may be available about releases that have occurred at specific locations. This dataset does not replace a full review of files publicly available either on-line and/or at CT DEEP’s Records Center. For a complete review of agency records for this or other agency programs, you can perform your own search in our DEEP public file room (https://portal.ct.gov/DEEP/About/Environmental-Quality-Records-Records-Center-File-Room) located at 79 Elm Street, Hartford CT or at our DEEP Online Search Portal at: https://filings.deep.ct.gov/DEEPDocumentSearchPortal/Home If errors are found or there are questions about the data, please contact the program unit using the following email address: DEEP.SpillsDocs@ct.gov