21 datasets found
  1. d

    Working With Messy Data in OpenRefine Workshop

    • search.dataone.org
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelly Schultz (2023). Working With Messy Data in OpenRefine Workshop [Dataset]. http://doi.org/10.5683/SP3/YSM3JM
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Kelly Schultz
    Description

    This workshop will introduce OpenRefine, a powerful open source tool for exploring, cleaning and manipulating "messy" data. Through hands-on activities, using a variety of datasets, participants will learn how to: Explore and identify patterns in data; Normalize data using facets and clusters; Manipulate and generate new textual and numeric data; Transform and reshape datasets; Use the General Regular Expression Language (GREL) to undertake manipulations, such as concatenating strings.

  2. m

    Criteria for evaluating and qualifying public datasets obtained from the...

    • data.mendeley.com
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gyslla Vasconcelos (2025). Criteria for evaluating and qualifying public datasets obtained from the Brazilian Federal Government's Open Data Portal - dados.gov [Dataset]. http://doi.org/10.17632/x8sgcykthn.2
    Explore at:
    Dataset updated
    May 19, 2025
    Authors
    Gyslla Vasconcelos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These criteria (file 1) were drawn up empirically, based on the practical challenges faced during the development of the thesis research, based on tests carried out with various datasets applied to process mining tools. These criteria were elaborated empirically, based on the practical challenges faced during the development of the thesis research, based on tests conducted with various datasets applied to process mining tools. These criteria were prepared with the aim of creating a ranking of the datasets selected and published (https://doi.org/10.6084/m9.figshare.25514884.v3), in order to classify them according to their score. The criteria are divided into informative (In), importance (I), difficulty (D) and ease (F) of handling (file 2). The datasets were selected (file 3) and, for ranking, calculations were made (file 5) to normalize the values for standardization (file 4). This data is part of a study on the application of process mining techniques to Brazilian public service data, available on the open data portal dados.gov.

  3. ARCS White Beam Vanadium Normalization Data for SNS Cycle 2024B

    • osti.gov
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abernathy, Douglas; Balz, Christian; Goyette, Rick; Granroth, Garrett (2025). ARCS White Beam Vanadium Normalization Data for SNS Cycle 2024B [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2570733
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    Spallation Neutron Source (SNS)
    Authors
    Abernathy, Douglas; Balz, Christian; Goyette, Rick; Granroth, Garrett
    Description

    A data set used to normalize the detector response of the ARCS instrument see ARCS_269548.md in the data set for more details.

  4. CYGNSS Level 1 Science Data Record Version 2.1 - Dataset - NASA Open Data...

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). CYGNSS Level 1 Science Data Record Version 2.1 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/cygnss-level-1-science-data-record-version-2-1-c4d25
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This Level 1 (L1) dataset contains the Version 2.1 geo-located Delay Doppler Maps (DDMs) calibrated into Power Received (Watts) and Bistatic Radar Cross Section (BRCS) expressed in units of meters squared from the Delay Doppler Mapping Instrument aboard the CYGNSS satellite constellation. This version supersedes Version 2.0. Other useful scientific and engineering measurement parameters include the DDM of Normalized Bistatic Radar Cross Section (NBRCS), the Delay Doppler Map Average (DDMA) of the NBRCS near the specular reflection point, and the Leading Edge Slope (LES) of the integrated delay waveform. The L1 dataset contains a number of other engineering and science measurement parameters, including sets of quality flags/indicators, error estimates, and bias estimates as well as a variety of orbital, spacecraft/sensor health, timekeeping, and geolocation parameters. At most, 8 netCDF data files (each file corresponding to a unique spacecraft in the CYGNSS constellation) are provided each day; under nominal conditions, there are typically 6-8 spacecraft retrieving data each day, but this can be maximized to 8 spacecraft under special circumstances in which higher than normal retrieval frequency is needed (i.e., during tropical storms and or hurricanes). Latency is approximately 6 days (or better) from the last recorded measurement time. The Version 2.1 release represents the second science-quality release. Here is a summary of improvements that reflect the quality of the Version 2.1 data release: 1) data is now available when the CYGNSS satellites are rolled away from nadir during orbital high beta-angle periods, resulting in a significant amount of additional data; 2) correction to coordinate frames result in more accurate estimates of receiver antenna gain at the specular point; 3) improved calibration for analog-to-digital conversion results in better consistency between CYGNSS satellites measurements at nearly the same location and time; 4) improved GPS EIRP and transmit antenna pattern calibration results in significantly reduced PRN-dependence in the observables; 5) improved estimation of the location of the specular point within the DDM; 6) an altitude-dependent scattering area is used to normalize the scattering cross section (v2.0 used a simpler scattering area model that varied with incidence and azimuth angles but not altitude); 7) corrections added for noise floor-dependent biases in scattering cross section and leading edge slope of delay waveform observed in the v2.0 data. Users should also note that the receiver antenna pattern calibration is not applied per-DDM-bin in this v2.1 release.

  5. US State populations - 2018

    • kaggle.com
    Updated May 29, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vikas (2018). US State populations - 2018 [Dataset]. https://www.kaggle.com/lucasvictor/us-state-populations-2018/data?select=State+Populations.csv
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vikas
    Description

    Context

    While working on the gun violence data set, i wanted to normalize the number of incidents because some states are more populous than others so normalizing the gun incidents per million people gave me a different outlook towards the data. The source of this data is unofficial as the last numbers from US census bureau were available only from 2010. I just wanted to get a quick unofficial source of this data and stumbled upon this site

    http://worldpopulationreview.com/states/

    Content

    Simple two columns - state and population as of 2018

    Acknowledgements

    http://worldpopulationreview.com/states/

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  6. ARCS White Beam Vanadium Normalization Data for SNS Cycle 2022B

    • osti.gov
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL) (2025). ARCS White Beam Vanadium Normalization Data for SNS Cycle 2022B [Dataset]. http://doi.org/10.14461/oncat.data/2515590
    Explore at:
    Dataset updated
    Feb 12, 2025
    Dataset provided by
    Department of Energy Basic Energy Sciences Programhttp://science.energy.gov/user-facilities/basic-energy-sciences/
    Office of Sciencehttp://www.er.doe.gov/
    Spallation Neutron Source (SNS)
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL)
    Description

    Neutron scattering Data from a Vanadium cylinder. Acquired on the ARCS spectrometer in white beam mode to normalize the detector efficiencies. During Cycle 2022B

  7. FinTech Climate Data API Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jun 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). FinTech Climate Data API Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/fintech-climate-data-api-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jun 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    FinTech Climate Data API Market Outlook



    According to our latest research, the FinTech Climate Data API market size reached USD 1.23 billion globally in 2024, demonstrating robust momentum as financial institutions increasingly integrate climate data into their operations. The market is projected to grow at a CAGR of 22.7% from 2025 to 2033, reaching a forecasted value of USD 9.94 billion by 2033. This rapid expansion is driven by mounting regulatory pressures, rising investor demand for climate transparency, and the urgent need for financial entities to assess climate-related risks and opportunities.




    A primary growth driver for the FinTech Climate Data API market is the global shift towards sustainable finance and the intensifying focus on environmental, social, and governance (ESG) criteria. Financial institutions are under increasing pressure from regulators and investors to quantify and disclose climate-related risks embedded in their portfolios. This has led to a surge in demand for sophisticated climate data APIs that can deliver real-time, granular, and actionable insights. These APIs enable banks, asset managers, and insurance companies to integrate climate risk analytics directly into their existing risk assessment, investment analysis, and compliance workflows. As a result, the market is witnessing accelerated adoption, particularly among organizations aiming to align with international frameworks such as the Task Force on Climate-related Financial Disclosures (TCFD) and the European Union’s Sustainable Finance Disclosure Regulation (SFDR).




    Another significant factor propelling the FinTech Climate Data API market is the rapid digital transformation within the financial services sector. The proliferation of cloud computing, artificial intelligence, and big data analytics has enabled the development of advanced climate data solutions that are scalable, interoperable, and easily integrated via API infrastructure. Financial technology (FinTech) companies are leveraging these capabilities to offer innovative services such as climate-adjusted portfolio management, carbon accounting, and scenario analysis. This technological evolution is lowering barriers to entry for smaller financial institutions and fintech startups, broadening the market’s user base and fostering a competitive ecosystem. Moreover, the growing collaboration between climate data providers and financial software vendors is catalyzing the creation of end-to-end solutions tailored to specific use cases across banking, asset management, and insurance.




    The increasing frequency and severity of climate-related events, such as floods, wildfires, and hurricanes, have heightened awareness of the financial risks associated with climate change. This has compelled financial institutions to seek more accurate and timely data to model potential impacts on asset values, loan portfolios, and insurance liabilities. The FinTech Climate Data API market is responding by offering APIs that aggregate and normalize data from diverse sources, including satellite imagery, meteorological data, and corporate emissions disclosures. By facilitating comprehensive risk modeling and scenario analysis, these APIs are becoming indispensable tools for financial decision-makers. The trend is particularly pronounced in developed markets, where regulatory frameworks and investor expectations are driving the integration of climate data into mainstream financial analysis.




    From a regional perspective, North America and Europe currently dominate the FinTech Climate Data API market, accounting for the largest share of global revenues. This leadership is attributed to the presence of major financial hubs, stringent regulatory requirements, and a high level of technological maturity. However, the Asia Pacific region is emerging as a key growth engine, supported by rapid fintech adoption, expanding financial markets, and increasing government initiatives to promote sustainable finance. Latin America and the Middle East & Africa, while still nascent, are expected to offer significant opportunities as awareness of climate risk grows and digital infrastructure improves. The regional landscape is thus characterized by a dynamic interplay of regulatory, technological, and market-driven factors shaping the adoption of climate data APIs.



  8. ARCS White Beam Vanadium Normalization Data for SNS Cycle 2022B (May 15 -...

    • osti.gov
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spallation Neutron Source (SNS) (2025). ARCS White Beam Vanadium Normalization Data for SNS Cycle 2022B (May 15 - Jun., 14, 2022) [Dataset]. http://doi.org/10.14461/oncat.data/2568320
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset provided by
    Department of Energy Basic Energy Sciences Programhttp://science.energy.gov/user-facilities/basic-energy-sciences/
    Office of Sciencehttp://www.er.doe.gov/
    Spallation Neutron Source (SNS)
    Description

    A data set used to normalize the detector response of the ARCS instrument see ARCS_226797.md in the data set for more details.

  9. U

    Data from: Attributes for NHDPlus Catchments (Version 1.1) for the...

    • data.usgs.gov
    • search.dataone.org
    • +2more
    Updated Aug 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Ammonium (NH4) [Dataset]. http://doi.org/10.5066/P9Y0FDV0
    Explore at:
    Dataset updated
    Aug 24, 2024
    Dataset authored and provided by
    United States Geological Surveyhttp://www.usgs.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2002
    Area covered
    United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Ammonium (NH4) for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of NH4 deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations.

    The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived ca ...

  10. n

    CYGNSS Level 1 Science Data Record Version 2.1

    • podaac.jpl.nasa.gov
    • s.cnmilf.com
    • +3more
    html
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PO.DAAC, CYGNSS Level 1 Science Data Record Version 2.1 [Dataset]. http://doi.org/10.5067/CYGNS-L1X21
    Explore at:
    htmlAvailable download formats
    Dataset provided by
    PO.DAAC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 18, 2017 - Present
    Variables measured
    RADAR CROSS-SECTION, RADAR REFLECTIVITY, SIGMA NAUGHT, FLIGHT DATA LOGS
    Description

    This Level 1 (L1) dataset contains the Version 2.1 geo-located Delay Doppler Maps (DDMs) calibrated into Power Received (Watts) and Bistatic Radar Cross Section (BRCS) expressed in units of meters squared from the Delay Doppler Mapping Instrument aboard the CYGNSS satellite constellation. This version supersedes Version 2.0. Other useful scientific and engineering measurement parameters include the DDM of Normalized Bistatic Radar Cross Section (NBRCS), the Delay Doppler Map Average (DDMA) of the NBRCS near the specular reflection point, and the Leading Edge Slope (LES) of the integrated delay waveform. The L1 dataset contains a number of other engineering and science measurement parameters, including sets of quality flags/indicators, error estimates, and bias estimates as well as a variety of orbital, spacecraft/sensor health, timekeeping, and geolocation parameters. At most, 8 netCDF data files (each file corresponding to a unique spacecraft in the CYGNSS constellation) are provided each day; under nominal conditions, there are typically 6-8 spacecraft retrieving data each day, but this can be maximized to 8 spacecraft under special circumstances in which higher than normal retrieval frequency is needed (i.e., during tropical storms and or hurricanes). Latency is approximately 6 days (or better) from the last recorded measurement time. The Version 2.1 release represents the second science-quality release. Here is a summary of improvements that reflect the quality of the Version 2.1 data release: 1) data is now available when the CYGNSS satellites are rolled away from nadir during orbital high beta-angle periods, resulting in a significant amount of additional data; 2) correction to coordinate frames result in more accurate estimates of receiver antenna gain at the specular point; 3) improved calibration for analog-to-digital conversion results in better consistency between CYGNSS satellites measurements at nearly the same location and time; 4) improved GPS EIRP and transmit antenna pattern calibration results in significantly reduced PRN-dependence in the observables; 5) improved estimation of the location of the specular point within the DDM; 6) an altitude-dependent scattering area is used to normalize the scattering cross section (v2.0 used a simpler scattering area model that varied with incidence and azimuth angles but not altitude); 7) corrections added for noise floor-dependent biases in scattering cross section and leading edge slope of delay waveform observed in the v2.0 data. Users should also note that the receiver antenna pattern calibration is not applied per-DDM-bin in this v2.1 release.

  11. U

    Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United...

    • data.usgs.gov
    • gimi9.com
    • +3more
    Updated Aug 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Nitrate (NO3) [Dataset]. https://data.usgs.gov/datacatalog/data/USGS:da57b127-7d06-47e6-8a1f-b7917784489f
    Explore at:
    Dataset updated
    Aug 24, 2024
    Dataset authored and provided by
    United States Geological Surveyhttp://www.usgs.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2002
    Area covered
    United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Nitrate (NO3) for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of NO3 deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations.

    The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived cat ...

  12. S

    A radiometric normalization dataset of Shandong Province based on Gaofen-1...

    • scidb.cn
    Updated Feb 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    黄莉婷; 焦伟利; 龙腾飞 (2020). A radiometric normalization dataset of Shandong Province based on Gaofen-1 WFV image (2018) [Dataset]. http://doi.org/10.11922/sciencedb.947
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2020
    Dataset provided by
    Science Data Bank
    Authors
    黄莉婷; 焦伟利; 龙腾飞
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Shandong
    Description

    Surface reflectance is a critical physical variable that affects the energy budget in land-atmosphere interactions, feature recognition and classification, and climate change research. This dataset uses the relative radiometric normalization method, and takes the Landsat-8 Operational Land Imager (OLI) surface reflectance products as the reference image to normalize the GF-1 satellite WFV sensor cloud-free images of Shandong Province in 2018. Relative radiometric normalization processing mainly includes atmospheric correction, image resampling, image registration, mask, extract the no-change pixels and calculate normalization coefficients. After relative radiometric normalization, the no-change pixels of each GF-1 WFV image and its reference image, R2 is 0.7295 above, RMSE is below 0.0172. The surface reflectance accuracy of GF-1 WFV image is improved, which can be used in cooperation with Landsat data to provide data support for remote sensing quantitative inversion. This dataset is in GeoTIFF format, and the spatial resolution of the image is 16 m.

  13. White Beam Normalization

    • osti.gov
    Updated Jun 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL) (2023). White Beam Normalization [Dataset]. http://doi.org/10.14461/oncat.data.649c9ec01c1bb5a8e6465d80/1987352
    Explore at:
    Dataset updated
    Jun 28, 2023
    Dataset provided by
    Department of Energy Basic Energy Sciences Programhttp://science.energy.gov/user-facilities/basic-energy-sciences/
    Office of Sciencehttp://www.er.doe.gov/
    Spallation Neutron Source (SNS)
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL)
    Description

    Raw data used to normalize detector performance on the ARCS instrument For the run cycle starting in June 2023. This is a V cylinder and the T0 chopper is set to 150 Hz and phased for 300 meV. All Fermi Choppers are out of the Beam.

  14. f

    Data from: Flowmapper.org: a web-based framework for designing...

    • tandf.figshare.com
    • figshare.com
    docx
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caglar Koylu; Geng Tian; Mary Windsor (2023). Flowmapper.org: a web-based framework for designing origin–destination flow maps [Dataset]. http://doi.org/10.6084/m9.figshare.18142635.v2
    Explore at:
    docxAvailable download formats
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Caglar Koylu; Geng Tian; Mary Windsor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    FlowMapper.org is a web-based framework for automated production and design of origin-destination flow maps. FlowMapper has four major features that contribute to the advancement of existing flow mapping systems. First, users can upload and process their own data to design and share customized flow maps. The ability to save data, cartographic design and map elements in a project file allows users to easily share their data and/or cartographic design with others. Second, users can generate customized flow symbols to support different flow map reading tasks such as comparing flow magnitudes and directions and identifying flow and location clusters that are strongly connected with each other. Third, FlowMapper supports supplementary layers such as node symbols, choropleth, and base maps to contextualize flow patterns with location references and characteristics. Finally, the web-based architecture of FlowMapper supports server-side computational capabilities to process and normalize large flow data and reveal natural patterns of flows.

  15. U

    Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United...

    • data.usgs.gov
    • search.dataone.org
    • +2more
    Updated Sep 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1) for the Conterminous United States: Normalized Atmospheric Deposition for 2002, Total Inorganic Nitrogen [Dataset]. http://doi.org/10.5066/P9PMFH4F
    Explore at:
    Dataset updated
    Sep 3, 2024
    Dataset authored and provided by
    United States Geological Surveyhttp://www.usgs.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2002
    Area covered
    United States
    Description

    This data set represents the average normalized atmospheric (wet) deposition, in kilograms, of Total Inorganic Nitrogen for the year 2002 compiled for every catchment of NHDPlus for the conterminous United States. Estimates of Total Inorganic Nitrogen deposition are based on National Atmospheric Deposition Program (NADP) measurements (B. Larsen, U.S. Geological Survey, written commun., 2007). De-trending methods applied to the year 2002 are described in Alexander and others, 2001. NADP site selection met the following criteria: stations must have records from 1995 to 2002 and have a minimum of 30 observations.

    The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus als ...

  16. f

    Selection of Suitable Reference Genes for qPCR Normalization under Abiotic...

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qian Jiang; Feng Wang; Meng-Yao Li; Jing Ma; Guo-Fei Tan; Ai-Sheng Xiong (2023). Selection of Suitable Reference Genes for qPCR Normalization under Abiotic Stresses in Oenanthe javanica (BI.) DC [Dataset]. http://doi.org/10.1371/journal.pone.0092262
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Qian Jiang; Feng Wang; Meng-Yao Li; Jing Ma; Guo-Fei Tan; Ai-Sheng Xiong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accurate normalization of gene expression data is an absolute prerequisite to obtain reliable results in qPCR analysis. Oenanthe javanica, an aquatic perennial herb, belongs to the Oenanthe genus in Apiaceae family, with known medicinal properties. In the current study, O. javanica was subjected to hormone stimuli (gibberellin, salicylic acid, methyl jasmonate, and abscisic acid) and abiotic stresses (heat, cold, salt, and drought), and the expression of nine candidate reference genes (eIF-4α, ACT7, TIP41, GAPDH, SAND, EF-1α, PP2A, TBP, and TUB) was evaluated. Stability of the genes was assessed using geNorm, NormFinder and BestKeeper. All the genes presented distinct expression profiles under the experimental conditions analyzed. Under abiotic stress conditions, ACT7 and PP2A genes displayed the maximum stability; PP2A and SAND were the most stable genes under hormone stimuli. Even though PP2A gene was most stable across all the samples, individual analysis revealed changes in expression profile. To further validate the suitability of the reference genes identified in this study, the expression level of M6PR gene under salt treatment was studied. Based on our data, we propose that it is essential to normalize the target gene expression with specific reference genes under different experimental conditions for most accurate results. To our knowledge, this is the first systematic analysis for reference genes under abiotic stress and hormone stimuli conditions in O. javanica. This will be beneficial for future studies on O. javanica and other plants in Apiaceae family at molecular level.

  17. DDSP EMG dataset.xlsx

    • commons.datacite.org
    • figshare.com
    Updated Jul 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Cercone (2019). DDSP EMG dataset.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.8864411
    Explore at:
    Dataset updated
    Jul 14, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    DataCitehttps://www.datacite.org/
    Authors
    Marta Cercone
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study was performed in accordance with the PHS Policy on Humane Care and Use of Laboratory Animals, federal and state regulations, and was approved by the Institutional Animal Care and Use Committees (IACUC) of Cornell University and the Ethics and Welfare Committee at the Royal Veterinary College.Study design: adult horses were recruited if in good health and following evaluation of the upper airways through endoscopic exam, at rest and during exercise, either overground or on a high-speed treadmill using a wireless videoendoscope. Horses were categorized as “DDSP” affected horses if they presented with exercise-induced intermittent dorsal displacement of the soft palate consistently during multiple (n=3) exercise tests, or “control” horses if they did not experience dorsal displacement of the soft palate during exercise and had no signs compatible with DDSP like palatal instability during exercise, soft palate or sub-epiglottic ulcerations. Horses were instrumented with intramuscular electrodes, in one or both thyro-hyoid muscles for EMG recording, hard wired to a wireless transmitter for remote recording implanted in the cervical area. EMG recordings were then made during an incremental exercise test based on the percentage of maximum heart rate (HRmax). Incremental Exercise Test After surgical instrumentation, each horse performed a 4-step incremental test while recording TH electromyographic activity, heart rate, upper airway videoendoscopy, pharyngeal airway pressures, and gait frequency measurements. Horses were evaluated at exercise intensities corresponding to 50, 80, 90 and 100% of their maximum heart rate with each speed maintained for 1 minute. aryngeal function during the incremental test was recorded using a wireless videoendoscope (Optomed, Les Ulis, France), which was placed into the nasopharynx via the right ventral nasal meatus. Nasopharyngeal pressure was measured using a Teflon catheter (1.3 mm ID, Neoflon) inserted through the left ventral nasal meatus to the level of the left guttural pouch ostium. The catheter was attached to differential pressure transducers (Celesco LCVR, Celesco Transducers Products, Canoga Park, CA, USA) referenced to atmospheric pressure and calibrated from -70 to 70 mmHg. Occurrence of episodes of dorsal displacement of the soft palate was recorded and number of swallows during each exercise trials were counted for each speed interval.
    EMG recordingEMG data was recorded through a wireless transmitter device implanted subcutaneously. Two different transmitters were used: 1) TR70BB (Telemetry Research Ltd, Auckland, New Zealand) with 12bit A/D conversion resolution, AC coupled amplifier, -3dB point at 1.5Hz, 2KHz sampling frequency (n=5 horses); or 2) ELI (Center for Medical Physics and Biomedical Engineering, Medical University of Vienna, Vienna, Austria) [23], with 12bit A/D conversion resolution, AC coupled amplifier, amplifier gain 1450, 1KHz sampling frequency (n=4 horses). The EMG signal was transmitted through a receiver (TR70BB) or Bluetooth (ELI) to a data acquisition system (PowerLab 16/30 - ML880/P, ADInstruments, Bella Vista, Australia). The EMG signal was amplified with octal bio-amplifier (Octal Bioamp, ML138, ADInstruments, Bella Vista, Australia) with a bandwidth frequency ranging from 20-1000 Hz (input impedance = 200 MV, common mode rejection ratio = 85 dB, gain = 1000), and transmitted to a personal computer. All EMG and pharyngeal pressure signals were collected at 2000 Hz rate with LabChart 6 software (ADInstruments, Bella Vista, Australia) that allows for real-time monitoring and storage for post-processing and analysis.
    EMG signal processingElectromyographic signals from the TH muscles were processed using two methods; 1) a classical approach to myoelectrical activity and median frequency and 2) wavelet decomposition. For both methods, the beginning and end of recording segments including twenty consecutive breaths, at the end of each speed interval, were marked with comments in the acquisition software (LabChart). The relationship of EMG activity with phase of the respiratory cycle was determined by comparing pharyngeal pressure waveforms with the raw EMG and time-averaged EMG traces.For the classical approach, in a graphical user interface-based software (LabChart), a sixth-order Butterworth filter was applied (common mode rejection ratio, 90 dB; band pass, 20 to 1,000 Hz), the EMG signal was then amplified, full-wave rectified, and smoothed using a triangular Bartlett window (time constant: 150ms). The digitized area under the time-averaged full-wave rectified EMG signal was calculated to define the raw mean electrical activity (MEA) in mV.s. Median Power Frequency (MF) of the EMG power spectrum was calculated after a Fast Fourier Transformation (1024 points, Hann cosine window processing). For the wavelet decomposition, the whole dataset including comments and comment locations was exported as .mat files for processing in MATLAB R2018a with the Signal Processing Toolbox (The MathWorks Inc, Natick, MA, USA). A custom written automated script based on Hodson-Tole & Wakeling [24] was used to first cut the .mat file into the selected 20 breath segments and subsequently process each segment. A bank of 16 wavelets with time and frequency resolution optimized for EMG was used. The center frequencies of the bank ranged from 6.9 Hz to 804.2 Hz [25]. The intensity was summed (mV2) to a total, and the intensity contribution of each wavelet was calculated across all 20 breaths for each horse, with separate results for each trial date and exercise level (80, 90, 100% of HRmax as well as the period preceding episodes of DDSP). To determine the relevant bandwidths for the analysis, a Fast Fourier transform frequency analysis was performed on the horses unaffected by DDSP from 0 to 1000 Hz in increments of 50Hz and the contribution of each interval was calculated in percent of total spectrum as median and interquartile range. According to the Shannon-Nyquist sampling theorem, the relevant signal is below ½ the sample rate and because we had instrumentation sampling either 1000Hz and 2000Hz we choose to perform the frequency analysis up to 1000Hz. The 0-50Hz interval, mostly stride frequency and background noise, was excluded from further analysis. Of the remaining frequency spectrum, we included all intervals from 50-100Hz to 450-500Hz and excluded the remainder because they contributed with less than 5% to the total amplitude.Data analysisAt the end of each exercise speed interval, twenty consecutive breaths were selected and analyzed as described above. To standardize MEA, MF and mV2 within and between horses and trials, and to control for different electrodes size (i.e. different impedance and area of sampling), data were afterward normalized to 80% of HRmax value (HRmax80), referred to as normalized MEA (nMEA), normalized MF (nMF) and normalized mV2 (nmV2). During the initial processing, it became clear that the TH muscle is inconsistently activated at 50% of HRmax and that speed level was therefore excluded from further analysis. The endoscopy video was reviewed and episodes of palatal displacement were marked with comments. For both the classical approach and wavelet analysis, an EMG segment preceding and concurrent to the DDSP episode was analyzed. If multiple episodes were recorded during the same trial, only the period preceding the first palatal displacement was analyzed. In horses that had both TH muscles implanted, the average between the two sides was used for the analysis. Averaged data from multiple trials were considered for each horse. Descriptive data are expressed as means with standard deviation (SD). Normal distribution of data was assessed using the Kolmogorov-Smirnov test and quantile-quantile (Q-Q) plot. To determine the frequency clusters in the EMG signal, a hierarchical agglomerative dendrogram was applied using the packages Matplotlib, pandas, numpy and scipy in python (version 3.6.6) executed through Spyder (version 3.2.2) and Anaconda Navigator. Based on the frequency analysis, wavelets included in the cluster analysis were 92.4 Hz, 128.5 Hz, 170.4 Hz, 218.1 Hz, 271.5 Hz, 330.6 Hz, 395.4 Hz and 465.9 Hz. The number of frequency clusters was set to two based on maximum acceleration in a scree plot and maximum vertical distance in the dendrogram. For continuous outcome measures (number of swallows, MEA, MF, and mV2) a mixed effect model was fitted to the data to determine the relationship between the outcome variable and relevant fixed effects (breed, sex, age, weight, speed, group) using horse as a random effect. Tukey’s post hoc tests and linear contrasts used as appropriate. Statistical analysis was performed using JMP Pro13 (SAS Institute, Cary, NC, USA). Significance set at P < 0.05 throughout.

  18. f

    Data from: Foreign allometric exponents adequately normalize isokinetic knee...

    • figshare.com
    bin
    Updated Mar 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Pugliesi Abdalla; Dalmo Roberto Lopes Machado; Lucimere Bohn; Gareth Stratton; Jorge Mota (2022). Foreign allometric exponents adequately normalize isokinetic knee extension strength to identify muscle weakness and mobility limitation in Portuguese older adults: A cross-sectional study [Dataset]. http://doi.org/10.6084/m9.figshare.19401032.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 22, 2022
    Dataset provided by
    figshare
    Authors
    Pedro Pugliesi Abdalla; Dalmo Roberto Lopes Machado; Lucimere Bohn; Gareth Stratton; Jorge Mota
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset encompassing 226 Portuguese (n=132) and Brazilian (n=94) older adults. Mobility limitation (six-minute walk test, at lowest quartile), lower limbs strength (knee extension isokinetic strength at 60º/s), and body dimensions measures were taken. Foreign allometric exponents (b) were used to normalize Portuguese strength (strength/body-size variablesb).

  19. ARCS White Beam Vanadium Normalization for SNS Cycle 2020A

    • osti.gov
    Updated Dec 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL) (2024). ARCS White Beam Vanadium Normalization for SNS Cycle 2020A [Dataset]. http://doi.org/10.14461/oncat.data/2482438
    Explore at:
    Dataset updated
    Dec 17, 2024
    Dataset provided by
    Department of Energy Basic Energy Sciences Programhttp://science.energy.gov/user-facilities/basic-energy-sciences/
    Office of Sciencehttp://www.er.doe.gov/
    Spallation Neutron Source (SNS)
    High Flux Isotope Reactor (HFIR) & Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL)
    Description

    This is a white beam data set from V to normalize the relative detector performance. See the ARCS_155050.md file for more information

  20. LLM Fine Tuning Dataset of Indian Legal Texts

    • kaggle.com
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akshat Gupta (2024). LLM Fine Tuning Dataset of Indian Legal Texts [Dataset]. https://www.kaggle.com/datasets/akshatgupta7/llm-fine-tuning-dataset-of-indian-legal-texts/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Akshat Gupta
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    This dataset comprises curated question-answer pairs derived from key legal texts pertinent to Indian law, specifically the Indian Penal Code (IPC), Criminal Procedure Code (CRPC), and the Indian Constitution. The goal of this dataset is to facilitate the development and fine-tuning of language models and AI applications that assist legal professionals in India.

    Dataset Details:

    • Sources: The questions and answers in this dataset are extracted from the Indian Constitution, Indian Penal Code (IPC), and the Code of Criminal Procedure (CrPC), ensuring relevance and accuracy in legal contexts.
    • Content: Each entry in the dataset contains a clear and concise question alongside its corresponding answer. The questions are designed to cover fundamental concepts, key provisions, and significant terms found within these legal documents.

    Use Cases:

    • Legal Research: A valuable tool for lawyers, legal researchers, and students seeking to understand legal terminology and principles as outlined in Indian law.
    • Natural Language Processing (NLP): This dataset is ideal for training AI models for question-answering systems that require a strong understanding of Indian legal texts.
    • Educational Resources: Useful for creating educational tools and materials for law students and legal practitioners.

    Note on Use and Limitations:

    • Misuse of Dataset: This dataset is intended for educational, research, and development purposes only. Users should exercise caution to ensure that any AI applications developed using this dataset do not misrepresent or distort legal information. The dataset should not be used for legal advice or to influence legal decisions without proper context and verification.

    • Relevance and Context: While every effort has been made to ensure the accuracy and relevance of the question-answer pairs, some entries may be out of context or may not fully represent the legal concepts they aim to explain. Users are strongly encouraged to conduct thorough reviews of the entries, particularly when using them in formal applications or legal research.

    • Data Preprocessing Recommended: Due to the nature of natural language, the QA pairs may include variations in phrasing, potential redundancies, or entries that may not align perfectly with the intended legal context. Therefore, it is highly recommended that users perform data preprocessing to cleanse, normalize, or filter out any irrelevant or out-of-context pairs before integrating the dataset into machine learning models or systems.

    • Dynamic Nature of Law: The legal landscape is subject to change over time. As laws and interpretations evolve, some answers may become outdated or less applicable. Users should verify the current applicability of legal concepts and check sources for updates when necessary.

    • Credits and Citations: If you use this dataset in your research or projects, appropriate credits should be provided. Users are also encouraged to share any improvements, corrections, or updates they make to the dataset for the benefit of the community.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kelly Schultz (2023). Working With Messy Data in OpenRefine Workshop [Dataset]. http://doi.org/10.5683/SP3/YSM3JM

Working With Messy Data in OpenRefine Workshop

Explore at:
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Kelly Schultz
Description

This workshop will introduce OpenRefine, a powerful open source tool for exploring, cleaning and manipulating "messy" data. Through hands-on activities, using a variety of datasets, participants will learn how to: Explore and identify patterns in data; Normalize data using facets and clusters; Manipulate and generate new textual and numeric data; Transform and reshape datasets; Use the General Regular Expression Language (GREL) to undertake manipulations, such as concatenating strings.

Search
Clear search
Close search
Google apps
Main menu