71 datasets found
  1. d

    NCOM Region 10 Aggregation/Best Time Series

    • datadiscoverystudio.org
    opendap
    Updated Nov 21, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil (2018). NCOM Region 10 Aggregation/Best Time Series [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/5cdc21deb99c4b25bb51704b576e14c6/html
    Explore at:
    opendapAvailable download formats
    Dataset updated
    Nov 21, 2018
    Authors
    kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil
    Area covered
    Description

    Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.

  2. Envestnet | Yodlee's De-Identified Online Purchase Data | Row/Aggregate...

    • datarade.ai
    .sql, .txt
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Online Purchase Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-de-identified-online-purchase-data-row-envestnet-yodlee
    Explore at:
    .sql, .txtAvailable download formats
    Dataset provided by
    Yodlee
    Envestnethttp://envestnet.com/
    Authors
    Envestnet | Yodlee
    Area covered
    United States of America
    Description

    Envestnet®| Yodlee®'s Online Purchase Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

    Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

    We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

    Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

    Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

    Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

    1. Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

    2. Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

    3. Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis

  3. d

    HYCOM Surface Aggregation/Best Time Series

    • datadiscoverystudio.org
    opendap
    Updated Nov 21, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil (2018). HYCOM Surface Aggregation/Best Time Series [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/6df5daf0d39f4bb690e01fc65f9f1e08/html
    Explore at:
    opendapAvailable download formats
    Dataset updated
    Nov 21, 2018
    Authors
    kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil
    Area covered
    North Atlantic Ocean, Atlantic Ocean
    Description

    Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.

  4. Data from: Usable observations over Europe: Evaluation of compositing...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katarzyna Ewa Lewińska; Katarzyna Ewa Lewińska; David Frantz; Ulf Leser; Patrick Hostert; David Frantz; Ulf Leser; Patrick Hostert (2024). Usable observations over Europe: Evaluation of compositing windows for landsat and sentinel-2 time series [Dataset]. http://doi.org/10.5061/dryad.5tb2rbp94
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katarzyna Ewa Lewińska; Katarzyna Ewa Lewińska; David Frantz; Ulf Leser; Patrick Hostert; David Frantz; Ulf Leser; Patrick Hostert
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Measurement technique
    <p>We used all Landsat surface reflectance Level 2, Tier 1 (Collection 2) scenes from 1984 through 2021 and Sentinel-2 TOA reflectance Level-1C (pre‑Collection-1; European Space Agency, 2021) scenes from 2016 through 2021 acquired over Europe, as available in Google Earth Engine (data accessed in June 2022; Gorelick et al., 2017). We utilized Seninel-2 Level-1C data instead of Level-2A because the Level-2A inherent quality data lack the desired scope and accuracy (Baetens et al., 2019; Coluzzi et al., 2018). Yet, the Level-1C products are accompanied by cloud probabilities (Zupanc, 2017) facilitating improved cloud screening. Furthermore, for cloud screening we also used Band 10 (Cirrus), which is not available as Level‑2A. Because we performed data availability analyses, i.e., we tallied the daily presence/absence of usable observation, the disparity between Landsat and Sentinel-2 reflectance values was here irrelevant, and the intra-sensor normalization was not needed. The difference in processing levels, however, played out in cloud, shadow, and snow masking accuracy, where the Sentinel-2 workflow assembles several approaches with known accuracies (Skakun et al., 2022), but has not been evaluated as a whole. We acknowledge that for real-life reflectance-based applications, data from corresponding processing levels need to be used and the reflectance normalized among the sensors (Okujeni et al., 2024). We recommend thus either preprocessing of the Sentinel-2 TOA data to achieve the desired quality of masks, or linking Sentinel-2 Level-A2 data with Level-1C band 10 and relevant Cloud Probability scenes for more rigorous cloud screening.</p> <p>To ensure that only pixels with the highest quality entered the analysis we applied conservative pixel-quality screening. For Landsat scenes, we excluded all pixels flagged as cloud, shadow, or snow using the inherent pixel quality bands (Foga et al., 2017; Z. Zhu & Woodcock, 2012) and discarded saturated pixels (Zhang et al., 2022). We further used the quality bands to exclude all data gaps in the Landsat 7 acquisitions occurring due to the SLC scanline failure (Andréfouët et al., 2003). Although the accuracy of the inherent pixel‑quality bands differs among the Landsat sensors due to the differences in the sensor's build and thus availability of thermal and cirrus‑specific bands (Foga et al., 2017), the Landsat quality bands are acclaimed standardized quality product. Finally, owing to Landsat 7's orbit drift (Qiu et al., 2021), we excluded all ETM+ scenes acquired after 31<sup>st</sup> December 2020.</p> <p>We used a 20‑km grid of 16,642 equidistant points to analyze the availability of useable Landsat and Sentinel-2 observations over Europe. We distributed points according to the Lambert azimuthal equal-area projection (LAEA, EPSG:3035), which is the preferred projection for EU-wide products. Despite LAEA being the equal-area projection, the distance distortion within our study area was mostly below 10 m, which is less than one pixel in high‑resolution Sentinel-2 bands. The systematic gridded sampling design ensured good representation of the West-East and South‑North climatic and phenological gradients, and facilitated graphical presentation of results.</p> <p>We derived the time series of usable Landsat and Sentinel-2 observations over Europe sampling individual pixels spaced systematically every 20 km in the latitudinal and longitudinal directions. We identified sampling locations according to the Lambert azimuthal equal-area projection (LAEA, EPSG:3035), which is the preferred projection for EU‑wide products. Despite LAEA being the equal-area projection, the distance distortion within our study area was mostly below 10 m, which is less than one pixel in high‑resolution Sentinel-2 bands. The systematic point sampling design is used to derive overview statistics for big datasets and in nearest neighbor-based rescaling of rasters. The 20-km sampling interval resulted in 16,642 locations over land ensuring good representation of the West-East and South‑North climatic and phenological gradients, as well as facilitating graphical presentation of results.</p> <p>For each sampled pixel we recorded the date of the valid cloud-, shadow, and snow-free Landsat and Sentinel-2 acquisition. We used the information at the original resolution and assumed each sampled pixel to be a probabilistic sample of the surrounding 20x20-km area, making the process analogous to the nearest neighbor resampling. We excluded duplicated data entries coming from the vertical overlaps among Landsat tiles in the same row, and vertical and horizontal overlaps among Sentinel-2 granules from the same swath. This resulted in daily data availability for 1984-2021 (1 – valid observation; 0 – no data or no valid observation), which we used to derive availability information for composites with aggregation periods of five, 10, 15, 20, and 25 days; one, two, three, four, six and 12 months. The non-overlapping compositing windows compartmentalized daily information for each year into 73, 37, 24, 18, 15, 12, six, four, three, two, and one composites for each calendar year, respectively. We used January 1<sup>st</sup> as the starting date for the compositing window sequence for each year. When the last compositing window was shorter than half its window width, we merged it with the penultimate composite. For each data point and every considered aggregation period we recorded the amount of available observations and considered a composite as 'successful' if at least one valid observation was available.</p>
    Description

    Landsat and Sentinel-2 data archives provide ever-increasing amounts of satellite data. However, the availability of usable observations greatly varies spatially and temporally. Pixel-based compositing that generates temporally equidistant cloud-free synthetic images can mitigate temporal variability, by constructing uninterrupted time series using different compositing windows. Here, we evaluated the feasibility of using compositing windows ranging from five days to one year for 1984-2021 Landsat and 2015-2021 Sentinel 2 time series to derive uninterrupted time series across Europe. We considered separate and joint use of both data archives and analyzed the spatio-temporal availability of composites during each calendar year and pixel-specific growing season across a variety of time windows and hypothesizing data interpolation. Our results demonstrated opportunities and limitations in the available data records to support medium- and long-term analyses requiring uninterrupted time series of composites with sub-annual temporal resolution. Spatial disparities across different compositing windows provide guidance on the feasibility of workflows relying on different data densities and on the challenges in wall-to-wall analyses. The feasibility of consistent time series based on composites with sub-monthly aggregation periods was mostly limited to the combined Landsat and Sentinel-2 archives after 2015, yet in some geographies requires interpolation of up to 50% of data.

  5. Z

    CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hynek, Karel (2024). CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13382426
    Explore at:
    Dataset updated
    Sep 30, 2024
    Dataset provided by
    Hynek, Karel
    Čejka, Tomáš
    Šiška, Pavel
    Koumar, Josef
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CESNET-TimeSeries24: The dataset for network traffic forecasting and anomaly detection

    The dataset called CESNET-TimeSeries24 was collected by long-term monitoring of selected statistical metrics for 40 weeks for each IP address on the ISP network CESNET3 (Czech Education and Science Network). The dataset encompasses network traffic from more than 275,000 active IP addresses, assigned to a wide variety of devices, including office computers, NATs, servers, WiFi routers, honeypots, and video-game consoles found in dormitories. Moreover, the dataset is also rich in network anomaly types since it contains all types of anomalies, ensuring a comprehensive evaluation of anomaly detection methods.Last but not least, the CESNET-TimeSeries24 dataset provides traffic time series on institutional and IP subnet levels to cover all possible anomaly detection or forecasting scopes. Overall, the time series dataset was created from the 66 billion IP flows that contain 4 trillion packets that carry approximately 3.7 petabytes of data. The CESNET-TimeSeries24 dataset is a complex real-world dataset that will finally bring insights into the evaluation of forecasting models in real-world environments.

    Please cite the usage of our dataset as:

    Josef Koumar, Karel Hynek, Tomáš Čejka, Pavel Šiška, "CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting", arXiv e-prints (2024): https://doi.org/10.48550/arXiv.2409.18874 @misc{koumar2024cesnettimeseries24timeseriesdataset, title={CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting}, author={Josef Koumar and Karel Hynek and Tomáš Čejka and Pavel Šiška}, year={2024}, eprint={2409.18874}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2409.18874}, }

    Time series

    We create evenly spaced time series for each IP address by aggregating IP flow records into time series datapoints. The created datapoints represent the behavior of IP addresses within a defined time window of 10 minutes. The vector of time-series metrics v_{ip, i} describes the IP address ip in the i-th time window. Thus, IP flows for vector v_{ip, i} are captured in time windows starting at t_i and ending at t_{i+1}. The time series are built from these datapoints.

    Datapoints created by the aggregation of IP flows contain the following time-series metrics:

    Simple volumetric metrics: the number of IP flows, the number of packets, and the transmitted data size (i.e. number of bytes)

    Unique volumetric metrics: the number of unique destination IP addresses, the number of unique destination Autonomous System Numbers (ASNs), and the number of unique destination transport layer ports. The aggregation of \textit{Unique volumetric metrics} is memory intensive since all unique values must be stored in an array. We used a server with 41 GB of RAM, which was enough for 10-minute aggregation on the ISP network.

    Ratios metrics: the ratio of UDP/TCP packets, the ratio of UDP/TCP transmitted data size, the direction ratio of packets, and the direction ratio of transmitted data size

    Average metrics: the average flow duration, and the average Time To Live (TTL)

    Multiple time aggregation: The original datapoints in the dataset are aggregated by 10 minutes of network traffic. The size of the aggregation interval influences anomaly detection procedures, mainly the training speed of the detection model. However, the 10-minute intervals can be too short for longitudinal anomaly detection methods. Therefore, we added two more aggregation intervals to the datasets--1 hour and 1 day.

    Time series of institutions: We identify 283 institutions inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution's data.

    Time series of institutional subnets: We identify 548 institution subnets inside the CESNET3 network. These time series aggregated per each institution ID provide a view of the institution subnet's data.

    Data Records

    The file hierarchy is described below:

    cesnet-timeseries24/

     |- institution_subnets/
    
     |   |- agg_10_minutes/<id_institution>.csv
    
     |   |- agg_1_hour/<id_institution>.csv
    
     |   |- agg_1_day/<id_institution>.csv
    
     |   |- identifiers.csv
    
     |- institutions/
    
     |   |- agg_10_minutes/<id_institution_subnet>.csv
    
     |   |- agg_1_hour/<id_institution_subnet>.csv
    
     |   |- agg_1_day/<id_institution_subnet>.csv
    
     |   |- identifiers.csv
    
     |- ip_addresses_full/
    
     |   |- agg_10_minutes/<id_ip_folder>/<id_ip>.csv
    
     |   |- agg_1_hour/<id_ip_folder>/<id_ip>.csv
    
     |   |- agg_1_day/<id_ip_folder>/<id_ip>.csv
    
     |   |- identifiers.csv
    
     |- ip_addresses_sample/
    
     |   |- agg_10_minutes/<id_ip>.csv
    
     |   |- agg_1_hour/<id_ip>.csv
    
     |   |- agg_1_day/<id_ip>.csv
    
     |   |- identifiers.csv
    
     |- times/
    
     |   |- times_10_minutes.csv
    
     |   |- times_1_hour.csv
    
     |   |- times_1_day.csv
    
     |- ids_relationship.csv   |- weekends_and_holidays.csv
    

    The following list describes time series data fields in CSV files:

    id_time: Unique identifier for each aggregation interval within the time series, used to segment the dataset into specific time periods for analysis.

    n_flows: Total number of flows observed in the aggregation interval, indicating the volume of distinct sessions or connections for the IP address.

    n_packets: Total number of packets transmitted during the aggregation interval, reflecting the packet-level traffic volume for the IP address.

    n_bytes: Total number of bytes transmitted during the aggregation interval, representing the data volume for the IP address.

    n_dest_ip: Number of unique destination IP addresses contacted by the IP address during the aggregation interval, showing the diversity of endpoints reached.

    n_dest_asn: Number of unique destination Autonomous System Numbers (ASNs) contacted by the IP address during the aggregation interval, indicating the diversity of networks reached.

    n_dest_port: Number of unique destination transport layer ports contacted by the IP address during the aggregation interval, representing the variety of services accessed.

    tcp_udp_ratio_packets: Ratio of packets sent using TCP versus UDP by the IP address during the aggregation interval, providing insight into the transport protocol usage pattern. This metric belongs to the interval <0, 1> where 1 is when all packets are sent over TCP, and 0 is when all packets are sent over UDP.

    tcp_udp_ratio_bytes: Ratio of bytes sent using TCP versus UDP by the IP address during the aggregation interval, highlighting the data volume distribution between protocols. This metric belongs to the interval <0, 1> with same rule as tcp_udp_ratio_packets.

    dir_ratio_packets: Ratio of packet directions (inbound versus outbound) for the IP address during the aggregation interval, indicating the balance of traffic flow directions. This metric belongs to the interval <0, 1>, where 1 is when all packets are sent in the outgoing direction from the monitored IP address, and 0 is when all packets are sent in the incoming direction to the monitored IP address.

    dir_ratio_bytes: Ratio of byte directions (inbound versus outbound) for the IP address during the aggregation interval, showing the data volume distribution in traffic flows. This metric belongs to the interval <0, 1> with the same rule as dir_ratio_packets.

    avg_duration: Average duration of IP flows for the IP address during the aggregation interval, measuring the typical session length.

    avg_ttl: Average Time To Live (TTL) of IP flows for the IP address during the aggregation interval, providing insight into the lifespan of packets.

    Moreover, the time series created by re-aggregation contains following time series metrics instead of n_dest_ip, n_dest_asn, and n_dest_port:

    sum_n_dest_ip: Sum of numbers of unique destination IP addresses.

    avg_n_dest_ip: The average number of unique destination IP addresses.

    std_n_dest_ip: Standard deviation of numbers of unique destination IP addresses.

    sum_n_dest_asn: Sum of numbers of unique destination ASNs.

    avg_n_dest_asn: The average number of unique destination ASNs.

    std_n_dest_asn: Standard deviation of numbers of unique destination ASNs)

    sum_n_dest_port: Sum of numbers of unique destination transport layer ports.

    avg_n_dest_port: The average number of unique destination transport layer ports.

    std_n_dest_port: Standard deviation of numbers of unique destination transport layer ports.

    Moreover, files identifiers.csv in each dataset type contain IDs of time series that are present in the dataset. Furthermore, the ids_relationship.csv file contains a relationship between IP addresses, Institutions, and institution subnets. The weekends_and_holidays.csv contains information about the non-working days in the Czech Republic.

  6. f

    Data from: A consistent data model for different data granularity in control...

    • figshare.com
    • tandf.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott D. Grimshaw (2023). A consistent data model for different data granularity in control charts [Dataset]. http://doi.org/10.6084/m9.figshare.19829476.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Scott D. Grimshaw
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    After a long-running show was canceled, control charts are used to identify if and when viewing drops. The finest granularity daily viewing has high autocorrelation and control charts use residuals from a seasonal ARIMA model. For coarse granularity data (weekly and monthly viewing) an approximate AR model is derived to be consistent with the finest granularity model. With the proposed approach, a longer memory model is used in the granular data control charts that reduces the number of false alarms from control charts constructed treating granular data as a different measurement.

  7. d

    NCOM SFC8 Hindcast Aggregation/Best Time Series

    • datadiscoverystudio.org
    opendap
    Updated May 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USA/NAVY/NAVO; USA/NAVY/NAVO; USA/NAVY/NAVO; USA/NAVY/NAVO (2018). NCOM SFC8 Hindcast Aggregation/Best Time Series [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/128083e1df8b454280d17ca40a93d412/html
    Explore at:
    opendapAvailable download formats
    Dataset updated
    May 4, 2018
    Authors
    USA/NAVY/NAVO; USA/NAVY/NAVO; USA/NAVY/NAVO; USA/NAVY/NAVO
    Area covered
    Africa
    Description

    Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.

  8. Data from: Using partial aggregation in Spatial Capture Recapture

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin
    Updated May 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cyril Milleret; Pierre Dupont; Henrik Brøseth; Jonas Kindberg; J. Andrew Royle; Richard Bischof; Cyril Milleret; Pierre Dupont; Henrik Brøseth; Jonas Kindberg; J. Andrew Royle; Richard Bischof (2022). Data from: Using partial aggregation in Spatial Capture Recapture [Dataset]. http://doi.org/10.5061/dryad.pd612qp
    Explore at:
    binAvailable download formats
    Dataset updated
    May 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cyril Milleret; Pierre Dupont; Henrik Brøseth; Jonas Kindberg; J. Andrew Royle; Richard Bischof; Cyril Milleret; Pierre Dupont; Henrik Brøseth; Jonas Kindberg; J. Andrew Royle; Richard Bischof
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
    1. Spatial capture-recapture (SCR) models are commonly used for analyzing data collected using non-invasive genetic sampling (NGS). Opportunistic NGS often leads to detections that do not occur at discrete detector locations. Therefore, spatial aggregation of individual detections into fixed detectors (e.g. center of grid cells) is an option to increase computing speed of SCR analyses. However, it may reduce precision and accuracy of parameter estimations.
    2. Using simulations, we explored the impact that spatial aggregation of detections has on a trade-off between computing time and parameter precision and bias, under a range of biological conditions. We used three different observation models: the commonly used Poisson and Bernoulli models, as well as a novel way to partially aggregate detections (Partially Aggregated Binary model (PAB)) to reduce the loss of information after aggregating binary detections. The PAB model divides detectors into K subdetectors and models the frequency of subdetectors with more than one detection as a binomial response with a sample size of K. Finally, we demonstrate the consequences of aggregation and the use of the PAB model using NGS data from the monitoring of wolverine (Gulo gulo) in Norway.
    3. Spatial aggregation of detections, while reducing computation time, does indeed incur costs in terms of reduced precision and accuracy, especially for the parameters of the detection function. SCR models estimated abundance with a low bias (< 10%) even at high degree of aggregation, but only for the Poisson and PAB models. Overall, the cost of aggregation is mitigated when using the Poisson and PAB models. At the same level of aggregation, the PAB observation models out-performs the Bernoulli model in terms of accuracy of estimates, while offering the benefits of a binary observation model (less assumptions about the underlying ecological process) over the count-based model.
    4. We recommend that detector spacing after aggregation does not exceed 1.5 times the scale-parameter of the detection function in order to limit bias. We recommend the use of the PAB observation model when performing spatial aggregation of binary data as it can mitigate the cost of aggregation, compared to the Bernoulli model.
  9. Monthly aggregated GLASS FAPAR V6 (250 m): 50th percentile monthly...

    • zenodo.org
    png, tiff
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu-Feng Ho; Yu-Feng Ho; Xuemeng Tian; Xuemeng Tian; Davide Consoli; Davide Consoli; Julia Hackländer; Julia Hackländer; Tomislav Hengl; Tomislav Hengl (2024). Monthly aggregated GLASS FAPAR V6 (250 m): 50th percentile monthly time-series (2009) [Dataset]. http://doi.org/10.5281/zenodo.8417513
    Explore at:
    tiff, pngAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yu-Feng Ho; Yu-Feng Ho; Xuemeng Tian; Xuemeng Tian; Davide Consoli; Davide Consoli; Julia Hackländer; Julia Hackländer; Tomislav Hengl; Tomislav Hengl
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2000 - Dec 31, 2021
    Description

    List of Subdatasets:

    General Description

    The monthly aggregated Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) dataset is derived from 250m 8d GLASS V6 FAPAR. The data set is derived from Moderate Resolution Imaging Spectroradiometer (MODIS) reflectance and LAI data using several other FAPAR products (MODIS Collection 6, GLASS FAPAR V5, and PROBA-V1 FAPAR) to generate a bidirectional long-short-term memory (Bi-LSTM) model to estimate FAPAR. The dataset time spans from March 2000 to December 2021 and provides data that covers the entire globe. The dataset can be used in many applications like land degradation modeling, land productivity mapping, and land potential mapping. The dataset includes:

    • Long-term:

    Derived from monthly time-series. This dataset provides linear trend model for the p95 variable: (1) slope beta mean (p95.beta_m), p-value for beta (p95.beta_pv), intercept alpha mean (p95.alpha_m), p-value for alpha (p95.alpha_pv), and coefficient of determination R2 (p95.r2_m).

    • Monthly time-series:

    Monthly aggregation with three standard statistics: (1) 5th percentile (p05), median (p50), and 95th percentile (p95). For each month, we aggregate all composites within that month plus one composite each before and after, ending up with 5 to 6 composites for a single month depending on the number of images within that month.

    Data Details

    • Time period: March 2000 – December 2021
    • Type of data: Fraction of Absorbed Photosynthetically Active Radiation (FAPAR)
    • How the data was collected or derived: Derived from 250m 8 d GLASS V6 FAPAR using Python running in a local HPC. The time-series analysis were computed using the Scikit-map Python package.
    • Statistical methods used: for the long-term, Ordinary Least Square (OLS) of p95 monthly variable; for the monthly time-series, percentiles 05, 50, and 95.
    • Limitations or exclusions in the data: The dataset does not include data for Antarctica.
    • Coordinate reference system: EPSG:4326
    • Bounding box (Xmin, Ymin, Xmax, Ymax): (-180.00000, -62.0008094, 179.9999424, 87.37000)
    • Spatial resolution: 1/480 d.d. = 0.00208333 (250m)
    • Image size: 172,800 x 71,698
    • File format: Cloud Optimized Geotiff (COG) format.

    Support

    If you discover a bug, artifact, or inconsistency, or if you have a question please raise a GitHub issue: https://github.com/Open-Earth-Monitor/Global_FAPAR_250m/issues

    Reference

    Hackländer, J., Parente, L., Ho, Y.-F., Hengl, T., Simoes, R., Consoli, D., Şahin, M., Tian, X., Herold, M., Jung, M., Duveiller, G., Weynants, M., Wheeler, I., (2023?) "Land potential assessment and trend-analysis using 2000–2021 FAPAR monthly time-series at 250 m spatial resolution", submitted to PeerJ, preprint available at: https://doi.org/10.21203/rs.3.rs-3415685/v1

    Name convention

    To ensure consistency and ease of use across and within the projects, we follow the standard Open-Earth-Monitor file-naming convention. The convention works with 10 fields that describes important properties of the data. In this way users can search files, prepare data analysis etc, without needing to open files. The fields are:

    1. generic variable name: fapar = Fraction of Absorbed Photosynthetically Active Radiation
    2. variable procedure combination: essd.lstm = Earth System Science Data with bidirectional long short-term memory (Bi–LSTM)
    3. Position in the probability distribution / variable type: p05/p50/p95 = 5th/50th/95th percentile
    4. Spatial support: 250m
    5. Depth reference: s = surface
    6. Time reference begin time: 20000301 = 2000-03-01
    7. Time reference end time: 20211231 = 2022-12-31
    8. Bounding box: go = global (without Antarctica)
    9. EPSG code: epsg.4326 = EPSG:4326
    10. Version code: v20230628 = 2023-06-28 (creation date)
  10. A

    Australia Aggregate Monthly Hours Worked: Trend: Part Time: Female

    • ceicdata.com
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australia Aggregate Monthly Hours Worked: Trend: Part Time: Female [Dataset]. https://www.ceicdata.com/en/australia/aggregate-monthly-hours-worked/aggregate-monthly-hours-worked-trend-part-time-female
    Explore at:
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 1, 2024 - Jan 1, 2025
    Area covered
    Australia
    Variables measured
    Hours Worked
    Description

    Australia Aggregate Monthly Hours Worked: Trend: Part Time: Female data was reported at 225,972.411 Hour th in Jan 2025. This records an increase from the previous number of 225,859.486 Hour th for Dec 2024. Australia Aggregate Monthly Hours Worked: Trend: Part Time: Female data is updated monthly, averaging 123,113.874 Hour th from Jul 1978 (Median) to Jan 2025, with 559 observations. The data reached an all-time high of 225,972.411 Hour th in Jan 2025 and a record low of 46,197.368 Hour th in Jul 1978. Australia Aggregate Monthly Hours Worked: Trend: Part Time: Female data remains active status in CEIC and is reported by Australian Bureau of Statistics. The data is categorized under Global Database’s Australia – Table AU.G052: Aggregate Monthly Hours Worked.

  11. H

    Script for aggregating Norfolk, VA environmental data to daily time scale

    • hydroshare.org
    • search.dataone.org
    zip
    Updated Mar 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Sadler (2018). Script for aggregating Norfolk, VA environmental data to daily time scale [Dataset]. http://doi.org/10.4211/hs.41c8d8f8788c4ba0b0bfbb924fe1d403
    Explore at:
    zip(145.7 KB)Available download formats
    Dataset updated
    Mar 1, 2018
    Dataset provided by
    HydroShare
    Authors
    Jeff Sadler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2010 - Nov 1, 2016
    Area covered
    Description

    Script and accompanying ipython notebook written in Python 2.7 for aggregating sub-daily environmental data (rainfall, tide, wind, groundwater) to a daily timescale. The input data are from Norfolk, Virginia. Several different methods of aggregation are used including averages and maximums. The processed/aggregated data are combined with street flood report data to be used in data-driven, predictive modeling. The script in this resource was used in the analysis described in this Journal of Hydrology paper: https://doi.org/10.1016/j.jhydrol.2018.01.044.

  12. I

    India CS: Aggregate Deposits of Residents: Time

    • ceicdata.com
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    India CS: Aggregate Deposits of Residents: Time [Dataset]. https://www.ceicdata.com/en/india/commercial-bank-survey/cs-aggregate-deposits-of-residents-time
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 1, 2017 - Sep 1, 2018
    Area covered
    India
    Variables measured
    Loans
    Description

    India CS: Aggregate Deposits of Residents: Time data was reported at 103,241,410.000 INR mn in Sep 2018. This records an increase from the previous number of 102,539,530.000 INR mn for Aug 2018. India CS: Aggregate Deposits of Residents: Time data is updated monthly, averaging 30,494,680.000 INR mn from Mar 1999 (Median) to Sep 2018, with 235 observations. The data reached an all-time high of 103,241,410.000 INR mn in Sep 2018 and a record low of 5,454,360.000 INR mn in Mar 1999. India CS: Aggregate Deposits of Residents: Time data remains active status in CEIC and is reported by Reserve Bank of India. The data is categorized under Global Database’s India – Table IN.KAC003: Commercial Bank Survey.

  13. d

    On the Stability of the Excess Sensitivity of Aggregate Consumption Growth...

    • b2find.dkrz.de
    Updated Oct 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). On the Stability of the Excess Sensitivity of Aggregate Consumption Growth in the USA (replication data) - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/59734c18-eb3f-5cb2-b52e-d42d14b0d180
    Explore at:
    Dataset updated
    Oct 24, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This paper investigates whether there is time variation in the excess sensitivity of aggregate consumption growth to anticipated aggregate disposable income growth using quarterly US data over the period 1953-2014. Our empirical framework contains the possibility of stickiness in aggregate consumption growth and takes into account measurement error and time aggregation. Our empirical specification is cast into a Bayesian state-space model and estimated using Markov chain Monte Carlo (MCMC) methods. We use a Bayesian model selection approach to deal with the non-regular test for the null hypothesis of no time variation in the excess sensitivity parameter. Anticipated disposable income growth is calculated by incorporating an instrumental variables estimation approach into our MCMC algorithm. Our results suggest that the excess sensitivity parameter in the USA is stable at around 0.23 over the entire sample period.

  14. E

    WWF Italy (aggregated per 1-degree cell)

    • erddap.eurobis.org
    • obis.org
    • +2more
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Casale (2025). WWF Italy (aggregated per 1-degree cell) [Dataset]. https://erddap.eurobis.org/erddap/info/zd_1826_1deg/index.html
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Casale
    Area covered
    Variables measured
    sex, time, Notes, aphia_id, latitude, TimeOfDay, lifestage, longitude, DayCollected, BasisOfRecord, and 5 more
    Description

    Original provider: Paolo Casale Dataset credits: Data provider WWF Italy's Sea Turtle Network Originating data center Satellite Tracking and Analysis Tool (STAT) Supplemental information: Visit STAT's project page for additional information. This dataset is a summarized representation of the telemetry locations aggregated per species per 1-degree cell. AccConID=24 AccConstrDescription=This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms AccConstrDisplay=This dataset is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. AccConstrEN=Attribution-NonCommercial (CC BY-NC) AccessConstraint=Attribution-NonCommercial (CC BY-NC) AccessConstraints=This dataset is a summarized representation of the telemetry locations aggregated per species per 1-degree cell. Acronym=None added_date=2024-06-04 11:58:39.543000 BrackishFlag=0 CDate=2023-04-12 cdm_data_type=Other CheckedFlag=0 Citation=Casale P. 2021. WWF Italy. Data originated from Satellite Tracking and Analysis Tool (STAT; http://www.seaturtle.org/tracking/index.shtml?project_id=184). Comments=None ContactEmail=paolo.casale1@gmail.com Conventions=COARDS, CF-1.6, ACDD-1.3 CurrencyDate=None DasID=8288 DasOrigin=None DasType=None DasTypeID=None DateLastModified={'date': '2025-02-18 01:34:01.301036', 'timezone_type': 1, 'timezone': '+01:00'} DescrCompFlag=0 DescrTransFlag=0 Easternmost_Easting=49.5 EmbargoDate=None EngAbstract=Original provider: Paolo Casale Dataset credits: Data provider WWF Italy's Sea Turtle Network Originating data center Satellite Tracking and Analysis Tool (STAT) Supplemental information: Visit STAT's project page for additional information. This dataset is a summarized representation of the telemetry locations aggregated per species per 1-degree cell. EngDescr=None FreshFlag=0 GBIF_UUID=5e413639-a91c-41ba-aa33-8583c479a3fa geospatial_lat_max=47.5 geospatial_lat_min=25.5 geospatial_lat_units=degrees_north geospatial_lon_max=49.5 geospatial_lon_min=-53.5 geospatial_lon_units=degrees_east infoUrl=None InputNotes=None institution=WWF License=https://creativecommons.org/licenses/by-nc/4.0 Lineage=None MarineFlag=1 modified_sync=2024-05-21 00:00:00 Northernmost_Northing=47.5 OrigAbstract=None OrigDescr=None OrigDescrLang=None OrigDescrLangNL=None OrigLangCode=None OrigLangCodeExtended=None OrigLangID=None OrigTitle=None OrigTitleLang=None OrigTitleLangCode=None OrigTitleLangID=None OrigTitleLangNL=None Progress=None PublicFlag=1 ReleaseDate=Apr 24 2021 12:00AM ReleaseDate0=2021-04-24 RevisionDate=None SizeReference=None sourceUrl=(local files) Southernmost_Northing=25.5 standard_name_vocabulary=CF Standard Name Table v70 StandardTitle=WWF Italy (aggregated per 1-degree cell) StatusID=1 subsetVariables=ScientificName,BasisOfRecord,YearCollected,MonthCollected,DayCollected,sex,lifestage,aphia_id TerrestrialFlag=0 UDate=2023-04-20 VersionDate=Apr 24 2021 12:00AM VersionDay=None VersionMonth=None VersionName=None VersionYear=None VlizCoreFlag=1 Westernmost_Easting=-53.5

  15. E

    NMME CCSM4 Pressure at Sea Level Daily Aggregation R01 PSL By time,...

    • ncei.noaa.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NMME CCSM4 Pressure at Sea Level Daily Aggregation R01 PSL By time, latitude, longitude [Dataset]. https://www.ncei.noaa.gov/erddap/info/nmme_ccsm4_psl_day_r01_by_time_LAT_LON/index.html
    Explore at:
    Time period covered
    Jan 1, 2018 - Feb 28, 2026
    Area covered
    Variables measured
    PSL, time, latitude, longitude
    Description

    NMME CCSM4 Pressure at Sea Level Daily Aggregation R01 PSL Dimensioned By time, latitude, longitude. _CoordSysBuilder=ucar.nc2.dataset.conv.CF1Convention cdm_data_type=Grid contact=Dughong Min (dmin@rsmas.miami.edu) and Ben Kirtman (bkirtman@rsmas.miami.edu) Conventions=CF-1.4 Easternmost_Easting=359.0 endmonth=01 endyear=2026 experiment=Febuary 2025 Forecast experiment_id=Mon Mar 10 12:06:20 PM EDT 2025 frequency=day Generator=NCL v.6.0 geospatial_lat_max=90.0 geospatial_lat_min=-90.0 geospatial_lat_resolution=1.0 geospatial_lat_units=degrees_north geospatial_lon_max=359.0 geospatial_lon_min=0.0 geospatial_lon_resolution=1.0 geospatial_lon_units=degrees_east history=FMRC Best Dataset infoUrl=https://www.ncei.noaa.gov/thredds/catalog/model-nmme_ccsm4_psl_day_r01_agg/catalog.html?dataset=model-nmme_ccsm4_psl_day_r01_agg/NMME_CCSM4_Pressure_at_Sea_Level_Daily_Aggregation_R01_best.ncd institution=Univ. of Miami - Rosenstiel School of Marine & Atmosphereric Science institution_id=UM-RSMAS location=Proto fmrc:NMME_CCSM4_Pressure_at_Sea_Level_Daily_Aggregation_R01 model_id=CCSM4_0_a02 modeling_realm=atmos Northernmost_Northing=90.0 project_id=National Multi-Model Ensembles(NMME) project realization=01 References=Ben P. Kirtman, Dughong Min. (2009) Multimodel Ensemble ENSO Prediction with CCSM and CFS. Monthly Weather Review 137:9, 2908-2930 sourceUrl=https://www.ncei.noaa.gov/thredds/dodsC/model-nmme_ccsm4_psl_day_r01_agg/NMME_CCSM4_Pressure_at_Sea_Level_Daily_Aggregation_R01_best.ncd Southernmost_Northing=-90.0 startmonth=02 startyear=2025 time_coverage_end=2026-02-28T12:00:00Z time_coverage_start=2018-01-01T12:00:00Z Westernmost_Easting=0.0

  16. f

    Data_Sheet_1_An Expanded Polyproline Domain Maintains Mutant Huntingtin...

    • frontiersin.figshare.com
    docx
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Lucia Pigazzini; Mandy Lawrenz; Anca Margineanu; Gabriele S. Kaminski Schierle; Janine Kirstein (2023). Data_Sheet_1_An Expanded Polyproline Domain Maintains Mutant Huntingtin Soluble in vivo and During Aging.docx [Dataset]. http://doi.org/10.3389/fnmol.2021.721749.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    Frontiers
    Authors
    Maria Lucia Pigazzini; Mandy Lawrenz; Anca Margineanu; Gabriele S. Kaminski Schierle; Janine Kirstein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Huntington’s disease is a dominantly inherited neurodegenerative disorder caused by the expansion of a CAG repeat, encoding for the amino acid glutamine (Q), present in the first exon of the protein huntingtin. Over the threshold of Q39 HTT exon 1 (HTTEx1) tends to misfold and aggregate into large intracellular structures, but whether these end-stage aggregates or their on-pathway intermediates are responsible for cytotoxicity is still debated. HTTEx1 can be separated into three domains: an N-terminal 17 amino acid region, the polyglutamine (polyQ) expansion and a C-terminal proline rich domain (PRD). Alongside the expanded polyQ, these flanking domains influence the aggregation propensity of HTTEx1: with the N17 initiating and promoting aggregation, and the PRD modulating it. In this study we focus on the first 11 amino acids of the PRD, a stretch of pure prolines, which are an evolutionary recent addition to the expanding polyQ region. We hypothesize that this proline region is expanding alongside the polyQ to counteract its ability to misfold and cause toxicity, and that expanding this proline region would be overall beneficial. We generated HTTEx1 mutants lacking both flanking domains singularly, missing the first 11 prolines of the PRD, or with this stretch of prolines expanded. We then followed their aggregation landscape in vitro with a battery of biochemical assays, and in vivo in novel models of C. elegans expressing the HTTEx1 mutants pan-neuronally. Employing fluorescence lifetime imaging we could observe the aggregation propensity of all HTTEx1 mutants during aging and correlate this with toxicity via various phenotypic assays. We found that the presence of an expanded proline stretch is beneficial in maintaining HTTEx1 soluble over time, regardless of polyQ length. However, the expanded prolines were only advantageous in promoting the survival and fitness of an organism carrying a pathogenic stretch of Q48 but were extremely deleterious to the nematode expressing a physiological stretch of Q23. Our results reveal the unique importance of the prolines which have and still are evolving alongside expanding glutamines to promote the function of HTTEx1 and avoid pathology.

  17. f

    Data from: S1 Data -

    • plos.figshare.com
    txt
    Updated Dec 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahadee Al Mobin; Md. Kamrujjaman (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0295803.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mahadee Al Mobin; Md. Kamrujjaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data scarcity and discontinuity are common occurrences in the healthcare and epidemiological dataset and often is needed to form an educative decision and forecast the upcoming scenario. Often to avoid these problems, these data are processed as monthly/yearly aggregate where the prevalent forecasting tools like Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average (SARIMA), and TBATS often fail to provide satisfactory results. Artificial data synthesis methods have been proven to be a powerful tool for tackling these challenges. The paper aims to propose a novel algorithm named Stochastic Bayesian Downscaling (SBD) algorithm based on the Bayesian approach that can regenerate downscaled time series of varying time lengths from aggregated data, preserving most of the statistical characteristics and the aggregated sum of the original data. The paper presents two epidemiological time series case studies of Bangladesh (Dengue, Covid-19) to showcase the workflow of the algorithm. The case studies illustrate that the synthesized data agrees with the original data regarding its statistical properties, trend, seasonality, and residuals. In the case of forecasting performance, using the last 12 years data of Dengue infection data in Bangladesh, we were able to decrease error terms up to 72.76% using synthetic data over actual aggregated data.

  18. Bat-aggregated time series workflow

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Lee (2024). Bat-aggregated time series workflow [Dataset]. http://doi.org/10.5061/dryad.w0vt4b8zf
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    University of California, Santa Barbara
    Authors
    Brian Lee
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    This dataset and code provides radar-based detections of Brazilian free-tailed bats (Tadarida brasiliensis) across select regions of California and Texas, compiled using weather radar data from the NEXRAD (NEXtgeneration weather RADar) system. NEXRAD radars, operated by the US National Weather Service, continuously monitor the airspace, detecting various airborne organisms including birds, insects, and bats. The dataset was generated using the ‘BATS’ Python toolkit (program included), which automates the retrieval, processing, and classification of radar data. It employs a pre-trained machine learning model specifically designed to detect radar echoes associated with Brazilian free-tailed bats. The dataset includes the results from machine learning models trained and tested on radar data, which achieved an AUC of 0.963, demonstrating high accuracy in identifying bat activity. The dataset also includes pre-trained neural network and random forest models for reproducibility. This dataset provides valuable spatiotemporal information on bat presence at a large landscape scale and across extended timeframes. By distilling radar data into efficient summaries of bat occurrence, the dataset enables researchers to explore patterns in bat activity and their potential ecosystem services, such as insect consumption, in agricultural regions.

    Methods Data Description This dataset provides detailed radar-based detections of Brazilian free-tailed bats (Tadarida brasiliensis) across select regions of California and Texas. The data were compiled from the NEXRAD (NEXt-generation weather RADar) system, which operates S-band Doppler weather radars across the United States. NEXRAD radars detect various airborne targets such as birds, insects, and bats. The dataset is processed using the 'BATS' Python toolkit, which automates the retrieval and classification of radar data. Using radar data sourced from the Amazon Web Services (AWS) repository, the BATS toolkit classifies radar echoes based on a machine learning model trained to identify Brazilian free-tailed bats. The dataset contains bat presence information at a pixel resolution of 70 meters, derived from radar data over multiple time periods in 2018 and 2019. This data will be useful for researchers exploring bat ecology, insectivorous bat ecosystem services, and landscape-level bat monitoring. The dataset includes:

    Radar data processed to detect bat presence in California (2018) and Texas (2019) Classified radar pixels indicating bat presence or absence Machine learning-derived bat occurrence probabilities (thresholded for binary classification) Geotiff files that aggregate radar data over six-month periods

    Methods Data Collection The dataset was generated using NEXRAD radar data, sourced from AWS. The BATS Python toolkit facilitated the collection and processing of radar data files, automating the pipeline from raw radar retrieval to bat detection. Radar data was selected based on specific regions, timeframes, and weather conditions associated with confirmed Brazilian free-tailed bat emergence events. The radar data collected spans 11 weather-free days in California (2018) and 7 days in Texas (2019). Reference data on bat emergence was gathered from field observations provided by local bat monitoring organizations. Data Processing Once downloaded, the raw radar data (Level II “.gz” files) was processed using the Py-ART library, which is designed for radar data manipulation. Py-ART converted the radar data from its native polar coordinates into a uniform Cartesian grid, with a resampled pixel resolution of 70 meters to facilitate accurate bat detection. The processed radar data was then classified using a machine learning pipeline. The BATS toolkit includes scripts for classification, in which radar echoes were evaluated by pre-trained machine learning models. The dataset was classified using three machine learning models: random forest (RF), support vector machines (SVM), and artificial neural networks (ANN). The ANN model, selected for its superior performance (AUC of 0.963), was used to classify each radar pixel as either containing or not containing Brazilian free-tailed bats. The model outputs a binary classification based on a 90% probability threshold to ensure accurate detection while minimizing false positives. Evaluation and Quality Control To ensure the accuracy of the model and its classifications, the dataset was evaluated using standard binary classification metrics: precision, recall, AUC (Area Under the ROC Curve), and precision-recall curves. Hyperparameter tuning and spatial cross-validation were performed to account for spatial autocorrelation in the radar data and to improve the generalization of the machine learning models. Training data for the model was primarily sourced from California, while independent testing was conducted using radar data from Texas. The dataset also includes labeled data representing noise sources (such as birds, vehicles, and weather phenomena) to reduce false positives during classification. By processing large volumes of radar data and applying machine learning algorithms, the BATS toolkit condensed terabytes of raw radar data into concise geotiff maps of bat presence, enabling efficient analysis of bat populations across landscapes.

  19. g

    Aggregation of 1 micron latex spheres suspended in sterile EPS isolated from...

    • data.gulfresearchinitiative.org
    • data.griidc.org
    Updated Jun 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GRIIDC (2018). Aggregation of 1 micron latex spheres suspended in sterile EPS isolated from bacterial isolates and consortia on a micro-scale crude oil droplet [Dataset]. http://doi.org/10.7266/N7BV7F6V
    Explore at:
    Dataset updated
    Jun 25, 2018
    Dataset provided by
    GRIIDC
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The effect of extracellular polymeric substances (EPS) extracted from Sagittula stellata and natural Gulf of Mexico bacterial consortia on aggregation of 1 micron latex particles on a crude oil drop surface is observed. Crude oil droplets between 100-200 microns are pinned in a microchannel while time lapse microscopy observes particle aggregation with time. Aggregation rates correlate positively with increasing protein-carbohydrate ratios in the varying EPS compositions studied.

  20. E

    CARESAT (aggregated per 1-degree cell)

    • erddap.eurobis.org
    • emodnet.ec.europa.eu
    • +2more
    Updated Jan 17, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luschi (2018). CARESAT (aggregated per 1-degree cell) [Dataset]. https://erddap.eurobis.org/erddap/info/zd_1686_1deg/index.html
    Explore at:
    Dataset updated
    Jan 17, 2018
    Dataset authored and provided by
    Luschi
    Time period covered
    Jan 1, 2014 - Aug 1, 2020
    Area covered
    Variables measured
    sex, time, Notes, aphia_id, latitude, TimeOfDay, lifestage, longitude, DayCollected, BasisOfRecord, and 5 more
    Description

    CARESAT is a project funded by the Tuscany Region (Italy) aiming to use satellite telemetry to increase the limited information currently available on the movements of loggerhead turtles frequenting Tuscany waters and the Pelagos Marine Sanctuary. To this aim, turtles found in Tuscan waters and rehabilitated in Tuscan rescue centers will be equipped with satellite transmitters, to reconstruct the movements made by tracked individuals, to identify the areas of the Sanctuary that are mainly frequented, and to reveal hitherto unknown aspects of their ecology and behavior. AccConID=24 AccConstrDescription=This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms AccConstrDisplay=This dataset is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. AccConstrEN=Attribution-NonCommercial (CC BY-NC) AccessConstraint=Attribution-NonCommercial (CC BY-NC) AccessConstraints=None Acronym=None added_date=2023-04-27 11:03:36.983000 BrackishFlag=None CDate=2023-02-08 cdm_data_type=Other CheckedFlag=0 Citation=Luschi P. 2021. CARESAT. Data originated from Satellite Tracking and Analysis Tool (STAT; http://www.seaturtle.org/tracking/index.shtml?project_id=1050). Comments=Only data aggregated per 1-degree cell are available through OBIS. The non-aggregated data are available through the OBIS-SEAMAP Portal. ContactEmail=pluschi@biologia.unipi.it Conventions=COARDS, CF-1.6, ACDD-1.3 CurrencyDate=None DasID=8204 DasOrigin=Sensor platform DasType=Data DasTypeID=1 DateLastModified={'date': '2025-02-13 01:37:29.538097', 'timezone_type': 1, 'timezone': '+01:00'} DescrCompFlag=0 DescrTransFlag=0 Easternmost_Easting=14.5 EmbargoDate=None EngAbstract=CARESAT is a project funded by the Tuscany Region (Italy) aiming to use satellite telemetry to increase the limited information currently available on the movements of loggerhead turtles frequenting Tuscany waters and the Pelagos Marine Sanctuary. To this aim, turtles found in Tuscan waters and rehabilitated in Tuscan rescue centers will be equipped with satellite transmitters, to reconstruct the movements made by tracked individuals, to identify the areas of the Sanctuary that are mainly frequented, and to reveal hitherto unknown aspects of their ecology and behavior. EngDescr=Original provider: Islameta Group, University of Pisa

    Dataset credits: Data provider Islameta Group, Dept. of Biology - University of Pisa Originating data center Satellite Tracking and Analysis Tool (STAT) Project partner Parco Regionale della Maremma (Maremma Regional Park) Project sponsor or sponsor description Osservatorio Toscano Cetacei e Tartarughe (Tuscan Observatory Cetaceans and Turtles)

    Abstract: CARESAT is a project funded by the Tuscany Region (Italy) aiming to use satellite telemetry to increase the limited information currently available on the movements of loggerhead turtles frequenting Tuscany waters and the Pelagos Marine Sanctuary. To this aim, turtles found in Tuscan waters and rehabilitated in Tuscan rescue centers will be equipped with satellite transmitters, to reconstruct the movements made by tracked individuals, to identify the areas of the Sanctuary that are mainly frequented, and to reveal hitherto unknown aspects of their ecology and behavior.

    Supplemental information: Visit STAT's project page for additional information.

    This dataset is a summarized representation of the telemetry locations aggregated per species per 1-degree cell. FreshFlag=None GBIF_UUID=03fdaaf6-227f-4731-9ee4-72ad5b28d80d geospatial_lat_max=47.5 geospatial_lat_min=37.5 geospatial_lat_units=degrees_north geospatial_lon_max=14.5 geospatial_lon_min=-16.5 geospatial_lon_units=degrees_east infoUrl=None InputNotes=None institution=None License=https://creativecommons.org/licenses/by-nc/4.0 Lineage=None MarineFlag=1 modified_sync=2023-03-31 00:00:00 Northernmost_Northing=47.5 OrigAbstract=None OrigDescr=None OrigDescrLang=None OrigDescrLangNL=None OrigLangCode=None OrigLangCodeExtended=None OrigLangID=None OrigTitle=None OrigTitleLang=None OrigTitleLangCode=None OrigTitleLangID=None OrigTitleLangNL=None Progress=None PublicFlag=1 ReleaseDate=Jul 11 2021 10:00PM ReleaseDate0=2021-07-11 RevisionDate=None SizeReference=None sourceUrl=(local files) Southernmost_Northing=37.5 standard_name_vocabulary=CF Standard Name Table v70 StandardTitle=CARESAT (aggregated per 1-degree cell) StatusID=1 subsetVariables=ScientificName,BasisOfRecord,YearCollected,MonthCollected,DayCollected,sex,lifestage,aphia_id TerrestrialFlag=None time_coverage_end=2020-08-01T01:00:00Z time_coverage_start=2014-01-01T01:00:00Z UDate=2023-11-20 VersionDate=Jul 11 2021 10:00PM VersionDay=12 VersionMonth=7 VersionName=None VersionYear=2021 VlizCoreFlag=1 Westernmost_Easting=-16.5

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil (2018). NCOM Region 10 Aggregation/Best Time Series [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/5cdc21deb99c4b25bb51704b576e14c6/html

NCOM Region 10 Aggregation/Best Time Series

Explore at:
opendapAvailable download formats
Dataset updated
Nov 21, 2018
Authors
kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil; kelly.r.wood@navy.mil; jeffery.rayburn@navy.mil
Area covered
Description

Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.Best time series, taking the data from the most recent run available.

Search
Clear search
Close search
Google apps
Main menu