12 datasets found
  1. n

    Data from: Continuous-time spatially explicit capture-recapture models, with...

    • data.niaid.nih.gov
    • dataone.org
    • +2more
    zip
    Updated Apr 21, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Foster; Bart Harmsen; Lorenzo Milazzo; Greg Distiller; David Borchers (2014). Continuous-time spatially explicit capture-recapture models, with an application to a jaguar camera-trap survey [Dataset]. http://doi.org/10.5061/dryad.mg5kv
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2014
    Dataset provided by
    University of St Andrews
    University of Cape Town
    University of Cambridge
    University of Belize
    Authors
    Rebecca Foster; Bart Harmsen; Lorenzo Milazzo; Greg Distiller; David Borchers
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Belize, Cockscomb Basin Wildlife Sanctuary
    Description

    Many capture-recapture surveys of wildlife populations operate in continuous time but detections are typically aggregated into occasions for analysis, even when exact detection times are available. This discards information and introduces subjectivity, in the form of decisions about occasion definition. We develop a spatio-temporal Poisson process model for spatially explicit capture-recapture (SECR) surveys that operate continuously and record exact detection times. We show that, except in some special cases (including the case in which detection probability does not change within occasion), temporally aggregated data do not provide sufficient statistics for density and related parameters, and that when detection probability is constant over time our continuous-time (CT) model is equivalent to an existing model based on detection frequencies. We use the model to estimate jaguar density from a camera-trap survey and conduct a simulation study to investigate the properties of a CT estimator and discrete-occasion estimators with various levels of temporal aggregation. This includes investigation of the effect on the estimators of spatio-temporal correlation induced by animal movement. The CT estimator is found to be unbiased and more precise than discrete-occasion estimators based on binary capture data (rather than detection frequencies) when there is no spatio-temporal correlation. It is also found to be only slightly biased when there is correlation induced by animal movement, and to be more robust to inadequate detector spacing, while discrete-occasion estimators with binary data can be sensitive to occasion length, particularly in the presence of inadequate detector spacing. Our model includes as a special case a discrete-occasion estimator based on detection frequencies, and at the same time lays a foundation for the development of more sophisticated CT models and estimators. It allows modelling within-occasion changes in detectability, readily accommodates variation in detector effort, removes subjectivity associated with user-defined occasions, and fully utilises CT data. We identify a need for developing CT methods that incorporate spatio-temporal dependence in detections and see potential for CT models being combined with telemetry-based animal movement models to provide a richer inference framework.

  2. Meta data and supporting documentation

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Meta data and supporting documentation [Dataset]. https://catalog.data.gov/dataset/meta-data-and-supporting-documentation
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    We include a description of the data sets in the meta-data as well as sample code and results from a simulated data set. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The R code is available on line here: https://github.com/warrenjl/SpGPCW. Format: Abstract The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publicly available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. File format: R workspace file. Metadata (including data dictionary) • y: Vector of binary responses (1: preterm birth, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate). This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  3. C

    Statistical Data Catalog Cologne

    • ckan.mobidatalab.eu
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Köln (2023). Statistical Data Catalog Cologne [Dataset]. https://ckan.mobidatalab.eu/dataset/statisticaldatacatalogue-coln
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/csv(307022), http://publications.europa.eu/resource/authority/file-type/csv(272780), http://publications.europa.eu/resource/authority/file-type/json, http://publications.europa.eu/resource/authority/file-type/csv(3746), http://publications.europa.eu/resource/authority/file-type/csv(3752), http://publications.europa.eu/resource/authority/file-type/csv(274184), http://publications.europa.eu/resource/authority/file-type/csv(3735), http://publications.europa.eu/resource/authority/file-type/csv(275264), http://publications.europa.eu/resource/authority/file-type/csv(5356), http://publications.europa.eu/resource/authority/file-type/csv(273265), http://publications.europa.eu/resource/authority/file-type/csv(3730), http://publications.europa.eu/resource/authority/file-type/csv(19787), http://publications.europa.eu/resource/authority/file-type/csv(273515), http://publications.europa.eu/resource/authority/file-type/csv(272571), http://publications.europa.eu/resource/authority/file-type/csv(3748), http://publications.europa.eu/resource/authority/file-type/csv(3753), http://publications.europa.eu/resource/authority/file-type/csv(271286), http://publications.europa.eu/resource/authority/file-type/csv(3754), http://publications.europa.eu/resource/authority/file-type/csv(273516), http://publications.europa.eu/resource/authority/file-type/csv(273403), http://publications.europa.eu/resource/authority/file-type/csv(3764), http://publications.europa.eu/resource/authority/file-type/csv(1215), http://publications.europa.eu/resource/authority/file-type/csv(3758)Available download formats
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    Köln
    License

    Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
    License information was derived automatically

    Description

    Data from various sources are updated in the Statistical Information System of the City of Cologne. The annual statistical yearbook publishes these in tabular, graphic and cartographic form at the level of the city districts and districts. Furthermore, definitions and calculation bases are explained. Small-scale statistics at the level of the 86 districts can be obtained from the Cologne district information become. All levels of the local area structure are presented in this publication explained.

    This statistical data catalogue supplements the range of small-scale data. Selected structural data can be called up here in compact tabular form at the level of the 570 statistical districts or the 86 districts. The two overviews provide information about which data is available and from which source it originates. The data itself is provided annually.

    Notes:

    • Data sources are indicated in the summary tables. When using the data, the data license Germany - attribution - version 2.0 must be observed.
    • Some values ​​cannot be given to protect statistical confidentiality. For the data sets of the Federal Employment Agency, these are values ​​from 1 to < 10, for all further data records values ​​from 1 to < 5. This is marked in the data by a * .
    • The differentiation of population figures by gender is currently made according to female and male residents. The case numbers of those who define themselves as non-binary/diverse are so low at a small-scale level that they cannot be reported for reasons of statistical confidentiality.
    • The determination of residents with a migration background is carried out by combination various characteristics from the resident registration procedure. The data are to be interpreted as estimates. The statistical yearbook of the city of Cologne provides further details.
    • The information on households comes from the household generation process. This is a statistical procedure in which residents within an address are assigned to a household as far as possible by querying certain criteria. If the procedure does not identify any connections, the allocation to single-person households takes place. The statistical yearbook of the city of Cologne provides further details.
    • The data set pupils* at general schools (spatial location by place of residence) is available from 2013.
    • The number of the statistical quarter or district is a spatial location and can be linked to the geodata (see related resource below).

  4. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  5. g

    Statistical Data Catalogue Cologne

    • gimi9.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistical Data Catalogue Cologne [Dataset]. https://gimi9.com/dataset/eu_8b29c92d-48d9-40f0-881e-dd820669794d
    Explore at:
    Description

    Data from various sources are updated in the Statistical Information System of the City of Cologne. The annual Statistical Yearbook publishes these in tabular, graphic and cartographic form at the level of the districts and districts. Definitions and calculation bases are also explained. Small-scale statistics at the level of the 86 districts can be obtained via the Cologne district information. All levels of local territorial division are explained in this publication. This statistical data catalogue complements the range of small-scale data. Selected structural data can be retrieved here at the level of the 570 statistical quarters or the 86 districts in compact tabular form. The two overviews provide information about which data is available and from which source they come from. The data itself is provided annually. Notes: Data sources are specified in the overview tables. When using the data, please note the data license Germany — Attribution — Version 2.0. In some cases, values to maintain statistical confidentiality cannot be specified. For the data sets of the Federal Employment Agency, these are values from 1 to < 10, for all other data sets values from 1 to & 5. In the data, this is marked by an *. The differentiation of population numbers by gender is currently carried out by female and male inhabitants. The case numbers of those who define themselves as non-binary/divers are so low at a small scale that they cannot be reported for reasons of statistical secrecy. The identification of residents with a migrant background is carried out by combining different characteristics from the population reporting procedure. The data shall be interpreted as estimates. Further details can be found in the Statistical Yearbook of the City of Cologne. The data on households are derived from the household generation process. This is a statistical procedure in which residents are assigned to a household within an address by querying certain criteria. The procedure does not recognise any correlations with the single-person households. Further details can be found in the Statistical Yearbook of the City of Cologne. The data set pupils at general education schools (spatial location by place of residence) will be available from 2013. Via the number of the Statistical Quarter or the district, a spatial location and link to the spatial data is possible (see Related Resource below).

  6. h

    Census Population, Age and Gender

    • open.hamilton.ca
    Updated Aug 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Hamilton (2023). Census Population, Age and Gender [Dataset]. https://open.hamilton.ca/datasets/d7de5931aef846fe87e1ae72aaca3d62
    Explore at:
    Dataset updated
    Aug 13, 2023
    Dataset authored and provided by
    City of Hamilton
    License

    https://www.hamilton.ca/city-initiatives/strategies-actions/open-data-licence-terms-and-conditionshttps://www.hamilton.ca/city-initiatives/strategies-actions/open-data-licence-terms-and-conditions

    Area covered
    Description

    In the 2021 Census, Statistics Canada introduced the concept of gender. Given that the non-binary population is small, data aggregation to a two-category gender variable was necessary to protect the confidentiality of responses provided. In these cases, individuals in the category “non-binary persons” are distributed into the other two gender categories and are denoted by the “+” symbol.Data is derived from custom tabulations of Statistic Canada’s Census obtained by the City of Hamilton as a consortium member of the Canadian Community Economic Development Network (CCEDNet) Community Data Program. For more information about Statistic Canada’s Census including methods, questionnaires, data quality and reporting definitions, visit https://www.statcan.gc.ca/en/start.

  7. f

    Summary of definitions of predictors and how they were characterized in...

    • plos.figshare.com
    xls
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eliezer Ofori Odei-Lartey; Stephaney Gyaase; Solomon Nyame; Dominic Asamoah; Kwaku Poku Asante; James Ben Hayfron-Acquah (2025). Summary of definitions of predictors and how they were characterized in studies. [Dataset]. http://doi.org/10.1371/journal.pdig.0000965.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    PLOS Digital Health
    Authors
    Eliezer Ofori Odei-Lartey; Stephaney Gyaase; Solomon Nyame; Dominic Asamoah; Kwaku Poku Asante; James Ben Hayfron-Acquah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of definitions of predictors and how they were characterized in studies.

  8. f

    This file includes key information for each of the studies reviewed, which...

    • plos.figshare.com
    xlsx
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eliezer Ofori Odei-Lartey; Stephaney Gyaase; Solomon Nyame; Dominic Asamoah; Kwaku Poku Asante; James Ben Hayfron-Acquah (2025). This file includes key information for each of the studies reviewed, which include predictors used, outcome definitions, modelling approaches, encoding strategies, and feature engineering notes. [Dataset]. http://doi.org/10.1371/journal.pdig.0000965.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    PLOS Digital Health
    Authors
    Eliezer Ofori Odei-Lartey; Stephaney Gyaase; Solomon Nyame; Dominic Asamoah; Kwaku Poku Asante; James Ben Hayfron-Acquah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file includes key information for each of the studies reviewed, which include predictors used, outcome definitions, modelling approaches, encoding strategies, and feature engineering notes.

  9. Details on cases definitions for the UKB binary phenotypes based on ICD10...

    • plos.figshare.com
    xls
    Updated Aug 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Merina Shrestha; Zhonghao Bai; Tahereh Gholipourshahraki; Astrid J. Hjelholt; Sile Hu; Mads Kjolby; Palle Duun Rohde; Peter Sørensen (2025). Details on cases definitions for the UKB binary phenotypes based on ICD10 codes and self-reported diseases, total number of cases, controls along with the distribution of age (mean and standard deviation [sd]) and number (No) of females within cases and controls. [Dataset]. http://doi.org/10.1371/journal.pgen.1011783.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 6, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Merina Shrestha; Zhonghao Bai; Tahereh Gholipourshahraki; Astrid J. Hjelholt; Sile Hu; Mads Kjolby; Palle Duun Rohde; Peter Sørensen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Details on cases definitions for the UKB binary phenotypes based on ICD10 codes and self-reported diseases, total number of cases, controls along with the distribution of age (mean and standard deviation [sd]) and number (No) of females within cases and controls.

  10. Z

    Data from: Dataset of lightning flashovers on medium voltage distribution...

    • data.niaid.nih.gov
    Updated Jan 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarajcev, Petar (2023). Dataset of lightning flashovers on medium voltage distribution lines [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7382547
    Explore at:
    Dataset updated
    Jan 31, 2023
    Dataset provided by
    University of Split, FESB
    Authors
    Sarajcev, Petar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This synthetic dataset was generated from Monte Carlo simulations of lightning flashovers on medium voltage (MV) distribution lines. It is suitable for training machine learning models for classifying lightning flashovers on distribution lines. The dataset is hierarchical in nature (see below for more information) and class imbalanced.

    Following five different types of lightning interaction with the MV distribution line have been simulated: (1) direct strike to phase conductor (when there is no shield wire present on the line), (2) direct strike to phase conductor with shield wire(s) present on the line (i.e. shielding failure), (3) direct strike to shield wire with backflashover event, (4) indirect near-by lightning strike to ground where shield wire is not present, and (5) indirect near-by lightning strike to ground where shield wire is present on the line. Last two types of lightning interactions induce overvoltage on the phase conductors by radiating EM fields from the strike channel that are coupled to the line conductors. Three different methods of indirect strike analysis have been implemented, as follows: Rusck's model, Chowdhuri-Gross model and Liew-Mar model. Shield wire(s) provide shielding effects to direct, as well as screening effects to indirect, lightning strikes.

    Dataset consists of two independent distribution lines, with heights of 12 m and 15 m, each with a flat configuration of phase conductors. Twin shield wires, if present, are 1.5 m above the phase conductors and 3 m apart [2]. CFO level of the 12 m distribution line is 150 kV and that of the 15 m distribution line is 160 kV. Dataset consists of 10,000 simulations for each of the distribution lines.

    Dataset contains following variables (features):

    'dist': perpendicular distance of the lightning strike location from the distribution line axis (m), generated from the Uniform distribution [0, 500] m,

    'ampl': lightning current amplitude of the strike (kA), generated from the Log-Normal distribution (see IEC 60071 for additional information),

    'front': lightning current wave-front time (us), generated from the Log-Normal distribution; it needs to be emphasized that amplitudes (ampl) and wave-front times (front), as random variables, have been generated from the appropriate bivariate probability distribution which includes statistical correlation between these variates,

    'veloc': velocity of the lightning return-stroke current defined indirectly through the parameter "w" that is generated from the Uniform distribution [50, 500] m/us, which is then used for computing the velocity from the following relation: v = c/sqrt(1+w/I), where "c" is the speed of light in free space (300 m/us) and "I" is the lightning-current amplitude,

    'shield': binary indicator that signals presence or absence of the shield wire(s) on the line (0/1), generated from the Bernoulli distribution with a 50% probability,

    'Ri': average value of the impulse impedance of the tower's grounding (Ohm), generated from the Normal distribution (clipped at zero on the left side) with median value of 50 Ohm and standard deviation of 12.5 Ohm; it should be mentioned that the impulse impedance is often much larger than the associated grounding resistance value, which is why a rather high value of 50 Ohm have been used here,

    'EGM': electrogeometric model used for analyzing striking distances of the distribution line's tower; following options are available: 'Wagner', 'Young', 'AW', 'BW', 'Love', and 'Anderson', where 'AW' stands for Armstrong & Whitehead, while 'BW' means Brown & Whitehead model; statistical distribution of EGM models follows a user-defined discrete categorical distribution with respective probabilities: p = [0.1, 0.2, 0.1, 0.1, 0.3, 0.2],

    'ind': indirect stroke model used for analyzing near-by indirect lightning strikes; following options were implemented: 'rusk' for the Rusck's model, 'chow' for the Chowdhuri-Gross model (with Jakubowski modification) and 'liew' for the Liew-Mar model; statistical distribution of these three models follows a user-defined discrete categorical distribution with respective probabilities: p = [0.6, 0.2, 0.2],

    'CFO': critical flashover voltage level of the distribution line's insulation (kV),

    'height': height of the phase conductors of the distribution line (m),

    'flash': binary indicator that signals if the flashover has been recorded (1) or not (0). This variable is the outcome/label (i.e. binary class).

    Mathematical background used for the analysis of lightning interaction with the MV distribution line can be found in the references cited below.

    References:

    A. R. Hileman, "Insulation Coordination for Power Systems", CRC Press, Boca Raton, FL, 1999.

    J. A. Martinez and F. Gonzalez-Molina, "Statistical evaluation of lightning overvoltages on overhead distribution lines using neural networks," in IEEE Transactions on Power Delivery, vol. 20, no. 3, pp. 2219-2226, July 2005.

    A. Borghetti, C. A. Nucci and M. Paolone, An Improved Procedure for the Assessment of Overhead Line Indirect Lightning Performance and Its Comparison with the IEEE Std. 1410 Method, IEEE Transactions on Power Delivery, Vol. 22, No. 1, 2007, pp. 684-692.

  11. Z

    Data from: Lightning flashover simulations on medium voltage distribution...

    • data.niaid.nih.gov
    Updated Jul 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarajcev, P (2023). Lightning flashover simulations on medium voltage distribution lines [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6381636
    Explore at:
    Dataset updated
    Jul 17, 2023
    Dataset provided by
    University of Split, FESB
    Authors
    Sarajcev, P
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    [Version 1.2] This version of the dataset fixes a bug found in the previous versions (see below for more information).

    Dataset has been generated from the Monte Carlo simulations of lightning flashovers on medium voltage (MV) distribution lines. It is suitable for training machine learning models for classifying lightning flashovers on distribution lines, as well as for line insulation coordination studies. The dataset is hierarchical in nature (see below for more information) and class imbalanced.

    Following five different types of lightning interaction with the MV distribution line have been simulated: (1) direct strike to phase conductor (when there is no shield wire present on the line), (2) direct strike to phase conductor with shield wire(s) present on the line (i.e. shielding failure), (3) direct strike to shield wire with backflashover event, (4) indirect near-by lightning strike to ground where shield wire is not present, and (5) indirect near-by lightning strike to ground where shield wire is present on the line. Last two types of lightning interactions induce overvoltage on the phase conductors by radiating EM fields from the strike channel that are coupled to the line conductors. Shield wire(s) provide shielding effects to direct, as well as screening effects to indirect, lightning strikes.

    Dataset consists of the following variables:

    'dist': perpendicular distance of the lightning strike location from the distribution line axis (m), generated from the Uniform distribution [0, 500] m,

    'ampl': lightning current amplitude of the strike (kA), generated from the Log-Normal distribution (see IEC 60071 for additional information),

    'veloc': velocity of the lightning return stroke current (m/us), generated from the Uniform distribution [50, 500] m/us,

    'shield': binary indicator that signals presence or absence of the shield wire(s) on the line (0/1), generated from the Bernoulli distribution with a 50% probability,

    'Ri': average value of the impulse impedance of the tower's grounding (Ohm), generated from the Normal distribution (clipped at zero on the left side) with median value of 50 Ohm and standard deviation of 12.5 Ohm; it should be mentioned that the impulse impedance is often much larger than the associated grounding resistance value, which is why a rather high value of 50 Ohm have been used here,

    'EGM': electrogeometric model used for analyzing striking distances of the distribution line's tower; following options are available: 'Wagner', 'Young', 'AW', 'BW', 'Love', and 'Anderson', where 'AW' stands for Armstrong & Whitehead, while 'BW' means Brown & Whitehead model; statistical distribution of EGM models follows a user-defined discrete categorical distribution with respective probabilities: p = [0.1, 0.2, 0.1, 0.1, 0.3, 0.2],

    'CFO': critical flashover voltage level of the distribution line's insulation (kV); following three levels have been used: 150, 150, and 160 kV, respectively, for three different distribution lines of height 10, 12, and 14 m,

    'height': height of the phase conductors of the distribution line (m); distribution line has flat configuration of phase conductors with following heights: 10, 12, and 14 m; twin shield wires, if present, are 1.5 m above the phase conductors and 3 m apart; data set consists of 10000 simulations for each line height,

    'flash': binary indicator that signals if the flashover has been recorded (1) or not (0). This variable is the outcome (binary class).

    Note: It should be mentioned that the critical flashover voltage (CFO) level of the line is taken at 150 kV for the first two lines (10 m and 12 m) and 160 kV for the third line (14 m), and that the diameters of the phase conductors and shield wires for all treated lines are, respectively, 10 mm and 5 mm. Also, average grounding resistance of the shield wire is assumed at 10 Ohm for all treated cases (it has no discernible influence on the flashover rate). Dataset is class imbalanced and consists in total of 30000 simulations, with 10000 simulations for each of the three different MV distribution line heights (geometry) and CFO levels.

    Important: Version 1.2 of the dataset fixes an important bug found in the previous data sets, where the column 'Ri' contained duplicate data from the column 'veloc'. This issue is now resolved.

    Mathematical background used for the analysis of lightning interaction with the MV distribution line can be found in the references below.

    References:

    J. A. Martinez and F. Gonzalez-Molina, "Statistical evaluation of lightning overvoltages on overhead distribution lines using neural networks," in IEEE Transactions on Power Delivery, vol. 20, no. 3, pp. 2219-2226, July 2005, doi: 10.1109/TPWRD.2005.848734.

    A. R. Hileman, "Insulation Coordination for Power Systems", CRC Press, Boca Raton, FL, 1999.

  12. Descriptive statistics for the research sample.

    • plos.figshare.com
    xls
    Updated Sep 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jagienka Rześny-Cieplińska; Tomasz Tomaszewski; Maja Piecyk-Ouellet; Maja Kiba-Janiak (2023). Descriptive statistics for the research sample. [Dataset]. http://doi.org/10.1371/journal.pone.0289915.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jagienka Rześny-Cieplińska; Tomasz Tomaszewski; Maja Piecyk-Ouellet; Maja Kiba-Janiak
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AimActive transportation referring to non-motorized modes of transport is promoted and popularized both in practice and in the scientific literature, while their use for urban freight transport has been largely neglected. Thus the main scope of the paper is to indicate the development potential of micromobility use in urban freight transport and to check its influence on urban sustainability.MethodsThe authors have hypothesized that active means of transport, with a focus on micromobility, have great development potential in freight transportation in cities. The implemented methods for analyzing the relationship between users’ characteristics, micromobility, and its impact on urban sustainable development, were logit and probit modelling. The authors’ system includes an analysis of factors connected with the topics of sustainability and micromobilty, that have met an essential scientific gap that this paper addresses.Logistic (logit) regression is used mainly for binary, ordinal, and multi-level outcomes to find the probability of success (i.e. occurrence of some event). Probit regression, however, is primarily used in binary response models and assumes the normal distribution of data.ResultsThe main finding of the article has led the authors to the statement that active means of transport, including micromobility have great development potential in freight transportation in cities.ConclusionsKnowledge of the acceptance of micromobility solutions is essential for municipal authorities in shaping the development of urban transport systems. Thus proper strategies and actions need to be prioritized to leverage the sustainability-related co-benefits of active transport.

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rebecca Foster; Bart Harmsen; Lorenzo Milazzo; Greg Distiller; David Borchers (2014). Continuous-time spatially explicit capture-recapture models, with an application to a jaguar camera-trap survey [Dataset]. http://doi.org/10.5061/dryad.mg5kv

Data from: Continuous-time spatially explicit capture-recapture models, with an application to a jaguar camera-trap survey

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Apr 21, 2014
Dataset provided by
University of St Andrews
University of Cape Town
University of Cambridge
University of Belize
Authors
Rebecca Foster; Bart Harmsen; Lorenzo Milazzo; Greg Distiller; David Borchers
License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Area covered
Belize, Cockscomb Basin Wildlife Sanctuary
Description

Many capture-recapture surveys of wildlife populations operate in continuous time but detections are typically aggregated into occasions for analysis, even when exact detection times are available. This discards information and introduces subjectivity, in the form of decisions about occasion definition. We develop a spatio-temporal Poisson process model for spatially explicit capture-recapture (SECR) surveys that operate continuously and record exact detection times. We show that, except in some special cases (including the case in which detection probability does not change within occasion), temporally aggregated data do not provide sufficient statistics for density and related parameters, and that when detection probability is constant over time our continuous-time (CT) model is equivalent to an existing model based on detection frequencies. We use the model to estimate jaguar density from a camera-trap survey and conduct a simulation study to investigate the properties of a CT estimator and discrete-occasion estimators with various levels of temporal aggregation. This includes investigation of the effect on the estimators of spatio-temporal correlation induced by animal movement. The CT estimator is found to be unbiased and more precise than discrete-occasion estimators based on binary capture data (rather than detection frequencies) when there is no spatio-temporal correlation. It is also found to be only slightly biased when there is correlation induced by animal movement, and to be more robust to inadequate detector spacing, while discrete-occasion estimators with binary data can be sensitive to occasion length, particularly in the presence of inadequate detector spacing. Our model includes as a special case a discrete-occasion estimator based on detection frequencies, and at the same time lays a foundation for the development of more sophisticated CT models and estimators. It allows modelling within-occasion changes in detectability, readily accommodates variation in detector effort, removes subjectivity associated with user-defined occasions, and fully utilises CT data. We identify a need for developing CT methods that incorporate spatio-temporal dependence in detections and see potential for CT models being combined with telemetry-based animal movement models to provide a richer inference framework.

Search
Clear search
Close search
Google apps
Main menu