Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Standardized data on large-scale and long-term patterns of species richness are critical for understanding the consequences of natural and anthropogenic changes in the environment. The North American Breeding Bird Survey (BBS) is one of the largest and most widely used sources of such data, but so far, little is known about the degree to which BBS data provide accurate estimates of regional richness. Here we test this question by comparing estimates of regional richness based on BBS data with spatially and temporally matched estimates based on state Breeding Bird Atlases (BBA). We expected that estimates based on BBA data would provide a more complete (and therefore, more accurate) representation of regional richness due to their larger number of observation units and higher sampling effort within the observation units. Our results were only partially consistent with these predictions: while estimates of regional richness based on BBA data were higher than those based on BBS data, estimates of local richness (number of species per observation unit) were higher in BBS data. The latter result is attributed to higher land-cover heterogeneity in BBS units and higher effectiveness of bird detection (more species are detected per unit time). Interestingly, estimates of regional richness based on BBA blocks were higher than those based on BBS data even when differences in the number of observation units were controlled for. Our analysis indicates that this difference was due to higher compositional turnover between BBA units, probably due to larger differences in habitat conditions between BBA units and a larger number of geographically restricted species. Our overall results indicate that estimates of regional richness based on BBS data suffer from incomplete detection of a large number of rare species, and that corrections of these estimates based on standard extrapolation techniques are not sufficient to remove this bias. Future applications of BBS data in ecology and conservation, and in particular, applications in which the representation of rare species is important (e.g., those focusing on biodiversity conservation), should be aware of this bias, and should integrate BBA data whenever possible.
Methods Overview
This is a compilation of second-generation breeding bird atlas data and corresponding breeding bird survey data. This contains presence-absence breeding bird observations in 5 U.S. states: MA, MI, NY, PA, VT, sampling effort per sampling unit, geographic location of sampling units, and environmental variables per sampling unit: elevation and elevation range from (from SRTM), mean annual precipitation & mean summer temperature (from PRISM), and NLCD 2006 land-use data.
Each row contains all observations per sampling unit, with additional tables containing information on sampling effort impact on richness, a rareness table of species per dataset, and two summary tables for both bird diversity and environmental variables.
The methods for compilation are contained in the supplementary information of the manuscript but also here:
Bird data
For BBA data, shapefiles for blocks and the data on species presences and sampling effort in blocks were received from the atlas coordinators. For BBS data, shapefiles for routes and raw species data were obtained from the Patuxent Wildlife Research Center (https://databasin.org/datasets/02fe0ebbb1b04111b0ba1579b89b7420 and https://www.pwrc.usgs.gov/BBS/RawData).
Using ArcGIS Pro© 10.0, species observations were joined to respective BBS and BBA observation units shapefiles using the Join Table tool. For both BBA and BBS, a species was coded as either present (1) or absent (0). Presence in a sampling unit was based on codes 2, 3, or 4 in the original volunteer birding checklist codes (possible breeder, probable breeder, and confirmed breeder, respectively), and absence was based on codes 0 or 1 (not observed and observed but not likely breeding). Spelling inconsistencies of species names between BBA and BBS datasets were fixed. Species that needed spelling fixes included Brewer’s Blackbird, Cooper’s Hawk, Henslow’s Sparrow, Kirtland’s Warbler, LeConte’s Sparrow, Lincoln’s Sparrow, Swainson’s Thrush, Wilson’s Snipe, and Wilson’s Warbler. In addition, naming conventions were matched between BBS and BBA data. The Alder and Willow Flycatchers were lumped into Traill’s Flycatcher and regional races were lumped into a single species column: Dark-eyed Junco regional types were lumped together into one Dark-eyed Junco, Yellow-shafted Flicker was lumped into Northern Flicker, Saltmarsh Sparrow and the Saltmarsh Sharp-tailed Sparrow were lumped into Saltmarsh Sparrow, and the Yellow-rumped Myrtle Warbler was lumped into Myrtle Warbler (currently named Yellow-rumped Warbler). Three hybrid species were removed: Brewster's and Lawrence's Warblers and the Mallard x Black Duck hybrid. Established “exotic” species were included in the analysis since we were concerned only with detection of richness and not of specific species.
The resultant species tables with sampling effort were pivoted horizontally so that every row was a sampling unit and each species observation was a column. This was done for each state using R version 3.6.2 (R© 2019, The R Foundation for Statistical Computing Platform) and all state tables were merged to yield one BBA and one BBS dataset. Following the joining of environmental variables to these datasets (see below), BBS and BBA data were joined using rbind.data.frame in R© to yield a final dataset with all species observations and environmental variables for each observation unit.
Environmental data
Using ArcGIS Pro© 10.0, all environmental raster layers, BBA and BBS shapefiles, and the species observations were integrated in a common coordinate system (North_America Equidistant_Conic) using the Project tool. For BBS routes, 400m buffers were drawn around each route using the Buffer tool. The observation unit shapefiles for all states were merged (separately for BBA blocks and BBS routes and 400m buffers) using the Merge tool to create a study-wide shapefile for each data source. Whether or not a BBA block was adjacent to a BBS route was determined using the Intersect tool based on a radius of 30m around the route buffer (to fit the NLCD map resolution). Area and length of the BBS route inside the proximate BBA block were also calculated. Mean values for annual precipitation and summer temperature, and mean and range for elevation, were extracted for every BBA block and 400m buffer BBS route using Zonal Statistics as Table tool. The area of each land-cover type in each observation unit (BBA block and BBS buffer) was calculated from the NLCD layer using the Zonal Histogram tool.
Facebook
TwitterABSTRACT:
The World Soil Information Service (WoSIS) provides quality-assessed and standardized soil profile data to support digital soil mapping and environmental applications at broad scale levels. Since the release of the ‘WoSIS snapshot 2019’ many new soil data were shared with us, registered in the ISRIC data repository, and subsequently standardized in accordance with the licenses specified by the data providers. The source data were contributed by a wide range of data providers, therefore special attention was paid to the standardization of soil property definitions, soil analytical procedures and soil property values (and units of measurement).
We presently consider the following soil chemical properties (organic carbon, total carbon, total carbonate equivalent, total Nitrogen, Phosphorus (extractable-P, total-P, and P-retention), soil pH, cation exchange capacity, and electrical conductivity) and physical properties (soil texture (sand, silt, and clay), bulk density, coarse fragments, and water retention), grouped according to analytical procedures (aggregates) that are operationally comparable.
For each profile we provide the original soil classification (FAO, WRB, USDA, and version) and horizon designations as far as these have been specified in the source databases.
Three measures for 'fitness-for-intended-use' are provided: positional uncertainty (for site locations), time of sampling/description, and a first approximation for the uncertainty associated with the operationally defined analytical methods. These measures should be considered during digital soil mapping and subsequent earth system modelling that use the present set of soil data.
DATA SET DESCRIPTION:
The 'WoSIS 2023 snapshot' comprises data for 228k profiles from 217k geo-referenced sites that originate from 174 countries. The profiles represent over 900k soil layers (or horizons) and over 6 million records. The actual number of measurements for each property varies (greatly) between profiles and with depth, this generally depending on the objectives of the initial soil sampling programmes.
The data are provided in TSV (tab separated values) format and as GeoPackage. The zip-file (446 Mb) contains the following files:
Readme_WoSIS_202312_v2.pdf: Provides a short description of the dataset, file structure, column names, units and category values (this file is also available directly under 'online resources'). The pdf includes links to tutorials for downloading the TSV files into R respectively Excel. See also 'HOW TO READ TSV FILES INTO R AND PYTHON' in the next section.
wosis_202312_observations.tsv: This file lists the four to six letter codes for each observation, whether the observation is for a site/profile or layer (horizon), the unit of measurement and the number of profiles respectively layers represented in the snapshot. It also provides an estimate for the inferred accuracy for the laboratory measurements.
wosis_202312_sites.tsv: This file characterizes the site location where profiles were sampled.
wosis_2023112_profiles: Presents the unique profile ID (i.e. primary key), site_id, source of the data, country ISO code and name, positional uncertainty, latitude and longitude (WGS 1984), maximum depth of soil described and sampled, as well as information on the soil classification system and edition. Depending on the soil classification system used, the number of fields will vary .
wosis_202312_layers: This file characterises the layers (or horizons) per profile, and lists their upper and lower depths (cm).
wosis_202312_xxxx.tsv : This type of file presents results for each observation (e.g. “xxxx” = “BDFIOD” ), as defined under “code” in file wosis_202312_observation.tsv. (e.g. wosis_202311_bdfiod.tsv).
wosis_202312.gpkg: Contains the above datafiles in GeoPackage format (which stores the files within an SQLite database).
HOW TO READ TSV FILES INTO R AND PYTHON:
A) To read the data in R, please uncompress the ZIP file and specify the uncompressed folder.
setwd("/YourFolder/WoSIS_2023_December/") ## For example: setwd('D:/WoSIS_2023_December/')
Then use read_tsv to read the TSV files, specifying the data types for each column (c = character, i = integer, n = number, d = double, l = logical, f = factor, D = date, T = date time, t = time).
observations = readr::read_tsv('wosis_202312_observations.tsv', col_types='cccciid')
observations ## show columns and first 10 rows
sites = readr::read_tsv('wosis_202312_sites.tsv', col_types='iddcccc') sites
profiles = readr::read_tsv('wosis_202312_profiles.tsv', col_types='icciccddcccccciccccicccci') profiles
layers = readr::read_tsv('wosis_202312_layers.tsv', col_types='iiciciiilcc') layers
orgc = readr::read_tsv('wosis_202312_orgc.tsv', col_types='iicciilccdccddccccc')
orgc
Note: One may also use the following R code (example is for file 'observations.tsv'): observations <- read.table("wosis_202312_observations.tsv", sep = "\t", header = TRUE, quote = "", comment.char = "", stringsAsFactors = FALSE )
B) To read the files into python first decompress the files to your selected folder. Then in python:
import pandas as pd
observations = pd.read_csv("wosis_202312_observations.tsv", sep="\t") # print the data frame header and some rows observations.head()
sites = pd.read_csv("wosis_202312_sites.tsv", sep="\t")
profiles = pd.read_csv("wosis_202312_profiles.tsv", sep="\t")
layers = pd.read_csv("wosis_202312_layers.tsv", sep="\t")
cfvo = pd.read_csv("wosis_202312_cfvo.tsv", sep="\t")
CITATION: Calisto, L., de Sousa, L.M., Batjes, N.H., 2023. Standardised soil profile data for the world (WoSIS snapshot – December 2023), https://doi.org/10.17027/isric-wdcsoils-20231130
Supplement to: Batjes N.H., Calisto, L. and de Sousa L.M., 2023. Providing quality-assessed and standardised soil data to support global mapping and modelling (WoSIS snapshot 2023). Earth System Science Data, https://doi.org/10.5194/essd-16-4735-2024.
Facebook
TwitterPhotosynthetically Available Radiation (PAR) observations were made in the water column and on deck using Biospherical Instruments Inc. PUV-500 and PUV-510 (cosine, 400-700nm) units. The observations were made over a six month period (January-June 1995) from five cruises of the R/V Endeavor to the Georges Bank region.
Ship Cruise ID Dates
R/V Endeavor EN259 1995 01 12 1995 01 21
R/V Endeavor EN262 1995 02 26 1995 03 08
R/V Endeavor EN264 1995 03 28 1995 04 03
R/V Endeavor EN266 1995 04 27 1995 05 05
R/V Endeavor EN267A 1995 05 23 1995 06 04
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data comprise locations and dive metrics from 19 adult Weddell seals (Leptonychotes weddellii) instrumented in February 2011 with telemetry devices. These devices were Sea Mammal Research Unit Conductivity-Temperature-Depth Satellite Relay Data Loggers. These are animal-attached loggers that are manufactured and programmed by the Sea Mammal Research Unit Instrumentation Group at the University of St Andrews, Scotland. Ethical approval for this work was obtained from the University of St Andrews Animal Welfare and Ethics Committee, and from the corresponding committee at the British Antarctic Survey.
This dataset is linked to the manuscript Photopoulou et al. 2020 "Sex-specific variation in the use of vertical habitat by a resident Antarctic top predator" Proceedings of the Royal Society B (http://dx.doi.org/10.1098/rspb.2020.1447) and a preprint on bioRxiv (https://doi.org/10.1101/2020.06.15.152009).
The data are structured in long format, so that each row in the dataset represents an observation. The columns in the data are as follows.
ID: This a factor variable identifying each individual seal. It has 19 levels.
sex: This is a character variable identifying the sex of the seal, either "F" or "M" for female and male.
source: This is a factor variable identifying the type of record. It has three levels "dv", "ho" and "sf" for dive, haulout and surface record respectively.
start_date: This is a date variable (POSIXct) that gives the date and time of the beginning of an observation in UTC time.
end_date: This is a date variable (POSIXct) that gives the date and time of the end of an observation in UTC time.
local_start_date: This is a date variable (POSIXct) that gives the local date and time of the beginning of an observation at the geographic location of the seal.
end_end_date: This is a date variable (POSIXct) that gives the local date and time of the end of an observation at the geographic location of the seal.
fg_lon: This is a numeric variable and gives the longitude of the seal at the local end time of each record. This is the product of fitting a statistical model to the observed ARGOS location data, taking account of the error class. This is not the originally observed location. The model used was a continuous time state-space model implemented in the R package foieGras by Ian Jonsen and Toby Patterson (2019) foieGras: Fit Continuous-Time State-Space and Latent Variable Models for Filtering Argos Satellite (and Other) Telemetry Data and Estimating Movement Behaviour. R package version 0.4.01.
fg_lat: This is a numeric variable and gives the latitude of the seal at the local end time of each record. This is the product of fitting a statistical model to the observed ARGOS location data, taking account of the error class. This is not the originally observed location. The model used was a continuous time state-space model implemented in the R package foieGras by Ian Jonsen and Toby Patterson (2019) as above in fg_lon.
hour: This is a positive integer variable giving the hour of the day at the local end date.
week: This is a positive integer variable giving week of the year at the UTC end date.
DURATION: This is a positive integer variable that gives the duration of the record in seconds.
MAX_DEP: This is a positive continuous variable that gives the maximum depth (m) visited during a record. This is only zero for haulout records.
hunt_secs: This is a positive integer variable that gives the total number of seconds spent in hunting behaviour as calculated using methods developed by Heerah et al. (2019) Validation of dive foraging indices using archived and transmitted acceleration data: The case of the Weddell seal. Frontiers in Ecology and Evolution, 7: 30 and in the supplementary material of Photopoulou et al. (2020) Sex-dependent variation in the use of vertical habitat by a resident Antarctic top predator.
hunt_prop: This is a numeric variable ranging from 0 to 1. It represents the proportion of dive time spent hunting and is calculated as hunt_secs/DURATION for dive records only.
hunt_avgdep: This is a positive continuous variable giving the mean depth (m) of hunting behaviour as calculated using methods described above.
hunt_mindep: This is a positive continuous variable giving the minimum (shallowest) depth (m) of hunting behaviour, as calculated using methods described above.
hunt_maxdep: This is a positive continuous variable giving the maximum (deepest) depth (m) of hunting behaviour, as calculated using methods described above.
bathy: This is a negative continuous variable giving the depth (m) of the seal floor at the record's location based on the IBSCO bathymetry dataset Arndt et al. (2013) The International Bathymetric Chart of the Southern Ocean (IBCSO) Version 1.0—A new bathymetric compilation covering circum‐Antarctic waters. Geophysical Research Letters, 40(12): 3111-3117
bathy_std: This is a numeric variable giving the standardised bathymetry value calculated as (bathy-mean(bathy))/sd(bathy).
prop_bathy: This is a numeric variable ranging from 0 to 1, giving the proportion of bathymetry reached at the maximum dive depth, calculated as MAX_DEP/bathy.
prop_hunt_bathy: This is a numeric variable ranging from 0 to 1, giving the proportion of bathymetry reached at the mean hunting depth, calculated as hunt_avgdep/bathy.
dep4ctd: This is a discrete numeric variable diving the depth at which temperature and salinity were extracted from the CTD profile nearest in time to the location of the record. The CTD profile was interpolated at a 10m resolution.
psal: This is a positive continuous variable giving the salinity (PSU) at the location of the record, at the depth given by dep4ctd. When there was no hunting, there is an NA.
temp: This is a positive continuous variable giving the potential temperature (degrees Celsius) at the location of the record, at the depth given by dep4ctd. When there was no hunting, there is an NA.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○