This data package contains identifiers, metadata, and a map of the locations where field measurements have been conducted at the East River Community Observatory located in the Upper Colorado River Basin, United States. The East River is the primary field site of the Watershed Function Scientific Focus Area (WFSFA) and the Rocky Mountain Biological Laboratory. Researchers from over 30 institutions generate highly diverse hydrological, biogeochemical, climate, vegetation, geological, remote sensing, and model data at the East River in collaboration with the WFSFA. Thus, the purpose of this data package is to maintain an inventory of the field locations and instrumentation to provide information on the field activities in the East River and coordinate data collected across different locations, researchers, and institutions. The data package contains (1) a README file with information on the various files, (2) three csv files describing the metadata collected for each surface point location, plot and region registered with the WF SFA, (3) csv files with metadata and contact information for each surface point location registered with the WF SFA, (4) a csv file with with metadata and contact information for plots, (5) a csv file with metadata for geographic regions and sub-regions within the watershed, (6) a compiled xlsx file with all the data and metadata which can be opened in Microsoft Excel, and (7) a kmz map of the locations plotted in the watershed which can be opened in Google Earth. Persistent location identifiers are determined by the WFSFA data management team and are used to track data and samples across locations. Researchers interested in having their East River measurement locations added in this list should reach out to the WFSFA data management team at wfsfa-data@googlegroups.com. Acknowledgements: Please cite this dataset if using any of the location metadata in other publications or derived products. If using the location metadata for the NEON hyperspectral campaign, additionally cite Chadwick et al. (2020). doi:10.15485/1618130.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU
ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.
*lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)
Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.
This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.
** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).
** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract
Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).
**VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)
v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record
** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.
** Other uses of Speedtest Open Data; - see link at Speedtest below.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities. The boundaries for counties and equivalent entities are as of January 1, 2017, primarily as reported through the Census Bureau's Boundary and Annexation Survey (BAS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is outcome of a paper "Floating Car Data Map-matching Utilizing the Dijkstra Algorithm" accepted for 3rd International Conference on Data Management, Analytics & Innovation held in Kuala Lumpur, Malaysia in 2019.
The floating car data (FCD representing movement of cars with their position in time) is produced by the traffic simulator software (further referred to as Simulator) published in [1] and can be used as an input for data processing and benchmarking. The dataset contains FCD of various quality levels based on the routing graph of the Czech Republic derived from Open Street Map openstreetmap.org.
Should the dataset be exploited in scientific or other way, any acknowledgement or references to our paper [1] and dataset are welcomed and highly appreciated.
Archive contents
The archive contains following folders.
city_oneway and city_roadtrip - FCD from the city of Brno, Czech Republic where FCD is based on Origin-Destination in case of oneway and Origin-Destination-Origin in case of a road trip
intercity_oneway and intercity_roadtrip - FCD from cities of Brno, Ostrava, Olomouc and Zlin, all Czech Republic where FCD is based on Origin-Destination in case of oneway and Origin-Destination-Origin in case of a road trip
Content explanation
All four of mentioned folders contain raw FCD as they come from our Simulator, post-processed FCD enriching Simulator FCD, and obfuscated raw FCD (of both low and high obfuscation level). In the both obfuscated data sets, each measured point was moved in a random direction a number of meters given by drawing a number from a Gaussian distribution. We utilized two Gaussian distributions, one for the roads outside the city (N(0,10) for the lower and N(0,20) for the higher obfuscation level) and one for the roads inside the city (N(0,15) and N(0,30) respectively). Then some predefined number of randomly chosen points were removed (3% in our case). This approach should roughly represent real conditions encountered by FCD data as described by El Abbous and Samanta [2].
In case of post-processed road trip data, there is one extra dataset with "cache" suffix representing the very same dataset limited to a 5-minute session memoization. This folder also contains a picture of processed FCD represented on a map.
Data format Standard UTF-8 encoded CSV files, separated by a semicolon with the following columns:
RAW
Header
session_id;timestamp;lat;lon;speed;bearing;segment_id
Data
session_id: (Type: unsigned INT) - session (car) identifier timestamp: (Type: datetime) - timestamp in UTC lat: (Type: unsigned long) - latitude as used in Google maps lon: (Type: unsigned long) - longitude as used in Google maps speed: (Type: unsigned INT) - actual speed in kmh bearing: (Type: unsigned INT) - actual bearing in angles 0-360 segment_id: (Type: unsigned long) - unique edge identifier
POST-PROCESSED
Header
gid;car_id;point_time;lat;lon;segment_id;speed_kmh;speed_avg_kmh;distance_delta_m;distance_total_m;speedup_ratio;duration;segment_changed;duration_segment;moved;duration_move;good;duration_good;bearing;interpolated
Data
gid: (Type: unsigned long) - global identifier of a record car_id: (Type: unsigned INT) - session (car) identifier point_time: (Type: datetime) - timestamp with timezone lat: (Type: unsigned long) - latitude as used in Google maps lon: (Type: unsigned long) - longitude as used in Google maps segment_id: (Type: unsigned long) - unique edge identifier speed: (Type: unsigned INT) - actual speed in kmh speed_avg_kmh: (Type: unsigned long) - actual average speed of a car in kmh distance_delta_m: (Type: unsigned long) - actual distance delta in metres distance_total_m: (Type: unsigned long) - actual total distance of a car in metres speedup_ratio: (Type: unsigned long) - actual speed-up ratio of a car duration: (Type: time) - actual duration of a car segment_changed: (Type: boolean) - signals if actual segment of a car differs from the previous one duration_segment: (Type: time) - actual duration on a segment of a car moved: (Type: boolean) - signals if actual position of a car differs from the previous one duration_move:(Type: time) - actual duration of a car since moving good: signals if actual record values satisfies all data constraints (all true as derived from Simulator) duration_good: actual duration of a car since when all constraints conditions satisfied bearing: (Type: unsigned INT) - actual bearing in angles 0-360 interpolated: (Type: boolean) - signals if actual segment identifier is calculated (all false as derived from Simulator)
References
[1] V. Ptošek, J. Ševčík, J. Martinovič, K. Slaninová, L. Rapant, and R. Cmar, Real-time traffic simulator for self-adaptive navigation system validation, Proceedings of EMSS-HMS: Modeling & Simulation in Logistics, Traffic & Transportation, 2018.
[2] A. El Abbous and N. Samanta. A modeling of GPS error distri-butions, In proceedings of 2017 European Navigation Conference (ENC), 2017.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Summary:
The files contained herein represent green roof footprints in NYC visible in 2016 high-resolution orthoimagery of NYC (described at https://github.com/CityOfNewYork/nyc-geo-metadata/blob/master/Metadata/Metadata_AerialImagery.md). Previously documented green roofs were aggregated in 2016 from multiple data sources including from NYC Department of Parks and Recreation and the NYC Department of Environmental Protection, greenroofs.com, and greenhomenyc.org. Footprints of the green roof surfaces were manually digitized based on the 2016 imagery, and a sample of other roof types were digitized to create a set of training data for classification of the imagery. A Mahalanobis distance classifier was employed in Google Earth Engine, and results were manually corrected, removing non-green roofs that were classified and adjusting shape/outlines of the classified green roofs to remove significant errors based on visual inspection with imagery across multiple time points. Ultimately, these initial data represent an estimate of where green roofs existed as of the imagery used, in 2016.
These data are associated with an existing GitHub Repository, https://github.com/tnc-ny-science/NYC_GreenRoofMapping, and as needed and appropriate pending future work, versioned updates will be released here.
Terms of Use:
The Nature Conservancy and co-authors of this work shall not be held liable for improper or incorrect use of the data described and/or contained herein. Any sale, distribution, loan, or offering for use of these digital data, in whole or in part, is prohibited without the approval of The Nature Conservancy and co-authors. The use of these data to produce other GIS products and services with the intent to sell for a profit is prohibited without the written consent of The Nature Conservancy and co-authors. All parties receiving these data must be informed of these restrictions. Authors of this work shall be acknowledged as data contributors to any reports or other products derived from these data.
Associated Files:
As of this release, the specific files included here are:
Column Information for the datasets:
Some, but not all fields were joined to the green roof footprint data based on building footprint and tax lot data; those datasets are embedded as hyperlinks below.
For GreenRoofData2016_20180917.csv there are two additional columns, representing the coordinates of centroids in geographic coordinates (Lat/Long, WGS84; EPSG 4263):
Acknowledgements:
This work was primarily supported through funding from the J.M. Kaplan Fund, awarded to the New York City Program of The Nature Conservancy, with additional support from the New York Community Trust, through New York City Audubon and the Green Roof Researchers Alliance.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Microdata of the ANTIELAB Research Data Archive - Mobilization Map
Field definition (event.csv):
event-id: an unique event identifier date: date of the event time: hour of a day of the event link: url link(s) to the call for action posts on the Telegram channel type: type of the event district: one of the 18 Hong Kong districts (in Chinese) location1-13: specific locations identified in the post(s) (in Chinese)
The shape file contains the WSG84 coordinates of each of the identified locations of every event. You can look up the geographical coordinates of the events by matching the field ‘event_id’ in the shape file with that in the CSV file. The coordinates are provided by Google’s Geocoding API and Place API.
December 7, 2022: This updated version fixes a data format issue occurred when exporting the coordinates to the data repository. It also makes the date format of the events consistent to avoid user’s misidentification. These changes do not affect the analysis and the results of the original paper.
Reference: Teo, E., Fu, KW. A novel systematic approach of constructing protests repertoires from social media: comparing the roles of organizational and non-organizational actors in social movement. J Comput Soc Sc (2021). https://link.springer.com/article/10.1007/s42001-021-00101-3
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Microdata of the ANTIELAB Research Data Archive - Teargas Map
Field definition:
event-id: an unique event identifier time_interval: time interval within when the event happened district: one of the 18 Hong Kong districts (in Chinese) location1-3: specific locations identified in the post(s) (in Chinese)
The shape file contains the WSG84 coordinates of each of the identified locations of every event. You can look up the geographical coordinates of the events by matching the field ‘event_id’ in the shape file with that in the CSV file. The coordinates are provided by Google’s Geocoding API and Place API.
December 7, 2022: This updated version fixes a data format issue occurred when exporting the coordinates to the data repository. It also makes the date format of the events consistent to avoid user’s misidentification. These changes do not affect the analysis and the results of the original paper.
Reference: Teo, E., Fu, KW. (2021) A novel systematic approach of constructing protests repertoires from social media: comparing the roles of organizational and non-organizational actors in social movement. J Comput Soc Sc. https://link.springer.com/article/10.1007/s42001-021-00101-3
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset contains plant species occurrence records in the upper to mid Tana River Basin, digitized from preserved specimens at the East African Herbarium (EA), Nairobi. The dataset is a subset of the EA BRAHMS database generated over several years through various projects and day-to-day digitization efforts. Specimen records from the Tana River Basin were extracted from the EA BRAHMS dataset then cleaned and formatted in Darwin Core Standard. Data cleaning was done using Open Refine Software as well as the GPS Visualizer tool to generate altitude values from coordinates. Elevation data from GPS Visualizer is controlled from a variety of reputable sources including SRTM, ASTER, USGS and Google Maps. Other cleaning tasks included correcting spellings, filling in missing data, checking the taxonomy, dates, and subnational administrative areas. Records without coordinates were georeferenced using GEOLocate, Google Maps, and Gazetteers. The field altitude (m) was automatically generated from GPS Visualizer (https://www.gpsvisualizer.com/elevation) to generate missing elevation values. The process involved uploading a CSV format file with latitude and longitude values to the website which then sampled these coordinates against inbuilt elevation profile files (Digital Elevation Models) to give altitude values at specified coordinates. Most coordinates were also retrieved from gazetteers as they are legacy data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
VariablesCensus - week number the data was collected.Calendar date - date data was collected.Campus - university from where data was collected.Group_ID - identification tag of group that collected the data.Lat - latitude approximated using Google Maps.Long - longitude approximated using Google Maps.Elevation - approximated elevation.Rep - data collection number.abundance.native.plants - number of native plantsabundance.exotic.plants - number of exotic plantstotal.number.flowers(quadrats) - flower abundance measured by quadratsabundance.woody.plants - number of woody plantscanopy.cover - approximated percentage sky is covered by canopyground.cover - approximated percentage ground is covered by vegetationtotal.flower.numbers(transect) - flower abundance measured by transectsabundance.vertebrate - number of vertebratesvertebrate.species - number of species of vertebratesabundance.human - number of humansabundance.invertebrates.pantraps - number of invertebrates measured by pantrapsabundance.invertebrates.sweep - number of invertebrates measured by sweep netsMethods:MethodsData was collected across four observers and later compiled into a single digital document.Data collected approximately from 15:00 to 17:00 on October 3, 2016 at the York University Keele Reservoir. Conditions were cloudy, mild temperatures.Two transects were used to encompass a 50m distance.25 quadrats were placed in each habitat, distanced apart by 2 meters from the last, while also alternating left and right of the transect.A measuring tape was used to approximate distance more accurately at times.Woody plant abundance and other similar data were collected every 2 meters along the transect. They were approximated values.In terms of measuring abundance of vertebrates, etc., the observer walked counter-clockwise around each habitat for 15 minutes.The transect aided in visualizing the 50m observation range.Any vertebrate directly above observer were not counted.Humans in vehicles were not counted, and humans that made multiple appearances were also not counted. There was no minimum time for a human to be considered in a habitat, just so long as they remained in range for the observer to make note.6 pan traps were placed of alternating colors (placed at 15:00 at Grasslands, picked up at 16:06, and placed at 16:30 in Disturbance Area, and picked up at 15:30).The pan traps were distanced along the transect, 3 meters apart from the last.10 sweep nets traveled the transect distance to capture invertebrates. 5 sweep nets on each side of the transect.Tools:1 Quadrat2 Transects1 Measuring tape1 Jug of soap water2 Sweep nets12 Pan trapsHypothesis:The grasslands will demonstrate a greater diversity of life, in terms of abundances, vegetation, and flowers.Predictions:1. There will be greater canopy coverage in the grasslands than in the disturbed area2. The disturbed area will have a reduced invertebrate/vertebrate abundance3. There will be more humans in the disturbed area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables
Census - week number the data was collected. Calendar date - date data was collected. Month/day. Campus - university from where data was collected. Group_ID - identification tag of group that collected the data. Lat - latitude approximated using Google Maps. Long - longitude approximated using Google Maps. Elevation - approximated elevation. Measured in meters. Rep - data collection number.
abundance.native.plants - number of native plants, determined by aid of teacher assistant. A plant was classified as native if it originated in its respective habitat. This is continuous variable with a ratio scale where 0 represents no plant were observed.
abundance.exotic.plants - number of exotic plants, determined by aid of teacher assistant. A plant was classified as exotic if it originated outside its respective habitat. This is continuous variable with a ratio scale where 0 represents no plant were observed.
total.number.flowers(quadrats) - flower abundance measured by quadrats. Only flowers whose roots were inside the quadrat were counted. This is continuous variable with a ratio scale where 0 represents no flower head were observed. abundance.woody.plants - number of woody plants, i.e. trees. A tree was defined to be at least 1.5m in height. This is continuous variable with a ratio scale where 0 represents no plant were observed.
canopy.cover - approximated percentage sky is covered by tree canopy. This is continuous variable.
ground.cover - approximated percentage ground is covered by vegetation. This is continuous variable.
total.flower.numbers(transect) - flower abundance measured by transects (1m by 1m of transect). This is continuous variable.
abundance.vertebrate - number of vertebrates, where vertebrates is defined as an animal with a structural backbone. There is a continuous variable with a ratio scale where 0 represents no vertebrates were observed.
vertebrate.species - number of species of vertebrates, where species was distinguished based on morphologically traits. This is continuous variable with a ratio scale where 0 represents no vertebrates were observed.
abundance.human - number of humans observable in the area. This is continuous variable with a ratio scale where 0 represents no vertebrates were observed.
abundance.invertebrates.observed - number of invertebrates, where intervebrate is defined as annanimal lacking a structural backbone. This is continuous variable with a ratio scale where 0 represents no vertebrates were observed.
abundance.invertebrates.sweep - number of invertebrates captured by sweep nets. An invertebrate was defined as lacking a structural backbone.
Methods
Data was collected across four observers and later compiled into a single digital document. Data collected approximately from 15:00 to 17:00 on October 17, 2016 at the York University's designated "The Pond Area" and "impermeable ground Area". Weather conditions was foggy , mild temperatures and slightly breeze. The process of collecting data was repeated twice. Once by The Pond Area at York U, and then again at the Impermeable Area. All abundances were counted and were clearly visible. Two transects were combined to encompass 50 meters long . Most collection of data used the transects as a reference point. 25 quadrats were placed in each habitat, distanced apart by 2 meters from the last, while also alternating left and right of the transect. An observer would count the abundance of native and exotic plants in the quadrat, as well as the number of flower heads.
An observer would walk along the transect. Every 2 meters, they would approximate the abundance of trees and flowers, as well as canopy coverage and vegetative ground coverage. Coverages were approximated as percentages.
The abundance of verterbrates, invertebrates, number of vertebrate species, and abundance of humans were measured along the transect. Vertebrate information was collected on a 50m radius from the transect, and invertebrate information was collected on a 5m radius. Observances were made in a counter-clockwise direction around the transect for 15 minutes. To visualize a 50m observation range, the observer referred to the transect laid out. To cover a 5m observation range, the observer took 5 large steps. Any vertebrate that was more than 0.5m above the observer's view was not collected. Humans in vehicles were not included in the data, nor were humans recounted if they made a double appearance. The humans were counted as long as they were within observation range regardless if they were just passing by.
An observer placed six pan traps in alternating color order. They were placed at 14:52 at The Pond Area, picked up at 16:20, while placed at 16:25 on the impermeable Area, and picked up at 17:09). The pan traps were distanced along the transect, 3 meters apart from the last. 10 sweep nets traveled 50m along the transect to capture invertebrates. 5 sweep nets on each side of the transect.
Hypothesis:
As the amount of plants increase, the amount of invertebrates will increase as
the plants provide them with resources.
Prediction:
1. Near pond will have a greater abundance of invertebrates than the
impermeable ground because there will be a higher abundance of plants resulting
in more resources.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset for "Geography for AI Sustainability and Sustainability for GeoAI". It includes the necessary data and code to reproduce the figures presented in the work.The .csv file contains detailed information on the locations of data centers from three major cloud computing providers: Amazon Web Services, Microsoft Azure, and Google Cloud Platform. It includes the coordinates, launch time, and carbon intensity values (sourced from Electricity Map, as yearly average of 2023) for each data center.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Column Headers and Descriptions:
census: refers to the sampling week. This dataset was prepared during Week 5 of the Fall 2016 BIOL 2050 course, or the 1st sampling week.
calendar.date: refers to the date on which the data was collected (October 4th, 2016).
campus: the data was collected at York University.
group _ID: refers to the unique group identifier within each BIOL 2050 laboratory.
habitat: refers to the habitat in which observations were recorded. The two habitats that were assessed are grassland and disturbed open space (disturbed grassland). Grassland is defined as an open area at least 250m by 250m in size with few trees. The disturbed open space was defined as a region similar to grassland in size, but containing more mixed vegetation and more trails for people.
lat: refers to the latitude at which observations were recorded. Latitude was approximated using Google Maps.
long: refers to the longitude at which observations were recorded. Longitude was approximated using Google Maps.
rep: refers to the replicate; each replicate is a repetition of the experimental condition.
abundance.native.plants: refers to the number of native plants. Native plants are defined as those plants that develop naturally or existed in the area for a long time. Continuous (numerical) variable.
abundance.exotic.plants: refers to the number of exotic plants. Exotic plants are defined as those plants that do not originate from the location of the study. Continuous (numerical) variable.
total.number.flowers (quadrat): refers to the total number of flowers counted in a quadrat. Continuous (numerical) variable.
abundance.woody.plants: refers to the total number of woody plants counted. Woody plant is defined as a plant that is greater than 1.5 meters in height. Continuous (numerical) variable.
canopy.cover: refers to the coverage area of the projecting tree crown. Expressed as a percentage. Continuous (numerical) variable.
ground.cover: refers to the estimated proportion of vegetative ground cover. Expressed as a percentage. Continuous (numerical) variable.
total.flower.number (transect): refers to the number of flowers counted within 0.5 meters of the transect. Continuous (numerical) variable.
abundance.vertebrates: the number of vertebrates (animals) observed within a 50-meter radius in a 15-minute interval. Continuous (numerical) variable.
vertebrate.species: refers to the species of vertebrate that were observed. Categorical variable. Species that were observed include sparrows, crows, and squirrels.
abundance.human: the number of humans observed within a 50-meter radius in a 15-minute interval. These humans were only recorded if they did not belong to the BIOL 2050 laboratory. Continuous (numerical) variable.
abundance.invertebrates.pantraps: refers to the number of invertebrates counted in the soapy-water-filled bowls after 30 minutes. Continuous (numerical) variable.
abundance.invertebrates.sweeps: refers to the number of invertebrates counted inside the sweep nets after swinging for 50m. Continuous (numerical) variable.
abundance.invertebrates.observed: the number of invertebrates observed within a 5-meter radius in a 15-minute interval. Continuous (numerical) variable.
Additional Information:
Hypothesis and Predictions:
A. It is hypothesized that there will be a negative relationship between the 2 variables, (abundance of native and exotic plants) in the grasslands and disturbed open space, because the plant species will compete with each other for resources (light, space, food.
It is predicted that as the abundance of native plants increases, the fewer exotic plants you will see present in a habitat.
B. It was hypothesized that there will be more woody plants in the grassland habitat than in the disturbed open space. In the disturbed area the soil is not favourable for the growth of the large woody plants and flowers, thus the ground coverage will be higher and abundance of woody plants will be close to zero. It was predicted that woody plants will be more abundant in grassland compare to the disturbed area.
C. It was hypothesized that in the grassland there are more vertebrates and invertebrates than in the disturbed open space.
It is predicted that more vertebrates will be observed in the grassland than in the disturbed open space after surveying a 50-meter radius for 15 minutes. It is also predicted that more invertebrates will be observed in the grassland than in the disturbed open space after surveying a 5-meter radius for 15 minutes.
It is predicted that there will be a lot of invertebrates in the sweep nets in the grasslands and little in the sweep nets in the disturbed open space. There will be a lot of invertebrates in the pan traps in the grassland and little in the pan traps in the disturbed open space.
Time Data Was Collected: Tuesday, October 4, 2016. 3:20-4:35 EST.
Location of Data: Grassland and adjacent disturbed at York University, near York Boulevard. Latitude and longitude: 43.775, -79.505 (grassland) and 43.775 -79.5055 (disturbed grassland).
Weather Conditions: 19 degrees Celsius, clear, Sunny and dry conditions.
Survey Method:
A. The field experiment was conducted using the quadrat sampling technique. A 50m transect was placed on the ground. An individual would walk along this transect and place a quadrat every 2 meters, alternating left and right along the transect. Each time the quadrats were placed, the total number of exotic and native plants, as well as the total number of flowers within the quadrats were counted and recorded. This sampling technique was repeated for a total of 25 times. Similarly, in the disturbed open space, this method was replicated. However, in order to get an estimation of the exotic plants, the quadrats were visually divided into sections of 16. The abundance of exotic plants was counted for 1 section, which was then used to estimate the abundance in the entire quadrat.
B. This experiment was completed using transect. 25 replicates were conducted in the grassland and 25 replicates in the disturbed open area. Every two meters along a transect of 50 meters, abundance of woody plants was measured on either side of the transect within 0.5 meters. At these points canopy coverage was observed and recorded by making a square with fingers and holding up and estimating how much of sky you can see. Using the same method the ground coverage was estimated. To measure the ground coverage, a visual area was divided into quadrats, then sum the area covered. At end the total number of flowers was also recorded.
C. A 50-meter transect was established perpendicular to the periphery of the grassland. Standing at the beginning of the transect, a 50-meter radius was surveyed. The number of vertebrates, the types of vertebrate species, and the number of people (who were not members of the BIOL 2050 Ecology Lab) observed in 15 minutes were recorded. Then, a 5-meter radius from the beginning of the transect was surveyed for 15 minutes. The number of individual invertebrates observed within this 5-meter radius in 15 minutes was recorded.
The above procedure was repeated in the adjacent disturbed open space.
D. 12 bowls with a 6cm diameter and 5cm depth were filled with soapy water and placed in 3m intervals, with 6 bowls in the grassland and 6 in the disturbed open space. The bowl colors were alternated between yellow, blue, and white. The bowls were left for 30 minutes and then the number of invertebrates in the bowl was counted. While waiting the 30 minutes for the pan trap bowls, 50 meters were walked while swinging a sweep net with a 32 cm and a depth of 73 cm from side to side. At the end of the 50 meters, the number of invertebrates captured inside was counted and then the invertebrates released. This method was repeated 10 times in the grassland and 10 times in the disturbed open space.
Equipment Used:
transects
12 plastic bowls with a 6cm diameter and 5cm depth. 4 bowls were blue, 4 bowls were white, and 4 bowls were yellow.
1 sweep net with a diameter of 32 cm and a depth of 73 cm was used to conduct all the sweep net trials.
quadrats measuring 1 meter by 1 meter
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Column Headers and Descriptions:
census: refers to the number of weeks of observations that have taken place to date for the campus ecology survey in BIOL 2050 Lab Section 02. Continuous (numerical) variable.
habitat: refers to the habitat in which observations were recorded. The habitat that was assessed is a forest. Forest is defined as a woodlot on campus.
lat: refers to the latitude at which observations were recorded. Latitude was approximated using Google Maps.
elevation: refers to the elevation at which observations were recorded.
long: refers to the longitude at which observations were recorded. Longitude was approximated using Google Maps.
rep: refers to the replicate; each replicate is a repetition of the experimental condition.
abundance.native.plants: refers to the number of native plants. Native plants are defined as those plants that develop naturally or existed in the area for a long time. Continuous (numerical) variable.
abundance.exotic.plants: refers to the number of exotic plants. Exotic plants are defined as those plants that do not originate from the location of the study. Continuous (numerical) variable.
total.number.flowers (quadrat): refers to the total number of flowers counted in a quadrat. Continuous (numerical) variable.
abundance.woody.plants: refers to the total number of woody plants counted. Woody plant is defined as a plant that is greater than 1.5 meters in height. Continuous (numerical) variable.
canopy.cover: refers to the coverage area of the projecting tree crown. Expressed as a percentage. Continuous (numerical) variable.
ground.cover: refers to the estimated proportion of vegetative ground cover. Expressed as a percentage. Continuous (numerical) variable.
total.flower.number (transect): refers to the number of flowers counted within 0.5 meters of the transect. Continuous (numerical) variable.
abundance.vertebrates: the number of vertebrates (animals) observed within a 50-meter radius in a 15-minute interval. Continuous (numerical) variable.
vertebrate.richness: refers to the number of species of vertebrates that were observed. Categorical variable.
abundance.human: the number of humans observed within a 50-meter radius in a 15-minute interval. These humans were only recorded if they did not belong to the BIOL 2050 laboratory. Continuous (numerical) variable.
abundance.invertebrates.pantraps: refers to the number of invertebrates counted in the soapy-water-filled bowls after 30 minutes. Continuous (numerical) variable.
abundance.invertebrates.sweeps: refers to the number of invertebrates counted inside the sweep nets after swinging for 50m. Continuous (numerical) variable.
abundance.invertebrates.observed: the number of invertebrates (insects) observed within a 5-meter radius in a 15-minute interval. Continuous (numerical) variable.
Additional Information:
Hypothesis and Predictions:
A. It was hypothesized that fewer invertebrates and vertebrates will be found in the forest as temperature decreases.
It is predicted that fewer vertebrates will be observed in the forest on October 25th, 2016 than were observed on September 27th, 2016 after surveying a 50-meter radius for 15 minutes. It is also predicted that fewer invertebrates will be observed in the forest on October 25th, 2016 than were observed on September 27th, 2016 after surveying a 50-meter radius for 15 minutes
It is predicted that there will be less invertebrates in the pan traps on October 25th, 2016 than there were on September 27th, 2016. It was also predicted that there will be fewer invertebrates in the sweep nets on October 25th, 2016 than there were on September 27th, 2016.
B. It is hypothesized that there will be a negative relationship between the 2 variables, (abundance of native and exotic plants) in the forest, because the plant species will compete with each other for resources (light, space, food).
It is predicted that as the abundance of native plants increases, the fewer exotic plants you will see present the forest.
C. It was hypothesized that there is a negative correlation between number of the woody plants and total number of the flowers in the forest. It was predicted that number woody plants will be higher than number of the flowers. The reason is that the percentage of the canopy coverage is higher near the woody plants, therefore blocking the sunlight for the growth of the vegetation and flowers below the trees. Sunlight is an essential source for the growth and photosynthesis of the flowers, less exposure will result in less growth of the small plants.
Time Data Was Collected: Tuesday, October 25, 2016. 2:45-3:15 EST.
Location of Data: Forest at York University. Latitude and longitude: 43.768756, -79.5079 and 126.30201.
Weather Conditions: 9 degrees Celsius, sunny, clear conditions.
Survey Method:
A. The field experiment was conducted using the quadrat sampling technique. A 50m transect was placed on the ground in the forest. An individual would walk along this transect and place a quadrat every 2 meters, alternating left and right along the transect. Each time the quadrats were placed, the total number of exotic and native plants, as well as the total number of flowers within the quadrats were counted and recorded. This sampling technique was repeated for a total of 25 times.
B. This experiment was completed using transects. 25 replicates were conducted in the forest. Every two meters along a transect of 50 meters, abundance of woody plants was measured on either side of the transect within 0.5 meters. At these points canopy coverage was observed and recorded by making a square with fingers and holding up and estimating how much of sky you can see. Using the same method the ground coverage was estimated. To measure the ground coverage, a visual area was divided into quadrats, then sum the area covered. At end the total number of flowers was also recorded.
C. A 50-meter transect was established perpendicular to the periphery of the forest. Standing at the beginning of the transect, a 50-meter radius was surveyed. The number of vertebrates, the types of vertebrate species, and the number of people (who were not members of the BIOL 2050 Ecology Lab) observed in 15 minutes were recorded. Then, a 5-meter radius from the beginning of the transect was surveyed for 15 minutes. The number of individual invertebrates observed within this 5-meter radius in 15 minutes was recorded.
D. 12 bowls with a 6cm diameter and 5cm depth were filled with soapy water and placed in 3m intervals in the forest. The bowl colors were alternated between yellow, blue, and white. The bowls were left for 30 minutes and then the number of invertebrates in the bowl was counted. While waiting the 30 minutes for the pan trap bowls, 50 meters were walked while swinging a sweep net with a 32 cm and a depth of 73 cm from side to side. At the end of the 50 meters, the number of invertebrates captured inside was counted and then the invertebrates released back to the forest.
Equipment Used:
transects
12 plastic bowls with a 6cm diameter and 5cm depth. 4 bowls were blue, 4 bowls were white, and 4 bowls were yellow.
1 sweep net with a diameter of 32 cm and a depth of 73 cm was used to conduct all the sweep net trials.
quadrats measuring 1 meter by 1 meter
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This data package contains identifiers, metadata, and a map of the locations where field measurements have been conducted at the East River Community Observatory located in the Upper Colorado River Basin, United States. The East River is the primary field site of the Watershed Function Scientific Focus Area (WFSFA) and the Rocky Mountain Biological Laboratory. Researchers from over 30 institutions generate highly diverse hydrological, biogeochemical, climate, vegetation, geological, remote sensing, and model data at the East River in collaboration with the WFSFA. Thus, the purpose of this data package is to maintain an inventory of the field locations and instrumentation to provide information on the field activities in the East River and coordinate data collected across different locations, researchers, and institutions. The data package contains (1) a README file with information on the various files, (2) three csv files describing the metadata collected for each surface point location, plot and region registered with the WF SFA, (3) csv files with metadata and contact information for each surface point location registered with the WF SFA, (4) a csv file with with metadata and contact information for plots, (5) a csv file with metadata for geographic regions and sub-regions within the watershed, (6) a compiled xlsx file with all the data and metadata which can be opened in Microsoft Excel, and (7) a kmz map of the locations plotted in the watershed which can be opened in Google Earth. Persistent location identifiers are determined by the WFSFA data management team and are used to track data and samples across locations. Researchers interested in having their East River measurement locations added in this list should reach out to the WFSFA data management team at wfsfa-data@googlegroups.com. Acknowledgements: Please cite this dataset if using any of the location metadata in other publications or derived products. If using the location metadata for the NEON hyperspectral campaign, additionally cite Chadwick et al. (2020). doi:10.15485/1618130.