5 datasets found

S
Two residential districts datasets from Kielce, Poland for building semantic...
scidb.cn
Updated Sep 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agnieszka Łysak (2022). Two residential districts datasets from Kielce, Poland for building semantic segmentation task [Dataset]. http://doi.org/10.57760/sciencedb.02955
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.02955
Dataset updated
Sep 29, 2022
Dataset provided by
Science Data Bank
Authors
Agnieszka Łysak
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Area covered
Kielce, Poland
Description
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.
d
Residential Schools Locations Dataset (Geodatabase)
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Orlandini, Rosa (2023). Residential Schools Locations Dataset (Geodatabase) [Dataset]. http://doi.org/10.5683/SP2/JFQ1SZ
Explore at:
Unique identifier
https://doi.org/10.5683/SP2/JFQ1SZ
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Orlandini, Rosa
Time period covered
Jan 1, 1863 - Jun 30, 1998
Description
The Residential Schools Locations Dataset in Geodatabase format (IRS_Locations.gbd) contains a feature layer "IRS_Locations" that contains the locations (latitude and longitude) of Residential Schools and student hostels operated by the federal government in Canada. All the residential schools and hostels that are listed in the Residential Schools Settlement Agreement are included in this dataset, as well as several Industrial schools and residential schools that were not part of the IRRSA. This version of the dataset doesn’t include the five schools under the Newfoundland and Labrador Residential Schools Settlement Agreement. The original school location data was created by the Truth and Reconciliation Commission, and was provided to the researcher (Rosa Orlandini) by the National Centre for Truth and Reconciliation in April 2017. The dataset was created by Rosa Orlandini, and builds upon and enhances the previous work of the Truth and Reconcilation Commission, Morgan Hite (creator of the Atlas of Indian Residential Schools in Canada that was produced for the Tk'emlups First Nation and Justice for Day Scholar's Initiative, and Stephanie Pyne (project lead for the Residential Schools Interactive Map). Each individual school location in this dataset is attributed either to RSIM, Morgan Hite, NCTR or Rosa Orlandini. Many schools/hostels had several locations throughout the history of the institution. If the school/hostel moved from its’ original location to another property, then the school is considered to have two unique locations in this dataset,the original location and the new location. For example, Lejac Indian Residential School had two locations while it was operating, Stuart Lake and Fraser Lake. If a new school building was constructed on the same property as the original school building, it isn't considered to be a new location, as is the case of Girouard Indian Residential School.When the precise location is known, the coordinates of the main building are provided, and when the precise location of the building isn’t known, an approximate location is provided. For each residential school institution location, the following information is provided: official names, alternative name, dates of operation, religious affiliation, latitude and longitude coordinates, community location, Indigenous community name, contributor (of the location coordinates), school/institution photo (when available), location point precision, type of school (hostel or residential school) and list of references used to determine the location of the main buildings or sites. Access Instructions: there are 47 files in this data package. Please download the entire data package by selecting all the 47 files and click on download. Two files will be downloaded, IRS_Locations.gbd.zip and IRS_LocFields.csv. Uncompress the IRS_Locations.gbd.zip. Use QGIS, ArcGIS Pro, and ArcMap to open the feature layer IRS_Locations that is contained within the IRS_Locations.gbd data package. The feature layer is in WGS 1984 coordinate system. There is also detailed file level metadata included in this feature layer file. The IRS_locations.csv provides the full description of the fields and codes used in this dataset.
Z
Spatial distribution of housing rental value in Amsterdam 1647-1652
data.niaid.nih.gov
zenodo.org
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li, Weixuan (2024). Spatial distribution of housing rental value in Amsterdam 1647-1652 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7473119
Explore at:
Dataset updated
Jul 15, 2024
Dataset authored and provided by
Li, Weixuan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Amsterdam
Description
This dataset visualises the spatial distribution of the rental value in Amsterdam between 1647 and 1652. The source of rental value comes from the Verponding registration in Amsterdam. The verponding or the ‘Verpondings-quohieren van den 8sten penning’ was a tax in the Netherlands on the 8th penny of the rental value of immovable property that had to be paid annually. In Amsterdam, the citywide verponding registration started in 1647 and continued into the early 19th century. With the introduction of the cadastre system in 1810, the verponding came to an end.

The original tax registration is kept in the Amsterdam City Archives (Archief nr. 5044) and the four registration books transcribed in this dataset are Archief 5044, inventory 255, 273, 281, 284. The verponding was collected by districts (wijken). The tax collectors documented their collecting route by writing down the street or street-section names as they proceed. For each property, the collector wrote down the names of the owner and, if applicable, the renter (after ‘per’), and the estimated rental value of the property (in guilders). Next to the rental value was the tax charged (in guilders and stuivers). Below the owner/renter names and rental value were the records of tax payments by year.

This dataset digitises four registration books of the verponding between 1647 and 1652 in two ways. First, it transcribes the rental value of all real estate properties listed in the registrations. The names of the owners/renters are transcribed only selectively, focusing on the properties that exceeded an annual rental value of 300 guilders. These transcriptions can be found in Verponding1647-1652.csv. For a detailed introduction to the data, see Verponding1647-1652_data_introduction.txt.

Second, it geo-references the registrations based on the street names and the reconstruction of tax collectors’ travel routes in the verponding. The tax records are then plotted on the historical map of Amsterdam using the first cadaster of 1832 as a reference. Since the geo-reference is based on the street or street sections, the location of each record/house may not be the exact location but rather a close proximation of the possible locations based on the street names and the sequence of the records on the same street or street section. Therefore, this geo-referenced verponding can be used to visualise the rental value distribution in Amsterdam between 1647 and 1652. The preview below shows an extrapolation of rental values in Amsterdam. And for the geo-referenced GIS files, see Verponding_wijken.shp.

GIS specifications:

Coordination Reference System (CRS): Amersfoort/RD New (ESPG:28992)

Historical map tiles URL (From Amsterdam Time Machine)

NB: This verponding dataset is a provisional version. The georeferenced points and the name transcriptions might contain errors and need to be treated with caution.

Contributors

Historical and archival research: Weixuan Li, Bart Reuvekamp

Plotting of geo-referenced points: Bart Reuvekamp

Spatial analysis: Weixuan Li

Mapping software: QGIS

Acknowledgements: Virtual Interiors project, Daan de Groot
K
NZ Populated Places - Polygons
koordinates.com
csv, dwg, geodatabase +6
Updated Jun 16, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Scott (2011). NZ Populated Places - Polygons [Dataset]. https://koordinates.com/layer/3658-nz-populated-places-polygons/
Explore at:
kml, csv, dwg, mapinfo tab, pdf, geodatabase, shapefile, mapinfo mif, geopackage / sqliteAvailable download formats
Dataset updated
Jun 16, 2011
Authors
Peter Scott
Area covered
Description
ps-places-metadata-v1.01

SUMMARY

This dataset comprises a pair of layers, (points and polys) which attempt to better locate "populated places" in NZ. Populated places are defined here as settled areas, either urban or rural where densitys of around 20 persons per hectare exist, and something is able to be seen from the air.

RATIONALE

The only liberally licensed placename dataset is currently LINZ geographic placenames, which has the following drawbacks: - coordinates are not place centers but left most label on 260 series map - the attributes are outdated

METHODOLOGY

This dataset necessarily involves cleaving the linz placenames set into two, those places that are poplulated, and those unpopulated. Work was carried out in four steps. First placenames were shortlisted according to the following criterion: - all places that rated at least POPL in the linz geographic places layer, ie POPL, METR or TOWN or USAT were adopted. - Then many additional points were added from a statnz meshblock density analysis.
- Finally remaining points were added from a check against linz residential polys, and zenbu poi clusters.

Spelling is broadly as per linz placenames, but there are differences for no particular reason. Instances of LINZ all upper case have been converted to sentance case. Some places not presently in the linz dataset are included in this set, usually new places, or those otherwise unnamed. They appear with no linz id, and are not authoritative, in some cases just wild guesses.

Density was derived from the 06 meshblock boundarys (level 2, geometry fixed), multipart conversion, merging in 06 usually resident MB population then using the formula pop/area*10000. An initial urban/rural threshold level of 0.6 persons per hectare was used.

Step two was to trace the approx extent of each populated place. The main purpose of this step was to determine the relative area of each place, and to create an intersection with meshblocks for population. Step 3 involved determining the political center of each place, broadly defined as the commercial center.

Tracing was carried out at 1:9000 for small places, and 1:18000 for large places using either bing or google satellite views. No attempt was made to relate to actual town 'boundarys'. For example large parks or raceways on the urban fringe were not generally included. Outlying industrial areas were included somewhat erratically depending on their connection to urban areas.

Step 3 involved determining the centers of each place. Points were overlaid over the following layers by way of a base reference:

a. original linz placenames b. OSM nz-locations points layer c. zenbu pois, latest set as of 5/4/11 d. zenbu AllSuburbsRegions dataset (a heavily hand modified) LINZ BDE extract derived dataset courtesy Zenbu. e. LINZ road-centerlines, sealed and highway f. LINZ residential areas, g. LINZ building-locations and building footprints h. Olivier and Co nz-urban-north and south

Therefore in practice, sources c and e, form the effective basis of the point coordinates in this dataset. Be aware that e, f and g are referenced to the LINZ topo data, while c and d are likely referenced to whatever roading dataset google possesses. As such minor discrepencys may occur when moving from one to the other.

Regardless of the above, this place centers dataset was created using the following criteria, in order of priority:

attempts to represent the present (2011) subjective 'center' of each place as defined by its commercial/retail center ie. mainstreets where they exist, any kind of central retail cluster, even a single shop in very small places.

the coordinate is almost always at the junction of two or more roads.

most of the time the coordinate is at or near the centroid of the poi cluster

failing any significant retail presence, the coordinate tends to be placed near the main road junction to the community.

when the above criteria fail to yield a definitive answer, the final criteria involves the centroids of: . the urban polygons . the clusters of building footprints/locations.

To be clear the coordinates are manually produced by eye without any kind of computation. As such the points are placed approximately perhaps plus or minus 10m, but given that the roads layers are not that flash, no attempt was made to actually snap the coordinates to the road junctions themselves.

The final step involved merging in population from SNZ meshblocks (merge+sum by location) of popl polys). Be aware that due to the inconsistent way that meshblocks are defined this will result in inaccurate populations, particular small places will collect population from their surrounding area. In any case the population will generally always overestimate by including meshblocks that just nicked the place poly. Also there are a couple of dozen cases of overlapping meshblocks between two place polys and these will double count. Which i have so far made no attempt to fix.

Merged in also tla and regions from SNZ shapes, a few of the original linz atrributes, and lastly grading the size of urban areas according to SNZ 'urban areas" criteria. Ie: class codes:

Not used.

main urban area 30K+

secondary urban area 10k-30K

minor urban area 1k-10k

rural center 300-1K

village -300

Note that while this terminology is shared with SNZ the actual places differ owing to different decisions being made about where one area ends an another starts, and what constiutes a suburb or satellite. I expect some discussion around this issue. For example i have included tinwald and washdyke as part of ashburton and timaru, but not richmond or waikawa as part of nelson and picton. Im open to discussion on these.

No attempt has or will likely ever be made to locate the entire LOC and SBRB data subsets. We will just have to wait for NZFS to release what is thought to be an authoritative set.

PROJECTION

Shapefiles are all nztm. Orig data from SNZ and LINZ was all sourced in nztm, via koordinates, or SNZ. Satellite tracings were in spherical mercator/wgs84 and converted to nztm by Qgis. Zenbu POIS were also similarly converted.

ATTRIBUTES

Shapefile: Points id : integer unique to dataset name : name of popl place, string class : urban area size as above. integer tcode : SNZ tla code, integer rcode : SNZ region code, 1-16, integer area : area of poly place features, integer in square meters. pop : 2006 usually resident popluation, being the sum of meshblocks that intersect the place poly features. Integer lid : linz geog places id desc_code : linz geog places place type code

Shapefile: Polygons gid : integer unique to dataset, shared by points and polys name : name of popl place, string, where spelling conflicts occur points wins area : place poly area, m2 Integer

LICENSE

Clarification about the minorly derived nature of LINZ and google data needs to be sought. But pending these copyright complications, the actual points data is essentially an original work, released as public domain. I retain no copyright, nor any responsibility for data accuracy, either as is, or regardless of any changes that are subsequently made to it.

Peter Scott 16/6/2011

v1.01 minor spelling and grammar edits 17/6/11
n
NSW Administrative Boundaries | Dataset | SEED
datasets.seed.nsw.gov.au
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NSW Administrative Boundaries | Dataset | SEED [Dataset]. https://datasets.seed.nsw.gov.au/dataset/nsw-administrative-boundaries
Explore at:
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered
New South Wales
Description
The NSW Administrative Boundaries Web Service is a dynamic map of administrative and property boundaries. Administrative Areas Boundaries depict a polygon feature class within the NSW Digital Cadastral Database maintained by Spatial Services (DCS). The administrative boundaries provided through this web service includes: Counties, Suburbs, Parishes, Local Government Areas, State Forests, National Parks, State Electoral Districts.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Agnieszka Łysak (2022). Two residential districts datasets from Kielce, Poland for building semantic segmentation task [Dataset]. http://doi.org/10.57760/sciencedb.02955

Two residential districts datasets from Kielce, Poland for building semantic segmentation task

Explore at:

282 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.57760/sciencedb.02955

Dataset updated

Sep 29, 2022

Dataset provided by

Science Data Bank

Authors

Agnieszka Łysak

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Area covered

Kielce, Poland

Description

Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.

Clear search

Close search

Google apps

Main menu

Two residential districts datasets from Kielce, Poland for building semantic...

Residential Schools Locations Dataset (Geodatabase)

Spatial distribution of housing rental value in Amsterdam 1647-1652

NZ Populated Places - Polygons

SUMMARY

RATIONALE

METHODOLOGY

PROJECTION

ATTRIBUTES

LICENSE

NSW Administrative Boundaries | Dataset | SEED

Two residential districts datasets from Kielce, Poland for building semantic segmentation task