10 datasets found

Data from: A dataset to model Levantine landcover and land-use change...
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Kempf; Michael Kempf (2023). A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.10396148
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10396148
Dataset updated
Dec 16, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Kempf; Michael Kempf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 16, 2023
Area covered
Levant
Description
Overview

This dataset is the repository for the following paper submitted to Data in Brief:

Kempf, M. A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19. Data in Brief (submitted: December 2023).

The Data in Brief article contains the supplement information and is the related data paper to:

Kempf, M. Climate change, the Arab Spring, and COVID-19 - Impacts on landcover transformations in the Levant. Journal of Arid Environments (revision submitted: December 2023).

Description/abstract

The Levant region is highly vulnerable to climate change, experiencing prolonged heat waves that have led to societal crises and population displacement. Since 2010, the area has been marked by socio-political turmoil, including the Syrian civil war and currently the escalation of the so-called Israeli-Palestinian Conflict, which strained neighbouring countries like Jordan due to the influx of Syrian refugees and increases population vulnerability to governmental decision-making. Jordan, in particular, has seen rapid population growth and significant changes in land-use and infrastructure, leading to over-exploitation of the landscape through irrigation and construction. This dataset uses climate data, satellite imagery, and land cover information to illustrate the substantial increase in construction activity and highlights the intricate relationship between climate change predictions and current socio-political developments in the Levant.

Folder structure

The main folder after download contains all data, in which the following subfolders are stored are stored as zipped files:

“code” stores the above described 9 code chunks to read, extract, process, analyse, and visualize the data.

“MODIS_merged” contains the 16-days, 250 m resolution NDVI imagery merged from three tiles (h20v05, h21v05, h21v06) and cropped to the study area, n=510, covering January 2001 to December 2022 and including January and February 2023.

“mask” contains a single shapefile, which is the merged product of administrative boundaries, including Jordan, Lebanon, Israel, Syria, and Palestine (“MERGED_LEVANT.shp”).

“yield_productivity” contains .csv files of yield information for all countries listed above.

“population” contains two files with the same name but different format. The .csv file is for processing and plotting in R. The .ods file is for enhanced visualization of population dynamics in the Levant (Socio_cultural_political_development_database_FAO2023.ods).

“GLDAS” stores the raw data of the NASA Global Land Data Assimilation System datasets that can be read, extracted (variable name), and processed using code “8_GLDAS_read_extract_trend” from the respective folder. One folder contains data from 1975-2022 and a second the additional January and February 2023 data.

“built_up” contains the landcover and built-up change data from 1975 to 2022. This folder is subdivided into two subfolder which contain the raw data and the already processed data. “raw_data” contains the unprocessed datasets and “derived_data” stores the cropped built_up datasets at 5 year intervals, e.g., “Levant_built_up_1975.tif”.

Code structure

1_MODIS_NDVI_hdf_file_extraction.R

This is the first code chunk that refers to the extraction of MODIS data from .hdf file format. The following packages must be installed and the raw data must be downloaded using a simple mass downloader, e.g., from google chrome. Packages: terra. Download MODIS data from after registration from: https://lpdaac.usgs.gov/products/mod13q1v061/ or https://search.earthdata.nasa.gov/search (MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061, last accessed, 09th of October 2023). The code reads a list of files, extracts the NDVI, and saves each file to a single .tif-file with the indication “NDVI”. Because the study area is quite large, we have to load three different (spatially) time series and merge them later. Note that the time series are temporally consistent.

2_MERGE_MODIS_tiles.R

In this code, we load and merge the three different stacks to produce large and consistent time series of NDVI imagery across the study area. We further use the package gtools to load the files in (1, 2, 3, 4, 5, 6, etc.). Here, we have three stacks from which we merge the first two (stack 1, stack 2) and store them. We then merge this stack with stack 3. We produce single files named NDVI_final_*consecutivenumber*.tif. Before saving the final output of single merged files, create a folder called “merged” and set the working directory to this folder, e.g., setwd("your directory_MODIS/merged").

3_CROP_MODIS_merged_tiles.R

Now we want to crop the derived MODIS tiles to our study area. We are using a mask, which is provided as .shp file in the repository, named "MERGED_LEVANT.shp". We load the merged .tif files and crop the stack with the vector. Saving to individual files, we name them “NDVI_merged_clip_*consecutivenumber*.tif. We now produced single cropped NDVI time series data from MODIS.
The repository provides the already clipped and merged NDVI datasets.

4_TREND_analysis_NDVI.R

Now, we want to perform trend analysis from the derived data. The data we load is tricky as it contains 16-days return period across a year for the period of 22 years. Growing season sums contain MAM (March-May), JJA (June-August), and SON (September-November). December is represented as a single file, which means that the period DJF (December-February) is represented by 5 images instead of 6. For the last DJF period (December 2022), the data from January and February 2023 can be added. The code selects the respective images from the stack, depending on which period is under consideration. From these stacks, individual annually resolved growing season sums are generated and the slope is calculated. We can then extract the p-values of the trend and characterize all values with high confidence level (0.05). Using the ggplot2 package and the melt function from reshape2 package, we can create a plot of the reclassified NDVI trends together with a local smoother (LOESS) of value 0.3.
To increase comparability and understand the amplitude of the trends, z-scores were calculated and plotted, which show the deviation of the values from the mean. This has been done for the NDVI values as well as the GLDAS climate variables as a normalization technique.

5_BUILT_UP_change_raster.R

Let us look at the landcover changes now. We are working with the terra package and get raster data from here: https://ghsl.jrc.ec.europa.eu/download.php?ds=bu (last accessed 03. March 2023, 100 m resolution, global coverage). Here, one can download the temporal coverage that is aimed for and reclassify it using the code after cropping to the individual study area. Here, I summed up different raster to characterize the built-up change in continuous values between 1975 and 2022.

6_POPULATION_numbers_plot.R

For this plot, one needs to load the .csv-file “Socio_cultural_political_development_database_FAO2023.csv” from the repository. The ggplot script provided produces the desired plot with all countries under consideration.

7_YIELD_plot.R

In this section, we are using the country productivity from the supplement in the repository “yield_productivity” (e.g., "Jordan_yield.csv". Each of the single country yield datasets is plotted in a ggplot and combined using the patchwork package in R.

8_GLDAS_read_extract_trend

The last code provides the basis for the trend analysis of the climate variables used in the paper. The raw data can be accessed https://disc.gsfc.nasa.gov/datasets?keywords=GLDAS%20Noah%20Land%20Surface%20Model%20L4%20monthly&page=1 (last accessed 9th of October 2023). The raw data comes in .nc file format and various variables can be extracted using the [“^a variable name”] command from the spatraster collection. Each time you run the code, this variable name must be adjusted to meet the requirements for the variables (see this link for abbreviations: https://disc.gsfc.nasa.gov/datasets/GLDAS_CLSM025_D_2.0/summary, last accessed 09th of October 2023; or the respective code chunk when reading a .nc file with the ncdf4 package in R) or run print(nc) from the code or use names(the spatraster collection).
Choosing one variable, the code uses the MERGED_LEVANT.shp mask from the repository to crop and mask the data to the outline of the study area.
From the processed data, trend analysis are conducted and z-scores were calculated following the code described above. However, annual trends require the frequency of the time series analysis to be set to value = 12. Regarding, e.g., rainfall, which is measured as annual sums and not means, the chunk r.sum=r.sum/12 has to be removed or set to r.sum=r.sum/1 to avoid calculating annual mean values (see other variables). Seasonal subset can be calculated as described in the code. Here, 3-month subsets were chosen for growing seasons, e.g. March-May (MAM), June-July (JJA), September-November (SON), and DJF (December-February, including Jan/Feb of the consecutive year).
From the data, mean values of 48 consecutive years are calculated and trend analysis are performed as describe above. In the same way, p-values are extracted and 95 % confidence level values are marked with dots on the raster plot. This analysis can be performed with a much longer time series, other variables, ad different spatial extent across the globe due to the availability of the GLDAS variables.
d
Zooplankton cubic meter count data collected in the 1-m2 MOCNESS during R/V...
search.dataone.org
dataone.org
+1more
Updated Mar 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Edward Durbin; Ms Maria C Casas (2025). Zooplankton cubic meter count data collected in the 1-m2 MOCNESS during R/V Albatross IV, R/V Endeavor and R/V Oceanus broadscale cruises in the Gulf of Maine and Georges Bank from 1995-1999 and processed at GSO/URI (GB project) [Dataset]. http://doi.org/10.1575/1912/bco-dmo.2333.1
Explore at:
Unique identifier
https://doi.org/10.1575/1912/bco-dmo.2333.1
Dataset updated
Mar 9, 2025
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Dr Edward Durbin; Ms Maria C Casas
Time period covered
May 9, 1995 - Apr 16, 1999
Area covered

Description
Zooplankton Meter³ Data - MOCNESS Only
The Zooplankton Meter³ Database for the Georges Bank GLOBEC project was originally located in the laboratory of Ted Durbin at the Graduate School of Oceanography, University of Rhode Island. It was accessed via the U.S. GLOBEC Georges Bank data management system using SQLPlus network access to the data base management system at URI. The data were cached and are served from the local computer.

A description of the original URI database is available online and includes the design and variable definitions. A version of this document is shown here.

Note: Our program's Data Acknowledgement Policy requires that any person making substantial use of a data set must communicate with the investigators who acquired the data prior to publication and anticipate that the data collectors will be co-authors of published results.

The following documentation applies to the data found locally on the WHOI GLOBEC Data Server.
The data are served as a hierarchy. The least changing variables are in higher order levels (e.g., cruise id, year, month, etc.), while variables that change the most are in the lower order levels (e.g., time of collection, net number, taxon collected, etc.) There are six levels within the database; variable names and descriptions are given in the metadata.

Most column variable names and instrument names were taken from the U.S. GLOBEC Georges Bank data thesaurus; those that were not follow the GLOBEC data protocols. The taxonomic code variable (taxon_code) is from the National Oceanographic Data Center's Taxonomic List, version 8. Taxonomic information is built into these ten-digit codes as they reflect the systematic nomenclature.

You may contact BCO-DMO for additional help.
r
Predicted Near Future Climate Change Impacts on the HGL of the ACT 2017 (2nd...
researchdata.edu.au
Updated May 11, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NSW Department of Climate Change, Energy, the Environment and Water (2020). Predicted Near Future Climate Change Impacts on the HGL of the ACT 2017 (2nd Ed) [Dataset]. https://researchdata.edu.au/predicted-near-future-2nd-ed/3833419
Explore at:
Dataset updated
May 11, 2020
Dataset provided by
data.nsw.gov.au
Authors
NSW Department of Climate Change, Energy, the Environment and Water
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
This dataset supersedes all earlier versions of 'Predicted Near Future Climate Change Impacts on the HGL of the ACT'. It incorporates HGL boundary and management area edits based on updated soil landscape mapping for the ACT.\r \r The focus of this dataset is climate change impacts in the Australian Capital Territory. It contains digital spatial data developed to assist in land management decision making in the ACT. \r The dataset contains an assessment of climate change impacts on 14 variables defined by the NARCliM (NSW/ACT Regional Climate Modelling) project for three selected regional climate projection ensembles (multimodel mean, CCCMA3.1-R2, ECHAM5-R3). Only near-future (1990-2009 to 2020-2039) projections were considered.\r Each variable was considered using annual and seasonal time periods. Field names in the dataset follow the following format:\r \r Field name = MODEL_NARCliM VARIABLE_TIME PERIOD\t\r \r Values for each element of the field name are summarised as follows:\r \r MODEL (Near future - 1990-2009 to 2020-2039)\t\r C – Consensus (NARCliM Multimodel Consensus Scenario)\t\r W – Wetter (NARCliM CCCMA3.1-R2 Wetter Scenario)\t\r D – Drier (NARCliM ECHAM5-R3 Drier Scenario)\t\t\t\t\r \t\t\t\t\t\t\t\r NARCliM VARIABLE\t\r FFDI – Forest fire danger index \r FF50 – Forest fire danger index above 50\t\r FFBC – Forest fire danger index bias corrected\t\r FFBC50 – Forest fire danger index bias corrected above 50\t\r PRAC – Precipitation\t\r PRACBC – Precipitation bias corrected\t\r TAME – Temp mean\t\r TAMX – Temp max \r TAMN – Temp min \r TAMXBC – Temp max bias corrected \r TAMNBC – Temp min bias corrected\t\r TAMX35 – Temp max bias corrected over 35\t\r TAMN2 – Temp min bias corrected below 2 \r WSSM – Wind speed\t\r \t\t\t\t\t\t\t\r TIME PERIOD \r A – Annual \r D – DJF \r M – MAM \r J – JJA \r S – SON\r \r Hydrogeological landscape (HGL) unit boundaries developed as part of the broader ACT Hydrogeological Landscapes (HGL) Framework project where used to constrain the outputs for this climate change assessment in the ACT. In all, there are 25 HGL defined. A weighted mean was used to calculate values for each HGL unit based on the proportions of corresponding 10km gridded data from the NARCliM data set.\r \r Spatial resolution for this dataset is 1:50 000.
d
CanESM2 model output prepared for CMIP5 historicalMisc, served by ESGF -...
demo-b2find.dkrz.de
Updated Sep 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). CanESM2 model output prepared for CMIP5 historicalMisc, served by ESGF - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/fab65a62-b715-5872-9218-b038af7d2fd4
Explore at:
Dataset updated
Sep 20, 2025
Description
historicalMisc is an experiment of the CMIP5 - Coupled Model Intercomparison Project Phase 5 ( https://pcmdi.llnl.gov/mips/cmip5 ). CMIP5 is meant to provide a framework for coordinated climate change experiments for the next five years and thus includes simulations for assessment in the AR5 as well as others that extend beyond the AR5. Experiment design: https://pcmdi.llnl.gov/mips/cmip5/experiment_design.html List of output variables: https://pcmdi.llnl.gov/mips/cmip5/datadescription.html Output: time series per variable in model grid spatial resolution in netCDF format Earth System model and the simulation information: CIM repository Entry name/title of data are specified according to the Data Reference Syntax ( https://pcmdi.llnl.gov/mips/cmip5/docs/cmip5_data_reference_syntax.pdf ) as activity/product/institute/model/experiment/frequency/modeling realm/MIP table/ensemble member/version number/variable name/CMOR filename.nc . Used forcings in individual ensemble runs (see attached addinfo for more information): r[1-5]i1p2 - LU: land-use change; r[1-5]i1p3 - SI: solar irradiance; r[1-5]i1p4 - AA: anthropogenic aerosols (a mixture of aerosols, not explicitly defined here).
m
Inflation- Unemployment Data & Analysis Codes (R)
data.mendeley.com
Updated Sep 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hazar Altinbas (2018). Inflation- Unemployment Data & Analysis Codes (R) [Dataset]. http://doi.org/10.17632/v9679528f7.1
Explore at:
Unique identifier
https://doi.org/10.17632/v9679528f7.1
Dataset updated
Sep 11, 2018
Authors
Hazar Altinbas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data is used for examination of inflation- unemployment relationship for 18 countries after 1991. Inflation data is obtained from World Bank database (https://data.worldbank.org/indicator/FP.CPI.TOTL.ZG) and unemployment data is obtained from International Labor Organization (http://www.ilo.org/wesodata/).

Analysis period is different for all countries because of structural breaks determined by single point change point detection algorithm included in changepoint package of Killick & Eckley (2014). Granger-causality is conducted with Toda&Yamamoto (1995) procedure. Integration levels are determined with 3 stationary tests. VAR models are run with vars package (Pfaff, Stigler & Pfaff; 2018) without trend and constant terms. Cointegration test is conducted with urca package (Pfaff, Zivot, Stigler & Pfaff; 2016).

All data files are .csv files. Analyst need to change country index (variable name: j) in order to see individual results. Findings can be seen in the article.

Killick, R., & Eckley, I. (2014). changepoint: An R package for changepoint analysis. Journal of statistical software, 58(3), 1-19.

Pfaff, B., Stigler, M., & Pfaff, M. B. (2018). Package ‘vars’. Online] https://cran. r-project. org/web/packages/vars/vars. pdf.

Pfaff, B., Zivot, E., Stigler, M., & Pfaff, M. B. (2016). Package ‘urca’. Unit root and cointegration tests for time series data. R package version, 1-2.

Toda, H. Y., & Yamamoto, T. (1995). Statistical inference in vector autoregressions with possibly integrated processes. Journal of econometrics, 66(1-2), 225-250.
b
Zooplankton counts from R/V Albatross IV, R/V Endeavor, and R/V Oceanus in...
bco-dmo.org
csv
Updated Sep 6, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Edward Durbin; Ms Maria C Casas (2012). Zooplankton counts from R/V Albatross IV, R/V Endeavor, and R/V Oceanus in the Gulf of Maine and Georges Bank from 1995-1999 (GB project) [Dataset]. https://www.bco-dmo.org/dataset/3453
Explore at:
csv(3.91 MB)Available download formats
Dataset updated
Sep 6, 2012
Dataset provided by
Biological and Chemical Data Management Office
Authors
Dr Edward Durbin; Ms Maria C Casas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Gulf of Maine, Georges Bank
Variables measured
net, tow, inst, count, event, stage, taxon, counter, depth_w, station, and 26 more
Measurement technique
Bongo Net
Description
Zooplankton Meter² Data from GSO/URI - Bongo Nets only
The Zooplankton Meter² Database for the Georges Bank GLOBEC project was originally located in the laboratory of Ted Durbin at the Graduate School of Oceanography, University of Rhode Island. It was accessed via the U.S. GLOBEC Georges Bank data management system using SQLPlus network access to the data base management system at URI. Data were cached and are served from the local computer.

A description of the original URI database is online and includes the design and variable definitions. A version of this document is shown http://globec.whoi.edu/globec-dir/data_doc/zoo_square_meter_URI.html"> here.

Note: Our program's Data Acknowledgement Policy requires that any person making substantial use of a data set must communicate with the investigators who acquired the data prior to publication and anticipate that the data collectors will be co-authors of published results.

The following documentation applies to the data found locally on the WHOI GLOBEC Data Server:
The data is served as a hierarchy. The least changing variables are in higher order levels (e.g., cruise id, year, month, etc.), while variables that change the most are in the lower order levels (e.g., time of collection, net number, taxon collected, etc.) There are six levels within the data; variable names and descriptions are given in the metadata.

Most column variable names and instrument names were taken from the U.S. GLOBEC Georges Bank data thesaurus; those that were not follow the GLOBEC data protocols. The taxonomic code variable (taxon_code) is from the National Oceanographic Data Center's Taxonomic List, version 8. Taxonomic information is built into these ten-digit codes as they reflect the systematic nomenclature.

You may contact BCO-DMO for additional help.
S
Data from: A 1 km monthly dataset of historical and future climate changes...
scidb.cn
Updated Sep 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaofei Hu; Shaolin Shi; Borui Zhou; Jian Ni (2024). A 1 km monthly dataset of historical and future climate changes over China [Dataset]. http://doi.org/10.57760/sciencedb.13546
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.13546
Dataset updated
Sep 20, 2024
Dataset provided by
Science Data Bank
Authors
Xiaofei Hu; Shaolin Shi; Borui Zhou; Jian Ni
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
China
Description
This dataset provides 30-year averaged climate data for both historical and future periods, with a spatial resolution of 0.01° × 0.01°. Historical data (1991–2020) are based on the China Surface Climate Standard Dataset and were interpolated using ANUSPLIN software. Future climate data are derived from CMIP6 simulations, bias-corrected using the Delta downscaling method. The dataset includes 10 models (9 Global Climate Models, namely, GCMs, and 1 ensemble model), 3 scenarios (SSP1-2.6, SSP2-4.5, and SSP5-8.5), and 3 future periods (2021–2040, 2041–2070, 2071–2100). For each period (or scenario), 28 climate variables are provided, including: 5 monthly basic climate variables (mean temperature, maximum temperature, minimum temperature, precipitation, and percentage of sunshine), and 23 bioclimatic variables based on the basic variables (for details, see the dataset documentation file).The data quality was strictly evaluated. The ANUSPLIN interpolated historical data showed a strong correlation with observations (all correlation coefficients above 0.91). The historical interpolations generated by the ANUSPLIIN software showed a good fit (above 0.91) with observations. The bias correction improved the accuracy of most GCM original simulations, reducing the bias by 0.69%–58.63%. This dataset aims to provide high-resolution, bias-corrected long-term historical and future climate data for climate and ecological research. All computations were performed using R, and the corresponding code can be found in the dataset folder: “Code”.All data are provided in GeoTIFF (.tif) format, where each file for the basic climate variables contains 12 bands, representing monthly data in ascending order (e.g., Band 1 corresponds to January). To facilitate data storage, all files are provided in compressed archives, following a consistent naming convention:(1) Historical data: China_Variable_1km_1991–2020.tifWhere, Variable represents the abbreviation of the 28 climate variables.Example: China_pr_1km_1991–2020.tif.(2) Future data: China_Variable_Model_VariantLabel_1km_StartYear-EndYear_Scenario.tifWhere, Variable is the 28 climate variables; Model is the GCM name; VariantLabel is r1i1p1f1 in this study; StartYear-EndYear is the future period; Scenario is the SSP climate scenarioExample: China_tasmin_MRI-ESM2-0_r1i1p1f1_1km_2071–2100_SSP585.tif.
o
Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...
openicpsr.org
Updated May 18, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Hate Crime Data 1991-2019 [Dataset]. http://doi.org/10.3886/E103500V7
Explore at:
Unique identifier
https://doi.org/10.3886/E103500V7
Dataset updated
May 18, 2018
Dataset provided by
University of Pennsylvania
Authors
Jacob Kaplan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1991 - 2019
Area covered
United States
Description
!!!WARNING~~~This dataset has a large number of flaws and is unable to properly answer many questions that people generally use it to answer, such as whether national hate crimes are changing (or at least they use the data so improperly that they get the wrong answer). A large number of people using this data (academics, advocates, reporting, US Congress) do so inappropriately and get the wrong answer to their questions as a result. Indeed, many published papers using this data should be retracted. Before using this data I highly recommend that you thoroughly read my book on UCR data, particularly the chapter on hate crimes (https://ucrbook.com/hate-crimes.html) as well as the FBI's own manual on this data. The questions you could potentially answer well are relatively narrow and generally exclude any causal relationships. ~~~WARNING!!!Version 8 release notes:Adds 2019 dataVersion 7 release notes:Changes release notes description, does not change data.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS, SAS, and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Adds data for 1991.Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. Version 4 release notes: Adds data for 2017.Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017. Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent. Made the 'population' column which is the total population in that agency. Version 3 release notes: Adds data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Hate Crime data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about hate crimes reported in the United States. Please note that the files are quite large and may take some time to open.Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency. Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.). The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), made all character values lower case, reordered columns. I also generated incident month, weekday, and month-day variables from the incident date variable included in the original data.
CitiesGOER: Globally Observed Environmental Data for 52,602 Cities with a...
zenodo.org
data-staging.niaid.nih.gov
bin, txt
Updated Sep 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roeland Kindt; Roeland Kindt (2023). CitiesGOER: Globally Observed Environmental Data for 52,602 Cities with a Population ≥ 5000 [Dataset]. http://doi.org/10.5281/zenodo.8252984
Explore at:
bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8252984
Dataset updated
Sep 5, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Roeland Kindt; Roeland Kindt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CitiesGOER is a database that provides environmental data for 52,602 cities and 48 environmental variables, including 38 bioclimatic variables, 8 soil variables and 2 topographic variables. Data were extracted from the same 30 arc-seconds global grid layers that were prepared when making the TreeGOER (Tree Globally Observed Environmental Ranges) database that is available from https://doi.org/10.5281/zenodo.7922927. Details on the preparations of these layers are provided by Kindt, R. (2023). TreeGOER: A database with globally observed environmental ranges for 48,129 tree species. Global Change Biology, 00, 1–16. https://onlinelibrary.wiley.com/doi/10.1111/gcb.16914. CitiesGOER was designed to be used together with TreeGOER and possibly also with the GlobalUsefulNativeTrees database (Kindt et al. 2023) to allow users to filter suitable tree species based on environmental conditions of the planting site.

The identities and coordinates of cities were sourced from a data set with information for cities with a population size larger than 1000 that was created by Opendatasoft and made available from https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/table/?disjunctive.cou_name_en&sort=name. The data was downloaded on 22-JULY-2023 and afterwards filtered for cities with a population of 5000 or above. Cities where information on the country was missing were removed. The coordinates of cities were used to extract the environmental data via the terra package (Hijmans et al. 2022, version 1.6-47) in the R 4.2.1 environment.

Update 2023.08 provided median values from 23 Global Climate Models (GCMs) for Shared Socio-Economic Pathway (SSP) 1-2.6 and from 18 GCMs for SSP 3-7.0, both for the 2050s (2041-2060). Similar methods were used to calculate these median values as in the case studies for the TreeGOER manuscript (calculations were partially done via the BiodiversityR::ensemble.envirem.run function and with downscaled bioclimatic and monthly climate 2.5 arc-minutes future grid layers available from WorldClim 2.1).

The locations of the 52,602 cities are mapped in one of the series available from the TreeGOER Global Zones atlas that can be obtained from https://doi.org/10.5281/zenodo.8252756.

When using CitiesGOER in your work, cite this depository and the following:

Fick, S. E., & Hijmans, R. J. (2017). WorldClim 2: New 1‐km spatial resolution climate surfaces for global land areas. International Journal of Climatology, 37(12), 4302–4315. https://doi.org/10.1002/joc.5086

Title, P. O., & Bemmels, J. B. (2018). ENVIREM: An expanded set of bioclimatic and topographic variables increases flexibility and improves performance of ecological niche modeling. Ecography, 41(2), 291–307. https://doi.org/10.1111/ecog.02880

Poggio, L., de Sousa, L. M., Batjes, N. H., Heuvelink, G. B. M., Kempen, B., Ribeiro, E., & Rossiter, D. (2021). SoilGrids 2.0: Producing soil information for the globe with quantified spatial uncertainty. SOIL, 7(1), 217–240. https://doi.org/10.5194/soil-7-217-2021

Kindt, R. (2023). TreeGOER: A database with globally observed environmental ranges for 48,129 tree species. Global Change Biology, 00, 1–16. https://onlinelibrary.wiley.com/doi/10.1111/gcb.16914.

Opendatasoft (2023) Geonames - All Cities with a population > 1000. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/information/?disjunctive.cou_name_en&sort=name (accessed 22-JULY-2023)

The development of CitiesGOER was supported by the Darwin Initiative to project DAREX001 of Developing a Global Biodiversity Standard certification for tree-planting and restoration, by Norway’s International Climate and Forest Initiative through the Royal Norwegian Embassy in Ethiopia to the Provision of Adequate Tree Seed Portfolio project in Ethiopia, and by the Green Climate Fund through the IUCN-led Transforming the Eastern Province of Rwanda through Adaptation project.
o
Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...
openicpsr.org
Updated Mar 29, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Arrests by Age, Sex, and Race, 1974-2021 [Dataset]. http://doi.org/10.3886/E102263V15
Explore at:
Unique identifier
https://doi.org/10.3886/E102263V15
Dataset updated
Mar 29, 2018
Dataset provided by
Princeton University
Authors
Jacob Kaplan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1974 - 2021
Area covered
United States
Description
For a comprehensive guide to this data and other UCR data, please see my book at ucrbook.comVersion 15 release notes:Adds 2021 data.Version 14 release notes:Adds 2020 data. Please note that the FBI has retired UCR data ending in 2020 data so this will be the last Arrests by Age, Sex, and Race data they release. Version 13 release notes:Changes R files from .rda to .rds.Fixes bug where the number_of_months_reported variable incorrectly was the largest of the number of months reported for a specific crime variable. For example, if theft was reported Jan-June and robbery was reported July-December in an agency, in total there were 12 months reported. But since each crime (and let's assume no other crime was reported more than 6 months of the year) only was reported 6 months, the number_of_months_reported variable was incorrectly set at 6 months. Now it is the total number of months reported of any crime. So it would be set to 12 months in this example. Thank you to Nick Eubank for alerting me to this issue.Adds rows even when a agency reported zero arrests that month; all arrest values are set to zero for these rows.Version 12 release notes:Adds 2019 data.Version 11 release notes:Changes release notes description, does not change data.Version 10 release notes:The data now has the following age categories (which were previously aggregated into larger groups to reduce file size): under 10, 10-12, 13-14, 40-44, 45-49, 50-54, 55-59, 60-64, over 64. These categories are available for female, male, and total (female+male) arrests. The previous aggregated categories (under 15, 40-49, and over 49 have been removed from the data). Version 9 release notes:For each offense, adds a variable indicating the number of months that offense was reported - these variables are labeled as "num_months_[crime]" where [crime] is the offense name. These variables are generated by the number of times one or more arrests were reported per month for that crime. For example, if there was at least one arrest for assault in January, February, March, and August (and no other months), there would be four months reported for assault. Please note that this does not differentiate between an agency not reporting that month and actually having zero arrests. The variable "number_of_months_reported" is still in the data and is the number of months that any offense was reported. So if any agency reports murder arrests every month but no other crimes, the murder number of months variable and the "number_of_months_reported" variable will both be 12 while every other offense number of month variable will be 0. Adds data for 2017 and 2018.Version 8 release notes:Adds annual data in R format.Changes project name to avoid confusing this data for the ones done by NACJD.Fixes bug where bookmaking was excluded as an arrest category. Changed the number of categories to include more offenses per category to have fewer total files. Added a "total_race" file for each category - this file has total arrests by race for each crime and a breakdown of juvenile/adult by race. Version 7 release notes: Adds 1974-1979 dataAdds monthly data (only totals by sex and race, not by age-categories). All data now from FBI, not NACJD. Changes some column names so all columns are <=32 characters to be usable in Stata.Changes how number of months reported is calculated. Now it is the number of unique months with arrest data reported - months of data from the monthly header file (i.e. juvenile disposition data) are not considered in this calculation. Version 6 release notes: Fix bug where juvenile female columns had the same value as juvenile male columns.Version 5 release notes: Removes support for SPSS and Excel data.Changes the crimes that are stored in each file. There are more files now with fewer crimes per file. The files and their included crimes have been updated below.Adds in agencies that report 0 months of the year.Adds a column that indicates the number of months reported. This is generated summing up the number of unique months an agency reports data for. Note that this indicates the number of months an agency reported arrests for ANY crime. They may not necessarily report every crime every month. Agencies that did not report a crime with have a value of NA for every arrest column for that crime.Removes data on runaways.Version 4 release notes: Changes column names from "p
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Michael Kempf; Michael Kempf (2023). A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.10396148

Data from: A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10396148

Dataset updated

Dec 16, 2023

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Michael Kempf; Michael Kempf

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered

Dec 16, 2023

Area covered

Levant

Description

Overview

This dataset is the repository for the following paper submitted to Data in Brief:

Kempf, M. A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19. Data in Brief (submitted: December 2023).

The Data in Brief article contains the supplement information and is the related data paper to:

Kempf, M. Climate change, the Arab Spring, and COVID-19 - Impacts on landcover transformations in the Levant. Journal of Arid Environments (revision submitted: December 2023).

Description/abstract

The Levant region is highly vulnerable to climate change, experiencing prolonged heat waves that have led to societal crises and population displacement. Since 2010, the area has been marked by socio-political turmoil, including the Syrian civil war and currently the escalation of the so-called Israeli-Palestinian Conflict, which strained neighbouring countries like Jordan due to the influx of Syrian refugees and increases population vulnerability to governmental decision-making. Jordan, in particular, has seen rapid population growth and significant changes in land-use and infrastructure, leading to over-exploitation of the landscape through irrigation and construction. This dataset uses climate data, satellite imagery, and land cover information to illustrate the substantial increase in construction activity and highlights the intricate relationship between climate change predictions and current socio-political developments in the Levant.

Folder structure

The main folder after download contains all data, in which the following subfolders are stored are stored as zipped files:

“code” stores the above described 9 code chunks to read, extract, process, analyse, and visualize the data.

“MODIS_merged” contains the 16-days, 250 m resolution NDVI imagery merged from three tiles (h20v05, h21v05, h21v06) and cropped to the study area, n=510, covering January 2001 to December 2022 and including January and February 2023.

“mask” contains a single shapefile, which is the merged product of administrative boundaries, including Jordan, Lebanon, Israel, Syria, and Palestine (“MERGED_LEVANT.shp”).

“yield_productivity” contains .csv files of yield information for all countries listed above.

“population” contains two files with the same name but different format. The .csv file is for processing and plotting in R. The .ods file is for enhanced visualization of population dynamics in the Levant (Socio_cultural_political_development_database_FAO2023.ods).

“GLDAS” stores the raw data of the NASA Global Land Data Assimilation System datasets that can be read, extracted (variable name), and processed using code “8_GLDAS_read_extract_trend” from the respective folder. One folder contains data from 1975-2022 and a second the additional January and February 2023 data.

“built_up” contains the landcover and built-up change data from 1975 to 2022. This folder is subdivided into two subfolder which contain the raw data and the already processed data. “raw_data” contains the unprocessed datasets and “derived_data” stores the cropped built_up datasets at 5 year intervals, e.g., “Levant_built_up_1975.tif”.

Code structure

1_MODIS_NDVI_hdf_file_extraction.R

This is the first code chunk that refers to the extraction of MODIS data from .hdf file format. The following packages must be installed and the raw data must be downloaded using a simple mass downloader, e.g., from google chrome. Packages: terra. Download MODIS data from after registration from: https://lpdaac.usgs.gov/products/mod13q1v061/ or https://search.earthdata.nasa.gov/search (MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061, last accessed, 09th of October 2023). The code reads a list of files, extracts the NDVI, and saves each file to a single .tif-file with the indication “NDVI”. Because the study area is quite large, we have to load three different (spatially) time series and merge them later. Note that the time series are temporally consistent.

2_MERGE_MODIS_tiles.R

In this code, we load and merge the three different stacks to produce large and consistent time series of NDVI imagery across the study area. We further use the package gtools to load the files in (1, 2, 3, 4, 5, 6, etc.). Here, we have three stacks from which we merge the first two (stack 1, stack 2) and store them. We then merge this stack with stack 3. We produce single files named NDVI_final_*consecutivenumber*.tif. Before saving the final output of single merged files, create a folder called “merged” and set the working directory to this folder, e.g., setwd("your directory_MODIS/merged").

3_CROP_MODIS_merged_tiles.R

Now we want to crop the derived MODIS tiles to our study area. We are using a mask, which is provided as .shp file in the repository, named "MERGED_LEVANT.shp". We load the merged .tif files and crop the stack with the vector. Saving to individual files, we name them “NDVI_merged_clip_*consecutivenumber*.tif. We now produced single cropped NDVI time series data from MODIS.
The repository provides the already clipped and merged NDVI datasets.

4_TREND_analysis_NDVI.R

Now, we want to perform trend analysis from the derived data. The data we load is tricky as it contains 16-days return period across a year for the period of 22 years. Growing season sums contain MAM (March-May), JJA (June-August), and SON (September-November). December is represented as a single file, which means that the period DJF (December-February) is represented by 5 images instead of 6. For the last DJF period (December 2022), the data from January and February 2023 can be added. The code selects the respective images from the stack, depending on which period is under consideration. From these stacks, individual annually resolved growing season sums are generated and the slope is calculated. We can then extract the p-values of the trend and characterize all values with high confidence level (0.05). Using the ggplot2 package and the melt function from reshape2 package, we can create a plot of the reclassified NDVI trends together with a local smoother (LOESS) of value 0.3.
To increase comparability and understand the amplitude of the trends, z-scores were calculated and plotted, which show the deviation of the values from the mean. This has been done for the NDVI values as well as the GLDAS climate variables as a normalization technique.

5_BUILT_UP_change_raster.R

Let us look at the landcover changes now. We are working with the terra package and get raster data from here: https://ghsl.jrc.ec.europa.eu/download.php?ds=bu (last accessed 03. March 2023, 100 m resolution, global coverage). Here, one can download the temporal coverage that is aimed for and reclassify it using the code after cropping to the individual study area. Here, I summed up different raster to characterize the built-up change in continuous values between 1975 and 2022.

6_POPULATION_numbers_plot.R

For this plot, one needs to load the .csv-file “Socio_cultural_political_development_database_FAO2023.csv” from the repository. The ggplot script provided produces the desired plot with all countries under consideration.

7_YIELD_plot.R

In this section, we are using the country productivity from the supplement in the repository “yield_productivity” (e.g., "Jordan_yield.csv". Each of the single country yield datasets is plotted in a ggplot and combined using the patchwork package in R.

8_GLDAS_read_extract_trend

The last code provides the basis for the trend analysis of the climate variables used in the paper. The raw data can be accessed https://disc.gsfc.nasa.gov/datasets?keywords=GLDAS%20Noah%20Land%20Surface%20Model%20L4%20monthly&page=1 (last accessed 9th of October 2023). The raw data comes in .nc file format and various variables can be extracted using the [“^a variable name”] command from the spatraster collection. Each time you run the code, this variable name must be adjusted to meet the requirements for the variables (see this link for abbreviations: https://disc.gsfc.nasa.gov/datasets/GLDAS_CLSM025_D_2.0/summary, last accessed 09th of October 2023; or the respective code chunk when reading a .nc file with the ncdf4 package in R) or run print(nc) from the code or use names(the spatraster collection).
Choosing one variable, the code uses the MERGED_LEVANT.shp mask from the repository to crop and mask the data to the outline of the study area.
From the processed data, trend analysis are conducted and z-scores were calculated following the code described above. However, annual trends require the frequency of the time series analysis to be set to value = 12. Regarding, e.g., rainfall, which is measured as annual sums and not means, the chunk r.sum=r.sum/12 has to be removed or set to r.sum=r.sum/1 to avoid calculating annual mean values (see other variables). Seasonal subset can be calculated as described in the code. Here, 3-month subsets were chosen for growing seasons, e.g. March-May (MAM), June-July (JJA), September-November (SON), and DJF (December-February, including Jan/Feb of the consecutive year).
From the data, mean values of 48 consecutive years are calculated and trend analysis are performed as describe above. In the same way, p-values are extracted and 95 % confidence level values are marked with dots on the raster plot. This analysis can be performed with a much longer time series, other variables, ad different spatial extent across the globe due to the availability of the GLDAS variables.

Clear search

Close search

Google apps

Main menu

Data from: A dataset to model Levantine landcover and land-use change...

Zooplankton cubic meter count data collected in the 1-m2 MOCNESS during R/V...

Predicted Near Future Climate Change Impacts on the HGL of the ACT 2017 (2nd...

CanESM2 model output prepared for CMIP5 historicalMisc, served by ESGF -...

Inflation- Unemployment Data & Analysis Codes (R)

Zooplankton counts from R/V Albatross IV, R/V Endeavor, and R/V Oceanus in...

Data from: A 1 km monthly dataset of historical and future climate changes...

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

CitiesGOER: Globally Observed Environmental Data for 52,602 Cities with a...

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

Data from: A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19