TOOLS SETUP#
The analytical scripts can be downloaded as:
Jupyter notebooks: user-friendly script that run via browser interface. Read more about Jupyter Notebooks.
Python code: give the user more control, and has overall better performances making use of parallel processing.
These can be downloaded and exectuted on any windows or linux machine. In both cases, the script requires proper environment setup and input data to be provided according to the instructions below.
Python environment#
Python 3 needs to be installed on your system. We suggest the latest Anaconda distribution. Mamba is also encouraged.
Create new
CCDR-tools
environment according to your operating system: win.yml or linux.yml. In Anaconda cmd prompt:conda create --name CCDR-tools --file <dir/win_env.yml>` activate CCDR-tools
Input data management#
Download the latest version of the notebooks or the the parallel code.
Create folder structure as:
Work dir/ - Hazard.ipynb Place the notebooks and related files in the main work directory - common.py - Parallel/ Place the parallel processing script in a sub-folder - ... - Data/ - ADM/ Administrative boundaries layer for each country - HZD/ Hazard layers - EXP/ Exposure layers - RSK/ Output directory
Download country boundaries for multiple administrative levels (national, sub-national) sourced from HDX or Geoboundaries. Note that oftern there are several versions for the same country, so be sure to use the most updated from official agencies (eg. United Nations). Verify that shapes, names and codes are consistent across different levels.
Boundaries must be provided as a geopackage files named as
ISO
_ADM.gpkg (e.g.NPL
_ADM.gpkg) containing multiple layers, each one represening a different administrative boundary levels:- ISO_ADM - ADM0 (country) - ADM1 (first-level sub-national division) - ADM2 (second-level sub-national division) - ADM3 (third-level sub-national division) - ...
Each layer should include relative ADMi_CODE and ADMi_NAME across levels to facilitate the summary of results:
ADM0 layer
ISO3166_a2
ISO3166_a3
ADM0_CODE
ADM0_NAME
String(2)
String(3)
Integer
String (20)
ADM1 layer
ADM0_CODE
ADM0_NAME
ADM1_CODE
ADM1_NAME
Integer
String (20)
Integer
String(20)
ADM2 layer
ADM0_CODE
ADM0_NAME
ADM1_CODE
ADM1_NAME
ADM2_CODE
ADM2_NAME
Integer
String (20)
Integer
String(20)
Integer
String(20)
ADM3 layer
ADM0_CODE
ADM0_NAME
ADM1_CODE
ADM1_NAME
ADM2_CODE
ADM2_NAME
ADM3_CODE
ADM3_NAME
Integer
String (20)
Integer
String(20)
Integer
String(20)
Integer
String(20)
Download probabilistic hazard data, consisting of multiple RP scenarios. Each scenario is expected as a raster file (
.tif
) named asISO
_HZD
_RPi.tif (exampe for Nepal flood, RP100:NPL_FL_RP100.tif
). Any resolution should work, but using resolution below 90m over large countries could cause very long processing and memory cap issues.Download exposure data for population, built-up and agricolture. Layers are expected as raster files (
.tif
) named asISO
_EXP
.tif.ISO
_POP.tif: Population, as from Global Human Settlement Layer or Worldpop. Value as number of peope per pixel.ISO
_BU.tif: Built-up from Global Human Settlement Layer or World Settlement Footprint. Value could be binary (0/1: absence/presence per pixel) or float (0-1: density per pixel).ISO
_AGR.tif: Agriculture from land cover map, ESA land cover or equivalent. Value could be binary (0/1: absence/presence per pixel) or float (0-1: density per pixel).
Move verified input data into the proper folders:
Work dir/Data/ - ADM/ - ISO_ADM.gpkg - HZD/ - ISO_FL_RP10.tif - ISO_FL_RP100.tif - ISO_FL_RP1000.tif - ... - EXP/ - ISO_POP.tif - ISO_BU.tif - ISO_AGR.tif
Caution
All spatial data must use the same CRS, suggested:
EPSG 4326
(WGS 84)
Settings#
Edit the .env
file inside the notebook directories to specify the working directory:
# Environment variables for the CCDR Climate and Disasater Risk analysis notebooks
# Fill the below with the location of data files
# Use absolute paths with forward slashes ("/"), and keep the trailing slash
DATA_DIR = C:/Work/data
# THE ENTRIES BELOW DO NOT NEED TO BE EDITED
# Location to store results of analyses
OUTPUT_DIR = ${DATA_DIR}/RSK/
# Location to store downloaded rasters and other data
# for the analysis notebooks
CACHE_DIR = ${DATA_DIR}/cache/
Run Jupyter notebooks#
Be sure to activate the correct environment
activate CCDR-tools
Navigate to your working directory:
cd <Your work directory>
cd C:\Dir\Workdir\
Run the jupyter notebook.
jupyter notebook
The interface should pop up in your browser.
You can now run the baseline risk screening.
Parallel processing#
Setting parameters#
Edit the main.py
file to specify:
country (
country
):ISO3166_a3
country codehazard type (
haz_cat
):'FL'
for floods;'HS'
for heat stress;'DR'
for drought;'LS'
for landslidereturn periods (
return_periods
): list of return period scenarios as in the data, e.g.[5, 10, 20, 50, 75, 100, 200, 250, 500, 1000]
exposure categories (
exp_cat_list
): list of exposure categories:['POP', 'BU', 'AGR']
exposure categories file name (
exp_cat_list
): list of same length ofexp_cat_list
with file names for exposure categories, e.g.:['GHS', 'WSF19', 'ESA20']
If ‘None’, the default['POP', 'BU', 'AGR']
applies
analysis approach (
analysis_app
):['Classes', 'Function']
If
'Function'
, you can set minimum hazard threshold value (min_haz_slider
). Hazard value below this threshold will be ignoredIf
'Classes'
, you can set the number and value of thresholds to consider to split hazard intensity values into bins (class_edges
)
admin level (
adm
): specify which boundary level to use for results summary (must exist in theISOa3
_ADM.gpkg file)save check (
save_check_raster
): specify if you want to export intermediate rasters (increases processing time)[True, False]
Example of main.py
running flood analysis (haz_cat
) over Cambodia [KHM] (country
) for 10 return periods (return_periods
) over three exposure categories (exp_cat_list
) using hazard classes according to thresholds (class_edges
); results summarised at ADM3 level (adm
). Do not save intermediate rasters (save_check_raster
).
Example for function analysis:
# Defining the initial parameters
country = 'KHM'
haz_cat = 'FL'
return_periods = [5, 10, 20, 50, 75, 100, 200, 250, 500, 1000]
min_haz_slider = 0.05
exp_cat_list = ['POP', 'BU', 'AGR']
exp_nam_list = ['GHS', 'WSF19', 'ESA20']
adm = 'ADM3'
analysis_app = 'Function'
# class_edges = [0.05, 0.25, 0.50, 1.00, 2.00]
save_check_raster = False
Example for class analysis:
# Defining the initial parameters
country = 'KHM'
haz_cat = 'FL'
return_periods = [5, 10, 20, 50, 75, 100, 200, 250, 500, 1000]
# min_haz_slider = 0.05
exp_cat_list = ['POP', 'BU', 'AGR']
exp_nam_list = ['GHS', 'WSF19', 'ESA20']
adm = 'ADM3'
analysis_app = 'Classes'
class_edges = [0.05, 0.25, 0.50, 1.00, 2.00]
save_check_raster = False
Run the analysis with parallel processing#
$ python main.py
The analysis runs on all selected exposed categories, in sequence. It will print a separate message for each iteration. In case of 3 exposure caterories, it will take three iterations to get all results.
$ Running analysis...
$ Finished analysis
$ Running analysis...
$ Finished analysis
$ Running analysis...
$ Finished analysis
Depending on the number of cores, the size and resolution of the data, and power of CPU, the analysis can take from less than a minute to few minutes. E.g. for Bangladesh on a i9-12900KF (16 cores), 64 Gb RAM: below 100 seconds.