Integrated Discrete Environmental Analytics System

Chiranjib Chaudhuri 2020-07-01

GitHub Documents

This document explains the analytic capabilities of the IDEAS data model.

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## Linking to GEOS 3.8.0, GDAL 2.5.0, PROJ 6.3.0

We connect to a table containing spatial-time series of annual extreme daily climate variables for entire Canada.

data=tbl(con,"ANUSPLINE3")
head(data)

## # Source:   lazy query [?? x 4]
## # Database: NetezzaConnection
##     DGGID KEY      VALUE   TID
##     <int> <chr>    <dbl> <int>
## 1 2420704 MAX_TEMP  20.4  1950
## 2 2374744 MAX_TEMP  23.5  1950
## 3 2360784 MAX_TEMP  27.0  1950
## 4 2364464 MAX_TEMP  25.6  1950
## 5 2311463 MAX_TEMP  30.7  1950
## 6 2381424 MAX_TEMP  20.5  1950

Next we slice the data set for annual maximum daily precipitation.

datap=data%>%filter(KEY=='PRECIPITATION')
head(datap)

## # Source:   lazy query [?? x 4]
## # Database: NetezzaConnection
##     DGGID KEY           VALUE   TID
##     <int> <chr>         <dbl> <int>
## 1 2297589 PRECIPITATION  7.00  1950
## 2 2473831 PRECIPITATION 18.1   1950
## 3 2393550 PRECIPITATION 24.5   1950
## 4 2479111 PRECIPITATION  5.10  1950
## 5 2417910 PRECIPITATION  5.45  1950
## 6 2493712 PRECIPITATION 11.5   1950

We will calculate time-series of spatial average

avgs=datap%>%group_by(TID)%>%arrange(TID)%>%summarise(VALUE=mean(VALUE))
head(avgs)

## Warning: Missing values are always removed in SQL.
## Use `mean(x, na.rm = TRUE)` to silence this warning
## This warning is displayed only once per session.

## # Source:   lazy query [?? x 2]
## # Database: NetezzaConnection
##     TID VALUE
##   <int> <dbl>
## 1  1950  17.7
## 2  1951  18.0
## 3  1952  19.6
## 4  1953  20.9
## 5  1954  21.1
## 6  1955  20.1

We will calculate spatial distribution of temporal average

avgt=datap%>%group_by(DGGID)%>%arrange(DGGID)%>%summarise(VALUE=mean(VALUE))
head(avgt)

## # Source:   lazy query [?? x 2]
## # Database: NetezzaConnection
##     DGGID VALUE
##     <int> <dbl>
## 1 2460666  26.7
## 2 2277703  27.6
## 3 2414025  24.7
## 4 2481826  26.2
## 5 2266103  34.0
## 6 2292943  40.3

Let us plot some of these basic variables.

avgs=collect(avgs)
plot(avgs$TID,avgs$VALUE)

To plot the spatial variable we need to attach it with the spatial tabls.

grid=tbl(con,"FINALGRID2")
head(grid)

## # Source:   lazy query [?? x 6]
## # Database: NetezzaConnection
##        DGGID RESOLUTION  QUAD     I     J GEOM                                  
##        <dbl>      <int> <int> <int> <int> <ODBC_bnr>                            
## 1    1.38e11         22     5 68729 95619 010100f0e599f2bc145bc06412aaabad145bc…
## 2    1.38e11         22     5 68733 95613 010100b495fa264b145bc0ec7e15e03b145bc…
## 3    1.38e11         22     5 68732 95610 0101009cc6accd35145bc02ce6e78626145bc…
## 4    1.38e11         22     5 68733 95597 01010068a8401c9a135bc0a434bcd58a135bc…
## 5    1.38e11         22     5 68726 95603 0101008cd03a702f145bc000a0ab2920145bc…
## 6    1.38e11         22     5 68728 95595 01010004a6fc39bf135bc0986898f3af135bc…

avgt=avgt%>%inner_join(grid,by=c('DGGID'))%>%mutate(WKT=inza..ST_AsText(GEOM))%>%
  select(DGGID,VALUE,WKT)%>%arrange(DGGID)%>%head(100)%>%collect()

poly=st_as_sf(avgt, wkt='WKT', crs = 4326)
plot(poly['VALUE'])

Lets get a little more complex now. We want to clip the data for one of the eco-zone over Canada say somwhere over BC, Pacific-Maritime (ecozone=13)

ecozone=tbl(con,"ECOZONE_12")%>%filter(VALUE==13)%>%select(DGGID)
head(ecozone)

## # Source:   lazy query [?? x 1]
## # Database: NetezzaConnection
##     DGGID
##     <int>
## 1 2290293
## 2 2290292
## 3 2212852
## 4 2213587
## 5 2214331
## 6 2216513

datape=datap%>%inner_join(ecozone,by=c('DGGID'))
head(datape)

## # Source:   lazy query [?? x 4]
## # Database: NetezzaConnection
##     DGGID KEY           VALUE   TID
##     <int> <chr>         <dbl> <int>
## 1 2222341 PRECIPITATION  89.3  1950
## 2 2222341 PRECIPITATION  89.3  1950
## 3 2231102 PRECIPITATION  34.2  1950
## 4 2231102 PRECIPITATION  34.2  1950
## 5 2233302 PRECIPITATION  25.0  1950
## 6 2233302 PRECIPITATION  25.0  1950

Last update: July 1, 2020