Parallel Training of Multiple Foreign Exchange Return Models

In a variety of machine learning applications, there are often requirements for training multiple models. For example, in the internet of things (IoT) industry, a unique model needs to be built for each household with installed sensors that measure temperature, light or power consumption. Another example can be found in the online advertising industry. To serve personalized online advertisements or recommendations, a huge number of individualized models has to be built and maintained because each online user has a unique browsing history. Moreover, such a model has to be updated in a frequent manner to capture the change of consumer behavior. When the number of models goes high, and even the algorithm is carefully designed and proved to be solid, it could be a challenge to implement in production. Especially for time sensitive applications, multiple model training cannot afford the extra delay caused by iterations through a huge number of models. A good example is the financial industry. In this article, we will show an example of fitting multiple foreign exchange (FX) rate models and demonstrate how we can leverage the powerful parallel computation capability provided by Oracle R Enterprise (ORE). 

FX Rate Data

The FX rate data can be obtained from Federal Reserve Economic Data . Instead of going online to fetch the data manually, a library in R called ‘quantmod’ provides a convenient way of downloading the data. Here is the code used for this purpose.

library(quantmod)
symbol = "DEXCAUS"
getSymbols(symbol,src="FRED")

The symbol “DEXCAUS” means the FX rate of Canadian dollar to US dollar. In this example, we downloaded foreign exchanges rates for 22 currencies and focused on the time range from 1999 to 2015. 

rm(list=ls())
symbols <- c( "DEXBZUS", "DEXCAUS", "DEXCHUS", "DEXDNUS", "DEXHKUS", "DEXINUS", "DEXJPUS", "DEXKOUS", "DEXMXUS", "DEXNOUS", "DEXSDUS", "DEXSFUS", "DEXSIUS", "DEXSLUS", "DEXSZUS", "DEXTAUS", "DEXTHUS", "DEXUSAL", "DEXUSEU", "DEXUSNZ", "DEXUSUK", "DEXVZUS")
for( symbol in symbols){ getSymbols(symbol, src="FRED") }
mergecode<-paste("merge(", paste(symbols,collapse=","),")", collapse="")
merged.df<-eval(parse(text= mergecode))
fxrates.df.raw <- data.frame(date=index(merged.df), coredata(merged.df))
fxrates.df.raw <- fxrates.df.raw[fxrates.df.raw$date>'1999-01-04',] 

Non-stationarity

Let us take a first look into the FX rate data. We plot the FX rate of the Canadian dollar to the US dollar: 

ggplot(fxrates.df.raw[fxrates.df.raw$date>'2015-01-01',], aes(date, DEXCAUS)) + geom_line()+ labs(x = "day", title = "CA dollar ~ US dollar FX Rate") + theme(plot.title = element_text(hjust = 0.5)) 
https://blogs.oracle.com/R/resource/fx.jpg 

At a first glance, the series does not look to be stationary. To confirm it, we can run an Augmented Dickey–Fuller test to check if it has unit roots, which means that the series can have F(t) = ρF(t-1) + a(t) with ρ = 1. We can use the R library fUnitRoots to do the test. The null hypothesis is that the unit root exists. 

library(fUnitRoots)
adfTest(fxrates.df.raw$DEXCAUS)
The result shows as follows:
Title: Augmented Dickey-Fuller Test
Test Results: PARAMETER: Lag Order: 1 STATISTIC: Dickey-Fuller: -0.8459 P VALUE: 0.3467
 

Since p >> 0.05, we cannot reject the null hypothesis. This suggests that there is unit root in this series and thus it is confirmed that the time series is non-stationary. 

FX Rate Prediction 

Foreign exchange rate series are known to be difficult to predict. For a time, the predictability is questioned since it seems untied to several economic fundamentals link. Thus, a random walk model is often used as a benchmark. In this article, we will implement a random walk model for demonstration purposes. 
A random walk model is formulated as 

F(t) = F(t-1) + a(t), 

where a(t) is the zero mean random noise. 
In R, we can use the following function to fit a random walk model. 

arima(data, c(0,1,0))

This basically means that we remove both the MA and AR parts and only retain the integral part, which is exactly the random walk model. 

The prediction is often backtested in a moving window fashion. For each time step t, the model is trained using data over [t-L-1, t-1], which is a window with length L. The prediction result is then evaluated by out of sample (OOS) data. Then, we move the window forward for every t and calculate the out of sample error. Here, we only use one sample as OOS data, which means that we use a window of historical data to predict the next day’s FX rate. 

There are many ways to evaluate the result of backtesting. Here, we adopted the R squared as a measure of the goodness of fit. The closer the R squared is towards 1, the more accurate the prediction will be. 
Combining all the ingradients, we now can write a function in R for making the predictions for one currency

pred.fxrate <- function (data.fxrate) {
  data.fxrate <- data.fxrate[order(data.fxrate$date),]
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1] }
  R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)/sum((mean(data.fxrate$rate[(L+1):N], na.rm =TRUE) – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)
  pred.df <- as.data.frame(data.fxrate$date[(L+1):N])
  pred.df$pred <- pred names(pred.df) <- c("date", "pred")
  plot(data.fxrate$date, data.fxrate$rate, type = "l")
  lines(pred.df$date, pred.df$pred, col="red")} 

Note that the lines that compute the R squared has the option na.rm = TRUE on. It is because the data contains null values. 

We can test the function using CA dollar using the data from 2014 to 2017. The R squared is 0.97. Seems that we have a decent model! 

https://blogs.oracle.com/R/resource/dexcaus.jpg 
Parallel Prediction

As mentioned at the beginning, there are quite a few currencies and we probably do not want to loop through them. A solution is use the “group apply” capability provided by Oracle R Enterprise (ORE). That allows us to store the data as a table in Oracle Database (in many cases, it is the original data location), then run the function we wrote above in parallel for each currency. 

First, we need to merge all the FX data together and change the schema as follows.

fxrates.df <- data.frame(date=character(),
                         country=character(),
                         rate=double())
date.col <- fxrates.df.raw$date
symbols <- names(fxrates.df.raw)[-1]
n.row <- length(date.col)
for( symbol in symbols){
  symbol.data <- as.data.frame(date.col)
  symbol.data$country <- rep(symbol, n.row)
  symbol.data$return <- fxrates.df.raw[,symbol]
  fxrates.df <- rbind(fxrates.df, symbol.data)
}
names(fxrates.df) <- c("date", "country", "rate")
fxrates.df <- fxrates.df[fxrates.df$date > '2014-01-01', ]
fxrates.df <- fxrates.df[order(fxrates.df$date),]

The data frame we obtained looks like:

date country rate
2014-01-02 DEXCAUS 1.0634
2014-01-03 DEXCAUS 1.0612
2014-01-06 DEXCAUS 1.0658
2014-01-07 DEXCAUS 1.0742
2014-01-08 DEXCAUS 1.0802
2014-01-09 DEXCAUS 1.0850

Then, we create the table in Oracle Database with ORE.

ore.drop(table="FX_RATE") # to remove the table if it already exists
ore.create(fxrates.df, table="FX_RATE")

After the table is created, we call the ore.groupApply function on the column ‘country’. That will run the function pred.fxrate on the FX rate of each currency, using at most four parallel executing R engines spawned by Oracle Database. 

res <- ore.groupApply(FX_RATE,
                      FX_RATE$country,
                      pred.fxrate,
                      ore.connect=TRUE,
                      parallel = 4) 

Another way to store the result is creating an object in the ORE R Datastore. For instance, we can add the following code into the function pred.fxrate.

R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2)/sum((mean(data.fxrate$rate[(L+1):N]) – data.fxrate$rate[(L+1):N])^2)
name <- paste("Rsq_",country,sep="")
assign(name,R.sq)
try(ore.save(list=name, name="Rsquares",append=TRUE))

Then, after running the ore.groupApply function, we can retrieve the objects through ORE functions as below.

https://blogs.oracle.com/R/resource/table_fx.jpg 

Based on the R squared, the results look decent and will be even better if we can access data about other economic fundamentals and build an ensemble model. Due to the scope of this blog, we will leave this exploration to the reader. 

Invoke R scripts from SQL side 

Another scenario may require storing the result, such as R squared scores, into a structured format as a table in the database. Or we may need to store the generated image in the database. These can also be done by calling the R functions using capabilities provided by OAA on the SQL side. 

Let us first look at how we store the R squared as a table. Suppose we want to build the model over each currency in SQL. We can first create a SQL function that has the group apply capability. Recall that we have all data stored in FX_RATE. All we need to do is that we create a group apply function and also supply the script that build the model. 

CREATE OR REPLACE PACKAGE fxratePkg AS
TYPE cur IS REF CURSOR RETURN FX_RATE%ROWTYPE;
END fxratePkg;
 
CREATE OR REPLACE FUNCTION fxrateGroupEval(
inp_cur fxratePkg.cur,
par_cur SYS_REFCURSOR,
out_qry VARCHAR2,
grp_col VARCHAR2,
exp_txt CLOB)
RETURN SYS.AnyDataSet
PIPELINED PARALLEL_ENABLE (PARTITION inp_cur BY HASH("country"))
CLUSTER inp_cur BY ("country")
USING rqGroupEvalImpl;

This function is a PL/SQL function that can do the group apply. You can view it as a counterpart of ore.groupApply. Next, we store the script that build the model in the database. 

begin
sys.rqScriptDrop('RW_model'); — call if the model already exists.
sys.rqScriptCreate('RW_model',
    'function (data.fxrate) { 
  data.fxrate <- data.fxrate[order(data.fxrate$date),]
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1]
  } 
   R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)/sum((mean(data.fxrate$rate[(L+1):N], na.rm=TRUE) – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)
  data.frame(CURRENCY=country, RSQ = R.sq)
 }');
end;

 Note that in order to form a table, we need to create a data frame to store each single result. With both the group apply SQL function and the script stored, we now can call it within SQL.

select *
from table(fxrateGroupEval( cursor(select /*+ parallel(t, 4) */ * from FX_RATE t),
cursor(select 1 as "ore.connect" from dual),
'SELECT ''aaaaaaa'' CURRENCY, 1 RSQ FROM DUAL', 'country', 'RW_model'));

Note that “aaaaaaa” is a way to declare the format of the column, which is a 7 character text column. Moreover, we can even store the plots generated by each FX model. We can modify the function as below. 

begin
sys.rqScriptDrop('RW_model_plot');
sys.rqScriptCreate('RW_model_plot',
    'function (data.fxrate) { 
      data.fxrate <- data.fxrate[order(data.fxrate$date),]
 
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1]
  } 
 
  pred.df <- as.data.frame(data.fxrate$date[(L+1):N])
  pred.df$pred <- pred
  names(pred.df) <- c("date", "pred")
  plot(data.fxrate$date, data.fxrate$rate, type = "l")
  lines(pred.df$date, pred.df$pred, col="red")   
 }');
end;

Then, we can all the new SQL function and generate an table of images.  

select * from table(fxrateGroupEval( cursor(select /*+ parallel(t, 4) */ * from FX_RATE t),
 cursor(select 1 as "ore.connect" from dual),
'PNG', 'country', 'RW_model_plot'));

The output, if viewed at SQL developer, is as follows. 

https://blogs.oracle.com/R/resource/table_blob.jpg 

Note that now the image is generated and stored as a BLOB (binary large object) in the table. We can double click on the BLOB item and view the image in pop-up window (make sure the view as image box is checked). 

https://blogs.oracle.com/R/resource/blob.png 
Conclusion

In this blog, we demonstrate the parallel training of multiple FX rate models using a benchmark random walk model. The implementation only takes a few lines of code. We can see the powerful functionality provided by the in-database technology enabled by Oracle R Enterprise. 

from Oracle Blogs | Oracle R Technologies Blog https://blogs.oracle.com/r/parallel-training-of-multiple-foreign-exchange-return-models

Parallel Training of Multiple Foreign Exchange Return Models

In a variety of machine learning applications, there are often requirements for training multiple models. For example, in the internet of things (IoT) industry, a unique model needs to be built for each household with installed sensors that measure temperature, light or power consumption. Another example can be found in the online advertising industry. To serve personalized online advertisements or recommendations, a huge number of individualized models has to be built and maintained because each online user has a unique browsing history. Moreover, such a model has to be updated in a frequent manner to capture the change of consumer behavior. When the number of models goes high, and even the algorithm is carefully designed and proved to be solid, it could be a challenge to implement in production. Especially for time sensitive applications, multiple model training cannot afford the extra delay caused by iterations through a huge number of models. A good example is the financial industry. In this article, we will show an example of fitting multiple foreign exchange (FX) rate models and demonstrate how we can leverage the powerful parallel computation capability provided by Oracle R Enterprise (ORE). 

FX Rate Data

The FX rate data can be obtained from Federal Reserve Economic Data . Instead of going online to fetch the data manually, a library in R called ‘quantmod’ provides a convenient way of downloading the data. Here is the code used for this purpose.

library(quantmod)
symbol = "DEXCAUS"
getSymbols(symbol,src="FRED")

The symbol “DEXCAUS” means the FX rate of Canadian dollar to US dollar. In this example, we downloaded foreign exchanges rates for 22 currencies and focused on the time range from 1999 to 2015. 

rm(list=ls())
symbols <- c( "DEXBZUS", "DEXCAUS", "DEXCHUS", "DEXDNUS", "DEXHKUS", "DEXINUS", "DEXJPUS", "DEXKOUS", "DEXMXUS", "DEXNOUS", "DEXSDUS", "DEXSFUS", "DEXSIUS", "DEXSLUS", "DEXSZUS", "DEXTAUS", "DEXTHUS", "DEXUSAL", "DEXUSEU", "DEXUSNZ", "DEXUSUK", "DEXVZUS")
for( symbol in symbols){ getSymbols(symbol, src="FRED") }
mergecode<-paste("merge(", paste(symbols,collapse=","),")", collapse="")
merged.df<-eval(parse(text= mergecode))
fxrates.df.raw <- data.frame(date=index(merged.df), coredata(merged.df))
fxrates.df.raw <- fxrates.df.raw[fxrates.df.raw$date>'1999-01-04',] 

Non-stationarity

Let us take a first look into the FX rate data. We plot the FX rate of the Canadian dollar to the US dollar: 

ggplot(fxrates.df.raw[fxrates.df.raw$date>'2015-01-01',], aes(date, DEXCAUS)) + geom_line()+ labs(x = "day", title = "CA dollar ~ US dollar FX Rate") + theme(plot.title = element_text(hjust = 0.5)) 
https://blogs.oracle.com/R/resource/fx.jpg 

At a first glance, the series does not look to be stationary. To confirm it, we can run an Augmented Dickey–Fuller test to check if it has unit roots, which means that the series can have F(t) = ρF(t-1) + a(t) with ρ = 1. We can use the R library fUnitRoots to do the test. The null hypothesis is that the unit root exists. 

library(fUnitRoots)
adfTest(fxrates.df.raw$DEXCAUS)
The result shows as follows:
Title: Augmented Dickey-Fuller Test
Test Results: PARAMETER: Lag Order: 1 STATISTIC: Dickey-Fuller: -0.8459 P VALUE: 0.3467
 

Since p >> 0.05, we cannot reject the null hypothesis. This suggests that there is unit root in this series and thus it is confirmed that the time series is non-stationary. 

FX Rate Prediction 

Foreign exchange rate series are known to be difficult to predict. For a time, the predictability is questioned since it seems untied to several economic fundamentals link. Thus, a random walk model is often used as a benchmark. In this article, we will implement a random walk model for demonstration purposes. 
A random walk model is formulated as 

F(t) = F(t-1) + a(t), 

where a(t) is the zero mean random noise. 
In R, we can use the following function to fit a random walk model. 

arima(data, c(0,1,0))

This basically means that we remove both the MA and AR parts and only retain the integral part, which is exactly the random walk model. 

The prediction is often backtested in a moving window fashion. For each time step t, the model is trained using data over [t-L-1, t-1], which is a window with length L. The prediction result is then evaluated by out of sample (OOS) data. Then, we move the window forward for every t and calculate the out of sample error. Here, we only use one sample as OOS data, which means that we use a window of historical data to predict the next day’s FX rate. 

There are many ways to evaluate the result of backtesting. Here, we adopted the R squared as a measure of the goodness of fit. The closer the R squared is towards 1, the more accurate the prediction will be. 
Combining all the ingradients, we now can write a function in R for making the predictions for one currency

pred.fxrate <- function (data.fxrate) {
  data.fxrate <- data.fxrate[order(data.fxrate$date),]
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1] }
  R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)/sum((mean(data.fxrate$rate[(L+1):N], na.rm =TRUE) – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)
  pred.df <- as.data.frame(data.fxrate$date[(L+1):N])
  pred.df$pred <- pred names(pred.df) <- c("date", "pred")
  plot(data.fxrate$date, data.fxrate$rate, type = "l")
  lines(pred.df$date, pred.df$pred, col="red")} 

Note that the lines that compute the R squared has the option na.rm = TRUE on. It is because the data contains null values. 

We can test the function using CA dollar using the data from 2014 to 2017. The R squared is 0.97. Seems that we have a decent model! 

https://blogs.oracle.com/R/resource/dexcaus.jpg 
Parallel Prediction

As mentioned at the beginning, there are quite a few currencies and we probably do not want to loop through them. A solution is use the “group apply” capability provided by Oracle R Enterprise (ORE). That allows us to store the data as a table in Oracle Database (in many cases, it is the original data location), then run the function we wrote above in parallel for each currency. 

First, we need to merge all the FX data together and change the schema as follows.

fxrates.df <- data.frame(date=character(),
                         country=character(),
                         rate=double())
date.col <- fxrates.df.raw$date
symbols <- names(fxrates.df.raw)[-1]
n.row <- length(date.col)
for( symbol in symbols){
  symbol.data <- as.data.frame(date.col)
  symbol.data$country <- rep(symbol, n.row)
  symbol.data$return <- fxrates.df.raw[,symbol]
  fxrates.df <- rbind(fxrates.df, symbol.data)
}
names(fxrates.df) <- c("date", "country", "rate")
fxrates.df <- fxrates.df[fxrates.df$date > '2014-01-01', ]
fxrates.df <- fxrates.df[order(fxrates.df$date),]

The data frame we obtained looks like:

date country rate
2014-01-02 DEXCAUS 1.0634
2014-01-03 DEXCAUS 1.0612
2014-01-06 DEXCAUS 1.0658
2014-01-07 DEXCAUS 1.0742
2014-01-08 DEXCAUS 1.0802
2014-01-09 DEXCAUS 1.0850

Then, we create the table in Oracle Database with ORE.

ore.drop(table="FX_RATE") # to remove the table if it already exists
ore.create(fxrates.df, table="FX_RATE")

After the table is created, we call the ore.groupApply function on the column ‘country’. That will run the function pred.fxrate on the FX rate of each currency, using at most four parallel executing R engines spawned by Oracle Database. 

res <- ore.groupApply(FX_RATE,
                      FX_RATE$country,
                      pred.fxrate,
                      ore.connect=TRUE,
                      parallel = 4) 

Another way to store the result is creating an object in the ORE R Datastore. For instance, we can add the following code into the function pred.fxrate.

R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2)/sum((mean(data.fxrate$rate[(L+1):N]) – data.fxrate$rate[(L+1):N])^2)
name <- paste("Rsq_",country,sep="")
assign(name,R.sq)
try(ore.save(list=name, name="Rsquares",append=TRUE))

Then, after running the ore.groupApply function, we can retrieve the objects through ORE functions as below.

https://blogs.oracle.com/R/resource/table_fx.jpg 

Based on the R squared, the results look decent and will be even better if we can access data about other economic fundamentals and build an ensemble model. Due to the scope of this blog, we will leave this exploration to the reader. 

Invoke R scripts from SQL side 

Another scenario may require storing the result, such as R squared scores, into a structured format as a table in the database. Or we may need to store the generated image in the database. These can also be done by calling the R functions using capabilities provided by OAA on the SQL side. 

Let us first look at how we store the R squared as a table. Suppose we want to build the model over each currency in SQL. We can first create a SQL function that has the group apply capability. Recall that we have all data stored in FX_RATE. All we need to do is that we create a group apply function and also supply the script that build the model. 

CREATE OR REPLACE PACKAGE fxratePkg AS
TYPE cur IS REF CURSOR RETURN FX_RATE%ROWTYPE;
END fxratePkg;
 
CREATE OR REPLACE FUNCTION fxrateGroupEval(
inp_cur fxratePkg.cur,
par_cur SYS_REFCURSOR,
out_qry VARCHAR2,
grp_col VARCHAR2,
exp_txt CLOB)
RETURN SYS.AnyDataSet
PIPELINED PARALLEL_ENABLE (PARTITION inp_cur BY HASH("country"))
CLUSTER inp_cur BY ("country")
USING rqGroupEvalImpl;

This function is a PL/SQL function that can do the group apply. You can view it as a counterpart of ore.groupApply. Next, we store the script that build the model in the database. 

begin
sys.rqScriptDrop('RW_model'); — call if the model already exists.
sys.rqScriptCreate('RW_model',
    'function (data.fxrate) { 
  data.fxrate <- data.fxrate[order(data.fxrate$date),]
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1]
  } 
   R.sq <- 1 – sum((pred – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)/sum((mean(data.fxrate$rate[(L+1):N], na.rm=TRUE) – data.fxrate$rate[(L+1):N])^2, na.rm =TRUE)
  data.frame(CURRENCY=country, RSQ = R.sq)
 }');
end;

 Note that in order to form a table, we need to create a data frame to store each single result. With both the group apply SQL function and the script stored, we now can call it within SQL.

select *
from table(fxrateGroupEval( cursor(select /*+ parallel(t, 4) */ * from FX_RATE t),
cursor(select 1 as "ore.connect" from dual),
'SELECT ''aaaaaaa'' CURRENCY, 1 RSQ FROM DUAL', 'country', 'RW_model'));

Note that “aaaaaaa” is a way to declare the format of the column, which is a 7 character text column. Moreover, we can even store the plots generated by each FX model. We can modify the function as below. 

begin
sys.rqScriptDrop('RW_model_plot');
sys.rqScriptCreate('RW_model_plot',
    'function (data.fxrate) { 
      data.fxrate <- data.fxrate[order(data.fxrate$date),]
 
  N <- nrow(data.fxrate)
  L <- 300
  pred <- rep(0, N-L)
  country <- data.fxrate$country[1]
  for(i in (L+1):N){
    model <- arima(data.fxrate$rate[(i-L):i-1],c(0,1,0))
    pred[i-L] <- predict(model,1)[[1]][1]
  } 
 
  pred.df <- as.data.frame(data.fxrate$date[(L+1):N])
  pred.df$pred <- pred
  names(pred.df) <- c("date", "pred")
  plot(data.fxrate$date, data.fxrate$rate, type = "l")
  lines(pred.df$date, pred.df$pred, col="red")   
 }');
end;

Then, we can all the new SQL function and generate an table of images.  

select * from table(fxrateGroupEval( cursor(select /*+ parallel(t, 4) */ * from FX_RATE t),
 cursor(select 1 as "ore.connect" from dual),
'PNG', 'country', 'RW_model_plot'));

The output, if viewed at SQL developer, is as follows. 

https://blogs.oracle.com/R/resource/table_blob.jpg 

Note that now the image is generated and stored as a BLOB (binary large object) in the table. We can double click on the BLOB item and view the image in pop-up window (make sure the view as image box is checked). 

https://blogs.oracle.com/R/resource/blob.png 
Conclusion

In this blog, we demonstrate the parallel training of multiple FX rate models using a benchmark random walk model. The implementation only takes a few lines of code. We can see the powerful functionality provided by the in-database technology enabled by Oracle R Enterprise. 

from Oracle Blogs | Oracle R Technologies Blog https://blogs.oracle.com/r/parallel-training-of-multiple-foreign-exchange-return-models

Tech Tip Thursday: Dynamic Power BI reports using Parameters

Did you know that you can dynamically filter data in Power BI using parameters that are stored in an Excel workbook? In this video, Patrick from Guy in a Cube shows us how, using M Functions within Power Query and a gateway to enable data refresh. Check it out!

from Category Name https://powerbi.microsoft.com/en-us/blog/tech-tip-thursday-dynamic-power-bi-reports-using-parameters/

Tech Tip Thursday: Dynamic Power BI reports using Parameters

Did you know that you can dynamically filter data in Power BI using parameters that are stored in an Excel workbook? In this video, Patrick from Guy in a Cube shows us how, using M Functions within Power Query and a gateway to enable data refresh. Check it out!

from Microsoft Power BI Blog | Microsoft Power BI https://powerbi.microsoft.com/en-us/blog/tech-tip-thursday-dynamic-power-bi-reports-using-parameters/

Deployment of Pre-Trained Models on Azure Container Services

This post is authored by Mathew Salvaris, Ilia Karmanov and Jaya Mathew.

Data scientists and engineers routinely encounter issues when moving their final functional software and code from their development environment (laptop, desktop) to a test environment, or from a staging environment to production. These difficulties primarily stem from differences between the underlying software environments and infrastructure, and they eventually end up costing businesses a lot of time and money, as data scientists and engineers work towards narrowing down these incompatibilities and either modify software or update environments to meet their needs.

Containers end up being a great solution in such scenarios, as the entire runtime environment (application, libraries, binaries and other configuration files) get bundled into a package to ensure smooth portability of software across different environments. Using containers can, therefore, improve the speed at which apps can be developed, tested, deployed and shared among users working in different environments. Docker is a leading software container platform for enabling developers, operators and enterprises to overcome their application portability issue.

The goal of Azure Container Services (ACS) is to provide a container hosting environment by using popular open-source tools and technologies. Like all software, deploying machine learning (ML) models can be tricky due to the plethora of libraries used and their dependencies. In this tutorial, we will demonstrate how to deploy a pre-trained deep learning model using ACS. ACS enables the user to configure, construct and manage a cluster of virtual machines preconfigured to run containerized applications. Once the cluster is setup, DC/OS is used for scheduling and orchestration. This is an ideal setup for any ML application since Docker containers facilitate ultimate flexibility in the libraries used, are scalable on demand, and all while ensuring that the application is performant.

The Docker image used in this tutorial contains a simple Flask web application with Nginx web server and uses Microsoft’s Cognitive Toolkit (CNTK) as the deep learning framework, with a pretrained ResNet 152 model. Our web application is a simple image classification service, where the user submits an image, and the application returns the class the image belongs to. This end-to-end tutorial is split into four sections, namely:

  • Create Docker image of our application (00_BuildImage.ipynb).
  • Test the application locally (01_TestLocally.ipynb).
  • Create an ACS cluster and deploy our web app (02_TestWebApp.ipynb).
  • Test our web app (03_TestWebApp.ipynb, 04_SpeedTestWebApp.ipynb).

Each section has an accompanying Jupyter notebook with step-by-step instructions on how to create, deploy and test the web application.

Create Docker Image of the Application (00_BuildImage.ipynb)

The Docker image in this tutorial contains three main elements, namely: the web application (web app), pretrained model, and the driver for executing our model, based on the requests made to the web application. The Docker image is based on an Ubuntu 16.04 image to which we added the necessary Python dependencies and installed CNTK (another option would be to test our application in an Ubuntu Data Science Virtual Machine from Azure portal). An important point to be aware of is that the Flask web app is run on port 5000, so we have created a proxy from port 88 to port 5000 using Nginx to expose port 88 in the container. Once the container is built, it is pushed to a public Docker hub account so that the ACS cluster can access it.

Test the Application Locally (01_TestLocally.ipynb)

Having short feedback loops while debugging is very important and ensures quick iterations. Docker images allow the user to do this as the user can run their application locally and check the functionality, before going through the entire process of deploying the app to ACS. This notebook outlines the process of spinning up the Docker container locally and configuring it properly. Once the container is up and running the user can send requests to be scored using the model and check the model performance.

Create an ACS Cluster and Deploy the Web App (02_DeployOnACS.ipynb)

In this notebook, the Azure CLI is used to create an ACS cluster with two nodes (this can also be done via the Azure portal). Each node is a D2 VM, which is quite small but sufficient for this tutorial. Once ACS is setup, to deploy the app, the user needs to create and SSH tunnel into the head node. This ensures that the user can send the JSON application schema to Marathon.

From the schema, we have mapped port 80 of the host to port 88 on the port (users can choose different ports as well). This tutorial only deploys one instance of the application (the user can scale this up, but it will not be discussed in here). Marathon has a web dashboard that can be accessed through the SSH tunnel by simply pointing the web browser to the tunnel created for deploying the application schema.

Test the Web App (03_TestWebApp.ipynb, 04_SpeedTestWebApp.ipynb)

Once the application has been successfully deployed the user can send scoring requests. The illustration below shows examples of some of the results returned from the application. The ResNet 152 model seems to be fairly accurate, even when parts of the subject (in the image) are occluded.


Further, the average response time for these requests is less than a second, which is very performant. Note that this tutorial was run on a virtual machine in the same region as the ACS. Response times across regions may be slower but the performance is still acceptable for a single container on a single VM.

After running the tutorial, to delete ACS and free up other associated Azure resources, run the cells at the end of 02_TestWebApp.ipynb notebook.

We hope you found this interesting – do share your thoughts or comments with us below.

Mathew, Ilia & Jaya

References:

from Cortana Intelligence and Machine Learning Blog https://blogs.technet.microsoft.com/machinelearning/2017/05/25/deployment-of-pre-trained-models-on-azure-container-services/

A Revamped Grand Prix at Inspire Europe 2017 – What You Need to Know

A-Revamped-Grand-Prix-at-Inspire-Europe-2017-What-You-Need-to-Know.png


 


Learn how this year’s Grand Prix at Inspire Europe 2017 will be more collaborative, and some important dates you should be aware of, should you decide to flex your Alteryx muscle and throw your hat in the ring.

from Analytics Blog articles https://community.alteryx.com/t5/Analytics-Blog/A-Revamped-Grand-Prix-at-Inspire-Europe-2017-What-You-Need-to/ba-p/60175

Half of firms expect AI to transform their industry

Nearly half of firms believe the current wave of Artificial Intelligence will be transformational and widespread; fundamentally transforming the industry and markets they work in. However, only a third feel their business has the skills to adopt data-driven technologies, so the UK must act quickly to bridge the knowledge gap as international competition heats up.

from IBM Press Releases – IBM Watson Analytics – UK http://www.ibm.com/press/uk/en/pressrelease/52468.wss