AzureDSVM: a new R package for elastic use of the Azure Data Science Virtual Machine

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft)
The Azure Data Science Virtual Machine (DSVM) is a curated VM which provides commonly-used tools and software for data science and machine learning, pre-installed. AzureDSVM is a new R package that enables seamless interaction with the DSVM from a local R session, by providing functions for the following tasks:

Deployment, deallocation, deletion of one or multiple DSVMs;
Remote execution of local R scripts: compute contexts available in Microsoft R Server can be enabled for enhanced computation efficiency for either a single DSVM or a cluster of DSVMs;
Retrieval of cost consumption and total expense spent on using DSVM(s).

AzureDSVM is built upon the AzureSMR package and depends on the same set of R packages such as httr, jsonlite, etc. It requires the same initial set up on Azure Active Directory (for authentication).
To install AzureDSVM with devtools package:
library(devtools)
devtools::install_github(“Azure/AzureDSVM”)
library(“AzureDSVM”)

When deploying a Data Science Virtual Machine, the machine name, size, OS type, etc. must be specified. AzureDSVM supports DSVMs on Ubuntu, CentOS, Windows, and Windows with the Deep Learning Toolkit (on GPU-class instances). For example, the following code fires up a D4 v2 Ubuntu DSVM located in South East Asia:
deployDSVM(context,
resource.group=”example”,
location=”southeastasia”,
size=”Standard_D4_v2″,
os=”Ubuntu”,
hostname=”mydsvm”,
username=”myname”,
pubkey=”pubkey”)

where context is an azureActiveContext object created by AzureSMR::createAzureContext() function that encapsulates credentials (Tenant ID, Client ID, etc.) for Azure authentication.
In addition to launching a single DSVM, the AzureDSVM package makes it easy to launch a cluster with multiple virtual machines. Multi-deployment supports:

creating a collection of independent DSVMs which can be distributed to a group of data scientists for collaborative projects, as well as
clustering a set of connected DSVMs for high-performance computation.

To create a cluster of 5 Ubuntu DSVMs with default VM size, use:
cluster<-deployDSVMCluster(context,
resource.group=RG,
location="southeastasia",
hostnames="mydsvm",
usernames="myname",
pubkeys="pubkey",
count=5)

To execute a local script on remote cluster of DSVMs with a specified Microsoft R Server compute context, use the executeScript function. (NOTE: only Linux-based DSVM instances are supported at the moment as underneath the remote execution is achieved via SSH. Microsoft R Server 9.x allows remote interaction for both Linux and Windows, and more details can be found here.) Here, we use the RxForeachDoPar context (as indicated by the compute.context option):
executeScript(context,
resource.group="southeastasia",
machines="dsvm_names_in_the_cluster",
remote="fqdn_of_dsvm_used_as_master",
user="myname",
script="path_to_the_script_for_remote_execution",
master="fqdn_of_dsvm_used_as_master",
slaves="fqdns_of_dsvms_used_as_slaves",
compute.context="clusterParallel")

Information of cost consumption and expense spent on DSVMs can be retrieved with:
consum<-expenseCalculator(context,
instance="mydsvm",
time.start="time_stamp_of_starting_point",
time.end="time_stamp_of_ending_point",
granularity="Daily",
currency="USD",
locale="en-US",
offerId="offer_id_of_azure_subscription",
region="southeastasia")

print(consum)

Detailed introductions and tutorials can be found in the AzureDSVM Github repository, linked below.
Github (Azure): AzureDSVM

from Revolutions http://blog.revolutionanalytics.com/2017/05/azuredsvm-a-new-r-package-for-elastic-use-of-the-azure-data-science-virtual-machine.html

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s