Sunday, January 5, 2014

Generating Dynamic R Scripts with WebFOCUS

WebFOCUS would only have a fraction of its power without the procedural scripting language called Dialogue Manager. Based on user input, a WebFOCUS procedure can leverage Dialogue Manager to basically rewrite its own code and become something completely different.

This robust flexibility makes WebFOCUS a great complement to the R statistical programming language.

In Part I of this multi-part blog posting, I showed you a simple user interface for graphing data obtained from text mining of legacy reporting applications.

When automatically determining the complexity of any particular reporting procedure, I give the user an option of seeing the results--in addition to the tabular data--graphically. The user can select BoxPlot, Histogram, Plot, or none of the above.



Based on the user's choice, the WebFOCUS procedure needs to generate the appropriate R code.

Here is the line of WebFOCUS code that created the Graph pull-down list:




At run-time, WebFOCUS will have a symbolic variable named &GRAPH containing one of four possible values: BOXPLOT, HISTOGRAM, PLOT, or NONE. If the user was not interested in visualizing the data and selected "NONE," we can skip all of the R logic.

For the three R graph options, however, we need to provide R with several key pieces of information: 
  • Where to find the input file (stored in a WebFOCUS Reporting Server folder called "r_data")
  • Which R script to run (stored in a WebFOCUS Reporting Server folder called "r_scripts")
  • Where to store the output graph (we'll use the same "r_data" folder for input and output)

Visually, here is the interaction between WebFOCUS and R:



Below is the WebFOCUS logic for doing that. First, we set a symbolic variable named &COUNTPATH with the full path of the file containing named pairs of text scan values and their counts.

Next, we need to deal with the R script. We create a symbolic for the name of the R script (&SCRIPT) as well as provide some file definition details so that WebFOCUS knows where to write the script (a file allocation called "RPROGRAM").

In the third step, we set &GRAPHPATH with the full path name of where R should write the output graph. So far, so good.




Our next step is to create an R script that will use the mined counts from the text scanning to produce the user-selected graph.

Based on the user's graph selection, the WebFOCUS procedure will write one of three different R scripts. Note: I could have made this logic much more succinct and dynamic but I decided to make it more verbose and simpler to read for the purposes of this blog article.



In this step, we write statements to the file allocation "RPROGRAM" which points us back to the particular R script defined earlier as "r_dynamic_script.r" stored in the "r_scripts" folder.

The great thing about WebFOCUS is that, at run-time, it can dynamically change any aspect of the R scripts. My example here is fairly simple, with the only dynamic code being the names of the input and output files (&COUNTPATH and &GRAPHPATH, respectively).

After writing the proper R script, WebFOCUS will then go a block I named "RUN_RSCRIPT."



Here is where WebFOCUS uses the "call_r_dynamic" FOCEXEC discussed in Part II. This call passes in the name of the R script (the value of symbolic variable &SCRIPT) so that R can run the proper code. Within those instructions, we have told R where to find its input file, how to generate the output graph, and where to store the results.

Click here to go to Part IV, where we can take a closer look at the R scripts. 

Wednesday, January 1, 2014

Using WebFOCUS to Call R

In this second of several articles on integrating WebFOCUS and R, I will show how the enterprise BI product from Information Builders can easily call the open-source R statistical programming language. If you have not already, go back and read Part I for an overview.

One simple way for WebFOCUS to call the R product is to use the Dialogue Manager scripting language and the SYSTEM function.

For this example, I created a WebFOCUS procedure named "call_r_dynamic" intended to be executed from within other WebFOCUS procedures. The calling procedure uses a symbolic variable named &SCRIPT to pass in the name of the R script to be run.




Using the WebFOCUS Dialogue Manager command to create symbolic variables, I set the values of three different folder locations. Now typically, symbolics are intended to provide dynamically-changing values for parameters, but I am really just leveraging them here for readability purposes.

They are:
  • &R_BINARY variable tells WebFOCUS where the R executable is stored
  • &R_LIBS variable tells WebFOCUS where R packages are stored
  • &R_HOME variable tells WebFOCUS where the R scripts are stored


Based on my Windows environment, here is the WebFOCUS code for my three folder variables:

-SET &R_BINARY = '"C:\Program Files\R\R-2.15.1\bin\i386\R"';
-SET &R_LIBS = 'C:\Users\Doug\Documents\R\win-library\2.15';
-SET &R_HOME = 'C:\ibi\apps\r_scripts';


Notice that if any of your folder names have spaces, you need to enclose them within double quotes. Otherwise, the procedure gets confused when it hits the blank and thinks the folder name ends there. Also notice that I store the R scripts in an application folder on the WebFOCUS Reporting Server (I will talk more about this in a later post).  

The name of the R script really is a parameter from the calling program--for example, something like "r_dynamic_graph"--so I dynamically set its full path as follows:

-SET &R_SCRIPT = &R_HOME || '\' || &SCRIPT;


With those four symbolic variables, I will put everything together into a single instruction call to R:

-SET &R_CALL = '&R_BINARY.EVAL HOME=&R_HOME.EVAL R_LIBS=&R_LIBS.EVAL --vanilla  < &R_SCRIPT.EVAL ';


At run-time, WebFOCUS will substitute all of my values into a string such as this:

"C:\Program Files\R\R-2.15.1\bin\i386\R" HOME=C:\ibi\apps\r_scripts R_LIBS=C:\Users\Doug\Documents\R\win-library\2.15 --vanilla  < C:\ibi\apps\r_scripts\r_dynamic_graph.R


This command string will call R, telling it to use my R libraries. Notice that I included a start-up option called "vanilla" which basically instructs R to do minimal work. Also important is the left caret, telling R to run the script whose name I provided.

I still need to execute this command string, which is where the WebFOCUS SYSTEM function comes into play. This will pass the command string to the underlying operating system and send back a return code showing success or failure.

The WebFOCUS SYSTEM function takes three parameters:

SYSTEM(commandstring_length, commandstring, returncode_format);


Here is my actual WebFOCUS Dialogue Manager code using the SYSTEM function to call R: 

-SET &&RETCODE = SYSTEM(&R_CALL.LENGTH,&R_CALL,'D4');


I put the SYSTEM's return code into a global symbolic variable called &&RETCODE so that the calling procedure can verify everything worked. 

In Part III of this article, I will show how to use WebFOCUS to generate and run dynamic R scripts. If you have any questions, contact me. 

About Me

My photo

I am a project-based consultant, helping data-intensive firms use agile methods and automation tools to replace legacy reporting and bring in modern BI/Analytics to leverage Social, Cloud, Mobile, Big Data, Visualizations, and Predictive Analytics. For several world-class vendors, I led services teams specializing in providing software implementation and custom application development. Based on scores of successful engagements, I have assembled proven methodologies and automated software tools.

During twenty years of technical consulting, I have been blessed to work with smart people from some of the world's most respected organizations, including: FedEx, Procter & Gamble, Nationwide, The Wendy's Company, The Kroger Co., JPMorgan Chase, MasterCard, Bank of America Merrill Lynch, Siemens, American Express, and others.

I was educated at Valparaiso University and the University of Cincinnati, graduating summa cum laude. In 1990, I joined Information Builders, the vendor of WebFOCUS BI and iWay enterprise integration products, and for over a dozen years served in branch leadership roles. For several years, I also led technical teams within Cincom Systems' ERP software product group and the custom software services arm of Xerox.

Since 2007, I have provided enterprise BI services such as: strategic advice; architecture, design, and software application development of intelligence systems (interactive dashboards and mobile); data warehousing; and automated modernization of legacy reporting.