A stumbling
block for businesses trying to modernize legacy computer applications is the sheer
volume of program files. An IT organization may own thousands of code libraries,
each with thousands of programs. Often, the whereabouts of the original developers of old applications are unknown.
I have found this to
be especially true for legacy end-user 4GL reporting tools such as FOCUS.
A
computer language developed by Information Builders in the mid-1970s, FOCUS
became the industry standard as a multi-platform report writer for business
end-user communities. With FOCUS, rather than ask the busy IT organization to develop
reports, users could build their own.
But instead of being
just a report writer, FOCUS was in reality a full application development
environment originally designed to replace COBOL. Many enterprising users took advantage
of robust features such as online screens, database maintenance, and batch processing to build very sophisticated systems.
Two or three decades later,
many IT shops now struggle to grasp what their FOCUS users developed. Trying
to assess the purpose, functionality, usage, and complexity of these legacy
applications by manually looking at each program is nearly impossible.
To assist with this type of time-consuming detective work, Partner Intelligence developed text
analytics software called the "BI Consolidator." Written in C/C++
with a web browser graphical user interface as well as a command-line batch processor, the application has two main
features: 1) automated textual discovery; and 2) automated translation into a
new BI product.
For now let us
consider only the automated textual discovery feature that called the
"scanner."
Text Scanning
Computer programs are
not completely unstructured like an e-mail message or the prose found inside a Word
document. Instead, almost all computer programs follow a particular formal
syntax which forces them to be at least semi-structured text. This simplifies textual
analytics since we know what to expect (for the most part, anyway, as there can be user
syntax errors and a fair share of junk).
Our textual analytic
scanner is smart enough to figure out the code dialect, but we provide it
with some starting instructions. For example, we can tell the application to
perform a very specific scan such as looking for FOCUS-to-WebFOCUS conversion issues, FOCUS metadata
to find data formats, SAS statistical features, JCL batch job features, HTML
legacy CGI calls, Crystal Reports features, or to parse SQL commands.
When in a
curious mood, we can perform custom ad-hoc textual searches.
While the results
pulled from the text can be just displayed on a screen, it is more useful to
save these to a database and later analyze the answer set.
Online GUI and Batch
Text Scanning
We started with a GUI
front-end, but when working with a large number of libraries it quickly becomes
tedious to repeatedly point, click, and run. As a result, we modified the Scan
program to be alternatively run using a batch script from the command line.
Not only is it easier to use, the scanner runs much faster since we eliminate generating HTML for displaying results within the browser. On our current engagement, we scan close over 200 mainframe libraries containing over 80,000 programs within 15 minutes.
Not only is it easier to use, the scanner runs much faster since we eliminate generating HTML for displaying results within the browser. On our current engagement, we scan close over 200 mainframe libraries containing over 80,000 programs within 15 minutes.
Keyword Frequencies
For many of the scans,
the software performs keyword frequency counts. For example, to evaluate
conversion issues related to green-screen application development, the scanner
searches the text for a variety of FOCUS keywords whose either presence or absence
would be significant:
- MAINTAIN
- -WINDOW, -CRTFORM, -PROMPT, -FULLSCR
- CRTFORM, FIDEL, FI3270 (used within MODIFY)
- PFKEY, SET PF
To help with the
accuracy of the scanning, we can apply a variety of criteria on searches such
as:
- Perform case-sensitive search (or uppercase all text first)
- Perform stand-alone search (or allow the token to be embedded within a string)
- Ignore blanks between search tokens (since developers often format code using spaces between words)
Pattern Recognition
Using the results of
the keyword searches, we can group specific ones together help identify a
pattern of usage within the application. For example, if we group keywords found during a legacy FOCUS 4GL scan, we should recognize one or more of the following archetypes:
- Reporting App = high number of TABLE (report) requests but few MODIFYs (database updates)
- Online Reporting App = Reporting App with high number of -CRTFORMs (menu screens) or -PROMPTs
- Online Maintenance App = MODIFYs, CRTFORMs (transactional screens), and PFKEY usage
- Batch Maintenance App = MODIFYs with FIXFORM/FREEFORM (transactions) instead of screens
- Multi-Step Batch Job = JCL with various FOCUS and non-FOCUS steps (which implies this application may be difficult to port to a new platform)
Textual Parsing
For some textual
analytics, we actually need to parse the semi-structured code and pull out more
than just keywords. For example, we often find SQL (structured query language)
embedded within reporting applications. Being structured, SQL follows a strict
syntax of blocks of code in a specific order of: SELECT; FROM; WHERE; GROUP BY;
HAVING; ORDER BY.
This makes it possible
to parse the syntax and extract the names of databases, tables, and columns being used in the application. We can also distinguish between the columns
showed on the report versus those being used in the selection criteria or for sorting
and aggregation.
Standard Content
Analysis
With these textual
contents extracted and stored inside a database, we can then perform standard
reporting as well as custom queries. For example, one well-known client used
the scan results to perform a redundancy of their Business Objects environment
to evaluate it being replaced with a new web-based solution.
The business sponsor was completely against a
one-to-one conversion of these legacy reports. Instead, from the scanned
contents of thousands of reports and SQL files, the client was able to identify
commonalities and reporting redundancies which enabled them to categorize their
BI needs into a dozen buckets. From there, they built a roadmap for replacing
their legacy reporting environment with a collection of highly dynamic
reporting solutions.
In addition to analytics, we have standard reports that help with the operational aspect of a modernization initiative such as parallel test plans.
Building a Textual
Analytics Engine
When companies need to
modernize an application, they often view it as a one-time activity. With this
mindset, they might not invest the time and money to build this type of textual
analytics scanner and translator. Because we work with a variety of clients
with this common need, it made sense for Partner Intelligence to create a reusable tool such as the BI Consolidator.
This application has evolved over time. When we first developed it, it handled SQL-based legacy tools. After that, we enhanced it for the NOMAD and FOCUS 4GL. Since then, we have added features for a variety of products such as SAS, QMF/SQL, Oracle Portal, and SAP Business Objects (Crystal Reports, Deski, and Webi).
In addition to the reporting tools, we have added features for handling complementary technologies such as metadata schemas, HTML web pages, and mainframe job control language (JCL).
This application has evolved over time. When we first developed it, it handled SQL-based legacy tools. After that, we enhanced it for the NOMAD and FOCUS 4GL. Since then, we have added features for a variety of products such as SAS, QMF/SQL, Oracle Portal, and SAP Business Objects (Crystal Reports, Deski, and Webi).
In addition to the reporting tools, we have added features for handling complementary technologies such as metadata schemas, HTML web pages, and mainframe job control language (JCL).
If you are interested
in learning more, I would be happy to discuss the details of our software with
you. Contact me at my DLautzenheiser at PartnerPS dot com address.
You may also be interested in these articles:
- Preparing for FOCUS-to-WebFOCUS Conversions
- Converting the NOMAD 4GL to WebFOCUS
- Convert FOCUS Batch JCL Jobs for WebFOCUS
- Automatically Modernize QMF/SQL to WebFOCUS

1 comment:
NASA's Space Shuttle program still uses a large amount of 1970s-era technology. Replacement is cost-prohibitive because of the expensive requirement for flight certification; the legacy hardware currently being used has completed the expensive integration and certification requirement for flight, but any new equipment would have to go through that entire process – requiring extensive tests of the new components in their new configurations – before a single unit could be used in the Space Shuttle program. This would make any new system that started the certification process a de facto legacy system by the time of completion.
Thanks
Michael
Post a Comment