Nathan M. Andelin December 2016

Why ILE RPG? - (Part 2)

In light of mainstream language environments that are available on IBM i platforms, Part 1 in this series addressed the question of "why ILE RPG?" from the perspective of business managers who are interested in reducing costs by reducing server footprint and increasing return on investment.

There are a number of other perspectives that should be considered:

A perspective that notes distinctions between IBM i and its native virtual machine environment vs. PASE (a subset of AIX) and the various language environments that run therein. IBM i and its native virtual machine environment offer some distinct advantages for certain types of workloads that are worth reviewing.
A perspective that considers dividing applications into parts, some of which may run in mainstream language environments, while other parts address database I/O, data validation, business rules, database events (inserts, updates, deletes, reads.)
A perspective that includes a review of functional requirements and developer productivity, including API frameworks that may be available in other language environments that are not part of the RPG language, and RPG language elements that may not be available in other language environments.
A perspective that looks to support multiple platforms, say Windows and Linux in addition to an IBM i. Independent software vendors and managers of IBM i platforms may view platform agnosticism as desirable to reduce dependency on IBM and IBM i-centric solution providers.
A perspective that considers the cost, complexity, and risks associated with maintaining multiple language environments and multiple skill sets.

I'm dedicating this piece to the first perspective listed above. I hope to address the other perspectives in future articles.

There may also be perspectives that I'm not fully aware of. Post a note on Linkedin or Midrange Lists if you'd like me to consider or review another perspective.

IBM i Runtime Environments

The IBM i platform includes a runtime environment for AIX applications known as PASE, which has made it practical as well as feasible to port mainstream language environments to Power platforms and integrate them with the IBM i native virtual machine environment. I view this in a positive light as delineated in my earlier piece. IBM has provided multiple integration points between IBM i and PASE.

Notwithstanding the above, Frank Soltis in his book titled Fortress Rochester, depicts, diagrams, and describes PASE as a virtual machine environment for AIX applications that is separate from virtual machine environment that hosts IBM i and IBM i applications. He delineated other distinctions that should be familiar:

Technology Independence.
Object-based Design.
Hardware Integration.
Software Integration.
Single-Level Store.

Hardware integration is mostly a moot point, as both IBM i and PASE run on precisely the same hardware. One relevant distinction is that IBM i runs in a CPU mode named Amazon 64-bit, while PASE applications may run in CPU modes named 32-bit and 64-bit Power PC respectively (most PASE applications run in 32-bit mode).

PASE integration routinely results in CPUs switching modes between 32-bit Power PC (for PASE resource/applications), and 64-bit Amazon (for native IBM i resources/applications). The overhead associated with CPU mode switching is not well documented.

Regarding technology independence, IBM acknowledges that language environments that run in PASE may need to be ported again to future hardware, while applications that run in the native virtual machine would not even require recompilation.

IBM i applications run in a "global" memory address space known as single-level store, while PASE utilizes an address space known as teraspace, where active PASE JOBS include a "private" address space of up to 1 terabytes in size for the JOB.

The bottom line is that it is appropriate to view PASE and the IBM i native virtual machine as separate virtual machine environments with a number of integration points. They are containers (platforms) in and of themselves. PASE is invoked via IBM i JOBS that hold references to PASE call-stack entries.

Language environments that run in PASE are virtual machine environments too - running within a virtual machine environment. An astute observer noted "Java is not platform independent; Java IS A PLATFORM!". The same is essentially true for all the language environments that have been ported to PASE - language environments are essentially virtual machine environments within a virtual machine environment.

Runtime Distinctions

When referring to language environments in this article that run in PASE, I'm referring to Java, PHP, Python, Ruby, Perl, and Node.js - in contrast to languages that run in the IBM i native virtual machine environment (i.e. ILE RPG, COBOL, C, CL, SQL, etc.).

The biggest distinction between the IBM i native virtual machine environment and the virtual machine/language environments that run in PASE is the number of processes (active JOBS) serviced by the hardware. It's not uncommon to see 10,000+ active JOBS running smoothly in the native virtual machine environment even on 4-core Power Systems.

In contrast, the number of active JOBS running in PASE is comparatively small. For example, the number of JVM instances used in Java benchmarks has historically correlated with the number of CPU cores on benchmark servers. There has been an especially strong correlation on Power Systems (i.e. 4-core Power Systems might run 4 JVM JOBS in benchmark configurations). Higher numbers of JVM JOBS does not yield higher throughput.

PHP on IBM i is typically runs in a modest number of persistent (FastCGI) worker JOBS, that are linked to the IBM i HTTP server. Running too many instances of PHP (i.e. 200+ in some reports) has been known to overwhelm system resources and crash servers.

Similarly, Node.js and Ruby advocates tend to recommend keeping the number of runtime instances (JOBS) low.

See the distinction? The IBM i native virtual machine is an environment that smoothly supports the running of thousands or ten of thousands of light-weight processes. Contrast that with individual processes that run an AIX virtual machine environment plus a mainstream language environment plus scripts or byte-code that run in that language environment.

Single Level Store

Frank Soltis describes one of the benefits of the IBM i single-level store in a story referenced in his book titled Fortress Rochester, which I'd like to summarize.

A team from IBM Research (that was responsible for the Power Processor) was visiting Rochester to see how OS/400 might be run on the chip. The team in Rochester explained that OS/400 needed to switch tasks at intervals of about 1,200 instructions. The team from IBM Research said that was impossible, as it typically required 1,000 instructions to perform a task switch. IBM Research asserted that no work would get done with so frequent task switching. The Rochester team explained that single-level store made it possible to perform task switching with relatively few instructions.

Efficient task switching is a key reason why the IBM i native virtual machine environment is able to manage large numbers of concurrent JOBS.

What sort of workload would benefit from a runtime environment that is able to efficiently manage thousands or even tens of thousands of active JOBS? What about applications that maintain database and application state for thousands of concurrent users?

Stateful Interfaces

Stateful application and database interfaces are a requirement of business systems. A record submitted by a user of an interactive DB maintenance program might need to be compared to a record currently on file, that might need to be compared to the record just prior to the "edit" activity. Interactive database navigation requires persistent (stateful) database connections. Multi-page dialogs involve the collection and retention of "state" from form to form.

The question is often not whether application and database state must be maintained, but rather how to do it? The IBM i native virtual machine provides the option of launching an individual JOB for each user who may select a menu item, where the JOB automatically maintains both application and database state for that user and menu-item selection.

If you were to start a JOB that included a PASE runtime, plus a language environment, plus application code, for every user in order to maintain application and database state, you would quickly overwhelm system resources. The inability to maintain application and database state in a process (JOB) for individual users due to inordinate resource consumption is arguably a fundamental architecture flaw.

The way that PASE-based language and application environments address this problem is by maintaining pools of connections with JOBS that do run in the native virtual machine environment, which do maintain database state for individual end-users. This interface tends to be abused in practice due to developers failing to release resources pertaining to one application as users navigate to or launch another.

For interactive applications, alternatives exist for maintaining application state in PASE-based language / application environments. The most common is for developers to "save" session data after each request and to "restore" it prior to processing the next. Save and restore operations typically mean database persistence and retrieval which is resource intensive for PASE-based language environments.

Database Access

A noteworthy distinction between the IBM i integrated language environment and language environments that run in PASE is that the former runs in the same address space (single-level store) and has direct access to the IBM i DBMS.

Language environments that run in PASE must use some form of inter-process communication (i.e. sockets) to communicate with JOBS that run in the IBM i native virtual machine environment in order to access the IBM i DBMS.

By "must", I mean for all practical purposes, and using the standard interfaces provided in the respective language environments. Some may argue that C programs running in PASE could (with some constraints) communicate with ILE programs by using teraspace APIs. However using shared memory APIs is still a form of inter-process communication.

The point is, that only languages that run in the native virtual machine environment have direct access to the IBM i DBMS, which is a very meaningful and distinct advantage. People who advocate dropping ILE languages altogether are offering a myopic perspective that unnecessarily handicaps IBM I development teams and the applications that they can produce.

Wrapping Up

This piece may have said little about the ILE RPG language. That was intentional. My reasons for advocating for ILE RPG has less to do with language syntax and functionality, and more to do with the fact that it produces light-weight code that runs in the IBM i native virtual machine environment.

To me, the ability to run thousands or tens of thousands of JOBS in a virtual machine environment that provide stateful application and database interfaces for that many users concurrently has high value. To me, this is reassuring as well as refreshingly distinct from mainstream language and virtual machine environments.