I still have my “I Love VM” bumper sticker
🖥 🖥 🖥
I still have my “I Love VM” bumper sticker; where VM is not the VM of VMware. (No offense intended to the VMware folks.) The VM that I fell in love with is the VM that dates back to the early 70’s. (Yup, back in the previous century when I was moving from a career as an IBM Computer Operator with IBM Product Test in Kingston to becoming a programmer for IBM.) According to wikipedia, “The first version, released in 1972, was VM/370, or officially Virtual Machine Facility/370.”
To understand the beauty of VM (better known these days as z/VM® ) a bit of history in terms of the evolution of VM from the early days to where it fits in the current IT landscape might help.
>>> So here it is… with some poetic license! <<<
Actually here is my version as I recall it. (Note: I served as a technical instructor/course developer for IBM Kingston Site Education [Yes Virginia, there was an IBM in Kingston for decades]and eventually Mid-Hudson Valley Education. Although I was considered an instructor assigned to the Kingston and Poughkeepsie Sites, I spent a lot of time on the road teaching VM Internals (that is the data areas and logic of the CP or Control Program) at various IBM locations as well as at NASA.))) The extra parenthesis have been inserted here just to make sure I still have your attention.
The flow of instruction execution in the IBM System/370 (as well as the preceding System/360) was controlled by or directed by something called the Program Status Word (PSW).
The PSW contained two things of interest in this discussion; the address of the Next Sequential Instruction (NSI) and the Problem State (P) bit.
In concept, as the current instruction was executing, the NSI pointed to the next instruction to be fetched and subsequently executed (if the current instruction did not cause a branch causing a change in flow of the instructions being executed. Likewise, the P bit indicated the state of the processor; 1 for problem state and 0 for supervisor state. The setting of the P bit limited or enabled (based on being set to 1 or 0) the set of instructions that could be executed.
When the machine was in problem state, only instructions such as add, subtract, compare, branch and so on could be executed. This prevented a problem state program from performing system or supervisor like functions; as in I/O or dispatching work. It was a means to protect the system from application programs as well as to protect one program from another. When combined with the use of storage keys, this arrangement provided a reasonable amount of security to all parties. But that is another story.
>>> Binary Setting of the P could not support VM states <<<
The creation or execution of virtual machines required the capacity to present a similar problem-versus-supervisor state relationship between the virtual environment and the Control Programs; basically 3 states were needed which will be described further down below. However, since the PSW only had one P bit a certain amount of juggling was required by the VM Control Program (CP or Hypervisor if you prefer). From this point on, I will refer to the software that enables the concept of virtual machines in this environment simply as CP.
The CP was the first thing IPL’ed (Initial Program Load) and thus had total control of all resources. (Note: this might scare any reader who only has z/OS background. Wait a moment and we will see where your operating system fits into this environment.) Thus, the CP ran in Supervisor Mode (the P bit was set to indicate Supervisor State). In this state, CP could perform I/O, manage resources such as memory and all real devices (to which I/O would be performed) and of course CP would “dispatch” or give control to virtual machines using the LPSW (Load Program Status Word) instruction.
>>> This is where it becomes interesting! <<<
The CP could not give control to virtual machines in supervisor state, because doing so would give them (the virtual machines)total control of the REAL machine. That would not be good!
So CP gave control to the virtual machines in problem state; per the P bit. Now you might be saying to yourself (and hopefully not out loud) “Wait… how can we give control to an operating system such as z/OS running in a virtual machine in problem state when that operating system clearly needs to perform supervisory functions in supervisor state?!? on behalf of the applications it supports”. Read on — the mystery will be solved!
With the birth of CP (VM-67… VM/370), the following states were also born:
- Real Supervisor State (The state in which CP would run.)
- Virtual Supervisor State (This is the a state of the Guest Operating System such as z/OS.)
- Virtual Problem State (This is the state that an application would run in within the Guest such as an application running in a z/OS address space.)
The image to the left depicts the relationship of an application (in Virtual Problem State) running in a z/OS provided address space. Likewise, z/OS is running in Virtual Supervisor State (relative to CP) and CP is running in Real Supervisor State on an IBM Mainframe.
At this point in the evolution of VM (CP)… when the CP gave control (dispatched if you prefer) to a VM (z/OS in this image); a LPSW instruction loaded a Problem State PSW. (A REAL problem state psw). But before doing so, a Virtual or Guest PSW was stored away in a control block that represented the Guest (z/OS in this case). Refer to the VMDBK image provided here. Further details of the VMDBK can be found at the IBM publication https://www.vm.ibm.com/pubs/cp630/VMDBK.HTML
That Guest PSW would be used in processing an eventual Program Interrupt that would occur as soon as the Guest attempted to execute a privileged instruction; one that requires [real] Supervisor State.
Once dispatched, the VM (z/OS as shown in the image above and the application it supports) merrily chugs along executing instructions (not knowing that it is in REALLY in Problem State). As soon as the supervisor of the operating system gets control (perhaps because the application issued a Supervisor Call/SVC instruction to request some supervisor service… and z/OS attempts to perform a supervisory function as a result of the SVC call (such as LPSW or perhaps and Input Output instruction) bells and whistles go off in the form of a Program Interrupt. That would be a REAL Program Interrupt!
>>> Handling a REAL Program Interrupt. <<<
When a Program Interrupt (a REAL Program Interrupt) occurs, the Current PSW [that would be happening in the hardware] is stored at the OLD PSW location corresponding to the type of interrupt that is happening. (Note: the OLD PSW location is a fixed location per the architecture.) The corresponding NEW PSW (in this case the NEW PROGRAM INTERRUPT PSW) becomes or is loaded into the CURRENT PSW and execution continues at the point of the NSI in the CURRENT PSW. At this point some CP logic in the form of a first level interrupt handler gains control. (Note to z/OS programmers, yes this is basically what happens when your system is running native and your old friend the FLIH gains control… but in this case it is the FLIH for CP.)
Conceptually and briefly — here is what happens at and after the point in time when the CP first level interrupt handler gets control.
- The state of the interrupted VM is tucked away in a special control block (see VMDBK above)
- The perceived or Virtual Guest PSW is analyzed (the one that was stored away before dispatching the VM)
- If the Guest THOUGHT it was in Supervisor State (as described by the Guest PSW that was stored away just prior to giving the VM control), then the instruction that caused the REAL program interrupt was simulated by the CP. That means the CP will do whatever it needs to do on behalf of the guest to make it seem as if it (the guest) actually executed a LPSW or some other privileged instruction.
- Note that if the guest was attempting to do I/O then CP would have to perform a lot of leg work to make that happen. This included converting the associated CCW (Channel Command Word) chain (which contains storage addresses of where data is to be taken from or stored into) — so that they will work with the REAL I/O. (Remember that guests are sharing DASD.) The device address that a guest uses is virtual. That virtual device needs to be mapped to a corresponding real device and location on that device. Aren’t you glad you didn’t have to figure that process out on your own?
- If the Guest THOUGHT it was in Problem State (again as described by the Guest PSW previously stored away), then the CP would simulate an appropriate PSW Swap such that the guest’s Program New PSW is stored away as the guest’s Virtual Current PSW. Basically this simulates a Program Interrupt being presented to the guest when it is dispatched again. (Yes - this is the point where the z/OS FLIH would get control.)
>>> The point where simulation by CP is complete <<<
After CP completed PSW Swap simulation etc on behalf of the guest then the CP dispatcher will decide which guest is to be dispatched. All of the above is completed without guests perceiving or being concerned that anything weird happened. They just continue to operate somewhat oblivious to any of the magic or smoke and mirrors that made it possible for them to operate in a VM.
>>> As time and progress marched on… <<<
As time and progress marched on, much of the CP logic was incorporated as part of the hardware in the form of the Start Interpretive Execution (SIE) instruction. This allowed for a much smoother, faster, secure and robust VM environment.
(Note: When I was an IBM instructor, I taught a class that was a week in length that covered ONLY the SIE instruction. So YES… there is a lot going on underneath the covers…)
With SIE, once a VM was dispatched it was able to continue execution under the auspices of the SIE instruction minimizing the need for software simulation of guest activities. Instead, Interpretive Execution handled it all. Whenever something occurred in this interpretive environment that could not be handled by that environment, an event known as interception would occur.
Interception was essentially an interruption of the SIE instruction; but without swapping of the PSWs. Instead, the instruction path that immediately followed the SIE instruction would gain control. That sequence of instructions then had access to an area known as the SIEBK, where the current status of the intercepted VM was stored by the hardware… YES… the VMDBK mentioned above was part of the SIEBK.
It was.. and is.. that simple. *smile*.
If this article was of interest to you, then you may want to refer to the IBM Systems Magazine article by Bob Rogers where he discusses “Virtualization’s Past Helps Explain Its Current Importance for further discussion of VM.” (Published 02/06/2017)