General High Availability
What is a "High Availability Cluster"?
The aim of a High Availability Cluster (HAC) is to provide the availability of specific services within a system landscape. This is reached by reducing downtimes using redundant cluster nodes. A common HAC setup consists of two cluster nodes, which is the minimum for redundancy. If one server node crashes, the second node takes over the clustered services and secures the availability this way. This is also called a "two-node cluster" or a "failover cluster." According to the demands of availability, the number of server nodes can be increased to minimize the risks of failure of a node.
What is "switchover"?
“Switchover” is referred to as a planned switchover of a primary server to a standby server, which means without a failure of the primary server. A switchover is always initiated by the system administrator.
What is "failover"?
“Failover” is referred to the process of an unplanned switchover from a primary server to a standby server system in case of a system fail of the primary server node. Other than a switchover a failover is performed automatically by the cluster software. Some cluster software, such as Microsoft Cluster Service, does also provide an option for a manual failover for testing purposes.
What is "fallback"?
“Fallback” is referred to the process of switching back from a secondary server node to the primary server node after a failover occurred and the primary server node is available again. A fallback can be done automatically by the cluster software or intelligent, which means manual by the system administrator.
What is an "active-passive cluster"?
An active-passive cluster consists of two independent server nodes at a minimum. The primary server node performs all operations. A secondary node acts as a so called “standby system.”
In case of a system failure of the primary node, the cluster software fails over automatically to the standby server node, which starts the processes and resumes the work of the primary server node. Cluster groups are only active on one server node at the same time. Cluster groups are only active on one server node at the same time.
Please note that an active-passive cluster configuration do not implicate that the standby server node does not contain any workload. The active-passive configuration only referrers to the cluster group, which means that in an active-passive configuration the resource, can only be active on one server node at the same time. However, if a cluster environment contains more than one cluster group, theses groups can be distributed within the cluster environments.
The SAP Central Services Instance is implemented as an active-passive cluster.
What is a "standby system"?
A standby system is a redundant cluster node that takes over the processes if the primary cluster server fails. This is referred to as a “failover” and is performed automatically by cluster software. There can be several standby cluster nodes. The amount is only limited by the capabilities of the cluster software. For example, the Microsoft Cluster Service provides up to 8 server nodes, which means 7 standby nodes as a maximum.
“Standby systems” can be in a “hot” state or a “cold” state. A “hot standby” means that the processes run on the standby node also, which means that in case of a failure the cluster resource is running already and does not need to be started on the standby system. A “cold standby” means that in case of a failover the clustered resource needs to be started on the standby system which means that a (short) downtime during the failover occurs. The SAP Central Services Instance is usually implemented as a “cold standby” system due to the fact that SCS is a light component and does not need a long time for startup.
What is an "active-active cluster"?
An active-active cluster consists of two independent server nodes at a minimum. The workload within a cluster resource is shared between the server nodes. If a cluster node crashes the processes are resumed by the remaining cluster nodes. An active-active cluster configuration means that a cluster resource is active on all cluster nodes. The aim of an active-active cluster is not only to provide high availability system but to distribute the workload between the cluster nodes. Applications with a very high workload like databases benefit from an active-active setup. Due to the SAP Central Services Instance is a light component an active-active setup does not make any sense. Therefore the SCS is implemented as an active-passive cluster resource.
Shared Nothing
Shared All
What does "virtualization" mean?
The term “virtualization” in the context of HA refers to a kind of abstraction performed by the cluster software. The software creates a virtual host that owns a virtual hostname, virtual disk, and so on. “Virtual” in that manner means that such resources cannot only be owned by one physical machine but by all of them. Which node currently owns or runs a resource is managed by the cluster software. Related “resources” are usually grouped to logical containers (for example Groups on MSCS or packages on HPSG) that can perform failovers independently.
What do "shared nothing" and "shared all" mean?
The term “shared nothing” and “shared all” specifies a type of architecture within an active-active cluster. “Shared nothing” means that every cluster node contains its own data partition, which implicates that these kinds of setup are not highly available due to the fact that in case of a failure the data of the failed node is no longer available. In a “shared all” environment the different cluster nodes that run the same service shares a data partition and accesses the data concurrently.
These options have to be supported by the cluster software. For example, MSCS does not support the “shared all” option.
What is a "SPOF"?
A Single Point of Failure (SPOF) is any component within a system that, if it fails, causes a loss of a runtime critical service. A SPOF can be hardware or a software component. However, this FAQ only covers the SAP identified SPOF software components that are Message Server, Enqueue Server, the central file system and Database. In a High Availability manner it is necessary to eliminate these SPOF.
No comments:
Post a Comment