Sample Configuration: Basic Four-Node Cluster
include "types.cf"
cluster SGWM-demo
system Server1 (
Capacity = 100
)
system Server2 (
Capacity = 100
)
system Server3 (
Capacity = 100
)
system Server4 (
Capacity = 100
)
group G1 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 20
)
group G2 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 40
)
group G3 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 30
)
group G4 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 10
)
group G5 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 50
)
group G6 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 30
)
group G7 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 20
)
group G8 (
SystemList = { Server1, Server2, Server3, Server4 }
AutoStartPolicy = Load
AutoStartList = { Server1, Server2, Server3, Server4 }
FailOverPolicy = Load
Load = 40
)
AutoStart Operation
In this configuration, assume that groups probe in the same order they are described, G1 through G8. Group G1 chooses the system with the highest AvailableCapacity value. All systems have the same available capacity, so G1 starts on Server1 because this server is lexically first. Groups G2 through G4 follow on Server2 through Server4. With the startup decisions made for the initial four groups, the cluster configuration resembles:
Server
|
AvailableCapacity
|
Online Groups
|
Server1
|
80
|
G1
|
Server2
|
60
|
G2
|
Server3
|
70
|
G3
|
Server4
|
90
|
G4
|
As the next groups come online, group G5 starts on Server4 because this server has the highest AvailableCapacity value. Group G6 then starts on Server1 with AvailableCapacity of 80. Group G7 comes online on Server3 with AvailableCapacity of 70 and G8 comes online on Server2 with AvailableCapacity of 60.
The cluster configuration now resembles:
Server
|
AvailableCapacity
|
Online Groups
|
Server1
|
50
|
G1 and G6
|
Server2
|
20
|
G2 and G8
|
Server3
|
50
|
G3 and G7
|
Server4
|
40
|
G4 and G5
|
In this configuration, Server2 fires the loadwarning trigger after 600 seconds because it is at the default LoadWarningLevel of 80 percent.
Failure Scenario
In the first failure scenario, Server4 fails. Group G4 chooses Server1 because Server1 and Server3 have AvailableCapacity of 50 and Server1 is lexically first. Group G5 then comes online on Server3. Serializing the failover choice allows complete load-based control and adds less than one second to the total failover time.
Following the first failure, the configuration now resembles:
Server
|
AvailableCapacity
|
Online Groups
|
Server1
|
40
|
G1, G6, and G4
|
Server2
|
20
|
G2 and G8
|
Server3
|
0
|
G3, G7, and G5
|
In this configuration, Server3 fires the loadwarning trigger to notify that the server is overloaded. An administrator can then switch group G7 to Server1 to balance the load across groups G1 and G3. When Server4 is repaired, it rejoins the cluster with an AvailableCapacity value of 100, making it the most eligible target for a failover group.
Cascading Failure Scenario
If Server3 fails before Server4 can be repaired, group G3 chooses Server1, group G5 chooses Server2, and group G7 chooses Server1. This results in the following configuration:
Server
|
AvailableCapacity
|
Online Groups
|
Server1
|
-10
|
G1, G6, G4, G3, and G7
|
Server2
|
-30
|
G2, G8, and G5
|
Server1 fires the loadwarning trigger to notify that it is overloaded.
|