My Modulus Obsession Part II

In a previous post I discussed using the modulus operator ( % )to easily distribute one list across another usually larger list. For a simple 1 dimensional distribution modulus calculations enabled very compact and comprehensible code. However, not all distributions are that simple. Recently, while working on a new Exchange deployment, I needed to code for a more complex distribution algorithm which is the subject of this post.

Modern Exchange designs are usually informed by the sizing calculator and will tend to host multiple DB copies on each disk. In order to maximize efficiency, redundancy and high availability, DB copies are distributed across servers and disks. However, this may result in a visually complex DB distribution as seen below.

Calculator Generated Database Distribution

The above image was adjusted from the distribution tab of the sizing calculator. This design has 4 servers acting as an HA cooperative on the active side of an active/passive datacenter design. Servers in the secondary datacenter are to host the 3rd and 4th activation preferences. Notice, in either datacenter there are 3 different distribution patterns spreading database copies across 4 different servers. Those patterns then repeat down the list of volumes. The advantage of this is that if a single disk fails it will be reseeded from 3 other servers, restoring HA as quickly as possible.

While the calculator guides many design decisions it’s common to make adjustments that do not exactly follow the calculator’s output. For example, an organization may opt to increase or decrease capacity, hosting more or less DBs and/or disks. In this case, the organization opted for fewer drives & DBs, hosted on much more powerful SSDs. This somewhat invalidated the various setup scripts generated by the calculator. Along with several other factors, I decided to write my own set of scripts to configure disk subsystems, mount points and mailbox DBs copies, etc.

I may document the whole set of scripts in a separate post. Relevant to the distribution algorithm my approach required a set of objects that would emulate the calculator’s distribution. The objects would then be used as input to other scripts to actually create and configure the resources. At first I thought the modulus approach wouldn’t be portable to the more complex pattern, but once I got to coding it was really simply to just layer in some parameters with some simple arithmetic.

$Servers   = @( 'EXSrv-01', 'EXSrv-02', 'EXSrv-03', 'EXSrv-04' )
$DRServers = @( 'EXSrv-DR01', 'EXSrv-DR02', 'EXSrv-DR03', 'EXSrv-DR04' )

$DBs = @(
    'DB001', 'DB002', 'DB003', 'DB004', 'DB005', 'DB006', 'DB007', 'DB008'
    'DB009', 'DB010', 'DB011', 'DB012', 'DB013', 'DB014', 'DB015', 'DB016'
    'DB017', 'DB018', 'DB019', 'DB020', 'DB021', 'DB022', 'DB023', 'DB024' 
)

$Vols = @(
    'Vol1', 'Vol2', 'Vol3', 'Vol4',  'Vol5',  'Vol6'
    'Vol7', 'Vol8', 'Vol9', 'Vol10', 'Vol11', 'Vol12'
)

$Gap         = 1, 2, 3
$DBsPerVol   = 4
$VolTurnover = $DBsPerVol + $Servers.Count

$DBConfigs =
For( $i = 0; $i -lt $DBs.Count; ++$i )
{    
    $OffSet    = $Gap[ ([Math]::Floor( ($i / $DBsPerVol) )) % $Gap.Count ] # Determine the offset:
    $SrvNum    = $i % $Servers.Count                                       # Reusable index for primary & tertiary servers
    $SrvNum2nd = ($i + $Offset) % $Servers.Count                           # Reusable index for secondary & quaternary servers
    $VolNum    = ([Math]::Floor( ($i / $VolTurnover) ) % $Vols.Count)      # Returns the volume number 
    
    [PSCustomObject]@{
        Name             = $DBs[ $i ]               # Returns the DB name.
        Disk             = $VolNum + 1              # Returns the disk# 
        Volume           = $Vols[ $VolNum ]         # Returns the volume name
        PrimaryServer    = $Servers[ $SrvNum ]      # Returns the primary server
        SecondaryServer  = $Servers[ $SrvNum2nd ]   # Returns the secondary server
        TertiaryServer   = $DRServers[ $SrvNum ]    # Returns the tertiary server
        QuaternaryServer = $DRServers[ $SrvNum2nd ] # Returns the quaternary server
    }
}

$DBConfigs | Format-Table -AutoSize

Note: For brevity, this example is truncated. The real implementation had 96 DBs.

Before entering a typical for loop the code defines a few variables to guide the distribution pattern:

  1. $Gap is an array to help define the location of the secondary DB copy relative to the primary.
  2. As the name implies $DBsPerVol defines how many DBs should be on each volume.
  3. And, $VolTurnover determines how many loop iterations can elapse before we start placing databases on the next volume.

Inside the loop, several calculations are made:

  1. $OffSet uses a [Math]::Floor() calculation with simple division and % calculations to select an index from the $Gap array. Again, this will determine where to place the secondary DB copy relative to the primary, 1, 2 or 3 spots away, in a rotating pattern.
  2. $SrvNum & $SrvNum2nd calculate which index is selected from the $Servers & $DRServers arrays. As noted in the code, this effectively defines the servers hosting the primary, secondary, tertiary and quaternary copies for a given DB.
  3. Finally $VolNum uses another [Math]::Floor() calculation with a few other factors to select an index from the $Vols array.

The output of the code looks like:

Name  Disk Volume PrimaryServer SecondaryServer TertiaryServer QuaternaryServer
----  ---- ------ ------------- --------------- -------------- ----------------
DB001    1 Vol1   EXSrv-01      EXSrv-02        EXSrv-DR01     EXSrv-DR02
DB002    1 Vol1   EXSrv-02      EXSrv-03        EXSrv-DR02     EXSrv-DR03
DB003    1 Vol1   EXSrv-03      EXSrv-04        EXSrv-DR03     EXSrv-DR04
DB004    1 Vol1   EXSrv-04      EXSrv-01        EXSrv-DR04     EXSrv-DR01
DB005    1 Vol1   EXSrv-01      EXSrv-03        EXSrv-DR01     EXSrv-DR03
DB006    1 Vol1   EXSrv-02      EXSrv-04        EXSrv-DR02     EXSrv-DR04
DB007    1 Vol1   EXSrv-03      EXSrv-01        EXSrv-DR03     EXSrv-DR01
DB008    1 Vol1   EXSrv-04      EXSrv-02        EXSrv-DR04     EXSrv-DR02
DB009    2 Vol2   EXSrv-01      EXSrv-04        EXSrv-DR01     EXSrv-DR04
DB010    2 Vol2   EXSrv-02      EXSrv-01        EXSrv-DR02     EXSrv-DR01
DB011    2 Vol2   EXSrv-03      EXSrv-02        EXSrv-DR03     EXSrv-DR02
DB012    2 Vol2   EXSrv-04      EXSrv-03        EXSrv-DR04     EXSrv-DR03
...

The above output table follows the same distribution pattern that was output from the calculator. A simple export to a CSV file now allows me to use the configuration objects as input to the other scripts.

Extending modulus calculations with some basic math, I was able to generate rather complicated distribution pattern. I didn’t even attempt to write this without leveraging modulus, but I’d imagine having to resort to copious amounts if/else logic. In closing, this is another concise but powerful pattern that really makes me appreciate the modulus operator only further compounding my modulus obsession.

One thought on “My Modulus Obsession Part II

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s