Click below to get to a section quickly

INDEX

How do I?

 

 

 

Operator Quick Start

Connecting to the Messenger clouds:

  1. You must have a SecureID key fob as well as a unix account setup on the Messenger Machines. (Contact: mshinn or wlai)

2.      Run SecureCRT on any machine setup to connect to the public Internet.  This can be a corpnet machine that has Remote Winsock proxy installed. (Contact wlai for a license, but a 30-day trial copy is available for download from http://www.vandyke.com/download/SecureCRT/index.html )

3.      Click New on the Session List tab to create a new session.  Enter the following information:

·         Name = whatever friendly name you like

·         Protocol = SSH

·         Hostname or IP = law-l2.hotmail.com

·         Port = 22

·         Username = <leave blank>

·         Cipher = 3DES

·         Authentication = Password

·         Password = <leave blank>

·         <Click OK>

4.      Back on the Session List tab, select this newly created session, and click OK

5.      If this is your first time connecting to the law-l2.hotmail.com, then you will be prompted to save the identification and key for this sever.  Click Accept & Save.

6.      You will be prompted to enter your UserID.  Enter the UserID that Lorrie Wood gave you.

7.      You will be prompted the password.  First type in the four-digit personal prefix followed by the 6 digits currently showing on your SecureID key fob.

8.      You are now logged into the law-l2 machine, which is located on the Hotmail facility.  From here you can telnet to any of the Messenger machines.

9.      If you want to telnet to the Messenger machines, at the “>” prompt type in “telnet msgr-ns1”, where msgr-ns1 is the machine that you would like to access. 

10.  You will be prompted for the UserID and password for these machines.  Here you need to use the UserID and password for these machines.   Note that the password is not the SecureID passwords.

What are the machines?

Machine Name

Machine IP

Purpose

msgr-s1

209.185.128.171

Staging Server

msgr-ns1

209.185.128.132

NS’s

msgr-ns2

209.185.128.133

msgr-ns3

209.185.128.134

msgr-ns4

209.185.128.135

msgr-ns5

209.185.128.136

msgr-ns6

209.185.128.137

msgr-ns7

209.185.128.138

msgr-ns8

209.185.128.139

msgr-ns9

209.185.128.140

msgr-ns10

209.185.128.141

msgr-ns11

209.185.128.142

msgr-ns12

209.185.128.143

msgr-ns13

209.185.128.144

msgr-ns14

209.185.128.145

msgr-ns15

209.185.128.146

msgr-ns16

209.185.128.147

msgr-ns17

209.185.128.148

msgr-ns18

209.185.128.149

msgr-ns19

209.185.128.150

msgr-ns20

209.185.128.151

msgr-sb1

209.185.128.157

SB’s

msgr-sb2

209.185.128.158

msgr-sb3

209.185.128.159

msgr-sb4

209.185.128.160

msgr-sb5

209.185.128.161

msgr-sb6

209.185.128.177

msgr-dp1

209.185.128.152

DP’s

msgr-dp2

209.185.128.153

msgr-dp3

209.185.128.154

msgr-dp4

209.185.128.155

msgr-dp5

209.185.128.156

msgr-gdb1

209.185.128.167

Friends M-servs

msgr-gdb2

209.185.128.168

msgr-gdb

209.185.128.169

Virtual M-serv address through the Local Director.  Not telnet-able.

msgr-u1

209.185.128.162

Friends U-stores

msgr-u2

209.185.128.163

msgr-u3

209.185.128.164

msgr-u4

209.185.128.165

ldt-bud

 

Local Director

 

What are the key directories on the NS/DP/SB?

Directory

Purpose

Key Files

/home/hotmail/messenger

Main executable directories

·      ns or sb

·      server.conf

·     msgradmin

·      coreread

·      logfilter

·      whichns

/home/hotmail/messenger/conf

Configuration Files

 

(Updates Passport files and new mail templates)

·      friend_cur_machines

·      domainmap.txt

·      msgdomain.conf

·      cvr.csv

·      urllist.txt

·      not_allowed.txt

·      (3) F&F mail templates

·      ContestMail.txt (opt.)

/home/hotmail/messenger/temp

Membership directory search results

·      *.fnd files that are deleted every 15 min

/home/logs

Logs

·      ns.log or sb.log

·      ns.out or sb.out

·      ns.err or ns.err

·      ns.pid or ns.pid

·      sbstatxxxx.txt 

·   nsstatxxxx.txt

·   servmon.log or dpl

/home/hotmail/admin

Admin scripts

·      cleanup.fnd.pl (cleans up stale search results)

·      node monitor files

/tmp

Temporary directories

·      a convenient place to put temporary files

·      everyone has permission to read/write (for FTP, etc.)

·      cleaned up at reboot

How do I transfer files to and from the Messenger machines?

You have to do it in a few steps:

1.      From you PC, you can only zmodem files to the law-l2 machine.

  1. From law-l2, you can ftp to any machine in the cloud.  Note that the reverse is not true, i.e. you can’t ftp from another machine to law-l2.  So you must start ftp on law-l2, and then do a “put” or “get” from there.
  2. When putting files from law-l2 to another machine, you normally can only put files under /tmp directory. You will have to mv the file from /tmp to the proper directory after you logon to that machine.
  3. Under normal situation, you should ftp builds and configuration files to msgr-s1, the staging server, and then from there prop it to all the machines.

About the staging server msgr-s1

The staging sever is where builds and configuration files are prop from.  It contains the build tarballs, as well as the latest configuration files.  There are these directories in the msgr-s1 staging server:

Directory

Purpose

Key Files

/home/hotmail/messenger

where the scripts are kept

·      STOP

·      START

·      STATUS

·      UPGRADE

·      RESTART

·      msgrsrv.dat

·      server.conf

/home/hotmail/messenger/builds

where the build tarballs are kept

·      bldxxxx.tar.gz

/home/hotmail/messenger/conf

where the config files are kept

·      all configuration files except for server.conf

/home/hotmail/messenger/temp

where tarballs are untar and unzipped

·      temporary storage

  This server also contains stat scripts that look like the following:

 

Architecture Overview

DP – Dispatch Server

When users initially contact a Messenger server cloud, a Dispatch Server (DP) handles the packet that they send.  A server cloud may have a single Dispatch Server or it may have several of them.  (When there is more than one Dispatch Server, incoming packets are evenly distributed among them by a Cisco Local Director using a round robin strategy.)  The DP server selects (via a hash function that defines the partitioning strategy for Notification Servers,) the Notification Server (NS) that the user should be using, and tells the user to connect to that NS.  The software that’s installed on a Dispatch Server is actually identical to the software installed on a Notification Server (NS) as described below – the difference is in how the machine is used.  (In other words, any NS is capable of being used as a DP.  If you were to hook an NS machine up to the Cisco Local Director, or modify client software to initiate a connection to an NS machine’s IP address, then that machine would automatically start functioning as a DP.)

NS – Notification Server

Users maintain a persistent connection to a Notification Server (NS) for the duration of their session.  When there is more than one NS, users are partitioned across them, such that each user will always be sent to a particular NS.  (The partitioning scheme can be changed, to handle emergencies such as hardware failures, but usually the partitioning scheme remains static.)  When the user first connects to an NS, the NS performs authentication (by talking to an Authentication M-Serve,) and then handles all messaging between the client and the server cloud except for IM sessions.  When users request an IM session, that session is established by the NS on a Switchboard Server (SB), and the client is provided with connection information for connecting to the IM session.  NS Servers are listed in a configuration file (Server.conf) and that file is used as part of the user-partitioning algorithm.  If you were to add a DP machine to the file it would automatically start functioning as an NS because the bits installed on a DP are identical to those installed on an NS.

SB – Switchboard Server

IM sessions take place on the Switchboard Server (SB).  Switchboard Servers announce their availability by multi-casting a message that is received by the NS machines.  When a user is participating in more than one IM session, they will be connected to more than one Switchboard Server session.

Friends U – Friends U-Store

The Friends U-Store is the storage system for MSN Messenger Service.  This is where Messenger specific information is stored including the user’s friendly name, and the user’s lists (forward, reverse, allow, and block).  When we have more accounts than can be handled on a single U-Store, we spread the accounts across multiple U-Stores.  All U-Store servers provide access to their storage via XFS.

Friends M – Friends M-Serve

The M-Serve acts as an index to the U-Stores for all accounts.  The Friends M-Serve acts as an index to the Friends U-store.  When there is more than one Friends M-Serve, each Friends M-Serve is a full replica of the others.  It is possible to lookup a Messenger entry for any person located in any of the Friends U-Stores by searching in any of the Friends M-Serves.  In other words, while the Friends U-Stores are partitioned to each hold a portion of the Messenger accounts, each Friends M-Serve contains a full index for all of the Friends U-Stores.

Hotmail Servers used by Messenger

Auth U – Authentication U-Store

The Authentication U-Store is the storage system for Hotmail.  This storage contains all of the information concerning a Hotmail account including the password.  MSN Messenger Service looks up this password in order to perform user authentication.  All U-Store servers provide access to their storage via XFS.

Auth M – Authentication M-Serve

The M-Serve acts as an index to the U-Stores for all accounts.  The Auth M-Serve acts as an index to the Hotmail storage system.  When there is more than one Auth M-Serve, each Auth M-Serve is a full replica of the others.  It is possible to lookup a Hotmail entry for any person located in any of the Authentication U-Stores by searching in any of the Authentication M-Serves.  In other words, while the Auth U-Stores are partitioned to each hold a portion of the Hotmail accounts, each Auth M-Serve contains a full index for all of the Auth U-Stores.

Postman

The Postman delivers incoming mail to the Auth U-Store.  When the postman places a message in the user’s inbox, it also sends a notification to the user’s Notification Server (NS) informing the NS that the user has received mail.  In turn, the NS sends a message to the client and the user sees a popup notification.

Membership Directory

The Membership Directory is used to find users by name.  When a user tries to add someone by name, the Add Wizard looks for that name in the Membership Directory.  The results of that search (there can be multiple hits) are returned to the user so that the user may choose the person (by name and location) to whom they want to send Friends & Family (F&F) mail.  We do not disclose account information or e-mail addresses from this directory because it would be a violation of the privacy of people who are in the directory.

Messenger Server Cloud

The Messenger server cloud is located in the Hotmail facilities in San Jose.  This server cloud contains the following machines:

Five (5) Sun Ultra 5 machines (located behind a Cisco Local Director) serve as the production cloud’s Dispatch Servers.  The Local Director receives messages for the Internet address “messenger.hotmail.com” and evenly distributes them among the five dispatch servers.  These machines are named MSGR-DP-1 thru MSGR-DP-5.

Eighteen (18) Sun AXMP machines are used as the production cloud’s Notification Servers.  All Messenger accounts are partitioned across 20 Notification servers, named MSGR-NS-1 thru MSGR-NS-18.  The five Dispatch Servers figure out (via a hash function) which of the eighteen Notification Servers each user should be using, and tells the user to connect to that NS.  

Five (5) Sun AXMP machines are used as the production cloud’s Switchboard Servers.  These machines are named MSGR-SB-1 thru MSGR-SB-5.

Two (2) Sun AXMP machines (located behind a Cisco Local Director) serve as the production cloud’s Friends M-Serve machines.  Whenever one of the NS machines wants to communicate with a Friends M-Serve it will contact the address of the Friends M-Serve Local Director.  The Local Director will evenly distribute the requests among the two Friends M-Server machines.  These machines are MSGR-M-1 and MSGR-M-2.

Five (5) Sun E4500 machines are used as the Friends U-Store.  Each of these machines will store data for 20% of the Messenger accounts.  These machines have large RAID disk arrays and Qualstar tape drives for backup.

The production cloud does not have any Auth M-Serve or Auth U-Store machines that are separate from the Hotmail’s computers.  The Messenger production server cloud will communicate directly with the Auth M-Serve and Auth U-Store machines that are maintained by Hotmail.  Since Hotmail maintains those machines, they are not described here.

After the MSN Messenger test team signs off on a build as being high-enough quality to deploy, our operations staff runs the scripts necessary to move the build from MMSDNS onto the production machines.  This requires several steps, because we’re moving code from outside the Hotmail facility onto machines that are located behind the Hotmail firewall.

Setup and Machine Preparation

DP and NS

  1. Follow the SunOS 2.6 Install for standard Hotmail Backend machines.
  2. Install the following applications:
    1. GZIP: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/gzip-1.2.4-sol26-sparc-local
    2. TCSH: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/tcsh-6.07.02-sol26-sparc-local.gz
    3. PERL5: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/perl-5.005_02-sol26-sparc-local.gz
    4. TOP: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/top-3.5beta8-sol26-sparc-local.gz
    5. TRACEROUTE: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/traceroute-1.4a5-sol26-sparc-local.gz
    6. System Patches “Generic_105181-14”: ftp://sunsite.unc.edu/pub/sun-info/sun-patches/2.6_Recommended.tar.Z 
    7. Install Qmail, configure for outbound only

3.      Create directory

a.      /home/hotmail

b.      /home/hotmail/messenger

c.      /home/hotmail/messenger/temp

d.      /home/logs

4.      With root privileges, add into /etc/system the line

set tcp:tcp_conn_hash_size=32768

set tcp:tcp_close_wait_interval=60000

set tcp:tcp_keepalive_interval=600000

  1. <TBD: Need to install standard admin scripts>
  2. <TBD: Need to get server build.tar.gz>
  3. <TBD: need to get cron jobs and scripts>
  4. verify the following script:

  

SB

Switchboard machine is almost identical to Dispatch/Notification machines:

  1. Follow the SunOS Install for standard Hotmail Backend machines
  2. Install the following applications:
    1. GZIP: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/gzip-1.2.4-sol26-sparc-local
    2. TCSH: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/tcsh-6.07.02-sol26-sparc-local.gz
    3. PERL5: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/perl-5.005_02-sol26-sparc-local.gz
    4. TOP: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/top-3.5beta8-sol26-sparc-local.gz
    5. TRACEROUTE: ftp://sunsite.unc.edu/pub/solaris/freeware/sparc/2.6/traceroute-1.4a5-sol26-sparc-local.gz
    6. System Patches “Generic_105181-14”: ftp://sunsite.unc.edu/pub/sun-info/sun-patches/2.6_Recommended.tar.Z 

3.      Create directory /home/hotmail, /home/hotmail/messenger

4.      With root privileges, add into /etc/system the line

set tcp:tcp_conn_hash_size=32768

set tcp:tcp_close_wait_interval=60000

set tcp:tcp_keepalive_interval=600000

  1. <TBD: Need to install standard admin scripts>
  2. <TBD: Need to get server build.tar.gz>
  3. <TBD: Need to get standard configuration files for SB>

Friends M-serv

Follow standard Hotmail M-serv installation instructions

Friends U-store

Follow standard Hotmail U-store installation instructions

Q: Does the standard U-store installation include Veritas?

Q: what is the standard inode size used?

Network Setup[WYL1]

The following are ports that need to be enabled:

From

To

Port #

Requirement

The following are allowed for incoming connections from Public Internet:

Client

DP

1863 (NSPort)

Allow incoming connections from Internet

Client

NS

1863 (NSPort)

Allow incoming connections from Internet

Client

SB

1863 (SBPort)

Allow incoming connections from Internet

The following are allowed for outgoing connections to Public Internet:

NS

Public Internet

25? (SMTP)

Allow Qmail to send outgoing Email

The following are used for connections to/from Hotmail or other HM-operated sites:

NS

Authentication Mserv (at Hotmail and other sites)

901-916 (Mserv Ports)

Queries to Authentication Mserv

NS

Authentication Ustore (at Hotmail and other sites)

??? (XFS)

XFS transaction to Authentication Ustore

Postman (or XFS 2.0) at Hotmail email sites

NS

??? (NSEmailNotificationPort)

Postman generated notifications to NS (UDP)

NS

Membership Directory

10017 (MemdirPort)

For NS to query the Membership Directory

The following are used only within the Messenger sites, and should be blocked from Internet and Hotmail/HM-operated sites:

NS

Friends Mserv

901-916 (Mserv Ports)

Queries to Friends Mserv

NS

Friends Ustore

??? (XFS)

XFS transaction to Friends Ustore

NS

NS

??? (NSNSPort)

Inter-server communication

NS

SB

??? (NSSBPort)

Inter-server communication

SB

NS

??? (SBNSPort)

Inter-server communication

Any Messenger server

Any other Messenger server

??? (MulticastStatusPort)

System-wide status and commands (Multicast)

Servers within Messenger site

NS

??? (NSSuperUserPort)

For admin purposes

Servers within Messenger site

SB

??? (SBSuperUserPort)

For admin purposes

The following are special ports used for administrative purposes:

Terminal Servers

Any Messenger Server

???

For admin purposes

???Nodemon???

 

 

 

Other admin stuff

 

 

·   FTP for builds?

·   Rdist of files and passwd?

·   clock-sync

Configuration Files

server.conf   (updated 6/26/00)

filename

server.conf

usual location

/home/hotmail/messenger

servers

NS, SB, DP

referenced by

 

startup script, e.g. “ns /home/hotmail/messenger/server.conf”, and many utilities

purpose

system-wide configuration file; where NS hash table is defined; must be synchronized on all machines in system

The entries in sever.conf are:

Parameter Name

Parameter Value

Description / Usage

Detach

yes | no

Whether the server process should be forked from the shell process.

Default should be yes.

AbortOnLogAssert

yes | no

Whether the process just dies when a log assert occurs, should only be yes in a test environment

 

Default should be no

ServerBackwardCompatibility

yes | no

If set to yes, means server is running in M 4.2 mode as opposed to M5 mode

LogLevel

DEBUG | SPEW | ERROR | WARNING | INFO | REFCOUNT | PROTOCOL | SOCKET

Various level of logging.

Default should be DEBUG.

MaxLogFileSize

number of bytes

Maximum size of the application Log Files (not to be confused with sys log)

Default should be 3000000

MaxOutFileSize

number of bytes

NS reporting the current state of the threads (tool to figure out how stressed servers are)

 

Default should be 25000000

MaxNumConnections

number of sockets

Number of maximum Simultaneous Online Clients (SOC) that the server can adjust to without restarting.

 

Default should be 100000

NumConnections

number of sockets

This is the number of SOC that we use when the server is started up.  This can be adjusted through the su port.  This parameter cannot exceed the MaxNumConnections.

 

Default should be 50000

ThreadQueueSize

number of requests

The max queue size for requests to use the thread pool.

Default value should be slightly more than NumConnections

MaxCommandsToProcess

number of commands

Number of commands to process in one iteration.

AuthProxy

Yes | no

Whether to use proxy to access Passport Auth servers.

 

Reserved for testing purposes; always set to NO

RouterProxy

Yes | no

Whether to use proxy to access Notification Routers

 

Reserved for testing purposes; always set to NO

Proxy

IP address

The proxy’s IP address

 

Reserved for testing purposes; set to 0.0.0.0

ProxyPort

Port Number

The proxy’s port number

 

Reserved for testing purposes; set to 0

PassportDomainMap

file path and name

file path/location of partner.xml file obtained from passport.

EmailPassportDomain

domain name

This is the domain to use to authenticate e-mail passport holders (default is passport.com)

KeyFile

file path and location

This is the file path and location of the Messenger Key file.  This is the file that contains the passport issued key for messenger.

SiteID

numeric

This is our passport assigned Site ID for Messenger

PassportLoginTimeOut

time in sec.

This is how long the NS waits in sec before timing out in authentication requests.

MaxAuthConnectionsPerDomain

number of connctions

This is the maximum number of authentication connections that can be made to each domain at once.

EmailPassportEnabled

yes | no

Setting this to NO allows M5 to be deployed before e-mail passport is released by Passport team.

RouterConfigFile

file path and name

This is the location of router.conf which is used to communicate with the notification router for paging.

UstoreConnectionsBuffer

number of requests

Number of requests that can be queued up to look at a u-store

MinConnectionsPerUstore

number of connections

Number of connections that the server keeps open to the u-store

MaxListSize

Number of contacts

Number of contacts that can be kept on your buddy list

CvrFile

file path and name

Path/name of the cvr.csv file, used by the auto-upgrade process

FFTemplateFile

file path and name

Path to the RecruitMailTable.csv, which specifies which e-mail template to use for different languages.

InvalidNameFile

file path and name

Specific the file that contains words that are not allowed to be used in the user’s friendly name.

URLListFile

file path and name

Identifies the URLs that Messenger auto-logon will use for different domains.

This file is used for legacy support of MSNP2 clients) For newer clients, th same information is stored in msgdomain.conf

MaxAllowedCommandsPerMinute

number of commands

Total number of commands per minute that the client can send to the server

MaxAllowedSNDCommandPerMinute

number of commands

Number of friends and family mails that a client can send per minute

MaxAllowedCHGCommandPerMinute

number of commands

Number of status changes that a client can make per min

MaxAllowedPAGCommandPerMinute

number of commands

Number of mobile pages a client can send per minute.

PostmanPort

Port number

The port that is utilized to talk to the postman at hotmail (e-mail notifications)

WebTVConfigFile

file path and name

path for the config file that contains information for interoperating with webtv

StatsServiceBufferSize

size (in bytes)

Size of the stat log buffer

MaxStatFileVersion

number 

Number of versions of the stat file that can be generated, before the version number wraps around

StatFileSizeLimit

size (in bytes)

Maximum size of the stat service logs

MemdirIP

IP address

IP address of the Membership Directory Messenger uses for queries

MemdirPort

port number

Port number used to query  Membership Directory

TemporaryDirectory

directory path

Directory used to store temporary files generated during a Membership Directory query

MessengerDomainMapFile

file path and name

Path/name of the msgdomain.conf  used to identify the following information for each domain: 1) domain index number 2) the Friends Mserv 3) whether e-mail notification is supported for the domain 4) which friends u-store to use 5) the url's for autologin

CurDiskFile

file path and name

Path/name of the currdisk file on the Ustore The currdisk file is used to find out where on the Ustore one should create new friends file (e.g. s1\t1, etc.)

HotmailDomainMapFile

file path and name

Path/name of domainmap.txt, used to identify which domain uses which Authentication Mserv.  Only domains that have Authentication Mserv are listed in this file.

KeepAliveTimeout

time (in sec)

The time (in seconds) after which the server will start sending  “pings” to the client to see if it is still there.

CloseWaitInterval

time (in sec)

The time after which the server will clean up a closed socket. DONT set this to below 60 seconds.

TcpIpAbortInterval

time (in sec)

The time after which TCP/IP stops waiting for an ACK from the other end.

DelayedWriteInterval

time (in sec)

Amount of time to wait before committing a cached Ustore write to disk.  

MachineN (N = 1, 2, 3 …)

IP Address and Machine Type

e.g. “Machine1: IP=1.2.3.4, NS=1”

Each NS in the system must have one of these “MachineN” entry.  The purpose is to list all the NS that are being used in the system, so that every machine’s calculation of the hash is identical.  This is the main (but not only) reason why the server.conf file must be identical throughout the system.

Note that DPs, while they are running NS executables, are not listed here, because they do not receive and handle any hash buckets.

Note also that SBs are listed here for reference, but SBs can be added dynamically, as each SB broadcast its availability to the system via multicast status messages.

NSPort

 

port number

Port used by the NS to communicate with the clients.

IANA has assigned 1863 as the client-server port for Messenger.

NSSuperUserPort

 

port number

Port number used by the NS for superuser commands

WebInterfacePort

port number

Port for User Watch Count and Web Add

NSNumThreads

 

number of threads

Number of threads spawned by the NS

NSPermittedAddress

 

IP address mask

<not yet implemented???>

Limits the address range an NS could be located, a security measure.

The notation is masked IP address, “/”, IP mask.  E.g. 127.0.0.1/255.255.255.255

NSLogFile

 

file path and name

The location and name of the NS log file.  The NS Log File is the chief logging mechanism of the NS, and is limited to the size of MaxLogFileSize

NSPidFile

 

file path and name

The location and name of the NS PID file, which stores the process ID of the NS process.  This file is very small.

NSOutFile

 

file path and name

The location and name of the NS out file.  This file grows very slowly or not at all.

NSErrFile

 

file path and name

The location and name of the NS err file.  This file grows very slowly or not at all.

NSEmailNotificationPort

 

port number

Port number where Postman will deliver new email notifications

MSNP2ProtocolEnabled

yes | no

whether the server supports MSNP2

MSNP3ProtocolEnabled

yes | no

whether the server supports MSNP3

MSNP4ProtocolEnabled

yes | no

whether the server supports MSNP4

SBPort

 

port number

Port used by the SB to communicate with the clients.

IANA has assigned 1863 as the client-server port for Messenger.

SBSuperUserPort

port number

Port number used by the SB for superuser commands  

SBNumThreads

 

number of threads

Number of threads spawned by the SB

SBLogFile

 

file path and name

The location and name of the SB log file.  The SB Log File is the chief logging mechanism of the SB, and is limited to the size of MaxLogFileSize

SBPidFile

 

file path and name

The location and name of the SB PID file, which stores the process ID of the SB process.  This file is very small.

SBOutFile

 

file path and name

The location and name of the SB out file.  This file grows very slowly or not at all.

SBErrFile

 

file path and name

The location and name of the SB err file.  This file grows very slowly or not at all.

SBMaxSessions

number of IM sessions

Maximum number of simultaneous switchboard connections

SBMaxSessionsPerUser

number of IM sessions

Maximum number of IM sessions that one client can be in.

SBMaxUsersPerSession

number of IM sessions

Maximum number of users that can be in an IM session

PSPort

port number

Port used by the CS to talk to the PS

NOTE: this port needs to be filtered from external access

CSSBPort

port number

Port used for CS to SB communications

NOTE: this port needs to be filtered from external access

PSNRPort

 

port number

Port used for PS to NRouter communications

NOTE: this port needs to be filtered from external access

MulticastStatusPort

 

port number

Port used by the entire system when multicasting status and other communications

msgdomain.conf (updated 6/26/00)

filename

msgdomain.conf

usual location

 

/home/hotmail/messenger/conf

server

 

NS

referenced by

 

MessengerDomainMapFile in server.conf

purpose

 

domain map to the Messenger Friends Mserv

 

msgdomain.conf is a configuration file that contains information on how to handle these features for the Messenger-enabled domains:

·         Friends file storage (Mserv and Ustore)

·         Email Notification

·         Inbox Integration

An entry in server.conf, MessengerDomainMap, reference the location and name of the text file used as MessengerDomainMap.

msgdomain.conf is Messenger specific and is not used anywhere else in the Hotmail site.  This file needs to be created and modified by the Messenger administrators as appropriate.

The file format is very similar to that of the domainmap.txt:

#Version 1.0

#

#IMPORTANT: The pairing of DomainName and DomainIndex is permanent barring code changes.

#

#The format of each line is:

#DomainName   DomainIndex   MServIPAddress       EmailNotify   Friend U-Store       inbox folders message compose emailnotification

hotmail.com               209.185.128.169                  209.185.128.162      /cgi-bin/HoTMaiL /cgi-bin/folders /cgi-bin/getmsg /cgi-bin/compose /cgi-bin/getmsg?msg=%s&start=%s&len=%s\n

Because this is the counterpart of domainmap.txt, there are two key issues:

  1. Each domain that is documented on domainmap.txt needs to be listed in msgdomain.conf, and vice versa.
  2. With the exception of hotmail.com, all domain index used in msgdomain.conf needs to match that used by the domainmap.txt. 

For example, if “fubar.com” is listed as “@99” in domainmap.txt, then fubar.com must also appear in msgdomain.conf, and must have a DomainIndex of 99.

The fields in msgdomain.conf are:

Parameter Name

Description / Usage

DomainName

Name of the domain.  E.g. “msn.com”

DomainIndex

The short-hand notation of the domain, as agreed to by the entire site.   If “user@fubar.com” is referred to as “user@99” in the short-hand notation, then fubar.com’s DomainIndex is 99.

A special case is hotmail.com.  Hotmail.com users are special-cased in the domainmap.txt so that “user@hotmail.com” is just referred to as “user”.  Thus hotmail.com is generally thought of as having no DomainIndex.  Nonetheless, for consistency hotmail.com is listed as DomainIndex 0 in msgdomain.conf.

MServIPAddress

The IP address of the Friends Mserv handling this domain.

EmailNotify

0 – for domains that are not email enabled, i.e. email is not based on Hotmail technology

1 – for domains that are based on Hotmail’s passport technology (e.g. domessengerlogin CGI), and are Hotmail email enabled

2 – for domains that are based on Passport’s authentication technologies, but do have Hotmail email enabled.   E.g. Mariner devices for Msn.com

Friends U-store

Indicates which Friends Ustore should stores newly created Friends Files. 

inbox

The URL that leads to the Inbox web page.  This is sent to the client when requested, so that the client can request this page via the Passport Auto-login feature.

folders

The URL that leads to the Folders page in the Inbox.  This is sent to the client when requested, so that the client can request this page via the Passport Auto-login feature.

compose

The URL used to go to the web page allowing the user to compose a new email.  This is sent to the client when requested, so that the client can request this page via the Passport Auto-login feature.

emailnotifications

The URL used to go to launch to a particular Email message, in the case that the message has just been received by the Inbox.  This is sent to the client when requested, so that the client can request this page via the Passport Auto-login feature.

domainmap.txt (updated 6/26/00)

filename

domainmap.txt

usual location

/home/hotmail/messenger/conf

servers

NS

referenced by

HotmailDomainMapFile in server.conf

purpose

domain map to the Hotmail Authentication Mserv

domainmap.txt is the mechanism in which Messenger finds the Authentication Mserv handling a particular domain, such as hotmail.com, etc.  In the Passport era, the Mserv is no longers uses domainmap.txt for authentication purposes, but are used for the following purposes:

·        UserWatchCount feature, in which the Mserv bit flags are set/unset depending on whether the UWC should be shown

·        WebTV, to serve as a way to check whether an WebTV user exists

·        In cases where the Passport authentication did not provide a Magic Cookie, allow the Messenger server to identify which Authentication Ustore stores the user’s email

Note that only domains supported by Hotmail, plus the WebTV domain, should be listed in domainmap.txt

The domainmap.txt has the following format:

#This file will be dropped from HotMail

#Version 1.1

#The separator is "::"

#The fields are

#Domain_Name    Domain_Map_String   MServ_TCP_Address    MServ_UDP_Address

hotmail.com           ::            ::207.82.250.232     ::209.1.112.251

msn.com               ::@2          ::64.4.5.148         ::209.1.112.251

webtv.net             ::@3          ::209.185.128.50     ::209.1.112.251

Only the Domain_Name, Domain_Map_String, and MServ_TCP_Address fields are used.  The MServ_UDP_Address field is not currently used by Messenger.

cvr.csv  (updated 6/26/00)

filename

cvr.csv
usual location

/home/hotmail/messenger/conf

server

NS

referenced by

CvrFile in server.conf

purpose

client auto-upgrade matrix

cvr.csv tells the clients when there are any auto-upgrades available.  The CVR process goes as follows:

1.      When an NS is started, it loads the cvr.csv file in memory.  (Note that once loaded, any changes in cvr.csv file won’t take effect until a refreshconfig is issued to the servers.)

2.      When a client logon to an NS, the client identifies itself to the server.  It indicates among other things its client version number, the OS name, the processor type, the language version, etc.

3.      The NS refer to the CVR mapping it has in memory and find the best-match entry, and return it to the client.

4.      If there are newer versions of the client available, the client prompts the user to take the upgrade.

An cvr.csv file consists of multiple rows, with each row consisting of these columns:

Parameter Name

Description / Usage

client-lcid

These parameters are input parameters to the CVR process, used by the NS in the table lookup.

They respectively identifies these characteristics of the client: language ID, name of the OS, version of the OS, the processor type, name of the client, and the version of the client.

os-name

os-version

processor-architecture

client-name

client-version

brand id

This is to be used in the future for partnering

latest-version

These three parameters are returned to the client to indicate if any new versions with the same characteristics (e.g. Win32+x86+English) are available.

At present, the latest-version and current-version means the same thing: the latest or most current version of the client available that matches the version the user is currently using.  If the latest-version/current-version is higher than that of the user’s client, then the client will prompt the user to do an optional auto-upgrade.

The min-version is the minimum version necessary to logon to the server.  It is used to force a mandatory upgrade in the case of protocol changes, serious client bugs, or to obsolete support of old clients.

current-version

min-version

URL1

This is the primary URL the client will use to download the latest client bits.  The Messenger client will automatically retrieve the file, shutdown, install the new client, and restarts.

Note that it is assumed that if a user decides to upgrade, they are always upgraded to the latest version available.

URL2

In case the primary URL fails, this secondary URL is launched inside a browser, for the user to manually retrieve the latest client bits.

Note:  The easiest way to manipulate the cvr.csv file is to use Excel.  Excel will open the file as a spreadsheet, allowing easy manipulation and perusal of the file.

partner.xml  (updated 6/26/00)

filename

Partner.xml

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

PassportDomainMap in server.conf

purpose

Contains URL that Messenger uses to authenticate against each domain.  Also URL to launch to Passport Member Services.

Partner.xml is issued by Passport, and is the mechanism in which passport broadcast information to its partner site.   Of the vast amount of information in Partner.xml, Messenger uses only the following XML tags:

  • <DefaultSiteId>
  • <MD5Auth>
  • <MD5Silent>
  • <PassportInformationCenter>

messenger.key  (updated 6/26/00)

filename

Messenger.key

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

KeyFile in server.conf

purpose

Passport assigned Messenger key

This file contains the Passport key assigned to Messenger.  There can be multiple key version in this file.

router.conf  (updated 6/26/00)

filename

Router.conf

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

RouterConfigFile in server.conf

purpose

A config file used by Notification Router, but also used by the NS to interoperate with Notification Router, for Mobile paging, etc.

This is the config file for mobile notifications that lives on the PS 

Entry Name

Parameter Value

Description Usage

MobileEnabled

yes | no

Flag to turn on Mobile paging feature.

MobileSignupURL

URL

URL to launch when user wants to signup for Mobile capability

MobileChangeURL

URL

URL to launch when user wants to change his Mobile settings

MaxRouterConnections

Number of sockets

Maximum number of Notification Routers that can talk to one PS, for notification incoming to the PS

MobileSiteID

Number

The Site ID for Mobile

MobileTemplate

XML String

A XML template for posting notification to the NRouter

webtv.conf  (updated 6/26/00)

filename

Webtv.conf

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

WebTVConfigFile in server.conf

purpose

WebTV shared key

This file contains the shared key with WebTV.

recruitmailtable.csv  (updated 6/26/00)

filename

RecruitMailTable.csv

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

FFTemplateFile in server.conf

purpose

Indicates how FF mail templates are used

This table allows the specification of these scenario variations:

·        LCID (Language and Country ID)

·        Client identification string (used for identifying different versions of the client, such as co-branding, etc.)

For each of the above scenario, it identifies the following:

·        Email template used for the above variations

·        The preview Email web page used for the variations.

email*.eml  (updated 6/26/00)

filename

*.eml

usual location

/home/hotmail/messenger/conf

Server

NS

referenced by

CvrFile in server.conf

Purpose

Email templates for referral mail

These templates are used by NS to send out referral email (aka Friends & Family mail.)  The bulk of the text in these templates are used verbatim, while the following parameters are substituted with real user information:

Entry Name

Description / Usage

${usermail}

Sender’s fully-qualified email address

${userfriendly}

Sender’s friendly name

${contactmail}

Destination email address

${brandname}

Brand Name string, from RecruitMailTable.csv

${ProductURL}

Preview Email URL, from RecruitMailTable.csv

${currentdate}

Date string

urllist.txt   (updated 6/26/00)

filename

Urllist.txt

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

UrlListFile in server.conf

purpose

Legacy support for MSNP2 clients: URLs to autologin to Hotmail, etc.

This file contains URL for MSNP2 clients.  (For MSNP3 and above clients, the same information is stored in msgdomain.conf.) These URLs are used to autologin a client so that they can go to the Hotmail Inbox, etc.

Entry Name

Description Usage

Password

Launching to the password page (???)

inbox

Launching to the Inbox page

person

Launching to the Personal Info page (???)

folders

Launching to the Hotmail Folders page

message

Launching to a particular email message in the Inbox

compose

Launching to a Compose email page

login

???

emailnotification

Launching to a email received during email

not_allowed.txt  (updated 6/26/00)

filename

Not_allowed.txt

usual location

/home/hotmail/messenger/conf

server

NS

referenced by

InvalidNameFile in server.conf

purpose

List of words not allowed to be used in friendly names

This file is a list of words that aren’t allowed to be used in friendly names.  Highly recommended read.

 

Monitoring

Hotmail already uses a monitoring tool named “Node Monitor”.  

To see node monitor in action, set your proxy to “svlproxy: 80”, and view the page <http://simple1.hotmail.com/nodemonitor/>

Node Monitor consists of perl scripts.  The scripts that run on each server (a.k.a. agents,) collect data and make it available at specified ports (or via named pipes.)  Another set of perl scripts called the “console” collects data from the various agents, looks for exceptional conditions in that data, and then publishes the results as a set of web pages.

Node Monitor provides the following information:

1.         Real-Time Alerts: Emergency situations that requires immediate operator attention.

2.         Warnings: Abnormal statuses that do not require immediate attention

3.         Status: Data that has been collected from the Node Monitor agents and that is displayed in numeric and/or alpha (OK/Not OK) format.

Node Monitor will handle the Messenger servers as a separate geographic site, and will provide data on each of the following five server types:

·         Messenger user servers are the Friends U-stores.  These servers will run the same Node Monitor agents as the Auth U-Stores deployed within the Hotmail site.

·         Messenger database servers are the Friends M-serves.  These servers will run the same Node Monitor agents as the Auth M-Serves deployed within the Hotmail site.

·         Messenger dispatch servers are the DP servers.  These DP machines are positioned behind local directors, and are fully redundant.  MSN Messenger Service will function even if only one DP is operational.  An alert will be generated only when one or more DP machines approach maximum capacity, or when all of the DP machines are down. 

·         Messenger notification are the NS servers.  Since there is no redundancy in these servers, any individual server failure will trigger a real-time alert.

·         Messenger switchboard are the SB servers.  These servers are dynamically load balanced by the Messenger architecture, and alerts will be generated only when one or more SB machines approach maximum capacity, or when one or more servers are down. 


The following Node Monitor services are being added specifically for MSN Messenger Service:

For all servers:  We’ll calculate (and display) a CRC checksum for the config file on each server.  (Note: When we have time, we’d like to add the ability to generate an alert when the checksum doesn’t match the checksum from a “reference” machine.)

srvc_ns: (available on all NS machines)

Example:

+OK ns 1.0.0838 Jun 15 1999 17:37:54 up 0d 18h 7m

SOC: 49586

SOCMin: 0

SOCMax: 49666

Logons: 61558

Logoffs: 14145

FailedAuths: 0

StatusChanges: 332469

UnreadEmails: 61544

EmailNotifys: 0

ContactsAdded: 260316

ContactsRemoved: 10452

Searches: 0

SearchHits: 0

FFMails: 0

LastReset: 929507616 = Tue Jun 15 21:33:36 1999

+OK ns

build number, date, time, and uptime

SOC:

simultaneous online connections (red when >45000)

SOCMin:

smallest SOC since last reset (always zero)

SOCMax: 

largest SOC since last reset

Logons:

count of logons since last reset

Logoffs:

count of logoffs since last reset

FailedAuths:

count of failed authentications since last reset

StatusChanges: 

count of user status changes (busy, away, etc.) since last reset

UnreadEmails:

count of emails reported to users at logon

EmailNotifys:

count of incoming emails while logged on

ContactsAdded:

count of contacts added since last reset

ContactsRemoved:

count of contacts deleted since last reset

Searches:

count of searches conducted from within the Add Wizard

SearchHits:

count of successful hits for search activity

FFMails:

count of Friends & Family mail pieces sent

Last Reset:

date and time of last reset

 

srvc_sb: (available on all SB machines)

Example:

+OK sb 1.0.0833 Jun  8 1999 19:35:22 up 0d 0h 0m
SOC: 1
SOCMin: 0
SOCMax: 1
Logons: 0
Logoffs: 0
FailedAuths: 0
Sessions: 0
SessionsMin: 0
SessionsMax: 0
SessionsTotal: 0
Messages: 0
MessageBytes: 0.0
MessageLengthMin: 2000
MessageLengthMax: 0
LastReset: 928895861 = Tue Jun  8 19:37:41 1999
 

+OK sb

build number, date, time, and uptime

SOC:

simultaneous online connections

SOCMin:

smallest SOC since last reset (always zero)

SOCMax:

largest SOC since last reset

Logons:

count of logons since last reset

Logoffs:

count of logoffs since last reset

FailedAuths:

count of failed authentications since last reset

Sessions:

current number of active messaging sessions 

SessionsMin:

smallest session count since last reset

SessionsMax:

largest session count since last reset

SessionsTotal:

count of messaging sessions since last reset

Messages:

count of messages sent since last reset

MessageBytes:

count of message bytes sent since last reset

MessageLengthMin:

smallest message sent since last reset

MessageLengthMax:

largest message sent since last reset

LastReset:

date and time of last reset

 

Configuring Syslog  (updated 6/26/00)

1.      Add the following line to /etc/syslog.conf:

local4.notice                                   /home/logs/messenger.log

2.      Touch a file on each box in /home/logs called messenger.log, to create it for the first time.

3.      Restart the syslogd process for the change to take affect

kill -HUP `ps -ef | grep "syslogd" | grep -v "grep" | awk '{print $2}'`

4.      All messenger syslog event will now go to /home/logs/messenger.log NOT /var/adm/message

 

Regular Maintenance

Shut Down Individual Machines

Shutting down a sever is accomplished by the tool msgradmin, which sends commands to the server specified.

For shutting down an NS:

Before an NS can be shutdown, users who are already on that NS needs to be informed of the shutdown ahead of times.  A script will be provided [TBD] that will shutdown the machine gradually:

# warn clients that NS will be down in 30 minutes

msgradmin server.conf systemcomingdown 157.55.8.11 30

sleep 900                                               

# warn clients that NS will be down in 15 minutes

msgradmin server.conf systemcomingdown 157.55.8.11 15

sleep 600

# disable further login

# warn clients that NS will be down in 5 minutes

msgradmin server.conf disablelogins 157.55.8.11

msgradmin server.conf systemcomingdown 157.55.8.11 5

sleep 240

# warn clients that NS will be down in 1 minute

msgradmin server.conf systemcomingdown 157.55.8.11 1

sleep 60

# orderly shutdown

masgradmin server.conf shutdown 157.55.8.11

A warning message is sent to the client at 30, 15 minutes before shutdown.  At 5 minutes before, the script disable further logins, and warn again.  At 1 minute, the last warning is given, and at 0 minutes the system is shutdown in an orderly way.

Note: Shutting down a NS will cause outages for a segment of the user population.  If the NS needs to be shut down for a lengthy piece of time, the system should be configured to function with one less NS.  See Removing an NS for more details.

For shutting down an DP, the procedure is much the same, except that DP can be shutdown in one step as client do not stay logged on to the DP for very long:

# orderly shutdown

msgradmin server.conf shutdown 157.55.8.23

For shutting down an SB, [???is there any pre-announcement of shutdown in SB???]

Shut Down the Entire System

Shutting down the entire system is similar to shutting an individual machines.  Because of the involvement of the NS, which has live users connected, the gradual warning/shutdown script needs to be used:

# warn clients that the system will be down in 30 minutes

msgradmin server.conf systemcomingdown all 30

sleep 900                                               

# warn clients that the system will be down in 15 minutes

msgradmin server.conf systemcomingdown all 15

sleep 600

# disable further login

# warn clients that the system will be down in 5 minutes

msgradmin server.conf disablelogins all

msgradmin server.conf systemcomingdown all 5

sleep 240

# warn clients that the system will be down in 1 minute

msgradmin server.conf systemcomingdown all 1

sleep 60

# orderly shutdown

msgradmin server.conf shutdown all

The primary difference here is the use of all in the msgradmin command.

Start Up Individual Machines

The sample scripts S98msgrDPNS, and S98msgrSB are provided for use in starting the servers.  The S98msgrDPNS is deployed on DP and NS machines, and S98msgrSB are deployed on SB machines.

Command line to start a server is S98msgrDPNS start or S98msgrSB start.  The appropriate configuration files are used in each case.

Note: S98msgrDPNS and S98msgrSB should be placed in /etc/rc on each machine, so that the machine will come up automatically if power is recycled.

Start Up the Entire System

A sample script STARTALL is provided, which will read each machine in the file msgrsrv.dat, and start each of the machine accordingly.

Upgrading Builds

Upgrading builds will be done from a staging server dedicated to the task, with the file rdist’ed to the appropriate servers.

A sample script UPGRADE is provided, with the following syntax:

UPGRADE scope [buildnum] [configdir]

where:

·         scope is the scope of the upgrade. It can be [ALL | ALLDPNS | ALLSB | MachineName]

·         (optional) buildnum is the build number, such as 0833.  If omitted, the script will default to the build with the largest build number

·         (optional) configdir is the directory where a set of configuration files can be found.  If omitted, the script will default to reading the configuration files from the current directory.

UPGRADE takes the tarball specified by the buildnum and un-tar it into a temporary directory.  These will be the binaries and executables distributed.

UPGRADE also takes a set of configuration files from configdir and uses them as the configuration files to distribute.

UPGRADE references the file msgrsrv.dat, which lists all the machines and their types in the Messenger system.  UPGRADE uses this information to determine what machines should be updated when ALL, ALLDPNS, and ALLSB are specified.

The upgrade process is as follows:

1.                  Get server tarball from Messenger development team (method TBD).  The file will have file name of bld0833.tar.gz, where 0833 is the build number.

2.                  Get configuration files from Messenger development team if the new build require a new configuration file.  (Issue: We may package them into the tarball in the future)

3.                  Put the build on staging:/home/hotmail/messenger/

4.                  Placed the set of configuration files in a directory, such as staging:/home/hotmail/messenger/ and edit them as necessary.

5.                  Before upgrade, perform the necessary shutdown of the machines that will be upgraded (see shutdown)

6.                  Run UPGRADE with the appropriate scope.  The script will pushes the necessary binaries and configuration files to the machine(s) specified, place them where they are needed, and name the files as necessary.

7.                  After upgrade, perform the necessary steps to restart the machines upgraded (see startup)

8.                  Verify that the machine is back in service in Node Monitor (see health check)

9.                  ???Run some basic verification tests???

Rolling Back Builds

Rolling back builds is almost identical to the upgrade.  When running UPGRADE script, specify a particular directory where a corresponding set of configuration file is kept. 

The rollback process is as follows:

1.                  Verify that the build number of the server tar is available on staging:/home/hotmail/messenger

2.                  Place the set of configuration files in a directory.  This directory need not be the same as the one used in upgrades. (Issue: We may package them into the tarball in the future)

3.                  Before rollback, perform the necessary shutdown of the machines that will be rolled back (see shutdown)

4.                  Run UPGRADE with the appropriate scope.  The script will pushes the necessary binaries and configuration files to the machine(s) specified, place them where they are needed, and name the files as necessary.

5.                  After rollback, perform the necessary steps to restart the machines affected (see startup)

6.                  Verify that the machine is back in service in Node Monitor (see health check)

7.                  ???Run some basic verification tests???

Adding/Removing NS

The key to adding or removing NS from the system is to remember that NS are not redundant. 

Any NS being removed from the system means a certain segment of the user population is not being served, until the system-wide configuration is adjusted.  Adjusting the system-wide configuration means changing server.conf so that the entire system knows that there is one less NS, and restarting the entire system, including all DP’s, NS’s, and SB’s.  This is a very expensive proposition.

Adding an NS is a similar expensive propostion, because the system-wide configuration needs to be adjusted so that every server knows about the new server.

Adding an NS requires these steps:

1.                  Make sure that the new NS is ready for receiving server build and configuration files. (See Setting up NS)

2.                  On the staging server, edit the server.conf configuration file to add an additional line in the Machine section, corresponding to the IP address of the NS being added

 

Machine1: IP=209.185.128.100, NS=1

Machine2: IP=209.185.128.101, NS=1

Machine3: IP=209.185.128.102, NS=1    #new line for the new NS

 

3.                  Perform a system wide shutdown (see shutdown the entire system)

4.                  Run UPGRADE ALL bldnum, where buildnum is the build version to use.  This will distribute the configuration files to the entire system as well as making sure that all NS have the same version of the software.

5.                  The system will now have a server.conf that has one more NS than previously.

6.                  Restart the entire system (see startup the entire system)

7.                  Verify that the system is back in service in Node Monitor (see health check)

8.                  ???Run some basic verification tests???

Removing an NS requires these steps:

1.                  On the staging server, edit the server.conf configuration file to remove a line that matches the NS being removed:

 

Machine1: IP=209.185.128.100, NS=1

Machine2: IP=209.185.128.101, NS=1     # removed this NS

Machine2: IP=209.185.128.102, NS=1     # renumbered this to 2

 

2.                  Perform a system wide shutdown (see shutdown the entire system)

3.                  Run UPGRADE ALL bldnum to distribute the configuration files to the entire system.  The script will pushes the necessary binaries and configuration files to all the machines in the system.

4.                  The system will now have a server.conf that has one less NS than previously.

5.                  Restart the entire system (see startup the entire system)

6.                  Verify that the system is back in service in Node Monitor (see health check)

7.                  ???Run some basic verification tests???

Adding/Removing DP

DP are servers that services the DNS name messenger.hotmail.com.  They are completely redundant as they are virtualized behind a Local Director.  The only concern when adding or removing a DP are:

·         overall DP load after the change will be at an acceptable level

  • that the DP has the same configuration file (particularly the “machine map”) as the rest of the machines in the system

For adding a DP:

1.      Make sure that the DP has the same build and has the same configuration files as the rest of the system. (See upgrade.)  Note that DP does not need to be listed in the configuration file server.conf.

  1. Start up the DP (See startup)
  2. Configure the Local Director to use this DP in servicing of messenger.hotmail.com
  3. Verify that the machine is back in service in Node Monitor (see health check)
  4. ???Run some basic verification tests???

For removing a DP:

1.                  Check that there are enough DP left to handle the load

2.                  Perform the necessary shutdown of the machines that will be rolled back (see shutdown)

3.                  (optional) If this is a permanent removal of the DP, configure the Local Director so that this DP will not be servicing messenger.hotmail.com

Adding/Removing SB

SB are servers services the instant messaging sessions.  All the SB in a system announces their availability to the rest of the system, and the overall instant messaging load are distributed to the number of SB that announces itself to be available.  Because of this dynamic load balancing architecture, SB are redundant, and thus adding or removing an SB is fairly simple.  The only concern when adding or removing a SB are:

·         overall SB load after the change will be at an acceptable level

  • that the SB has the same configuration file (particularly the “machine map”) as the rest of the machines in the system

For adding a SB:

  1. Make sure that the SB has the same build and has the same configuration files as the rest of the system. (See upgrade.)  Note that SB does not need to be listed in the configuration file server.conf.
  2. Start up the SB (See startup)
  3. Verify that the machine is back in service in Node Monitor (see health check)
  4. ???Run some basic verification tests???

For removing a SB:

1.                  Check that there are enough SB left to handle the load

2.                  Perform the necessary shutdown of the machines that will be rolled back (see shutdown)

Adding/Removing Friends M-serv

[TBD]

Adding/Removing Friends U-store

[TBD]

Stopping Additional Login

msgradmin can be used to direct one or more servers to stop processing logins:

# disable further login

# warn clients that the system will be down in 5 minutes

msgradmin server.conf disablelogins all

Turning off/on Access to a Friends Ustore

·         Feature being implemented, using msgradmin

·         Status of not-accessing Ustore can be seen via Nodemon

Turning off/on Access to an Authentication Ustore

·         Feature being implemented, using msgradmin

·         Status of not-accessing Ustore can be seen via Nodemon

Updating Configuration files

Certain configuration files can be “refreshed” without server down time:

Configuration File

Command

server.conf

Can not be refreshed without server reboot

domainmap.txt

can be updated by using refreshconfig

msgdomain.conf

can be updated by using refreshconfig

cvr.csv

can be updated by using refreshconfig

friend_cur_machines

can be updated by using refreshconfig

urllist.txt

Dynamic update being implemented

Friends and Family email templates

can be updated by using refreshconfig

Node Monitor will provide a check where the current CRC of each configuration file listed above will be compared against the version kept at the staging server.  Warning/Alarm will be sounded if there are any discrepancy.

Assisting with debug

·         [TBD] scripts will be provided to package up the appropriate log and core files for sending to development team for debug, as well as cleanup of the logs and cores

Tools

msgradmin (updated 6/26/00)

Usage:        msgradmin <config file name> <command> <target> [timer | ustoreIP | refreshoption | domain | protocol]

·         <config file name> is usually server.conf

·         <target> is either ALL or the IP address of the NS/SB/DP to be administered

·         <command> are as listed below:

<command>

Description

systemcomingdown

Sends system message to the clients connected to the NS, that the machine is coming down in [timer] minutes

e.g. msgradmin server.conf systemcomingdown all 5

disablelogins

 

Prevents additional clients from logging into the NS

e.g. msgradmin sever.conf disablelogins all

enablelogins

Re-enable NS logins

e.g. msgradmin sever.conf  enablelogins all

listdisableddomains

lists the currently disabled domains

e.g. msgradmin server.conf listdisableddomains

shutdown

Shutdown the NS’s and the SB’s in the system (no DP).  Also specifies to the client when to re-logon again in [options] minute

e.g. msgradmin sever.conf  shutdown 15 all

refreshconfig

Refresh the configuration files; i.e. re-read and start using the configuration files on disk.  The scope of the refresh is specified in [options]:

·        passportinfo: partner.xml and messenger.key files

·        clientinfo: cvr.csv

·        domaininfo: domainmap.txt, msgdomain.conf, webtv.conf, urllist.txt, partner.xml, messenger.key

·        mailtemplates:  all the mail templates

·        misc: not_allowed.txt, etc.

·        all:  Everything

enableserverBC

Running this command will enable the M5 backward-compatibility mode, switching the machines to M4.2

e.g. msgradmin sever.conf enableserverBC all

disableserverBC

Running this command will disable the M5 backward-compatibility mode, switching the machines to the M4.2 mode

 

e.g. msgradmin sever.conf disableserverBC all

enableclientprotocol

Enable a particular client protocol.  The currently available options are MSNP3 and MSNP2, which are specified via the [options] parameter

e.g. msgradmin sever.conf  enableclientprotocol  all MSNP3

disableclientprotocol

Disable a particular client protocol.  The currently available options are MSNP3 and MSNP2, which are specified via the [options] parameter

e.g. msgradmin sever.conf  disableclientprotocol  all MSNP3

disableustore

Disable NS’s access to a Authentication U-store specified in [options]

e.g. msgradmin sever.conf  disableustore all 123.123.123.123

enableustore

Enable NS’s access to a Authentication U-store specified in [options]

e.g. msgradmin sever.conf  enableustore all 123.123.123.123 

listdisabledustores

List all the Authentication U-stores that have been disabled

e.g. msgradmin sever.conf  listdisabledustores all

enableemailpassport 

Enables the Auth/Add of e-mail passports

e.g. msgradmin sever.conf  enableemailpassport all

disableemailpassport 

Disables the Auth/Add of e-mail passports

e.g. msgradmin sever.conf  disableemailpassportt all



whichns

Usage: whichns server.conf username

When executed, whichns will resolve a particular username and determine which NS that user is being partitioned to.  This tool is helpful in diagnosing support issues.

servmon

Servmon is a process that runs on msgr-dp1.  This process makes statistics available that detail the number of SOC and IM for each of the NS and SB machines. 

Usage: servmon <config file name> <report port #> [<timeout>]

The default value for timeout is 135 seconds.  If a server does not broadcast its status or respond to a status request within this interval, it is considered faulty.

When servmon fails, you can simply restart it by doing the following:

·         logon to msgr-dp1

·         go to the home/hotmail/messenger directory

·         type:  servmon server.conf 8250

Staging Server Scripts  (updated 6/26/00)

msgrsrv.dat

·        Server as the configuration file for all Staging Server Scripts

·        Contain a list of all Messenger server machines in the Messenger cloud (not including Ustores and Mservs)

·        Identify the following for each machine:

·        Machine name

·        Whether they are DP, NS, or SB

·        Whether they are Spares, or In Service

DEPLOY

·        Usage: DEPLOY [machine name | ALLDPNS | ALLSB | ALL]

·        It prepares a clean-slate machine for use with Messenger:

·        Prepare the necessary directory structure

·        Prop the S98* file into the /etc/rc2.d

·        Prop the cleanup.fnd.pl

·        Prop coreit.pl

·        At the end of doing DEPLOY, admin still need manually add the set of cron jobs appropriate for the server type

·        For spares, admin should manually remove the S98* from /etc/rc2.d so that the server application don't auto-start on reboot

UPGRADE

·        Usage: UPDATE [machine name | ALLDPNS | ALLSB | ALL]

·        Extract the tarball with the highest build number from the staging server’s /home/hotmail/messenger/builds directory into /home/hotmail/messenger/temp

·        Prop all the binaries from the temp directory, and configuration files from /home/hotmail/messenger and /home/hotmail/messenger/config, to the specified machines

UPGRADECONF

·        Usage: UPGRADECONF [machine name | ALLDPNS | ALLSB | ALL]

·        Prop all the configuration files from /home/hotmail/messenger and /home/hotmail/messenger/config, to the specified machines

·        Usually followed by “msgradmin refreshconfig” command to have the new configuration take effect

START

·        Usage: START [machine name | ALLDPNS | ALLSB | ALL]

·        Start the appropriate server application on the specified machine

STATUS

·        Usage: STATUS [machine name | ALLDPNS | ALLSB | ALL]

·        Check the status of the server application on the specified machine

KILL

·        Use only when absolutely necessary, as a last resort

·        Usage: KILL [machine name | ALLDPNS | ALLSB | ALL]

·        Does the equivalent of “kill –9”

·        For normal shutdown, admin should use the msgradmin tool to properly shutdown (with pre-warning, etc.)

CORE

·        Usage: CORE [machine name | ALLDPNS | ALLSB | ALL]

·        Backup the core and log files for debugging to /home/coredump/xxx

CLEARCORE

·        Usage: CLEARCORE [machine name | ALLDPNS | ALLSB | ALL]

Remove all core files, if the core are not interesting for debug purposes

SuperUserPort (updated 6/26/00)

For the NS....

 Superuser commands:

General:

configlastupdate                                                     - displays last time config files read
createthreads <n>                                                 - create N new client-server threads
dumppolling                                                           - dump client-server poll list to log
help                                                                       - display this screen
logflush                                                                  - flush logs to disk
logfile [filename [maxsize]]                                      - start a new log file
loglevel { spew_mask | socket_mask | debug_mask |
info_mask | warning_mask | error_mask |
protocol | cvr | refcount |
debug | function |
info | warning |
error }                                                                     - set the logging level
nc                                                                            - # connections
OUT                                                                        - close this super user connection
resetconfig [{ NumConnections |
MaxListSize <NS_ONLY> |
SBMaxSessions <SB_ONLY> |
SBMaxSessionsPerUser <SB_ONLY> |
SBMaxUsersPerSessions <SB_ONLY> |
MaxLogFileSize |
CoreOnAssert }
[newval]]                                                                 - reset config values
retirethreads <n>                                                     - retire N client-server threads
smstate                                                                    - display server manager state
stat {full | counters | reset}                                        - display statistical info
terminate                                                                  - shut down this server
threads                                                                     - display thread activity
unstickservers                                                           - reset server-server poll flags

NS                                                                           -specific:

allthreads                                                                   - display and dump thread info for all 3 threadpools: client-server, server-server, and blocking
boot <user>                                                              - boot a client
who                                                                          - display who is logged in
examine userHandle                                                  - examine a user on any PS
xfsstate                                                                     - display the state of the XFS connection pools
protocols                                                                  - list protocol versions supported
logcvrstats statsfile [all|inputcols|outputcols]               - log cvr stats. Default = inputcols
logffstats statsfile [all|inputcols|outputcols]                  - log FF stats. Default = inputcols
moveuser TrID userHandle NewIP NewSDirNumber NewTDirNumber - migrate a user
deleteuser TrID userHandle                                       - delete a user
disableprotocol MSNP5|MSNP4|MSNP3|MSNP2  - disable a protocol
enableprotocol MSNP5|MSNP4|MSNP3|MSNP2  - enable a disabled protocol
passportstatus                                                          - display the state of the passport authentication socket pool
refcounttrace <user>                                                - get refcount on a particular user
startstats                                                                  - start statistics logging
stopstats                                                                  - stop statistics logging
flushusagecache                                                       - flush the usage cache (only applies to DP)
disablenotifications                                                   - stop processing notifications
enablenotifications                                                    - (Re)Start processing notifications

Troubleshooting Scenarios (Wlai)

DP failures

·         Just reboot, local director will handle it

·         if long term outage, reconfigure Local director to have one less DP

·         check load, may need to replace machine

NS failures

·         just reboot, knowing that during this some users are not served

·         If long term outage, remove NS from the system

  • If long term outage, replace the NS with another machine if possible

SB failures

·         just reboot

·         check load, may need to replace machine

Friends Mserv failures

·         reboot?

·         should be another Mserv behind local director

·         check load to make sure Mserv is not overloaded

·         may need to replace with another machine

Friends Ustore failures

·         BAD!

  • some portion of users can’t logon, or those who are logged on can’t add friends, change name, etc.
  • recover just like any other Ustore
  • data corruption will mostly be recovered over time

Unable to reach Hotmail site

·         can’t logon anymore users (can’t authenticate)

  • existing logged on users are okay, except no new email notifications

Network problems

·         Inter-server communication goes down means that basically no one can talk to users on a different server

System reaching capacity

Friends U-store Disk Full

Assisting with debug

In case of other emergency…

Issues Escalated by Support (wlai)

“I can’t logon”

·         ingress/egress problem

·         Check DP and Local Director to see if they are functioning

  • NS problem (get the userID and run it thru whichns to determine which machine they are assigned to)
  • Can’t access HM Auth Mserv/Ustore
  • Can’t access Friends M/U

I can’t see my friend when he comes online (or vice versa)

·         ingress/egress problem

·         interserver networking

·         interserver server links may be down or stuck

  • check NS health (use whichns to find the two NS involved)
  • The two corresponding list may not be correctly marked (step them thru the sequence that will correct this)

I can’t IM my friend

·         ingress/egress

·         interserver server links may be down or stuck

  • interserver networking
  • check NS health (use whichns to find the two NS involved)
  • Check SB health

I can’t add a friend

·         Can’t access HM Auth Mserv (not Ustore)

  • check NS health (use whichns to find the two NS involved)
  • Can’t access Friends M/U
  • interserver networking

·         interserver server links may be down or stuck

I can’t search or send friends&family mail

·         check meberbship directory server

·         check connectivity to membership directory

  • check NS (use whichns to see which NS is involved)
  • check Qmail status on the NS as well
  • check for config problems to membership directory

No email notifications at login

·         Hotmail site connectivity

  • Postman/XFS on the user’s Auth Ustore having problems
  • check for config problems to Postman

No email notifications after login

·         Hotmail site connectivity

  • Postman/XFS on the user’s Auth Ustore having problems
  • check for config problems to Postman

·         could be just lost UDP packets


HOW DO I .......

Disable a login domain?

connect to the messenger cloud?

transfer files to and from the messenger cloud?

setup a DP or NS?

setup an SB?

Add/Remove an NS?

Add/Remove a DP?

Add/Remove an SB?

Turn on/off access to a Friends Ustore?

Turn on/off access to an Authentication Ustore?

Update the configuration files?