Monday, 5 November 2012

Troubleshooting Database Mail - Part II

I started out with a blog about how to set database mail up correctly.  How about when it goes wrong, or fails to work?

I finished Part I with how to send a test message.  Right click on Database Mail and select Send a Test Email.


There's an alternative if you want - use the sp_send_dbmail stored procedure.   There's no great benefit in using this at the moment, but just for completeness:



EXEC msdb..sp_send_dbmail
       @profile_name='ServerName',
       @recipients='Jack.Whittaker@companyname.com',
       @subject='Test message',
       @body='This is the body of the test message.  Mail is now working.'


Whichever way you do it, an email should arrive in your inbox after a few moments.  Press Send and Receive if nothing happens for a while.  Okay, twiddle your thumbs and go and make a cup of tea.

Still nothing?  You may have a problem.



Check the Logs


First things first - view the database mail logs.  Right click on Database Mail, and select View Database Mail Log.  You are looking for clues here.  See any?  Nope, OK, lets look at some more logs.  

-- list log events
SELECT * FROM msdb.dbo.sysmail_log
GO

-- list error messages
SELECT * FROM msdb.dbo.sysmail_event_log
WHERE mailitem_id >= 1;

Anything?  Hmm - OK, let's look at the mail.  There are several tables which contain mail items.  Just think about the difference between Failed items and Unsent items for a minute.  
--To list all items:
USE msdb
SELECT sent_status, *
FROM sysmail_allitems
GO


--To list the unsent items only:
USE msdb
SELECT sent_status, *
FROM sysmail_unsentitems
GO


--To list the failed items only:
USE msdb
SELECT sent_status, *
FROM sysmail_faileditems
GO

Is your test mail failing?  Then there is probably some sort of error message somewhere.  

Is it unsent?  This may mean that something is wrongly configured. 


Check that you haven't set up two accounts and are using the wrong one:

-- list all accounts

EXEC msdb.dbo.sysmail_help_account_sp;



Can you pass the Buck?

Before looking at the multitude of things it might be, there may be a legitimate reason for your server not being able to send email, even if it is correctly configured.

  • Check the exclusion list of your anti-virus software - if AV is active on your server, make sure that databasemail.exe is allowed to send messages.
  • Is the server behind a firewall?  If it is, you can send mail until you are blue in the face, they won't get through.  

Next thing to check: can you TELNET to your smtp server on port 25 to make sure nothing is blocked?
Open a DOS Prompt on your server and type telnet mail.domain.ext 25 
It should return something like: 220 smtp esmtp Relay 
Which means that Telnet is working - 220 is the greeting code.

In my case I was able to get a colleague from the network services team to send an email from that box using his email diagnostics tool - that worked fine, so narrowing it down to a SQL Server issue,




Further Reading

Here's a reference to Microsoft's article Troubleshooting Database Mail: General steps - the version for 2008r2 - Database Mail doesn't look much different in the other versions, but if you go to that link it'll take you to the other versions.



Things to Check

Make sure that the login SA is the owner of all the system databases, and especially msdb.
It's probably a good idea if a named individual is not the owner of the user databases - things might go awry when that person leaves.  But for the moment, we are just checking the system DBs.



select name, suser_sname(owner_sid) as Owner 
from sys.databases

Or as so often in SQL Server, there's another way to do it:

-- check database owners
EXEC sp_helpdb
     
  
And if you find any stray databases, bring them back into the fold         
-- change db owner if needed          
USE eWorkDev
GO
EXEC sp_changedbowner [SA]
GO


Are you a member of the sysadmin fixed server role or msdb db_owner role?  This is a fairly safe bet if you are a DBA.  
Otherwise, you should be a member of the DatabaseMailUserRole.



Is Database Mail enabled?

sp_CONFIGURE 'show advanced', 1
GO
RECONFIGURE
GO
sp_CONFIGURE;
GO
RECONFIGURE
GO
sp_configure 'show advanced options', 0
GO
RECONFIGURE
GO
In the results pane, the run_value for Database Mail XPs should be set to 1.


Check DB MAIL Status - Sysmail should be STARTED
EXECUTE msdb.dbo.sysmail_help_status_sp


You can try stopping and then restarting 
EXECUTE dbo.sysmail_stop_sp

EXECUTE dbo.sysmail_start_sp



Check that the Service broker is enabled
SELECT is_broker_enabled FROM sys.databases WHERE name = 'msdb';

sysmail_help_queue_sp     This stored procedure looks at the status of the queue, but unfortunately it isn't as helpful as you might think.  It shows the queue as INACTIVE most of  the time, unless it is actually sending or receiving mail right at this moment.  

EXEC msdb.dbo.sysmail_help_queue_sp @queue_type = 'Mail' ;

The other possible messages are NOTIFIED (queue has been notified receipt to occur), and RECEIVES_OCCURRING (queue is receiving).


Getting Desperate?

You can always try the old standby - switch it off and switch it on again...

Go to the services on the server - run down to MSSQLSERVERAGENT and stop it,  Then start it again,  
Getting more desperate - locate the MSSQLSERVER service - stop it, start it.  
(Beware of Restart - it occasionally doesn't do anything if the service has crashed).  


Tidying Up

If by now your email has filled up with Unsent or Failed messages, you might like to clear them out with this handy stored procedure:

--clear out unsent messages
EXEC sysmail_delete_mailitems_sp @sent_status = 'unsent'

A good reason for doing this is that eventually you are going to fix this issue, and when you do, a hundred and one test messages are going to suddenly appear in your inbox.


Check the Database Mail program

Verify that the Database Mail External Program is located in the correct directory, 
In my case it's in F:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn



Check the Service Account

Verify that the service account for SQL Server has permission to run the executable DatabaseMail.exe.  Was there a change in the Service account that SQL Server runs under? 

If someone changed the account using the Services MMC rather than SQL Server Configuration Manager, then it may not have the rights required - these are usually set by the SSCM.  Using SSCM, reset the service account to something like NetworkService and restart SQL Server and SQL Agent. Try sending a test mail now.  Then using SSCM, set it back. This should confirm that the service accounts have the necessary rights. 

In my case, it was the Service Account which was the problem.  If you want, you can read an account of the hunt here.  Apart from thanking all involved again, I would like to make the point that you can search Google all you want - it's not as good as setting the problem-solving skills of a team of DBAs to work!

Sunday, 4 November 2012

Troubleshooting Database Mail - Part I

I've been wrestling with Database Mail lately - working through a list of servers and instances, making sure that they all send emails in the event of a problem.  Most did, some didn't.  Some had never been set up; some had been set up to use an old mail server which was about to be decommissioned; and one was apparently set up perfectly - except that it didn't actually work  <sigh>.


My purpose in setting up database mail on a server is to send warning messages to myself and my team in case anything goes wrong.  If a backup job fails, for example, I want to know about it.  Some people set up a corresponding message for when a job succeeds - this is up to you, but my attitude is that things normally work, and I don't plan to worry unless they don't.  I usually run a job under supervision the first time, so I can see whether it runs or not.  


Let's start with setting Database Mail up from scratch.  This came in with SQL Server 2005, replacing the cleft stick and smoke signals used by earlier versions.  (What?  you still use SQL Server 2000?  I don't have personal experience, but this may help).  There's a perfectly good wizard for doing this, and in general I suggest that you follow the wizard.

Starting in SSMS then, expand Management by clicking on the + symbol.
Right click on Database Mail and select Configure Database Mail - up pops the wizard.


The Welcome screen is optional - if you don't want it, click on the Skip this page in the future box.  Press Next.


The default is best - Set up Database Mail by performing the following tasks...  The other options are really there for making changes and additions subsequently, although you probably could set things up from scratch this way.  But you are using the wizard because you want to do things the easy way, aren't you?  I certainly am.

Assuming this is your first time setting up Database mail, up pops this box:


Well, fairly obviously, Yes I do!  If not, you might as well give up now...

Here's the new profile dialog box:

Profile name - I usually use the server name - it's as good as any other.  And ditto for the description.  But if you are going to have several mail profiles per server, you'll need to give this a bit more thought.  

Then click on Add to add a new SMTP (outgoing mail) account,  


Again, use the server name for the account name and description, and also for the Display name.  The email address and company name need to be valid addresses, but you don't really want any emails being sent to them, hence noreply@...  My purpose in setting up database mail on a server is to send warning messages to myself and my colleagues in case anything goes wrong.  I don't then send a message back to the server - this only encourages the server gremlins.  

The bit that really matters here is the name of the outgoing mail server (SMTP).  Your network team will be able to tell you the name of the mail server, the port it uses (default is 25) and whether it requires a secure connection.  Notice the name: mail.companyname.com; not mail@... - it's a server name, not an email address.  

Click on OK and it takes you back to the New Profile screen, this time with the Account name added.  Not very exciting, but for completeness, here's what it looks like:


Now, click on Next.  This is the Manage Profile Security page:


Make the Profile public by ticking in the box.
Make it the default profile by selecting Yes from the pull down menu.  The pull-down doesn't show until you click in the box, but it's there, I promise.  
You can set up private profiles that can only be accessed by specific users.  But we don't need that here.

Click on Next and you get a chance to configure the system parameters.  Most of the time, the defaults are fine, so don't worry about it.



Click on Next and it gives you a chance to review the choices you have made:


Click on Finish and it does what you told it to do.  Look, you only have yourself to blame...  


Close, and test.  

The easiest way to test is by right clicking on Database Mail and selecting Send a Test Email.


It should have the profile name you entered before, although if you need several profiles you will need to use the pull down menu.  Put your own email address in the To: box, the subject and Body are fine (although you can change them if you want) and press Send.  Sit and twiddle your thumbs for a few moments and a test email should appear in your inbox.  

What if it doesn't?  What you want is Troubleshooting Database Mail - Part II
Stay tuned to this channel, database fans...




Tuesday, 16 October 2012

Why is Replication not Working?

First of all, what is Replication about?  You take a copy of a database - all of it, part of it, whatever you need.

  • Maybe you want to copy data to a redundant server so that it's ready to take over if the original one fails.  Today you might use mirroring or AlwaysOn, but before these technologies came along you might use Replication
  • Maybe the master database contains a price list - you can send an up to date price list to every branch every morning at 0700 ready for the working day.  


Here's a one page intro that I like.  


So why is it not working?  No doubt there are hundreds of reasons why replication isn't working.  Here's one more to add to the list.  I set up what I thought was a straightforward Publication, using the wizard that Microsoft kindly provide.



I won't go into detail - just work through the wizard, tell it what table you want to publish, etc, etc and schedule it to run at an appropriate time - every hour, let's say - that's the default, fine by me.


So the wizard creates a publication - it takes data from a database and makes it available for other databases, probably on other servers, to read.  Sorry, to subscribe to.  




OK, so much for Publication, now for the Subscription.  Again, follow the wizard:



Start by telling it the name of the Publication you just created


 Connect to the subscription server i,e, the destination:



Schedule the job, and set it off.  Easy peasy.  Actually, there's a bit more work than I have shown here, but essentially you follow the wizard and it sets up two SQL Agent jobs - one to publish the data from Location A, and one to subscribe to that data and copy it to Location B.

The problem that cropped up was that I scheduled the jobs using the wizard to run every hour - as it assumes by default.  Then I changed one item of data in the original, and sat back to watch it replicate to the destination.

It didn't.

There's a Replication Monitor (just open up Replication and right click on the Publication or Subscription) and it said it ran fine - no errors, nothing.  But the changed data did not appear in the destination.

It took me a little while to work it out, and then a bit longer to stop kicking myself. The wizard assumes that the jobs will run every hour on the hour.  So by default the Publication and the Subscription both start at the same time i.e. the subscription starts before the publication has finished.

Therefore a change to the original Article gets published okay, but it won't get across to the Subscriber for an hour.

Solution - edit the schedule of the subscriber job to run 2 minutes after the hour. (Publication might only take a few seconds to run, depending of course on how much you are publishing)

Thus:

18:17 Change is made to the original data

19:00 Article is published

19:02 Subscriber receives article

Had I been a bit more patient, it would have worked out fine, an hour later.  But I wanted to go home!


Thursday, 4 October 2012

Book Review - Brad's Guide to Maintenance Plans

Or to give it its full title:
Brad's Sure Guide to SQL Server Maintenance Plans (Dba Handbooks)
by Brad M McGehee

An exhaustive haul through the various options that Microsoft provide for maintenance.  Brad spends a lot of time talking about the Wizard, much of that time warning against it!  He is wary of the wizard, in part because it could lead the unwary to ticking every box and thus doing unnecessary and even damaging stuff. Ticking Rebuild and Reorganise, for example - it just makes your plan take twice as long. If you Rebuild, then do an unnecessary Update Statistics, the update can screw up the statistics that the rebuild created and hit performance as well as wasting time.

I've never used the wizard myself - the SSIS-style drag and drop interface in the plan designer is so easy to use that I don't see the need. Brad also recommends using T-SQL or Powershell scripts for more complex maintenance. In fact he recommends this on average once per chapter, so it tends to get a little tedious.

To my mind, this leaves a lot of middle ground where the simplicity of the graphic maintenance plan designer does the job neatly and effectively. Want to take a backup? Just pick the backup task, spend a few moments telling it what to backup and where to, and then schedule it to run at a convenient time. Job done! Why write the code when the plan can do it for you?

Here's why - you can rebuild indexes, or reorganise indexes - but best practice is probably to read the fragmentation level of each one and then decide whether to rebuild, reorganise or leave alone. Especially on large databases, especially if you have a narrow maintenance window, you don't want to do an unnecessary rebuild. For this your best bet is a custom script.

But for most stuff the maintenance plan is fine. Brad reminds me of all the things I should be doing - like the history clean up task. Have I scheduled this? Probably not everywhere - note to self - go and check. And he explains very well the point of multiple sub-plans which I hadn't grasped before.

Saturday, 22 September 2012

UK Phone Numbers

I've been wrestling with a list of predominantly UK phone numbers for the last couple of days.  The data has come from a list of all the calls made to and from an organisation in the past year.  In particular, I want to identify the dialling code so that I can identify what geographical areas people are calling from.

First port of call is usually Google - I certainly don't want to write some code if I can just copy something that someone else has already done.  But sadly every entry I could find was about US phone numbers, which are easy peasy - area dialling codes are 3 digit across North America.  How rational!  Here in the UK dialling codes may be three, four, five or six digits long.  So if you have North American phone numbers to sort out, most of this article will be irrelevant.

Let's start with a bit of research - here and here. "In the United Kingdom, area codes are two, three, four, or, rarely, five digits long (after the initial zero). Regions with shorter area codes, typically large cities, permit the allocation of more telephone numbers as the local number portion has more digits. Local customer numbers are four to eight figures long. The total number of digits is ten, but in a very few areas the total may be nine digits (after the initial zero). The "area code" is also referred to as an "STD (code)" (subscriber trunk dialling) or a "dialling code" in the UK."  So, basically, it's a tangle of different standards.  OK, so I need some code which checks the first few digits and identifies which one is which. 


Data Cleansing
But before we get on to that, let's do some data cleansing.  I have a lot of what look like perfectly good phone numbers with a 92 prefix.  Like most firms, it's 9 to dial out, so I don't know where 92 comes from.  It's easy to deal with though- just a straightforward REPLACE() function.

replace(LEFT(strddi,2), '92', '')
means when the leftmost two characters of the DDI string are 92, replace them with a blank.

-- identify Direct Dialled numbers with 92 prefix - apparent error
update  [CUSTOMER].[PhoneCalls]
Set    strddi = replace(LEFT(strddi,2), '92', '')
WHERE LEFT(strddi,2) = '92'
AND         strDDIAreaCode IS NULL
AND         LEN(strddi) > 5;

There's another issue with incoming mobile phone numbers - in the UK, these start with 07, but somehow they have been recorded as starting with 7.  Replace does the job.

-- identify mobile numbers omitting 0 prefix - apparent error
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strCLI,1), '7', '07')
WHERE LEFT(strCLI,1) = '7'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) = 10;


Then we have the incoming calls from UK numbers, but specifying the full international dialling code 0044.  Fine if you are based somewhere else, but really not necessary if you are in the UK and your caller is in the UK.

-- identify UK numbers with an international code 0044
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strcli,4), '0044', '0')
WHERE LEFT(strcli,4) = '0044'
AND         strCLIAreaCode IS NULL
AND         LEN(strcli) > 5;



Having dealt with UK codes that think they are worldwide, I can now go on and identify the legitimate international codes.  I could pick out the French calls (0033), German calls (0049) and so on, but I only have a very small percentage of non-UK calls so I'm just going to treat all these Johnny Foreigners the same and lump them together as "International" calls.  If you do want to be more accurate with your international calls, click here for a list of the codes.

-- identify  international codes 00
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'International'
WHERE LEFT(strcli,2) = '00'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;



And use the same technique to identify Mobile numbers:

-- identify mobile phone numbers 07
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Mobile'
WHERE LEFT(strcli,2) = '07'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Freefone numbers are free only to people calling from Landlines, although I understand that there are plans to make them free to people using mobile  phones too.

-- identify Freefone  phone numbers 0800 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Freefone'
WHERE LEFT(strcli,4) in ('0800', '0500', '0808')
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;
 

There is a string of special rate numbers beginning with 08 - once upon a time you could identify 0845 as local rate and 0870 as national rate, but the list has proliferated and now you can't really tell how much it is going to cost.   

-- identify special rate phone numbers 0845 0870 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Special rate'
WHERE LEFT(strcli,2) = '08'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


One thing you can be sure of - a call to an 09 number is going to be outrageously expensive...


-- identify premium rate phone numbers 09
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Premium rate'
WHERE LEFT(strcli,2) = '09'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Identifying UK Dialling Codes
So - on to the main point of this article.  UK dialling codes - area codes - whatever you want to call them, allow you to identify where in the country a caller is based.  Aberdeen is 01224, York is 01904.  But very often the dialling code and the subscriber number are held in a single field e.g. 01169158424.
Taking this example, it could in theory be divided thus:
011        6915 8424
0116      915 8424
01169    158424
011691  58424

A dialling code can be anything from 3 digits to 6 digits - a subscriber number can be anything from 4 to 8 digits.

So here's how to tackle the problem.  There are only five cities with three digit dialling codes, ten with four digits, and 12 with six - the rest have five.  So thanks to a bit of research in Wikipedia (see the links I posted earlier) I was able to construct the following CASE statement:


-- UK Dialling codes may have 3, 4 , 5 or 6 digits
-- pick appropriate code
--CLI
 update  [CUSTOMER].[PhoneCalls]
set strCLIAreaCode =
      (
            CASE
            -- 3 digit dialling codes e.g. London, Belfast
            WHEN LEFT(STRcLI,3) IN 
                 ('020', '023', '024', '028', '029' )
                  THEN LEFT(strcli,3)
                 
            -- 4 digit dialling codes e.g. Bristol, Leicester
            WHEN LEFT(STRcLI,4) IN 
                 ('0118', '0117', '0116', '0115', '0114', '0113',
                  '0121', '0131', '0141', '0151', '0161', '0191')
                  THEN LEFT(strcli,4)
                 
            -- 6 digit dialling codes e.g. Langholm, Keswick
            WHEN LEFT(STRcLI,6) IN 
                 ('013873', '015242', '015394', '015395',
                  '015396', '016973', '016974', '016977',
                  '017683', '017684', '017687', '019467' )
                  THEN LEFT(strcli,6)
                                   
            -- The remaining majority of codes are 5 digit
            ELSE  LEFT(strcli,5)
      END
      )

WHERE
       strcli <> 'WITHHELD'
       AND LEN(strcli) > 5
      AND strCLIAreaCode IS null;
       


Looking at 01169158424, it's easy to see that the first four digits match the four digit option, so the area code part of this number is 0116 - which represents Leicester

Finally, a bit of tidying up:
      
-- Remove any remaining nulls in area code fields
-- nulls remaining represent internal codes or unidentifiable
-- CLI
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode =  'N/A'
WHERE strCLIAreaCode IS NULL;


Incidentally, the code shown here is half the code I wrote - I had to do the job for both inbound (CLI - Caller Line Identification) numbers and outbound (DDI - Direct Dial In).  the code is essentially the same so I haven't troubled you with it. 

I hope you find this useful!


Sunday, 2 September 2012

Book Review - Confessions of an IT Manager


You've read Dilbert?  Then you may have thought Scott Adams was joking.  He wasn’t, and Phil fleshes out the details.  From contract programmer to IT Director, Phil spills the dirt on them all,  not to mention recruitment agencies, for whom a special place is reserved in hell.  



This book was originally published as a series of articles on the Simple-Talk website. Take IT projects now, a subject close to my heart at the moment - they have a reputation for failing, especially the big ones.  Phil thinks it’s a well-deserved reputation.  Nonetheless he suggests a few things you can do to actually succeed in getting an IT project come in on time, under budget, and actually do something vaguely resembling whatever it’s supposed to do.  

Start by doing nothing until everyone agrees what is to be done.  Then any improvement, any change, has to go through a strict change control process.  (You may see already why there are so many failures). 

Don’t be a pioneer - use only boring proven technology.  If the programmers want to play with the latest TLAs (three letter acronyms) and put them on their CVs then fine, set up a sandbox to muck about, but do the real work using last year’s release. 

Code then recode.  Sling the first version and write it again properly.  It will take half the time and effort, and run twice as fast as the first tentative fumblings. (I can see his point, but I'm not sure that I would be brave enough to sling a working program and then rewrite my code.  Nor am I sure I could get it past my manager - "Yes, I'm finished, but now I'm going to rewrite it").

Allow no virtuoso programming - if you can’t understand it, it’s no good.  He mentions elsewhere the joy of documented code.  Oh, so true.  I like to put comments in my code because I haven't a clue what I've done otherwise when I look back at it three months later.  I know some people believe that their code is so clear that they don't need to bother - to them I say "Phooey" and Phil would probably agree.  

And encourage results by rewarding deadlines met. 

Fortunately it doesn’t matter if your project is abandoned or dies a painful unpleasant death.  Here I refer you to Dilbert’s The Joy of Work.  Suppose you design jet engines.  You may only ever have designed one of them, and on its first flight it crashed in flames and destroyed a remote logging town.  What matters is that you can put “Jet Engine Designer” on your CV.