Is there a best practices or recommendations for alerts in SQL Server?
I don't have MOM as an option, so looking at generally what are
recommendations for levels, error messages to monitor for basics?
Seems I could spend a lot of time going through the error code list to
catch these, but is there a shortened list somewhere that will give me
levels, errors to catch most critical errors that I could build on?
The following list should cover a significant percetage of what you want to
monitor:
1. Severe SQL Server errors (with severity level >= 17)
2. Access violaton errors
3. SQL Server instance accessibility (can you connect to the SQL instance?_
4. Database accessibility (can you query each database? You may want to
exclude some databases or databases in offline or loading states.)
5. SQL Server restarts
6. TCP port bind failures
7. Number of SQL connections excedding a threshold (you should be careful
with false positives)
8. Backup failure
9. Ownership change of the SQL cluster resource group (a failover or
somebody moved the group)
10. The SQL group is not on the preferred node
11. A cluster node is not up
12. You may also want to check whether critical support services are online
(if you use them, e.g. SQLAgent)
Again, this list is not exhaustive, but covers most of the system exceptions
that should be monitored regardless of what monitoring tool you actually use.
Linchi
"SteveM" wrote:
> Is there a best practices or recommendations for alerts in SQL Server?
> I don't have MOM as an option, so looking at generally what are
> recommendations for levels, error messages to monitor for basics?
> Seems I could spend a lot of time going through the error code list to
> catch these, but is there a shortened list somewhere that will give me
> levels, errors to catch most critical errors that I could build on?
>
No comments:
Post a Comment