Friday, February 24, 2012

Agent Deadlock Errors

I have been getting numerous deadlock errors in my Queued Reader agent and
Distribution clean up agent.
We are running transactional replication with immediate updating
subscribers-with queued updating as failover. We have one primary server
with 9 subscribers.
My Queue Agent is set to run continuously.
My Distribution clean up agent is set to run every 5 minutes.
I have also implemented a scheduled job that runs every two hours to update
the statistics in the distribution database.
Is there any way to eliminate these deadlocks?
Thanks,
David Grau
Surprise & Delight
Queued replication is not really scalable beyond 10 subscribers, you may be
hitting its upper limit. I would suggest you run the queued agents on a
schedule, every 5 minutes or so and perhaps stagger their schedules. Also
Close down Enterprise Manager as much as possible as this causes locking
with the replication agents (actually its the refresh rate which does this).
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
news:DD11621D-2E56-42C6-B1B2-7AC8B8234F5F@.microsoft.com...
>I have been getting numerous deadlock errors in my Queued Reader agent and
> Distribution clean up agent.
> We are running transactional replication with immediate updating
> subscribers-with queued updating as failover. We have one primary server
> with 9 subscribers.
> My Queue Agent is set to run continuously.
> My Distribution clean up agent is set to run every 5 minutes.
> I have also implemented a scheduled job that runs every two hours to
> update
> the statistics in the distribution database.
> Is there any way to eliminate these deadlocks?
> Thanks,
> David Grau
> --
> Surprise & Delight
|||Thanks for your answers.
Is it appropriate to regularly update the statistics on tables in the
distribution database? We have an very active environment, so I figured it
wouldn't hurt. What do you think?
By the way, I read your book on Trans. Replication from cover to cover.
Great information in there. Is your book on Merge Replication out yet?
Surprise & Delight
"Hilary Cotter" wrote:

> Queued replication is not really scalable beyond 10 subscribers, you may be
> hitting its upper limit. I would suggest you run the queued agents on a
> schedule, every 5 minutes or so and perhaps stagger their schedules. Also
> Close down Enterprise Manager as much as possible as this causes locking
> with the replication agents (actually its the refresh rate which does this).
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
> news:DD11621D-2E56-42C6-B1B2-7AC8B8234F5F@.microsoft.com...
>
>
|||Thanks - I don't like to update statistics as it causes locking. Statistics
is necessary for updating the execution plan - but this plan should be
pretty static for replication operations.
The merge book is in the works still.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
news:FA06ADC2-0441-491A-BB8E-32B32F063DC8@.microsoft.com...[vbcol=seagreen]
> Thanks for your answers.
> Is it appropriate to regularly update the statistics on tables in the
> distribution database? We have an very active environment, so I figured
> it
> wouldn't hurt. What do you think?
> By the way, I read your book on Trans. Replication from cover to cover.
> Great information in there. Is your book on Merge Replication out yet?
> --
> Surprise & Delight
>
> "Hilary Cotter" wrote:
|||Just where in the hell are you getting THAT information? Where's your proof
that it doesn't scale beyond 10 subscribers? That would mean one of my
customers implementations is completely impossible, because they have 98
subscribers doing queued updating right now.
Mike
Mentor
Solid Quality Learning
http://www.solidqualitylearning.com
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:uxQSahuEGHA.2292@.tk2msftngp13.phx.gbl...
> Queued replication is not really scalable beyond 10 subscribers, you may
> be hitting its upper limit. I would suggest you run the queued agents on a
> schedule, every 5 minutes or so and perhaps stagger their schedules. Also
> Close down Enterprise Manager as much as possible as this causes locking
> with the replication agents (actually its the refresh rate which does
> this).
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
> news:DD11621D-2E56-42C6-B1B2-7AC8B8234F5F@.microsoft.com...
>
|||If you use Enterprise Manager, make sure that you are NOT polling for
status. Painting that little red X in EM causes a LOT of blocking which
contributes to your deadlock problems. If you are using queued updating as
a failover, I wouldn't run it on a continuous basis. That's because all of
your transactions are going through immediate updating. If you move your
queued agent to run on a staggered schedule to the distribution clean up,
most of your deadlocks are going to disappear.
Mike
Mentor
Solid Quality Learning
http://www.solidqualitylearning.com
"David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
news:DD11621D-2E56-42C6-B1B2-7AC8B8234F5F@.microsoft.com...
>I have been getting numerous deadlock errors in my Queued Reader agent and
> Distribution clean up agent.
> We are running transactional replication with immediate updating
> subscribers-with queued updating as failover. We have one primary server
> with 9 subscribers.
> My Queue Agent is set to run continuously.
> My Distribution clean up agent is set to run every 5 minutes.
> I have also implemented a scheduled job that runs every two hours to
> update
> the statistics in the distribution database.
> Is there any way to eliminate these deadlocks?
> Thanks,
> David Grau
> --
> Surprise & Delight
|||Phil Vaughn and experience.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Michael Hotek" <mike@.solidqualitylearning.com> wrote in message
news:uWt1i2JFGHA.3984@.TK2MSFTNGP14.phx.gbl...
> Just where in the hell are you getting THAT information? Where's your
> proof that it doesn't scale beyond 10 subscribers? That would mean one of
> my customers implementations is completely impossible, because they have
> 98 subscribers doing queued updating right now.
> --
> Mike
> Mentor
> Solid Quality Learning
> http://www.solidqualitylearning.com
>
> "Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
> news:uxQSahuEGHA.2292@.tk2msftngp13.phx.gbl...
>
|||Thank you both for responding. The Agent failures have dropped in frequency,
but have not been eliminated.
Here is what I have now changed:
1. Run a scheduled job once a day, at 12:00 noon to update the statistics
in the distribution database.
2. Changed the History Cleanup Agent to run every 10 minutes, starting at
12:01.
3. Changed the Distribution Cleanup Agent to run every 5 minutes, starting
at 12:02
4. Changed the Replication Agent Checkup to run every 10 minutes, starting
at 12:03
5. Changed the Queued Reader Agent to run every 5 minutes, starting at 12:04
6. Turned off the auto-refresh of the Replication Monitor on my Enterprise
Manager and also on the server's Enterprise Manager.
I'm still getting the random deadlock failure of the Distribution Cleanup
Agent a half-dozen times a day. Even though it restarts, it is annoying to
those of us on-call folks that have to respond to the failures. Have I done
everything possible?
In our environment, the publisher and distributor are on the same server.
I'm interested in both your opinions about keeping them on the same server or
moving the distributor to a separate dedicated server. What metrics should I
be looking at to decide when it's time to separate them?
Thanks,
David Grau
Surprise & Delight
"Michael Hotek" wrote:

> If you use Enterprise Manager, make sure that you are NOT polling for
> status. Painting that little red X in EM causes a LOT of blocking which
> contributes to your deadlock problems. If you are using queued updating as
> a failover, I wouldn't run it on a continuous basis. That's because all of
> your transactions are going through immediate updating. If you move your
> queued agent to run on a staggered schedule to the distribution clean up,
> most of your deadlocks are going to disappear.
> --
> Mike
> Mentor
> Solid Quality Learning
> http://www.solidqualitylearning.com
>
> "David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
> news:DD11621D-2E56-42C6-B1B2-7AC8B8234F5F@.microsoft.com...
>
>
|||Thank you both for responding. The Agent failures have dropped in frequency,
but have not been eliminated.
Here is what I have now changed:
1. Run a scheduled job once a day, at 12:00 noon to update the statistics
in the distribution database.
2. Changed the History Cleanup Agent to run every 10 minutes, starting at
12:01.
3. Changed the Distribution Cleanup Agent to run every 5 minutes, starting
at 12:02
4. Changed the Replication Agent Checkup to run every 10 minutes, starting
at 12:03
5. Changed the Queued Reader Agent to run every 5 minutes, starting at 12:04
6. Turned off the auto-refresh of the Replication Monitor on my Enterprise
Manager and also on the server's Enterprise Manager.
I'm still getting the random deadlock failure of the Distribution Cleanup
Agent a half-dozen times a day. Even though it restarts, it is annoying to
those of us on-call folks that have to respond to the failures. Have I done
everything possible?
In our environment, the publisher and distributor are on the same server.
I'm interested in both your opinions about keeping them on the same server or
moving the distributor to a separate dedicated server. What metrics should I
be looking at to decide when it's time to separate them?
Thanks,
David Grau
Surprise & Delight
"Hilary Cotter" wrote:

> Thanks - I don't like to update statistics as it causes locking. Statistics
> is necessary for updating the execution plan - but this plan should be
> pretty static for replication operations.
> The merge book is in the works still.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
> news:FA06ADC2-0441-491A-BB8E-32B32F063DC8@.microsoft.com...
>
>
|||I am not sure about updating the statistics in the distribution database.
sp_MSdistribution_cleanup (the proc behind the distribution clean up agent)
does this anyway. I think all this will do is cause more blocking as it is
being executed.
I would also try run profiler to determine what process the distribution
clean up agent is locking with. Note that the distribution clean up agent is
always going to be the victim in a deadlock situation. You can modify this
behavior by altering the proc and commenting out the set deadlock_priority
low statement. This might raise the visibility of the process which is doing
the locking. Sp_who2 should also reveal this.
How many commands are flowing back and forth? If it is significant I might
also bump up the min_retention from 0 to 1 hour or so and study the impact
of this.
It sounds like your monitoring software is too sensitive and the
distribution cleanup agent can be noisy. Is there any way to make the
software less sensitive to this agent failure. Note that the replication
alerts can be configured to only report every 1 day or so. However you can't
be selective about which agent you want to dampen. You could also write some
logic in the proc used by the distribution clean up agent to only report
once a day.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"David Grau" <DavidGrau@.discussions.microsoft.com> wrote in message
news:39CE2CC8-8DA4-4F5C-B37C-ACA922FE5108@.microsoft.com...[vbcol=seagreen]
> Thank you both for responding. The Agent failures have dropped in
> frequency,
> but have not been eliminated.
> Here is what I have now changed:
> 1. Run a scheduled job once a day, at 12:00 noon to update the statistics
> in the distribution database.
> 2. Changed the History Cleanup Agent to run every 10 minutes, starting at
> 12:01.
> 3. Changed the Distribution Cleanup Agent to run every 5 minutes,
> starting
> at 12:02
> 4. Changed the Replication Agent Checkup to run every 10 minutes,
> starting
> at 12:03
> 5. Changed the Queued Reader Agent to run every 5 minutes, starting at
> 12:04
> 6. Turned off the auto-refresh of the Replication Monitor on my
> Enterprise
> Manager and also on the server's Enterprise Manager.
> I'm still getting the random deadlock failure of the Distribution Cleanup
> Agent a half-dozen times a day. Even though it restarts, it is annoying
> to
> those of us on-call folks that have to respond to the failures. Have I
> done
> everything possible?
> In our environment, the publisher and distributor are on the same server.
> I'm interested in both your opinions about keeping them on the same server
> or
> moving the distributor to a separate dedicated server. What metrics
> should I
> be looking at to decide when it's time to separate them?
> Thanks,
> David Grau
> --
> Surprise & Delight
>
> "Hilary Cotter" wrote:

No comments:

Post a Comment