What can I do to make a database application more resilient?

ijustlovemath · ‎10-30-2015

Currently, my DB application has a little construct that looks like this:

Whereby any stored procedures I'm running have a chance to be retried upon failure. The problem is, when they fail, they fail hard. They outlast the timeout, shoot the webserver I check on a failure message, and bail out.

The application itself is run on several different computers on my home network, several instances per computer, all committing to a DB server hosted on a laptop. I suspect that what's happening is the laptop's HDD can't handle the constant I/O, leading to inserts/selects to be queued up, and eventually perish. All of which bodes poorly for me, as I would like to have about 15 instances of this application running at all times.

My initial thought is to get rid of this retry structure, and on error, close and reopen the connection to the DB, repeating this until sucess, or 5 minutes, whichever comes first. I'm a complete novice to DB setups, so this may well be a bad idea. I'm just looking for something that doesn't crap itself at the slightest hitch.

Any ideas? I'm using SQL Server 2014 Express.

ijustlovemath · ‎10-31-2015

I've narrowed it down to this:

Invoke Node Error calling method System.Data.Odbc.OdbcCommand.ExecuteReader, (System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
        Inner Exception: System.Data.Odbc.OdbcException: ERROR [40001] [Microsoft][ODBC SQL Server Driver][SQL Server]Transaction (Process ID 69) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.) <append><b>System.Data.Odbc.OdbcException</b> in Batch Grab ExperimentID.vi->pull experiment ID.vi->pull next job.vi->2015_Controller.vi

There's a deadlock going on occasionally, and SQL Server kills whichever one is unfortunate enough to be deemed unworthy. This is handled by refreshing the connection. If anyone knows the actual error code of this thing, please let me know. I'd like to add it to my error handler so I can just rerun the transaction rather than refresh the connection. It's pretty rare, and like I said, it does get handled in a sense, so it's not top priority. But if any of you happen to know LV's code associated with this error, I'd love to see it. The error stems from a System.Data.Odbc.ExecuteReader invoke node.

ijustlovemath · ‎10-31-2015

Further research shows that all .NET errors give the code 1172. I handled this one specifically by instead searching the error message for "deadlock". Hope this helps somebody in the future!

PatrickLye · ‎10-31-2015

I am using straight SQL calls instead of stored procedures, and I haven't seen this type of problem on MS SQL databases. I am working through error handling and checking myself, and find it a big pain to handle seamlessly when dealing with DBs, especially since I have to automate things so much to keep things as hands off as possible. I hope to get back to this project soon, but I find it was a long delay before the error shows up, even when I disconnect a LAN cable, and I'm using the database connectivity toolkit. I suspect this is a delay in the underlying OS code, but haven't had time to look into it too far yet. My approach is to set up a seperately looped state machine for reading/writing, and supplying or extracting data through a functional global variable. When an error does happen I test for connectivity with pings and then decide what else to do as in email, log, etc.. and then open a new connection and wait for it to connect. I'm thinking there is quite a bit more that can be done to query status from the MS SQL server, but haven't had any time to begin looking at this as I'm ramping up on a new power monitoring project to more closely calculate production costs in the factory. I hope to jump back to my error DB connection handling code in this project, after working out the USB->RS485->Modbus RTU issues when working without the DSC module since all of my power data will again go into an MS SQL database for data mining. Parts have been arriving all week, but the critical parts didn't come in yet so I've been doing trivial clean up on some outstanding items that have been bugging me for a while. I'll watch this thread with great interest.

Glad to answer questions. Thanks for any KUDOS or marked solutions 😉

ijustlovemath · ‎10-31-2015

Glad to see I'm not alone in this. It seems like this is limited to when you have lots of same-user logins; I have 13 instances of the same user currently logged in (can't afford a full license on a student budget), and they do something akin to:

pull a job from the job table, dependent on the job status
process the job ID into usable parameters for the application (eg initial values, spline interpolants etc)
log results continuously, usually about ~4k rows per job, using inserts
close up the job, go back to 1.
exit when there's no more jobs

I know for sure that the deadlocks only occur at 1. Each job usually takes ~40 secs to run, and with 13 going all the time, sometimes you'll get a collision of two application instances gunning for the same job, resulting in a deadlock. SQL server handles this automatically by picking one transaction to terminate. You can also set deadlock priority per-transaction, but that'd be pretty arbitrary in my case, as the transactions that deadlock are all the same.

Reading that article, I think splitting applications into separate users would avoid this problem entirely. It doesn't really matter that much though, as you can just wait about 50ms and run the transaction again.

Thanks for the offer, maybe in a few years when I work a real job 😄

PatrickLye · ‎10-31-2015

This is all pretty interesting.

My typical problems happen when I'm not running on the industrial network where I have total control.

I have my MS SQL server bridged across two networks to give me access from either side, but to keep total separation from the office network for anything that could hurt people or damage equipment.

On the office network when a connection problem happens it typically shows up at the same time there is some kind of alarm event in the Windows log due to lost DNS, lousy backup software, or someone in the IT department was trying to "fix" something on the network, but it cascaded into problems that knocked other systems offline. But, that is life trying to do the almost impossible due to a patchwork office network that has IT people understaffed and changing over too frequently.

I would like to someday get a master controller program going that also monitors the health of my equipment on the office network, not just the industrial network side, and notifies the IT department so that they can get it repaired as well as log the issues to get ammunition to push for more changes. Without reporting it is difficult to justify cost expenditures in a corporate environment...

Glad to answer questions. Thanks for any KUDOS or marked solutions 😉

LabVIEW

What can I do to make a database application more resilient?

What can I do to make a database application more resilient?

Re: What can I do to make a database application more resilient?

Re: What can I do to make a database application more resilient?

Re: What can I do to make a database application more resilient?

Re: What can I do to make a database application more resilient?

Re: What can I do to make a database application more resilient?