You wish to examine the foundation reason behind this latency spike. that there’s no latest configuration or schema change to the database. The one factor you possibly can consider is the brand new deployment a number of hours in the past which may have launched a brand new entry sample. You suppose there is perhaps one thing suspicious with the transactions that had been rolled out with that deployment. However how are you going to discover the smoking gun to show it? How are you going to discover out whether or not lock conflicts are inflicting the latency spike? If lock conflicts are certainly the culprits, then which desk cells within the database are transactions making an attempt to lock?
Cloud Spanner and Locking
The excellent news is that Cloud Spanner’s introspection instruments and, specifically, the not too long ago added Lock Statistics characteristic, will help you reply these questions.
Cloud Spanner – Google’s totally managed horizontally scalable relational database service, provides the strictest concurrency-control ensures, with the intention to deal with the logic of the transaction with out worrying about information integrity. To provide you this peace of thoughts, and to make sure consistency of a number of concurrent transactions, we use a mixture of shared locks and unique locks on the desk cell stage. And, as you realize, with locks comes the potential for lock conflicts. When a number of transactions attempt to take a lock on the identical cell, lock conflicts can happen which may result in a efficiency hit in your database.
What’s Lock Statistics
Lock statistics is one other introspection instrument we not too long ago added to the gathering of options that enable you to analyze and clear up points in your Cloud Spanner database. As its title suggests, Lock Statistics exposes information about locks, lock wait instances, which desk cells are concerned in locks, and so forth. As with the opposite introspection instruments, this information is uncovered in Lock Statistics via a set of built-in statistics tables.
Aggregated Lock Statistics Desk. The aggregated desk comprises the full lock wait time for each 1-minute, 10-minutes, and 60-minutes interval. The full lock wait time can be utilized to watch total software well being and correlate with latency spikes.
Prime Lock Statistics Desk. The highest statistics desk comprises sampled cells that incur the longest lock wait time for each 1-minute, 10-minutes, and 60-minutes interval as nicely. The highest statistics desk helps to pin down particular information cells that incur the longest lock wait and establish the transactions that contend for locks.
Every row within the prime lock statistics desk represents the lock battle which causes the longest wait time within the given time interval. It reveals the beginning key of the locked key vary, the wait time the transactions spent on the sure conflicts, and pattern columns that had conflicts together with the lock mode. This lets you pin right down to particular information cells that incur the longest lock wait and establish what are the transactions which contend for locks.
You could find extra particulars within the Introspection Tools part of the documentation, which covers question, transaction, learn and lock statistics, in addition to info on learn how to uncover your oldest energetic queries.
Let’s now have a look at a sample that mixes Lock Statistics with different statistics as a way to clear up the issue we mentioned within the instance state of affairs.
The best way to use Lock Statistics
So, how can Lock Statistics assist us in our quest to know, and hopefully mitigate, the issue that led to the latency points we found in our instance state of affairs ? As you’ll see, the true energy of the introspection instruments assortment comes out after we mix info from these instruments over the time interval throughout which our issues began to happen.
Are lock conflicts the foundation reason behind the latency spikes?
To reply this query, we are able to look at common latency numbers and lock wait instances in the course of the time interval when our issues first occurred.
Transaction Statistics gives commit latency info and Lock Statistics gives lock wait time info. If lock conflicts had been the rationale for the rise in commit latency, then it is best to have the ability to see a correlation after we be a part of, for instance, (TXN_STATS_TOTAL_10MINUTE) with lock statistics (LOCK_STAT_TOTAL_10MINUTE) as within the following instance. The next pattern question tells us the common commit latency and complete lock wait time for each 10 minute interval.