Mysql – App Engine Standard connection to Cloud SQL Latency Randomly

google-cloud-sqlMySQLperformancePHPslow-log

I have a pretty "basic" app that we designed that was originally on a local plesk server and we migrated to GAE/GSQL/GCS. app engine, mysql, cloud storage.

Here's some background info:

App is PHP based, and runs great on the local server. When we migrate to the cloud we notice this random yet extremely latency that happens. It's so bad that the app times out and gives a SPDY timeout error. We utilize cloudflare for SPDY assistance so we started there and they said it's the the server. Then we went to google. We've been going back and forth back and forth and I am looking for other avenues of help.

I am running an app on a F2 standard GAE instance and a G1-small CloudSQL instance (gen 2). All same region/zone. There is also a failover sql instance.

There is really no pattern to it but users on the app notice a bad timeout very frequently and it dies after 60 seconds. (which points to a PHP timeout right? We checked the code and it runs fine on the local server)

I dont have a whole lot of traffic on this app yet (maybe a few users a day) so i dont know if it's traffic load. Here's some basic stats for you:

https://imgur.com/a/U1tk5ak

Some Google Engineers said our app has trouble scaling (QPS never will get about 1)

https://imgur.com/a/XWh44bm

And asked if we are threading. We are not. We do not use memcache yet either.

I also see a ton of these:

https://imgur.com/a/eVSNqc3

Which looks like this bug: https://github.com/GoogleCloudPlatform/cloudsql-proxy/issues/126

But I am unsure if this is all related.

We've tried going through Google's tech support, they said we have "manual locks" but our dev team doesn't agree nor know what this really means. Again, the same framework of the app (session handling etc) code is used in many apps with a ton of users on it (non GAE, they're on compute on AWS) so this is our first venture to GAE.

We connect using standard MySQL connection parameters and use the same framework in a lot of applications and it runs fine. We use the required proxy to connect to CloudSQL.

The speed and constant lag shouldn't be there. We don't know what this issue could be. My questions are:

1) Do you see any issues here? All database logs are above and summaries

2) Can you help me understand what may be wrong here?

Thank you!

Best Answer

It's hard to tell where exactly the problem is because there isn't enough information. I'm not sure how your application implements connecting, so it's hard to judge if that's the problem or not. Additionally, it's impossible to separate network latency vs query latency without more information.

Here some advice that might potentially help:

  1. Make sure you are using the /cloudsql/<INSTANCE_CONNECTION_NAME> unix socket to connect. You can follow the instructions here. This will likely provide the lowest latency.

  2. Make sure you are opening/closing connections correctly. You should be opening a connection, performing your queries, then closing the connection. Look up how best to do this in PHP given your stack.

  3. It looks like you are using a db-g1-small instance. This is a shared-core instance, and isn't recommended for production use (this is a particularly strange decision since you are also using a failover). Try upgrading to a machine size that is covered under the SLA.