背景

由于业务中有备份某个业务表的定时任务会在每天指定时间点做一次备份【使用quartz】

各位也都知道各种timeout的相关问题

socketTimeout

connectTimeOut

queryTimeOut

transactionTimeOut

等等

事实上每一个timeout都有其特殊的用处

相传不久前美团外卖挂机事件也是由于前一天晚上发布临时调整了timeout导致~

问题

我们系统中使用了mybatis的timeout 默认情况配置30s

因此开发为了防止对应sql超时设置了为500s

但是事务时间设置为180s

@Intercepts({         @Signature(type = StatementHandler.class, method = "update", args = {Statement.class}),         @Signature(type = StatementHandler.class, method = "batch", args = {Statement.class}),         @Signature(type = StatementHandler.class, method = "query", args = {Statement.class, ResultHandler.class})}) public class MybatisTransactionTimeoutInterceptor implements Interceptor {       @Override     public Object intercept(Invocation invocation) throws Throwable {         Statement stmt = (Statement) invocation.getArgs()[0];         Collection<Object> values = TransactionSynchronizationManager.getResourceMap().values();         if (values.size() > 0) {             for (Object obj : values) {                 if (obj != null && obj instanceof ConnectionHolder) {                     ConnectionHolder holder = (ConnectionHolder) obj;                     if (holder.hasTimeout()) {                         int queryTimeOut = holder.getTimeToLiveInSeconds();                         if (stmt.getQueryTimeout() != 0) {                             queryTimeOut = queryTimeOut < stmt.getQueryTimeout() ? queryTimeOut : stmt.getQueryTimeout();                         }                         stmt.setQueryTimeout(queryTimeOut);                     }                     break;                 }             }         }           return invocation.proceed();     }       @Override     public Object plugin(Object target) {         return Plugin.wrap(target, this);     }       @Override     public void setProperties(Properties properties) {       } }

那么实质上应当在180s的时候就抛出异常

但是结果均是大约在270s左右抛出异常。

service运行时间和事务时长不一致！居然额外多出了90s！

分析

首先分析该service

该service十分简单基本没有太多业务操作

考虑其设置超时时间应当正确为180s【事实上正常执行一次该sql需要约4min+】

但是基本在200s+能够返回

通过show processlist命令查看对应mysql连接处于query end 状态

面向Google编程后发现大部分用户提到磁盘空间占用不足导致出现长时间 queryend http://yuan-xulong.iteye.com/blog/2047298

和我们系统不符【继续观察系统】

发现sql语句恰好在180s后更改了状态为query end【证明query timeout 是生效的】

但是系统在200s+才收到异常的原因是mysql对应连接长时间hang在了query end 状态

继续面向谷歌编程检索到如下一句话

We're running on MySQL Percona 5.5 on InnoDB, and whenever our system has bad performance we see that there are a LOT of INSERT threads stuck in the "query end" state.

According to Google searches, this state doesn't do anything, it's just:

This state occurs after processing a query but before the freeing items state.

Does this mean that the 'freeing items' state has a queue that all of our INSERTs have to wait for, or is there something else going on during this state?

Edit: The innodb_flush_log_at_trx_commit variable is already set at 0.

都提到了该状态通常出现在insert之后我们的sql语句时大量的插入某个表【因此考虑可能导致log相当多涉及到磁盘】

查看 innodb_flush_log_at_trx_commit

Variable_name Value
innodb_flush_log_at_trx_commit 2

一、参数解释

0：log buffer将每秒一次地写入log file中，并且log file的flush(刷到磁盘)操作同时进行。该模式下在事务提交的时候，不会主动触发写入磁盘的操作。

1：每次事务提交时MySQL都会把log buffer的数据写入log file，并且flush(刷到磁盘)中去，该模式为系统默认。

2：每次事务提交时MySQL都会把log buffer的数据写入log file，但是flush(刷到磁盘)操作并不会同时进行。该模式下，MySQL会每秒执行一次 flush(刷到磁盘)操作。

二、参数修改

找到mysql配置文件mysql.ini，修改成合适的值，然后重启mysql。

三、注意事项

当设置为0，该模式速度最快，但不太安全，mysqld进程的崩溃会导致上一秒钟所有事务数据的丢失。

当设置为1，该模式是最安全的，但也是最慢的一种方式。在mysqld 服务崩溃或者服务器主机crash的情况下，binary log 只有可能丢失最多一个语句或者一个事务。。
当设置为2，该模式速度较快，也比0安全，只有在操作系统崩溃或者系统断电的情况下，上一秒钟所有事务数据才可能丢失。

考虑由于大量的插入导致需要flush log 导致sql语句hang在query end状态

措施

在大量插入的场景下可以适当放宽事务时长

本文发表于2018年01月03日 14:32
(c)注：本文转载自https://my.oschina.net/qixiaobo025/blog/1601128，转载目的在于传递更多信息，并不代表本网赞同其观点和对其真实性负责。如有侵权行为，请联系我们，我们会及时删除.

阅读 4908 讨论 0 喜欢 1

mysql的query end status

背景

问题

分析

三、注意事项

措施

Y2lvbkBjaGluYWNpb24uY24=

快捷链接
网站地图
提交友链