其次是更新表,这里的更新的时候是带主键值的(因为我是从另一个表获取数据再插入,要求主键不能变) 同样直接更新1000条数据, replace的操作要比insert on duplicate的操作低太多太多, 当insert瞬间完成(感觉)的时候,replace要7,8s, replace慢的原因我是知道的,在更新数据的时候,要先删除旧的,然后插入新的,在这个过程中,还要重新维护索引,所以速度慢,但为何insert on duplicate的更新却那么快呢。 在向老大请教后,终于知道,insert on duplicate 的更新操作虽然也会更新数据,但其对主键的索引却不会有改变,也就是说,insert on duplicate 更新对主键索引没有影响.因此对索引的维护成本就低了一些(如果更新的字段不包括主键,那就要另说了)。
mysql> explain select * from city_demo where city like 'Jinch%';
+----+-------------+-----------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | city_demo | range | city | city | 20 | NULL | 2 | Using where |
+----+-------------+-----------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql>
<?phpfunctionechoTimes($msg, $max){
for ($i = 1; $i <= $max; ++$i) {
echo"$msg iteration $i\n";
yield;
}
}
functiontask(){
echoTimes('foo', 10); // print foo ten timesecho"---\n";
echoTimes('bar', 5); // print bar five timesyield; // force it to be a coroutine
}
$scheduler = new Scheduler;
$scheduler->addTask(task());
$scheduler->run();
DELETE consum_record
FROM
consum_record,
(
SELECT
min(id) id,
user_id,
monetary,
consume_time
FROM
consum_record
GROUP BY
user_id,
monetary,
consume_time
HAVING
count(*) > 1
) t2
WHERE
consum_record.user_id = t2.user_id
and consum_record.monetary = t2.monetary
and consum_record.consume_time = t2.consume_time
AND consum_record.id > t2.id;
上面这条sql语句,仔细看一下,揣摩出思路也不难,大概也分为3步来理解:
(SELECT min(id) id, user_id, monetary, consume_time FROM consum_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2 查询出重复记录形成一个集合(临时表t2),集合里是每种重复记录的最小ID
consum_record.user_id = t2.user_id and consum_record.monetary = t2.monetary and consum_record.consume_time = t2.consume_time关联判断重复基准的字段