Yannick Pereira-Reis bio photo

Yannick Pereira-Reis

DevOps (docker swarm, haproxy, CI/CD, ELK, prometheus, grafana, ansible, automation, RabbitMQ, LVM, MySQL replication...) and fullstack web developer Symfony 2/3/4/5 + VueJs in Valence (France).

Twitter LinkedIn Github

Docker

MySQL Delete with Group By and Having clauses.

To achieve this exploit with a single simple MySQL command, we are using two useful functions:

This function returns a string result with the concatenated non-NULL values from a group. It returns NULL if there are no non-NULL values.

Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by , characters. If the first argument is a constant string and the second is a column of type SET, the FIND_IN_SET() function is optimized to use bit arithmetic. Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL. This function does not work properly if the first argument contains a comma (,) character.

Select the ids we want to delete with the GROUP BY and HAVING clauses

SELECT group_concat(id) AS ids_to_delete
FROM table
GROUP BY field_1, field_2,...
HAVING count(*) >= 5

The result will look like that:

---------------
ids_to_delete
---------------
1,34,87,8756,4657,34

Delete rows directly using this list of ids.

To do this we use the FIND_IN_SET() function.

[EDIT 1] (David Gurba)
group_concat_max_len: The maximum permitted result length in bytes for the GROUP_CONCAT() function. The default is 1024.
SET SESSION group_concat_max_len=4294967295;
SET GLOBAL group_concat_max_len=18446744073709551615;

DELETE FROM table
WHERE FIND_IN_SET(id, (
  SELECT ids_to_delete
  FROM (
     SELECT group_concat(id) AS ids_to_delete
     FROM table
     GROUP BY field_1, field_2,...
     HAVING count(*) >= 5
   ) t
))

Dry run - If you want to check what you will remove before really doing it:

DELETE FROM table => SELECT * FROM table

SELECT * FROM table
WHERE FIND_IN_SET(id, (
  SELECT ids_to_delete
  FROM (
     SELECT group_concat(id) AS ids_to_delete
     FROM table
     GROUP BY field_1, field_2,...
     HAVING count(*) >= 5
   ) t
))

That’s all ;)