MySQL数据分析统计全攻略,从入门到精通

文章导读
SELECT AVG(salary) as average_salary FROM employees GROUP BY department; 这是一个基础的平均薪资统计查询,从入门开始掌握聚合函数如COUNT、SUM、AVG、MIN、MAX。接着使用GROUP BY分组统计,再结合HAVING筛选组结果。高级技巧包括窗口函数如ROW_NUMBER() OVER(ORDER BY sales D
📋 目录
  1. A 基础聚合函数
  2. B 分组与多表统计
  3. C 窗口函数进阶
  4. D 子查询与CTE
  5. E 性能优化
  6. F 时间序列分析
A A

SELECT AVG(salary) as average_salary FROM employees GROUP BY department; 这是一个基础的平均薪资统计查询,从入门开始掌握聚合函数如COUNT、SUM、AVG、MIN、MAX。接着使用GROUP BY分组统计,再结合HAVING筛选组结果。高级技巧包括窗口函数如ROW_NUMBER() OVER(ORDER BY sales DESC),用于排名分析。全攻略的核心是JOIN多表关联统计,如LEFT JOIN orders ON customers.id = orders.customer_id,然后GROUP BY customers.region计算区域销售额。子查询嵌套如SELECT department, (SELECT AVG(salary) FROM employees e2 WHERE e2.department = e1.department) FROM employees e1。性能优化用EXPLAIN分析查询计划,建立索引INDEX idx_date ON sales(date)。实战案例:日活跃用户DAU统计SELECT DATE(login_time) as date, COUNT(DISTINCT user_id) FROM user_logs GROUP BY DATE(login_time) ORDER BY date DESC; 月环比增长(REVENUE - LAG(REVENUE) OVER(ORDER BY month)) / LAG(REVENUE) OVER(ORDER BY month) * 100。从入门到精通,就掌握这些SQL语句逐步实践。

基础聚合函数

COUNT(*) 用于统计总行数,COUNT(DISTINCT column) 统计唯一值。SUM(column) 求和,AVG(column) 平均值,忽略NULL。MIN/MAX 极值。示例:SELECT department, COUNT(*) as emp_count, AVG(salary) as avg_salary FROM employees GROUP BY department HAVING emp_count > 10;

分组与多表统计

GROUP BY department, role 多字段分组。结合ROLLUP生成小计,如GROUP BY department WITH ROLLUP。JOIN统计:SELECT c.name, COUNT(o.id) as order_count FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id;

MySQL数据分析统计全攻略,从入门到精通

窗口函数进阶

PARTITION BY department ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING 计算部门平均。RANK() OVER(PARTITION BY dept ORDER BY score DESC) 排名,DENSE_RANK忽略并列空位。LAG/LEAD 访问前后行数据,用于环比。

子查询与CTE

WITH top_depts AS (SELECT department FROM employees GROUP BY department HAVING AVG(salary) > 50000) SELECT * FROM employees WHERE department IN (SELECT department FROM top_depts); CTE复用子查询,提高可读性。

MySQL数据分析统计全攻略,从入门到精通

性能优化

用EXPLAIN SELECT ... 查看type=ref/index,避免全表扫描。复合索引覆盖查询字段。LIMIT分页:SELECT * FROM large_table ORDER BY id LIMIT 10000, 10; 注意深分页问题,用WHERE id > last_id。

MySQL数据分析统计全攻略,从入门到精通

时间序列分析

日期函数:DATE_FORMAT(date, '%Y-%m') 月分组。DATEDIFF(end_date, start_date) 天差。用户留存:SELECT user_id, COUNT(DISTINCT DATE(login_time)) as active_days FROM logs GROUP BY user_id HAVING active_days >= 7;

FAQ
Q: 如何计算年同比销售额增长?
A: SELECT year, revenue, (revenue - LAG(revenue) OVER(ORDER BY year ROWS 1 PRECEDING)) / LAG(revenue) OVER(ORDER BY year ROWS 1 PRECEDING) * 100 as yoy_growth FROM yearly_sales;
Q: GROUP BY 和 DISTINCT 区别?
A: GROUP BY 用于聚合统计,DISTINCT 只去重单列或整行。
Q: 窗口函数与子查询性能哪个好?
A: 窗口函数通常更快,避免多次扫描表。
Q: 如何处理NULL值在统计中?
A: AVG忽略NULL,用IFNULL(column, 0) 或 COALESCE 替换。