본문 바로가기
Data Analysis/SQL

6. RFM Segmentation (SQL)

by Hagrid 2022. 11. 15.
반응형

1. RFM 분석의 3가지 지표 집계하기

- Recency: 최근 구매일 - 최근에 구매하였는가?
- Frequency: 구매 횟수 - 얼마나 자주 구매하였는가?
- Monetary: 구매 금액 합계 -  얼마나 돈을 썼는가?

2. 원리는 아래의 사진과 같다

3. 순서대로 진행 !

1

  • 먼저 집계 테이블을 구하는데 위에 분류된 기준을 사용해 컬럼을 생성해준다.
select customer_id,
      count(customer_id) as frequency,
      sum(sales) as monetary,
      max(order_date) AS recent_order
from records
group by customer_id

2

  • 그다음 나이브하게 5등분으로 나누어 준다.
  • 위에서 만든 테이블은 CTE 형식으로 나타내어 주어도 좋지만 서브쿼리를 사용하여 나타내어 본다.
SELECT customer_id,
    ntile(5) OVER ( ORDER BY recent_order ) AS rfm_recency,
    ntile(5) OVER ( ORDER BY total_orders ) AS rfm_frequency,
    ntile(5) OVER ( ORDER BY total_sales ) AS rfm_monetary
from(select customer_id,
      count(customer_id) as total_orders,
      sum(sales) as total_sales,
      max(order_date) AS recent_order
from records
group by customer_id) as t
  • 결과 테이블

3.

RFM CELL 을 만들어 준다.

/* cte */
WITH preprocessing_tbl
     AS (
        /* 5등분한 테이블 */
        SELECT customer_id,
               Ntile(5)
                 OVER (
                   ORDER BY recent_order ) AS rfm_recency,
               Ntile(5)
                 OVER (
                   ORDER BY total_orders ) AS rfm_frequency,
               Ntile(5)
                 OVER (
                   ORDER BY total_sales )  AS rfm_monetary
         /* RFM 을 계산한 테이블 */
         FROM   (SELECT customer_id,
                        Count(customer_id) AS total_orders,
                        Sum(sales)         AS total_sales,
                        Max(order_date)    AS recent_order
                 FROM   records
                 GROUP  BY customer_id) AS tbl1)
/* RFM CELL 만들기 */
SELECT customer_id,
       rfm_recency,
       rfm_frequency,
       rfm_monetary,
       Concat(rfm_recency, rfm_frequency, rfm_monetary) AS rfm_cell
FROM   preprocessing_tbl
  • 서브쿼리 버전
/* 서브쿼리 */
SELECT customer_id,
       rfm_recency,
       rfm_frequency,
       rfm_monetary,
       Concat(rfm_recency, rfm_frequency, rfm_monetary) AS rfm_cell
FROM  (
      /* 5등분한 테이블 */
      SELECT customer_id,
             Ntile(5)
               OVER (
                 ORDER BY recent_order ) AS rfm_recency,
             Ntile(5)
               OVER (
                 ORDER BY total_orders ) AS rfm_frequency,
             Ntile(5)
               OVER (
                 ORDER BY total_sales )  AS rfm_monetary
       /* RFM 을 계산한 테이블 */
       FROM   (SELECT customer_id,
                      Count(customer_id) AS total_orders,
                      Sum(sales)         AS total_sales,
                      Max(order_date)    AS recent_order
               FROM   records
               GROUP  BY customer_id) AS tbl1) tbl2
  • 결과 테이블

- 구분기준

Cannot lose Active fans Promising newbies Potential churners
RFM cell values 355, 255 543, 542, 453, 452 525, 524, 515, 514 335, 334, 325, 324
Conditions for inclusion Low RHigh F&M High R&FLow M High R&MLow F (so far) Mid RLow FHigh M
Description Big spenders who haven’t purchased lately Customers who buy often & recently, but at low price points New customers with large orders High spending customers who haven’t purchased often or lately

최종버전

/* 구분 기준별 나누기  */
SELECT customer_id,
            rfm_recency,
       rfm_frequency,
       rfm_monetary,
       rfm_cell,
       CASE
         WHEN rfm_cell IN ( '355', '255' ) THEN 
                    'Cannot lose'
         WHEN rfm_cell IN ( '543', '542', '453', '452' ) THEN 
                    'Active fans'
         WHEN rfm_cell IN ( '525', '524', '515', '514' ) THEN
         'Promising newbies'
         WHEN rfm_cell IN ( '335', '334', '325', '324' ) THEN
         'Potential churners'
         ELSE 'Other'
       END         AS rfm_segment
FROM   (
       /* 서브쿼리 - RFM cell 만들기  */
       SELECT customer_id,
              rfm_recency,
              rfm_frequency,
              rfm_monetary,
              Concat(rfm_recency, rfm_frequency, rfm_monetary) AS rfm_cell
        FROM   (
               /* 5등분한 테이블 */
               SELECT customer_id,
                      Ntile(5)
                        OVER (
                          ORDER BY recent_order ) AS rfm_recency,
                      Ntile(5)
                        OVER (
                          ORDER BY total_orders ) AS rfm_frequency,
                      Ntile(5)
                        OVER (
                          ORDER BY total_sales )  AS rfm_monetary
                /* RFM 을 계산한 테이블 */
                FROM   (SELECT customer_id,
                               Count(customer_id) AS total_orders,
                               Sum(sales)         AS total_sales,
                               Max(order_date)    AS recent_order
                        FROM   records
                        GROUP  BY customer_id) AS tbl1) tbl2) AS tbl3
  • 결과 테이블 및 시각화


위와 같이 고객을 나이브하게 나누어 분류해보았다.

결론

참고링크

How to Create RFM Segments for Shopify

반응형

댓글