데이터베이스 예제코드 (MySQL+AWS) - JOIN 활용 실제 예제1

9. JOIN 활용 실제 예제1

예제1: "Comedy", "Sports", "Family" 카테고리에 포함되는 각각의 영화 수 구하기

sakila 데이터베이스에서 실습

db.close() # 이전에 연결된 데이터베이스 끊어주세요 (에러나면 이미 끊어진 것임)
host_name = "0.0.0.0"
username = "root"
password = "korea123"
database_name = "sakila"    # 데이터베이스 이름을 sakila 로 바꿔줘야 합니다.
db = pymysql.connect(
    host=host_name,  # DATABASE_HOST
    port=3306,
    user=username,  # DATABASE_USERNAME
    passwd=password,  # DATABASE_PASSWORD
    db=database_name,  # DATABASE_NAME
    charset='utf8'
)
  • "Comedy", "Sports", "Family" 각 카테고리 ID만 출력해보세요
category_df = pd.read_sql("SELECT * FROM category;", db)
category_df
Out[101]:
category_id name last_update
0 1 Action 2006-02-15 04:46:27
1 2 Animation 2006-02-15 04:46:27
2 3 Children 2006-02-15 04:46:27
3 4 Classics 2006-02-15 04:46:27
4 5 Comedy 2006-02-15 04:46:27
5 6 Documentary 2006-02-15 04:46:27
6 7 Drama 2006-02-15 04:46:27
7 8 Family 2006-02-15 04:46:27
8 9 Foreign 2006-02-15 04:46:27
9 10 Games 2006-02-15 04:46:27
10 11 Horror 2006-02-15 04:46:27
11 12 Music 2006-02-15 04:46:27
12 13 New 2006-02-15 04:46:27
13 14 Sci-Fi 2006-02-15 04:46:27
14 15 Sports 2006-02-15 04:46:27
15 16 Travel 2006-02-15 04:46:27
SQL = """
    SELECT name, category_id FROM category 
    WHERE name = 'Comedy' OR name = 'Sports' OR name = 'Family'
"""

category_df = pd.read_sql(SQL, db)
category_df
Out[107]:
name category_id
0 Comedy 5
1 Family 8
2 Sports 15
  • 영화 아이디가 2 인 영화의 카테고리 ID는?
film_category_df = pd.read_sql("SELECT * FROM film_category;", db)
film_category_df.head()
Out[109]:
film_id category_id last_update
0 1 6 2006-02-15 05:07:09
1 2 11 2006-02-15 05:07:09
2 3 6 2006-02-15 05:07:09
3 4 11 2006-02-15 05:07:09
4 5 8 2006-02-15 05:07:09
SQL = """
    SELECT category_id FROM film_category WHERE film_id = 2;
"""
film_category_df = pd.read_sql(SQL, db)
film_category_df
Out[111]:
category_id
0 11
  • 카테고리 ID별 영화 수는?
SQL_QUERY = """
    SELECT COUNT(*) FROM film_category GROUP BY category_id;
"""
category_film_count_df = pd.read_sql(SQL_QUERY, db)
category_film_count_df
Out[114]:
COUNT(*)
0 64
1 66
2 60
3 57
4 58
5 68
6 62
7 69
8 73
9 61
10 56
11 51
12 63
13 61
14 74
15 57
  • category 테이블에는 카테고리 이름과 category_id, film_category 테이블에는 category_id와 각 영화 id가 있음

JOIN 시작: 카테고리가 Comedy인 영화를 모두 찾아보자.

SQL_QUERY = """
    SELECT * 
    FROM film_category
    JOIN category
    ON category.category_id = film_category.category_id
    WHERE category.name = 'Comedy'
"""
pd.read_sql(SQL_QUERY, db)
  • 카테고리가 Comedy인 영화 수를 알아보자.
SQL_QUERY = """
    SELECT COUNT(*) 
    FROM film_category
    JOIN category
    ON category.category_id = film_category.category_id
    WHERE category.name = 'Comedy'
"""
pd.read_sql(SQL_QUERY, db)
Out[118]:
COUNT(*)
0 58
  • 카테고리가 Comedy, Sports, Family인 영화 수를 각각 알아보자.
SQL_QUERY = """
    SELECT category.name, COUNT(*) 
    FROM film_category
    JOIN category
    ON category.category_id = film_category.category_id
    WHERE category.name = 'Comedy' OR category.name = 'Sports' OR category.name = 'Family' 
    GROUP BY category.category_id
"""
pd.read_sql(SQL_QUERY, db)
Out[120]:
name COUNT(*)
0 Comedy 58
1 Family 69
2 Sports 74
  • 카테고리가 Comedy, Sports, Family인 영화 수와 카테고리명을 각각 출력해보자.
SQL_QUERY = """
    SELECT category.name, COUNT(*) 
    FROM film_category
    JOIN category
    ON category.category_id = film_category.category_id
    WHERE category.name = 'Comedy' OR category.name = 'Sports' OR category.name = 'Family' 
    GROUP BY category.category_id
"""
pd.read_sql(SQL_QUERY, db)
Out[122]:
name COUNT(*)
0 Comedy 58
1 Family 69
2 Sports 74
  • 카테고리가 Comedy, Sports, Family인 영화 수와 카테고리명을 각각 출력해보자(카테고리명을 이해하기 쉽게 category_film_count로 필드명을 출력하자)
SQL_QUERY = """
    SELECT category.name, COUNT(*) AS category_film_count
    FROM film_category
    JOIN category
    ON category.category_id = film_category.category_id
    WHERE category.name = 'Comedy' OR category.name = 'Sports' OR category.name = 'Family' 
    GROUP BY category.category_id
"""
pd.read_sql(SQL_QUERY, db)
Out[124]:
name category_film_count
0 Comedy 58
1 Family 69
2 Sports 74