Standard SQL CookBook - 2/N

複数カラムのmin, maxのとり方 - GREATEST, LEAST

公開日: 2021-09-27

  概要
目的 複数カラムのmin, maxのとり方 - GREATEST, LEAST
分類 SQL Cookbook
SQL Standard SQL
環境 BigQuery

Table of Contents

問題設定: 複数カラムに対するmin/max valueの計算方法

以下のようなデータ(X, Y, Z)を考えたとき、(X, Y, Z)のうちもっとも小さい/大きい値を抽出したいとします

X	Y	Z
10	10	60
10	20	10
20	10	10
20	20	50
20	30	50
20	40	20
30	10	10
30	20	50
30	30	100
30	40	50
30	50	10
40	20	10
40	30	30
40	40	40
40	50	20
50	30	20
50	40	20
50	50	40

Goal

10	10	60	10	60
10	20	10	10	20
20	10	10	10	20
20	20	50	20	50
20	30	50	20	50
20	40	20	20	40
30	10	10	10	30
30	20	50	20	50
30	30	100	30	100
30	40	50	30	50
30	50	10	10	50
40	20	10	10	40
40	30	30	30	40
40	40	40	40	40
40	50	20	20	50
50	30	20	20	50
50	40	20	20	50
50	50	40	40	50

SQL: LEAST/GREATEST function

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
WITH data AS(
SELECT
    X,
    COALESCE(Y[SAFE_OFFSET(offset)],  0) AS Y,
    COALESCE(Z[SAFE_OFFSET(offset)],  0) AS Z
FROM
    (
        SELECT 
            [10, 10, 20, 20, 20, 20, 30, 30, 30,  30, 30, 40, 40, 40, 40, 50, 50, 50] AS X,
            [10, 20, 10, 20, 30, 40, 10, 20, 30,  40, 50, 20, 30, 40, 50, 30, 40, 50] AS Y,
            [60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40] AS Z
    ) AS input,
    UNNEST(X) AS X WITH OFFSET AS offset
)
SELECT
    X,
    Y,
    Z,
    LEAST(X, Y, Z) AS minimum_value,
    GREATEST(X, Y, Z) AS max_value,
FROM
    data

Python: pandas

1
2
3
4
5
6
7
8
9
10
import pandas as pd

df = pd.DataFrame(
    {"X": [10, 10, 20, 20, 20, 20, 30, 30, 30, 30, 30, 40, 40, 40, 40, 50, 50, 50],
     "Y": [10, 20, 10, 20, 30, 40, 10, 20, 30, 40, 50, 20, 30, 40, 50, 30, 40, 50],
     "Z": [60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40]}
     )

df['minimum_value'], df['max_value'] = df.min(axis = 1), df.max(axis = 1)
print(df)

Python: numpy

1
2
3
4
5
6
7
8
9
10
11
12
import pandas as pd
import numpy as np

df = pd.DataFrame(
    {"X": [10, 10, 20, 20, 20, 20, 30, 30, 30, 30, 30, 40, 40, 40, 40, 50, 50, 50],
     "Y": [10, 20, 10, 20, 30, 40, 10, 20, 30, 40, 50, 20, 30, 40, 50, 30, 40, 50],
     "Z": [60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40]}
     ).values

min_array, max_array = np.amin(df, axis=1), np.amax(df, axis=1) 
result_array = np.column_stack([df, min_array, max_array])
print(result_array)


Share Buttons
Share on:

Feature Tags
Leave a Comment
(注意:GitHub Accountが必要となります)