概要 | |
---|---|
目的 | 複数カラムのmin, maxのとり方 - GREATEST, LEAST |
分類 | SQL Cookbook |
SQL | Standard SQL |
環境 | BigQuery |
Table of Contents
問題設定: 複数カラムに対するmin/max valueの計算方法
以下のようなデータ(X, Y, Z)を考えたとき、(X, Y, Z)のうちもっとも小さい/大きい値を抽出したいとします
X Y Z
10 10 60
10 20 10
20 10 10
20 20 50
20 30 50
20 40 20
30 10 10
30 20 50
30 30 100
30 40 50
30 50 10
40 20 10
40 30 30
40 40 40
40 50 20
50 30 20
50 40 20
50 50 40
Goal
10 10 60 10 60
10 20 10 10 20
20 10 10 10 20
20 20 50 20 50
20 30 50 20 50
20 40 20 20 40
30 10 10 10 30
30 20 50 20 50
30 30 100 30 100
30 40 50 30 50
30 50 10 10 50
40 20 10 10 40
40 30 30 30 40
40 40 40 40 40
40 50 20 20 50
50 30 20 20 50
50 40 20 20 50
50 50 40 40 50
SQL: LEAST/GREATEST function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
WITH data AS(
SELECT
X,
COALESCE(Y[SAFE_OFFSET(offset)], 0) AS Y,
COALESCE(Z[SAFE_OFFSET(offset)], 0) AS Z
FROM
(
SELECT
[10, 10, 20, 20, 20, 20, 30, 30, 30, 30, 30, 40, 40, 40, 40, 50, 50, 50] AS X,
[10, 20, 10, 20, 30, 40, 10, 20, 30, 40, 50, 20, 30, 40, 50, 30, 40, 50] AS Y,
[60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40] AS Z
) AS input,
UNNEST(X) AS X WITH OFFSET AS offset
)
SELECT
X,
Y,
Z,
LEAST(X, Y, Z) AS minimum_value,
GREATEST(X, Y, Z) AS max_value,
FROM
data
Python: pandas
1
2
3
4
5
6
7
8
9
10
import pandas as pd
df = pd.DataFrame(
{"X": [10, 10, 20, 20, 20, 20, 30, 30, 30, 30, 30, 40, 40, 40, 40, 50, 50, 50],
"Y": [10, 20, 10, 20, 30, 40, 10, 20, 30, 40, 50, 20, 30, 40, 50, 30, 40, 50],
"Z": [60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40]}
)
df['minimum_value'], df['max_value'] = df.min(axis = 1), df.max(axis = 1)
print(df)
Python: numpy
1
2
3
4
5
6
7
8
9
10
11
12
import pandas as pd
import numpy as np
df = pd.DataFrame(
{"X": [10, 10, 20, 20, 20, 20, 30, 30, 30, 30, 30, 40, 40, 40, 40, 50, 50, 50],
"Y": [10, 20, 10, 20, 30, 40, 10, 20, 30, 40, 50, 20, 30, 40, 50, 30, 40, 50],
"Z": [60, 10, 10, 50, 50, 20, 10, 50, 100, 50, 10, 10, 30, 40, 20, 20, 20, 40]}
).values
min_array, max_array = np.amin(df, axis=1), np.amax(df, axis=1)
result_array = np.column_stack([df, min_array, max_array])
print(result_array)
統計
Python
math
Linux
Ubuntu 20.04 LTS
Shell
English
git
方法論
Ubuntu 22.04 LTS
統計検定
競技プログラミング
フーリエ解析
前処理
SQL
coding
コミュニケーション
Network
ssh
将棋
Data visualization
Docker
Econometrics
VSCode
statistical inference
GitHub Pages
apt
development
システム管理
Coffee
cloud
数値計算
素数
Book
Font
Metrics
Poetry
Ubuntu 24.04 LTS
architecture
aws
shell
systemctl
テンプレート
データ構造
ポワソン分布
会計分析
文字コード
環境構築
論文
App
Bayesian
Dynamic Programming
Keyboard
Processing
R
Steam
filesystem
quarto
regex
(注意:GitHub Accountが必要となります)