import matplotlib
import numpy as np
import matplotlib.pyplot as plot
%matplotlib inline
%%sql
SELECT
max, (max-32)*5/9 celsius, mo, da, state, stn, name
FROM (
SELECT
max, mo, da, state, stn, name
FROM
[bigquery-public-data:noaa_gsod.gsod2015] a
JOIN
[bigquery-public-data:noaa_gsod.stations] b
ON
a.stn=b.usaf
AND a.wban=b.wban
WHERE
state="WA"
AND max<1000
AND country='US' )
ORDER BY
max DESC
Skagit Rgnl is in a very cool costal community. this is an anomaly. let's look at two other nearby stations.
%%sql
SELECT
usaf, name
FROM [bigquery-public-data:noaa_gsod.stations]
WHERE
name="BELLINGHAM INTL" OR name="PADILLA BAY RESERVE" OR name = "SKAGIT RGNL"
Now pull out the weather data for the year for all three stations and plot them.
q = "SELECT max AS temperature, \
TIMESTAMP(STRING(year) + '-' + STRING(mo) + '-' + STRING(da)) AS timestamp \
FROM [bigquery-public-data:noaa_gsod.gsod2015] \
WHERE stn = '%s' and max <500 \
ORDER BY year DESC, mo DESC, da DESC"
stationlist = ['720272','727930', '727976' ]
dflist = [bq.Query(q % station).to_dataframe() for station in stationlist]
from pylab import rcParams
rcParams['figure.figsize'] = 20, 5
with plot.style.context('fivethirtyeight'):
for df in dflist:
plot.plot(df['timestamp'], df['temperature'], linewidth=2)
plot.show()
we can clearly see the skagit station is not always functioning properly. there is clearly another anomaly in March.