Assignment: Pandas Groupby with Hurricane Data

Import Numpy, Pandas and Matplotlib and set the display options.

Use the following code to load a CSV file of the NOAA IBTrACS hurricane dataset:

url = 'https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.ALL.list.v04r00.csv'
df = pd.read_csv(url, parse_dates=['ISO_TIME'], usecols=range(12),
                 skiprows=[1], na_values=[' ', 'NOT_NAMED'],
                 keep_default_na=False, dtype={'NAME': str})
df.head()

Basin Key: (NI - North Indian, SI - South Indian, WP - Western Pacific, SP - Southern Pacific, EP - Eastern Pacific, NA - North Atlantic)

How many rows does this dataset have?

How many North Atlantic hurricanes are in this dataset?

1) Get the unique values of the BASIN, SUBBASIN, and NATURE columns

2) Rename the WMO_WIND and WMO_PRES columns to WIND and PRES

3) Get the 10 largest rows in the dataset by WIND

You will notice some names are repeated.

4) Group the data on SID and get the 10 largest hurricanes by WIND

5) Make a bar chart of the wind speed of the 20 strongest-wind hurricanes

Use the name on the x-axis.

6) Plot the count of all datapoints by Basin

as a bar chart

7) Plot the count of unique hurricanes by Basin

as a bar chart.

8) Make a hexbin of the location of datapoints in Latitude and Longitude

9) Find Hurricane Katrina (from 2005) and plot its track as a scatter plot

First find the SID of this hurricane.

Next get this hurricane’s group and plot its position as a scatter plot. Use wind speed to color the points.

10) Make time the index on your dataframe

11) Plot the count of all datapoints per year as a timeseries

You should use resample

12) Plot all tracks from the North Atlantic in 2005

You will probably have to iterate through a GroupBy object

13) Create a filtered dataframe that contains only data since 1970 from the North Atlantic (“NA”) Basin

Use this for the rest of the assignment

14) Plot the number of datapoints per day from this filtered dataframe

Make sure you figure is big enough to actually see the plot

15) Calculate the climatology of datapoint counts as a function of dayofyear

Plot the mean and standard deviation on a single figure

16) Use transform to calculate the anomaly of daily counts from the climatology

Resample the anomaly timeseries at annual resolution and plot a line with dots as markers.

Which years stand out as having anomalous hurricane activity?