API reference#
This page provides an auto-generated summary of pynsitu’s API. For more details and examples, refer to the relevant chapters in the main part of the documentation.
events#
Module handling helpful classes to keep track of an experimental campaign timeline
- class pynsitu.events.Campaign(file)[source]#
Campaign object, gathers deployments information from a yaml file
Methods
add_legend(ax[, labels, skip, colors])Add legend for deployment/platforms on an axis.
loops over all deployments, e.g.:
load(item[, toframe, ignore])load processed data files
load_path(item)load processed file path(s)
map([bathy, coastline, rivers])Plot map Wrapper around geo.plot_map, see related doc
map_folium([width, height, tiles, ignore, ...])Plot overview map with folium
timeline([platforms, sensors, deployments, ...])Plot the campaign deployment timeline
- add_legend(ax, labels=[], skip=None, colors={}, **kwargs)[source]#
Add legend for deployment/platforms on an axis. To be used for timelines (see Campaign.timeline) as well as maps
- Parameters:
- ax: pyplot.axes
- labels: list, optional
List of labels to consider amongst cp deployments/platforms
- skip: list, optional
List of deployments and platforms to skip
- colors: dict, optional
- **kwargs: passed to legend
- get_all_deployments()[source]#
loops over all deployments, e.g.:
- for label, deployment, platform, sensor, meta in cp.get_all_deployments():
…
- load(item, toframe=False, ignore=False)[source]#
load processed data files
- Parameters:
- item: str
Name of netcdf file
- toframe: boolean
Transform to pd.DataFrame
- ignore: boolean
Ignore non-existent files
- Returns:
- output: xr.Dataset, pd.DataFrame, dict
{‘file0’: ds0, ‘file1’: ds1, …} {‘platform0’: {‘deployment0’: data, …}}
- load_path(item)[source]#
load processed file path(s)
- Parameters:
- item: str
Name of netcdf file
- Returns:
- file_path: str, dict
- map(bathy=None, coastline=None, rivers=None, **kwargs)[source]#
Plot map Wrapper around geo.plot_map, see related doc
- map_folium(width='60%', height='60%', tiles='Cartodb Positron', ignore=None, bathy=True, overwrite_contours=False, zoom=10)[source]#
Plot overview map with folium
Parameters:#
- width: str, optional
width of the plot
- height: str, optional
height of the plot
- tiles: str, optional
- tiles used, see folium.Map?` (default is Cartodb Positron)
“OpenStreetMap”
“Mapbox Bright” (Limited levels of zoom for free tiles)
“Mapbox Control Room” (Limited levels of zoom for free tiles)
“Stamen” (Terrain, Toner, and Watercolor)
“Cloudmade” (Must pass API key)
“Mapbox” (Must pass API key)
“CartoDB” (positron and dark_matter)
- ignore: list, optional
Ignore deployment labels
- bathy: boolean, optional
Turn on/off bathymetric contours plotting
- overwrite_contours: boolean, optional
Overwrite contour file (default is False)
- zoom: int
Folium zoom level, see Folium doc zoom_start kwarg https://python-visualization.github.io/folium/quickstart.html#Getting-Started
- timeline(platforms=True, sensors=True, deployments=True, align_deployments=False, height=0.6, labels=False, ax=None, grid=True, exclude=[], figsize=None)[source]#
Plot the campaign deployment timeline
- Parameters:
- platforms: boolean, optional
Show platforms
- sensors: boolean, optional
Show sensors
- deployments: boolean, optional
Show deployments
- align_deployments: boolean, optional
Align deployments vertically
- height: float, optional
bar heights, 0.6 by default
- ax: pyplot.axes, optional
- grid: boolean, optional
Turn grid one (default is True)
- exclude: list, optional
list of platforms or deployments to exclude
- figsize: tuple, optional
enforce the size of the output figure
- class pynsitu.events.Deployment(label, start=None, end=None, meta=None, loglines=None)[source]#
A deployment describes data collection during a continuous stretch of time and is thus described by:
a label
a start event (see class event`)
an end event (see class event)
a meta dictionnary containing various pieces of information
Methods
plot_on_map(ax[, line, label, label_xyshift, s])Plot deployment on a map
converts to deployments object
- plot_on_map(ax, line=False, label=True, label_xyshift=(0.1, 0.1), s=5, **kwargs)[source]#
Plot deployment on a map
- Parameters:
- ax: matplotlib.pyplot.axes
Axis where to plot the event
- line: boolean, optional
Plot a line between start and end
- label: boolean, optional
Print label (False by default)
- label_xyshift: tuple, optional
Shifts the label in the x and y direction, (.1,.1) by default
- **kwargs: optional
Passed to pyplot plotting methods, if cartopy is used, one should at least pass transform=ccrs.PlateCarree()
- class pynsitu.events.Deployments(*args, **kwargs)[source]#
deployement dictionnary, provides shortcuts to access data in meta subdicts, e.g.: p = Deployments(meta=dict(a=1)) p[“a”] # returns 1
Methods
clear()get(k[,d])items()keys()pop(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem()as a 2-tuple; but raise KeyError if D is empty.
setdefault(k[,d])update([E, ]**F)If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values()copy
fromkeys
- class pynsitu.events.Event(label=None, logline=None)[source]#
An event is an atom used to describe deployments. It contains four elementary information:
label, longitude, latitude, time
- class pynsitu.events.Platform(dict=None, /, **kwargs)[source]#
Platform dictionnary, provides shortcuts to access data in meta, sensors and deployments subdicts, e.g.: p = platform(sensors=dict(a=1), deployments=dict(b=2)) p[“a”] # returns 1
Methods
clear()get(k[,d])items()keys()pop(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem()as a 2-tuple; but raise KeyError if D is empty.
setdefault(k[,d])update([E, ]**F)If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values()copy
deployments
fromkeys
sensors
maps#
Module for map generation (cartopy, folium)
- pynsitu.maps.get_projection(extent)[source]#
compute a geographical projection from extent which can either be a string (e.g. “global”) or a tuple
- pynsitu.maps.load_bathy(bathy, bounds=None, steps=None, land=False)[source]#
Load bathymetry
- Parameters:
- bathy: str
“etopo1” or filepath to bathymetric file
- bounds: list, tuple, optional
Bounds to be selected (lon_min, lon_max, lat_min, lat_max)
- steps: list, tuple, optional
subsampling steps (di_lon, di_lat)
- Returns:
- ds: xr.Dataset
Dataset containing variables elevation and depth (=-elevation)
- pynsitu.maps.plot_map(da=None, extent='global', projection=None, title=None, fig=None, figsize=None, ax=None, colorbar=True, colorbar_kwargs={}, centered_clims=False, gridlines=True, gridkwargs=None, bathy=None, bathy_levels=None, bathy_fill=False, land=False, coastline='110m', rivers=False, tile=None, **kwargs)[source]#
Plot a geographical map
- Parameters:
- da: xr.DataArray, optional
Scalar field to plot
- extent: str, list/tuple, optional
Geographical extent, “global” or [lon_min, lon_max, lat_min, lat_max]
- projection: cartopy.crs.??, optional
Cartopy projection, e.g.: projection = ccrs.Robinson()
- title: str, optional
Title
- fig: matplotlib.figure.Figure, optional
Figure handle, create one if not passed
- figsize: tuple, optional
Figure size, e.g. (10,5)
- ax: matplotlib.axes.Axes, matplotlib.gridspec.SubplotSpec, optional
Axis handle (needs to generated with cartopy projection) or gridspec handle as generated from GridSpec
- colorbar: boolean, optional
add colorbar (default is True)
- colorbar_kwargs: dict, optional
kwargs passed to colorbar
- centered_clims: boolean, optional
Center color limits (default is False)
- gridlines: boolean, optional
Add grid lines (default is True)
- bathy: str, optional
Plot bathymetry (default is None) Need to provide path to bathymetry (see pynsitu.maps.load_bathy)
- bathy_levels: list/tuple, optional
Levels of bathymetry to plot
- bathy_fill: boolean, optional
Fill bathymetry with colors
- land: boolean, str, optional
Add land
- coastline: str, optional
True, [“10m”, “50m”, “110m”], [“c”, “l”, “i”, “h”, “f”] or path to coast shapefile
- rivers: boolean, optional
- **kwargs:
passed to the plot of the da variable
- pynsitu.maps.store_bathy_contours(bathy, contour_file='contours.geojson', levels=[0, 100, 500, 1000, 2000, 3000], **kwargs)[source]#
!!! need reimplemation, see following link for insight: metno/pyaerocom#952
Store bathymetric contours as a geojson The geojson may be used for folium plots
geo#
Module with pandas and xarray accessors for the analysis of geographically referenced data
- class pynsitu.geo.PdGeoAccessor(_obj)[source]#
Pandas DataFrame accessor in order to process geographical data
- Attributes:
- projection
projection_referencedefine a reference projection if none is available
Methods
apply_xy(fun, **kwargs)apply a function that requires working with projected coordinates x/y
compute_accelerations([from_, names, ...])compute acceleration from velocities or position Parameters ---------- df : dataframe, dataframe containing trajectories from_ : tuple of str, optional (key, east_name, north_name) if key = 'velocities', compute accelaration from velocities if key = 'lonlat', compute acceleration from lonlat time series if key = 'xy', compute acceleration from xy time series names : tuple, optional Contains columns names for eastern, northen and norm acceleration ("acceleration_east", "acceleration_north", "acceleration") by default centered_velocities : boolean, optional True if the velocities is centered temporally (True by default) time: str, optional Column name. Default is "index", i.e. considers the index keep_dt: boolean Keeps time intervals (False by default). fill_startend : boolean fill dataframe start and end (Nan values due to the derivation/centering method) (True by default). inplace : boolean if True add acceleration to dataset, if False return only a dataframe with time, id (for identification) and computed acceleration.
compute_dt([time, fill_startend, inplace])compute dt Parameters ---------- time: str, optional Column name. Default is "index", i.e. considers the index fill_startend : boolean, optional fill dataframe start and end (Nan values due to the derivation/centering method) (True by default). inplace : boolean, optional if True add dt to dataset, if False return only a dataframe with time, id (for identification) and computed velocities.
update longitude and latitude from projected coordinates
compute_transect(ds[, vmin, dt_max])Average data along a transect of step ds
compute_velocities([time, distance, ...])compute velocity Parameters ---------- time: str, optional Column name. Default is "index", i.e. considers the index distance: str, optional Method to compute distances. Default is geoid ("WGS84" with pyproj). Uses projected fields otherwise ("x", "y") centered: boolean, optional Centers velocity calculation temporally (True by default). keep_dt: boolean, optional Keeps time intervals (False by default). fill_startend : boolean, optional fill dataframe start and end (Nan values due to the derivation/centering method) (True by default). names : tuple, optional Contains columns names for eastern, northen and norm velocities ("velocity_east", "velocity_north", "velocity" by default inplace : boolean, optional if True add velocities to dataset, if False return only a dataframe with time, id (for identification) and computed velocities.
plot_bokeh([deployments, rule, mindec, ...])Plot time series: longitude, latitude, velocities, acceleration
plot_on_map([rule, coords])Produce map with trajectory on map Requires geoviews
project([overwrite])add (x,y) projection to object
resample(rule[, interpolate])temporal resampling https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html
set_projection_reference(ref[, reset])set projection reference point, (lon, lat) tuple
trim(d)given a deployment item, trim data
- apply_xy(fun, **kwargs)[source]#
apply a function that requires working with projected coordinates x/y
- compute_accelerations(from_=('velocities', 'velocity_east', 'velocity_north'), names=None, centered_velocity=True, time='index', keep_dt=False, fill_startend=True, inplace=False)[source]#
compute acceleration from velocities or position Parameters ———- df : dataframe,
dataframe containing trajectories
- from_tuple of str, optional
(key, east_name, north_name) if key = ‘velocities’, compute accelaration from velocities if key = ‘lonlat’, compute acceleration from lonlat time series if key = ‘xy’, compute acceleration from xy time series
- namestuple, optional
Contains columns names for eastern, northen and norm acceleration (“acceleration_east”, “acceleration_north”, “acceleration”) by default
- centered_velocitiesboolean, optional
True if the velocities is centered temporally (True by default)
- time: str, optional
Column name. Default is “index”, i.e. considers the index
- keep_dt: boolean
Keeps time intervals (False by default).
- fill_startendboolean
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default).
- inplaceboolean
if True add acceleration to dataset, if False return only a dataframe with time, id (for identification) and computed acceleration
- compute_dt(time='index', fill_startend=True, inplace=False)[source]#
compute dt Parameters ———- time: str, optional
Column name. Default is “index”, i.e. considers the index
- fill_startendboolean, optional
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default).
- inplaceboolean, optional
if True add dt to dataset, if False return only a dataframe with time, id (for identification) and computed velocities
- compute_transect(ds, vmin=None, dt_max=None)[source]#
Average data along a transect of step ds
- Parameters:
- ds: float
transect spacing in meters
- vmin: float, optional
ship minimum speed, used to compute a maximum search time for each transect cell
- dt_max: pd.Timedelta, optional
maximum search time for each transect cell
- compute_velocities(time='index', distance='geoid', centered=True, keep_dt=False, fill_startend=True, names=None, inplace=False)[source]#
compute velocity Parameters ———- time: str, optional
Column name. Default is “index”, i.e. considers the index
- distance: str, optional
Method to compute distances. Default is geoid (“WGS84” with pyproj). Uses projected fields otherwise (“x”, “y”)
- centered: boolean, optional
Centers velocity calculation temporally (True by default).
- keep_dt: boolean, optional
Keeps time intervals (False by default).
- fill_startendboolean, optional
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default).
- namestuple, optional
Contains columns names for eastern, northen and norm velocities (“velocity_east”, “velocity_north”, “velocity” by default
- inplaceboolean, optional
if True add velocities to dataset, if False return only a dataframe with time, id (for identification) and computed velocities
- plot_bokeh(deployments=None, rule=None, mindec=True, velocity=False, acceleration=False)[source]#
Plot time series: longitude, latitude, velocities, acceleration
- Parameters:
- deployments: dict-like, pynsitu.events.Deployments for instance, optional
Deployments
- rule: str, optional
resampling rule
- mindec: boolean
Plot longitude and latitude as minute/decimals
- plot_on_map(rule=None, coords='geo', **kwargs)[source]#
Produce map with trajectory on map Requires geoviews
- Parameters:
- rule: str, optional
resampling rule
- coords: str, optional
- Controls coordinates:
“xy”: x/y space
“geo”: geographical coordinates (lon/lat)
- **kwargs: passed to hvplot
- resample(rule, interpolate=False, **kwargs)[source]#
temporal resampling https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html
- Parameters:
- rule: DateOffset, Timedelta or str
- Passed to pandas.DataFrame.resample, examples:
‘10T’: 10 minutes
‘10S’: 10 seconds
- inplace: boolean, optional
turn inplace resampling on, default is False
- interpolate: boolean, optional
turn on interpolation for upsampling
- kwargs:
passed to resample
- class pynsitu.geo.XrGeoAccessor(_obj)[source]#
Xarray Dataset accessor in order to process geographical data
- Attributes:
- projection
projection_referencedefine a reference projection if none is available
Methods
compute_lonlat([x, y])update longitude and latitude from projected coordinates
project([overwrite])add (x,y) projection to object
set_projection_reference(ref[, reset])set projection reference point, (lon, lat) tuple
- pynsitu.geo.azimuth_distance(lon0, lat0, lon1, lat1, ellps='WGS84')[source]#
compute azimuths and distances between two points
- Returns:
- az12 (deg), az21 (deg), dist (meters)
- pynsitu.geo.compute_accelerations(df, from_=('velocities', 'velocity_east', 'velocity_north'), names=None, centered_velocity=True, time='index', keep_dt=False, fill_startend=True, inplace=False)[source]#
compute acceleration from velocities or position Parameters ———- df : dataframe,
dataframe containing trajectories
- from_tuple of str,
(key, east_name, north_name) if key = ‘velocities’, compute accelaration from velocities if key = ‘lonlat’, compute acceleration from lonlat time series if key = ‘xy’, compute acceleration from xy time series if key = ‘xy_spectral’ compute from velocities via spectral method if key = ‘velocities_spectral’ compute from velocities via spectral method
- namestuple, optional
Contains columns names for eastern, northen and norm acceleration (“acceleration_east”, “acceleration_north”, “acceleration”) by default
- centered_velocitiesboolean
True if the velocities is centered temporally (True by default)
- time: str, optional
Column name. Default is “index”, i.e. considers the index
- keep_dt: boolean
Keeps time intervals (False by default).
- fill_startendboolean
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default).
- inplaceboolean
if True add acceleration to dataset, if False return only a dataframe with time, id (for identification) and computed acceleration
- pynsitu.geo.compute_dt(df, time, fill_startend=True, inplace=False)[source]#
core method to compute dt from a dataframe Parameters ———- df : dataframe,
dataframe containing trajectories
- time: str
Column name corresponding to time. Can be “index”, in which case the index is used
- fill_startendboolean
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default)
- inplaceboolean, optional
if True add dt to dataset, if False return only a dataframe with time, id (for identification) and computed dt.
- pynsitu.geo.compute_velocities(df, time, names, centered, fill_startend, distance, lon_key='lon', lat_key='lat', keep_dt=False, inplace=False)[source]#
core method to compute velocity from a dataframe Parameters ———- df : dataframe,
dataframe containing trajectories
- lon_key: str
longitude column name in dataframe
- lat_key: str
latitude column name in dataframe
- time: str
Column name corresponding to time. Can be “index”, in which case the index is used
- namestuple
Contains columns names for eastern, northen and norm velocities (“velocity_east”, “velocity_north”, “velocity” by default
- centered: boolean
Centers velocity calculation temporally
- fill_startendboolean
fill dataframe start and end (Nan values due to the derivation/centering method) (True by default)
- distance: str
- Method to compute distances:
“geoid” is based geodetic distance and bearing (“WGS84” with pyproj)
“spectral” is a spectral estimation (requires uniform time sampling)
“xy” is from “x” and “y” columns (projected fields)
- keep_dt: boolean, optional
Keeps time intervals (False by default).
- inplaceboolean, optional
if True add velocities to dataset, if False return only a dataframe with time, id (for identification) and computed velocities.
- class pynsitu.geo.projection(lon_ref, lat_ref)[source]#
wrapper around pyproj to easily convert to local cartesian coordinates
Methods
- pynsitu.geo.spectral_diff(x, dt, order, dx0=0.0, time=None)[source]#
Differentiate (order=1, 2) or integrate (order=-1) spectrally a pd.Series presumed uniform
- Parameters:
- x: pd.Series
time series to differentiate/integrate
- dt: array-like
time intervals used to estimate the time step and verify timeline is uniform
- order: int
order of differentiation: 1, 2, -1 (integration)
- dx0: float
initial values for integrations (order=-1)
- time: np.array
array of datetimes
tseries#
Module with pandas and xarray accessors for time series analysis
- class pynsitu.tseries.TimeSeriesAccessor(obj)[source]#
Pandas DataFrame accessor in order to edit and process timeseries-like data
- Attributes:
delta_time_unitdefine a reference time interval if none is available
dtmost likely time increment
timereturn time as a series
time_origindefine a reference time if none is available
- variables
Methods
get_dt()get time intervals as an array
load_harmonics(file[, col])load harmonics from a file
package_harmonics([col])package harmonics for storage
resample_centered(freq)centered resampling, i.e. data at t is representative of data within [t-dt/2, t+dt/2].
resample_uniform(rule[, inplace])resample on a uniform time line via interpolation this may be useful for upsampling for instance
set_time_origin_delta([time_origin, ...])set time reference variables
set_time_physical([inplace, overwrite])add physical time to object
spectrum([method, unit, include, ignore, ...])compute spectra of timeseries
tidal_analysis(col[, library])compute a tidal analysis on one column
trim(d)given a deployment item, trim data temporally
tidal_plot_harmonics
tidal_predict
- resample_centered(freq)[source]#
centered resampling, i.e. data at t is representative of data within [t-dt/2, t+dt/2]
- Parameters:
- freq: str
Frequency of the resampling, e.g. “1H”
- Returns
- ——-
- df_rs: pandas.core.resample.DatetimeIndexResampler
- This means the reduction step needs to be performed, e.g.
df_rs.mean() or df_rs.median()
- resample_uniform(rule, inplace=False, **kwargs)[source]#
resample on a uniform time line via interpolation this may be useful for upsampling for instance
- Parameters:
- rule: str
Sets output frequency, e.g. “1T” for 1 minute or “1H” for 1 hour
- inplace: boolean
Operated inplace
- **kwargs: passed to pandas interpolate method
- set_time_physical(inplace=True, overwrite=False)[source]#
add physical time to object
- Parameters:
- inplace: boolean, optional
Add physical time as an additional column, returns the variable otherwise
- overwrite: boolean, optional
Enable overwriting an existing physical time
- spectrum(method='welch', unit=None, include=None, ignore=None, complex=None, fill_limit=None, **kwargs)[source]#
compute spectra of timeseries
- Parameters:
- method: str, optional
Spectral method, e.g. welch, …
- unit: str, pd.Timedelta, optional
time unit to use for frequencies (e.g. “1T”, “1D”)
- include: str, list, optional
variables to compute the spectrum on
- ignore: str, list, optional
list of variables to exclude from the spectral calculation
- complex: tuple, optional
Specify varibles for the calculation of rotary spectral calculation, e.g. complex= (v0, v1) computes the spectrum of v0 + 1j*v1
- fill_limit: int, optional
maximum number of points that can be interpolated
- **kwargs: passed to the spectral method
- tidal_analysis(col, library='pytide', **kwargs)[source]#
compute a tidal analysis on one column
- Parameters:
- col: str
Column to consider
- constituents: list, optional
List of consistuents
- library: str, optional
Tidal library to use, e.g. “pytide”, “utide”
- property time#
return time as a series
- class pynsitu.tseries.XrTimeSeriesAccessor(obj)[source]#
Xarray Dataset accessor in order to edit and process timeseries-like data
- Attributes:
Methods
get_dt()get time intervals as an array
set_time_origin_delta([time_origin, ...])set time reference variables
set_time_physical([overwrite])add physical time to object
spectrum([method, unit, include, ignore, ...])compute spectra of timeseries
trim(d[, inplace])given a deployment item, trim data temporally
- spectrum(method='welch', unit=None, include=None, ignore=None, complex=None, fill_limit=None, **kwargs)[source]#
compute spectra of timeseries
- Parameters:
- method: str, optional
Spectral method, e.g. welch, …
- unit: str, pd.Timedelta, optional
time unit to use for frequencies (e.g. “1min”, “1d”)
- include: str, list, optional
variables to compute the spectrum on
- ignore: str, list, optional
list of variables to exclude from the spectral calculation
- complex: tuple, optional
Specify varibles for the calculation of rotary spectral calculation, e.g. complex= (v0, v1) computes the spectrum of v0 + 1j*v1
- fill_limit: int, optional
maximum number of points that can be interpolated
- **kwargs: passed to the spectral method
- property time#
return time (may have a different name)
- trim(d, inplace=False)[source]#
given a deployment item, trim data temporally
- Parameters:
- d: pynsitu.events.Deployment
- property variables#
list only time variables
- pynsitu.tseries.compute_spectrum_pd(v, method, dt, **kwargs)[source]#
Compute the spectrum of a pandas time series Treatment of NaNs is assumed to be carried out beforehand
- Parameters:
- v: ndarray, pd.Series
Time series, the index must be time (and named as it) if dt is not provided
- method: string
Method that will be employed for spectral calculations. Implemented methods are ‘welch’, ‘periodogram’ (not tested)
- dt: float
Time spacing
**kwargs: passed to the spectral calculation method
- See:
- pynsitu.tseries.compute_spectrum_xr(da, method, dt, time, rechunk=False, **kwargs)[source]#
Compute the spectrum of a pandas time series Treatment of NaNs is assumed to be carried out beforehand
- Parameters:
- da: ndarray, xr.DataArray
Time series, the index must be time (and named as it) if dt is not provided
- method: string
Method that will be employed for spectral calculations. Implemented methods are ‘welch’, ‘periodogram’ (not tested)
- dt: float
Time spacing
- time: str
Name of time dimension in xarray object
- rechunk: boolean
Automatically rechunk along time dimension
**kwargs: passed to the spectral calculation method
- See:
- pynsitu.tseries.filter_response(h, dt=0.041666666666666664)[source]#
Returns the frequency response
- Parameters:
- h: np.array
filter kernel/weights
- dt: float, optional
- Returns:
- H: np.array
frequency response function
- w: np.array
frequencies
- pynsitu.tseries.generate_filter(band, T=10, dt=0.041666666666666664, lat=None, bandwidth=None, normalized_bandwidth=None)[source]#
Wrapper around scipy.signal.firwing
- Parameters:
- band: str, float
Frequency band (e.g. “semidiurnal”, …) or filter central frequency in cpd
- T: float
Filter length in days
- dt: float
Filter/time series time step
- lat: float
Latitude (for inertial band)
- bandwidth: float
Filter bandwidth in cpd
- dt: float
days
- pynsitu.tseries.load_equilibrium_constituents(c=None)[source]#
Load equilibrium tide amplitudes
- Parameters:
- c: str, list
constituent or list of constituent
- Returns:
- amplitude: amplitude of equilibrium tide in m for tidal constituent
- phase: phase of tidal constituent
- omega: angular frequency of constituent in radians
- alpha: load love number of tidal constituent
- species: spherical harmonic dependence of quadrupole potential
- pynsitu.tseries.pytide_harmonic_analysis(time, eta, constituents=[])[source]#
Distributed harmonic analysis
- Parameters:
- time: np.array, pd.Series
timeline
- constituents: list
- tidal consituent e.g.:
[“M2”, “S2”, “N2”, “K2”, “K1”, “O1”, “P1”, “Q1”, “S1”, “M4”]
- pynsitu.tseries.pytide_predict_tides(time, har, cplx=False)[source]#
Predict tides based on pytide outputs
v = Re ( conj(amplitude) * dsp.f * np.exp(1j*vu) )
see: https://pangeo-pytide.readthedocs.io/en/latest/pytide.html#pytide.WaveTable.harmonic_analysis
- Parameters:
- time: xr.DataArray
Target time
- har: xr.DataArray, xr.Dataset, optional
Complex amplitudes. Load constituents from a reference station otherwise
- pynsitu.tseries.utide_dict2ds_scalar(coef)[source]#
transform utide scalar tidal harmonic output (dict) to xarray dataset
seawater#
Module with pandas and xarray accessors for data containing seawater information
- class pynsitu.seawater.PdSeawaterAccessor(_obj)[source]#
Pandas DataFrame accessor in order to carry and process seawater properties
- The DataFrame requires the following columns:
in situ temperature, accepted names: [“temperature”, “temp”, “t”]
- practical salinity or conductivity, accepted names are:
[“salinity”, “psal”, “s”] [“conductivity”, “cond”, “c”]
- pressure or depth, accepted names:
[“pressure”, “p”] [“depth”]
- Accepted units ares:
temperature: degC
practical salinity: PSU
conductivity: mS/cm
pressure: dbar
depth: m
Longitude and Latitude are treated differently and may be columns or attributes or in the attrs dictionnary
Methods
apply_with_eos_update(fun, *args, **kwargs)Apply a function and update eos related variables This is an helper method
init()simply instantiate accessor
plot_bokeh([deployments, rule, width, cross])Bokeh plot, useful to clean data
resample(rule[, op, interpolate])Temporal resampling https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html This is NOT done inplace
reset([extra])delete core seawater variables for update
set_columns(**kwargs)set accessor column names: t, s, c, p, d, lon, lat and update internal eos variables (SA, PT)
update_eos([inplace])update eos related variables (e.g. in situ temperature, practical salinity, sigma0) based on SA (absolute salinity) and CT (conservative temperature).
compute_vertical_profile
- apply_with_eos_update(fun, *args, **kwargs)[source]#
Apply a function and update eos related variables This is an helper method
- plot_bokeh(deployments=None, rule=None, width=400, cross=True)[source]#
Bokeh plot, useful to clean data
- Parameters:
- deployments: dict-like, pynsitu.events.Deployments for instance, optional
Deployments
- rule: str, optional
resampling rule
- width: int, optional
Plot width in pixels
- cross: boolean, optional
…
- resample(rule, op='mean', interpolate=False, **kwargs)[source]#
Temporal resampling https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html This is NOT done inplace
- Parameters:
- rule: DateOffset, Timedelta or str
- Passed to pandas.DataFrame.resample, examples:
‘10T’: 10 minutes
‘10S’: 10 seconds
- op: str, optional
operation to perform while resampling (“mean” by default)
- interpolate: boolean, optional
activates interpolation for upsampling (False by default)
- **kwargs:
passed to resample
- set_columns(**kwargs)[source]#
set accessor column names: t, s, c, p, d, lon, lat and update internal eos variables (SA, PT)
- update_eos(inplace=True)[source]#
update eos related variables (e.g. in situ temperature, practical salinity, sigma0) based on SA (absolute salinity) and CT (conservative temperature).
- Parameters:
- inplace: boolean, optional
convenience argument to trigger a copy of the dataframe instead of an inplace modification (False by default).
drifters#
Drifter specific data analysis
- pynsitu.drifters.advance_search(nt, time, t, i, delta_plus, delta_minus)[source]#
find next closest neighbourgh by searching in positive and negative directions with respect to index i and update delta_minus, delta_plus
- pynsitu.drifters.despike_all(df, acceleration_threshold, acc_key=None, verbose=False)[source]#
Drops isolated anomalous positions (spikes) in a position time series. Anomalous positions are first detected if acceleration exceed the provided threshold. Speed acceleration should have been computed with the pynsitu.geo.GeoAccessor, e.g.: df.geo.compute_velocities(centered=False, acceleration=True)
- Parameters:
- df: `pandas.DataFrame`
Input dataframe, must contain an acceleration column
- acceleration_threshold: float
Threshold used to detect anomalous values
- acc_key: tuple, optional
Keys/labels/column identifiers for x/y/absolute value of acceleration
- verbose: boolean
Outputs number of anomalous values detected Default is True
- Returns:
- df: pandas.DataFrame
Output dataframe with spikes removed.
- pynsitu.drifters.despike_isolated(df, acceleration_threshold, acc_key=None, verbose=False)[source]#
Drops isolated anomalous positions (spikes) in a position time series. Anomalous positions are first detected if acceleration exceed the provided threshold. Detected values are masked if they are combined with an adequate pattern of acceleration sign reversals, e.g. +-+ or -+- Speed acceleration should have been computed with the pynsitu.geo.GeoAccessor, e.g.: df.geo.compute_velocities(centered=False, acceleration=True)
- Parameters:
- df: `pandas.DataFrame`
Input dataframe, must contain an acceleration column
- acceleration_threshold: float
Threshold used to detect anomalous values
- acc_key: tuple, optional
Keys/labels/column identifiers for x/y/absolute value of acceleration
- verbose: boolean
Outputs number of anomalous values detected Default is True
- Returns:
- df: pandas.DataFrame
Output dataframe with spikes removed.
- pynsitu.drifters.despike_pm(df, acceleration_threshold, pm=1, acc_key=None, verbose=False)[source]#
Drops anomalous positions (spikes) in a position time series. Anomalous positions are first detected if acceleration exceed the provided threshold. The pm points before and after are also removed Speed acceleration should have been computed with the pynsitu.geo.GeoAccessor, e.g.: df.geo.compute_velocities(centered=False, acceleration=True)
- Parameters:
- df: `pandas.DataFrame`
Input dataframe, must contain an acceleration column
- acceleration_threshold: float
Threshold used to detect anomalous values
- pmint
number of point before and after to remove
- acc_key: tuple, optional
Keys/labels/column identifiers for x/y/absolute value of acceleration
- verbose: boolean
Outputs number of anomalous values detected Default is True
- Returns:
- df: pandas.DataFrame
Output dataframe with spikes removed.
- pynsitu.drifters.divide_blocs(df, t_target, maxgap)[source]#
Cut out gaps bigger than maxgap and return blocs
- Parameters:
- dforiginal dataframe
- t_targetpd.datetime index,
interpolation times
- maxgap: float
max gap length in seconds
- Returns
- ———-
- DFlist of dataframe,
blocs without gaps bigger than maxgap
- DF_target :list of pd.date_time_index,
list of times out gaps bigger than maxgap
- DF_target_gaplist of pd.date_time_index,
list of times in gaps bigger than maxgap
- pynsitu.drifters.find_gap(df_gap, maxgap)[source]#
Find gaps (return time start, time end) bigger than dtmax in the dataset
- Parameters:
- df_gaporiginal dataframe
- maxgap: float
max gap length in seconds
- pynsitu.drifters.find_nearest_neighboors(time, t, i)[source]#
Find 3 remaining neighbouring points i is a starting value (closest point)
- pynsitu.drifters.gap_array(time, t_target)[source]#
Returns time distance between time in t_target and their nearest neightbor in time
- Parameters:
- timendarray of datetime,
- t_targetndarray of datetime,
- pynsitu.drifters.low_pass_(df, T=1, cutoff=11.5, velocities_key=('velocity_east', 'velocity_north', 'velocity'))[source]#
apply low pass filter on velocity and reintegrate x, y Parameters ———- df: dataframe, must contain u, v cutoff : float,
low pass filter cutoff frequency in cpp
- Tfloat
Filter length in days
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
Return : dataframe with x, y, u, v
- pynsitu.drifters.lowess(time, x, time_target, nb=4, degree=2)[source]#
perform a lowess interpolation
- Parameters:
- time: np.array
time array, assumed to be sorted in time, should be floats
- x: np.array
positions
- time_target: np.array
target timeline
- nbnumber of closest neighboors to consider
- degree2 or 3, of the polynomial
- pynsitu.drifters.lowess_smooth(df, t_target, degree=2, iteration=3, nb=4, T_low_pass=None, cutoff_low_pass=None, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=None, spectral_diff=False, geo=False)[source]#
perform a lowess interpolation with optional posteriori low pass filter
- Parameters:
- df: dataframe, must contain x, y
- t_target: `pandas.core.indexes.datetimes.DatetimeIndex` or str
Output time series, as typically given by pd.date_range or the delta time of the output time series as str In this case, t_target is then recomputed taking start-end the start end of the input trajectory and the given delta time
- degree2 or 3,
degree of the polynomial for the lowess method
- iterationnumber of time to apply LOWESS (interpolation on t_target at the last iteration)
- nbnumber of closest neighboors to consider
- T_low_passfloat
Filter length in days, if None (default), does not apply filter
- cutoff_low_passfloat
low pass filter cutoff frequency in cpp
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
- accelerations_key(,,) of str,
ex : (‘acceleration_east’,’acceleration_north’, ‘acceleration’) or (‘ax’,’ay’, ‘Axy’) or (‘au’,’av’, ‘Auv’) etc
- import_columnslist of str
list of df constant columns we want to import (ex: id, platform)
- spectral_diffboolean,
if True use spectral differentiation instead of central differentiation
- geo: boolean,
optional if geo obj with projection
- Returndataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, and ae-an computed via lowess if degree = 3,+norms, id, platform
- pynsitu.drifters.nan_in_gap(df, df_gap, dtmax, inplace=False)[source]#
Fill gaps bigger than dtmax with nan
- Parameters:
- dfdataframe on which we want to put the gap
- df_gaporiginal dataframe
- dtmax: float
max gap length in seconds
- inplaceboolean
- pynsitu.drifters.posteriori_low_pass_uv(df, T=20, cutoff=4, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'])[source]#
Apply low pass filter to a smoothed trajectory a posteriori
- Parameters:
- df: dataframe, must contain x, y
- Tfloat
Filter length in days
- cutofffloat
low pass filter cutoff frequency in cpp
- import_columnslist of str
list of df constant columns we want to import (ex: id, platform)
- Returndataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, accelerations, +norms, id, platform
- pynsitu.drifters.posteriori_low_pass_xy(df, T=1, cutoff=13, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'])[source]#
Apply low pass filter to a smoothed trajectory a posteriori
- Parameters:
- df: dataframe, must contain x, y
- Tfloat
Filter length in days
- cutofffloat
low pass filter cutoff frequency in cpp
- import_columnslist of str
list of df constant columns we want to import (ex: id, platform)
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
- accelerations_key(,,) of str,
ex : (‘acceleration_east’,’acceleration_north’, ‘acceleration’) or (‘ax’,’ay’, ‘Axy’) or (‘au’,’av’, ‘Auv’) etc
- Returndataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, accelerations, +norms, id, platform
- pynsitu.drifters.smooth(df, method, t_target, maxgap=14400, parameters={}, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'], spectral_diff=False, geo=True)[source]#
Smooth and interpolated a trajectory Parameters: ———–
- dfdataframe with raw trajectory,
must contain ‘time’,’x’,’y’, ‘u’, ‘v’, ‘dt’
- methodstr
smoothing method among : ‘spydell’, ‘variational’ or ‘lowess’
- t_target: pandas.core.indexes.datetimes.DatetimeIndex or str
Output time series, as typically given by pd.date_range or the delta time of the output time series as str In this case, t_target is then recomputed taking start-end the start end of the input trajectory and the given delta time
- maxgapfloat,
max gap tolerated in SECONDS
- parametersdict,
contains all parameters to give to method : - variational : dict(acc_cut =, position_error=, acceleration_amplitude=, acceleration_T=,time_chunk=) - lowess : dict(degree=) - spydell : dict(acc_cut =, nb_pt_mean=)
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
- accelerations_key(,,) of str,
ex : (‘acceleration_east’,’acceleration_north’, ‘acceleration’) or (‘ax’,’ay’, ‘Axy’) or (‘au’,’av’, ‘Auv’) etc
- import_columnslist of str,
list of df constant columns we want to import (ex: id, platform)
- geo: boolean,
optional if geo obj with projection
- acc: boolean,
optional compute acceleration
Return : interpolated dataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, +norms, id, platform with index time
- pynsitu.drifters.smooth_all(df, method, t_target, maxgap=14400, parameters={}, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'], spectral_diff=True, geo=True)[source]#
Smooth and interpolated all trajectories Parameters: ———–
- dfdataframe with raw trajectory,
must contain ‘time’, ‘velocity_east’, ‘velocity_north’
- methodstr
smoothing method among : ‘spydell’, ‘variational’ or ‘lowess’
- t_target: pandas.core.indexes.datetimes.DatetimeIndex or str
Output time series, as typically given by pd.date_range or the delta time of the output time series as str In this case, t_target is then recomputed taking start-end the start end of the input trajectory and the given delta time
- maxgapfloat,
max gap tolerated in SECONDS
- parametersdict,
contains all parameters to give to method : - variational : dict(acc_cut =, position_error=, acceleration_amplitude=, acceleration_T=,time_chunk=) - lowess : dict(degree=) - spydell : dict(acc_cut =, nb_pt_mean=)
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
- accelerations_key(,,) of str,
ex : (‘acceleration_east’,’acceleration_north’, ‘acceleration’) or (‘ax’,’ay’, ‘Axy’) or (‘au’,’av’, ‘Auv’) etc
- import_columnslist of str,
list of df constant columns we want to import (ex: id, platform)
- geo: boolean,
optional if geo obj with projection
- acc: boolean,
optional compute acceleration
Return : interpolated dataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, +norms, id, platform with index time
- pynsitu.drifters.spydell_smooth(df, t_target, acc_cut=0.001, nb_pt_mean=5, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'], spectral_diff=True, geo=True)[source]#
Smooth and interpolated a trajectory with the method described in Spydell et al. 2021.
- pynsitu.drifters.time_window_processing(df, myfun, T, overlap=0.5, id_label='id', dt=None, limit=None, geo=None, xy=None, **myfun_kwargs)[source]#
Break each drifter time series into time windows and process each windows
myfun signature must be myfun(df, **kwargs) and it must return a pandas Series Drop duplicates if a date column is present
- Parameters:
- df: Dataframe
This dataframe represents a drifter time series
- myfun
Method that will be applied to each window
- T: float, pd.Timedelta
Length of the time windows, must be in the same dtype and units than column “time”
- overlap: float
Amount of overlap between temporal windows. Should be between 0 and 1. Default is 0.5
- id_label: str, optional
Label used to identify drifters
- dt: float, str
Conform time series to some time step, if string must conform to rule option of pandas resample method
- geo: boolean
Turns on geographic processing of spatial coordinates
- xy: tuple
specify x, y spatial coordinates if not geographic
- **myfun_kwargs
Keyword arguments for myfun
- pynsitu.drifters.variational_smooth(df, t_target, acc_cut, position_error, acceleration_amplitude, acceleration_T, time_chunk=2, velocities_key=('velocity_east', 'velocity_north', 'velocity'), accelerations_key=('acceleration_east', 'acceleration_north', 'acceleration'), import_columns=['id'], spectral_diff=True, geo=True)[source]#
Smooth and resample a drifter position time series The smoothing balances positions information according to the specified position error and the smoothness of the output time series by specifying a typical acceleration amplitude and decorrelation timescale (assuming exponential decorrelation). The output trajectory x minimizes:
|| I(x) - x_obs ||^2 / e_x^2 + (D2 x)^T R^{-1} (D2 x)
where e_x is the position error, I the time interpolation operator, R the acceleration autocorrelation, D2 the second order derivative.
Closest reference (but no temporal autocorrelation of acceleration considered): Yaremchuk and Coelho 2015. Filtering Drifter Trajectories Sampled at Submesoscale, Resolution. IEEE Journal of Oceanic Engineering
- Parameters:
- df: `pandas.DataFrame`
Input drifter time series, must contain projected positions (x and y)
- t_target: pandas.core.indexes.datetimes.DatetimeIndex
Output time series, as typically given by pd.date_range Note that the problem seems ill-posed in the downsampling case … need to be fixed
- acc_cutfloat,
acceleration spike cut
- position_error: float
Position error in meters
- acceleration_amplitude: float
Acceleration typical amplitude
- acceleration_T: float
Acceleration decorrelation timescale in seconds
- time_chunk: int/float, optional
Maximum time chunk (in days) to process at once. Data is processed by chunks and patched together.
- velocities_key(,,) of str,
ex : (‘velocity_east’,’velocity_north’, ‘velocity’) or (‘u’,’v’, ‘U’) etc
- accelerations_key(,,) of str,
ex : (‘acceleration_east’,’acceleration_north’, ‘acceleration’) or (‘ax’,’ay’, ‘Axy’) or (‘au’,’av’, ‘Auv’) etc
- spectral_diffboolean
computing velocities and accelaration with spectral diff or not
- import_columnslist of str
list of df constant columns we want to import (ex: id, platform)
- geo: boolean,
optional if geo obj with projection
- Returninterpolated dataframe with x, y, u, v, ax-ay computed from xy, au-av computed from u-v, +norms, +import_columns with index time