Series¶
Constructor¶
|
pandas-on-Spark Series that corresponds to pandas Series logically. |
Attributes¶
The index (axis labels) Column of the Series. |
|
Return the dtype object of the underlying data. |
|
Return the dtype object of the underlying data. |
|
Return an int representing the number of array dimensions. |
|
Return name of the Series. |
|
Return a tuple of the shape of the underlying data. |
|
Return a list of the row axis labels. |
|
Return an int representing the number of elements in this object. |
|
Returns true if the current object is empty. |
|
Return the transpose, which is by definition self. |
|
Return True if it has any missing values. |
|
Return a Numpy representation of the DataFrame or the Series. |
Conversion¶
|
Cast a pandas-on-Spark object to a specified dtype |
|
Make a copy of this object’s indices and data. |
Return the bool of a single element in the current object. |
Indexing, iteration¶
Access a single value for a row/column label pair. |
|
Access a single value for a row/column pair by integer position. |
|
Access a group of rows and columns by label(s) or a boolean Series. |
|
Purely integer-location based indexing for selection by position. |
|
Return alias for index. |
|
|
Return item and drop from series. |
This is an alias of |
|
Lazily iterate over (index, value) tuples. |
|
Return the first element of the underlying data as a Python scalar. |
|
|
Return cross-section from the Series. |
|
Get item from object for given key (DataFrame column, Panel slice, etc.). |
Binary operator functions¶
|
Return Addition of series and other, element-wise (binary operator +). |
|
Return Floating division of series and other, element-wise (binary operator /). |
|
Return Multiplication of series and other, element-wise (binary operator *). |
|
Return Reverse Addition of series and other, element-wise (binary operator +). |
|
Return Reverse Floating division of series and other, element-wise (binary operator /). |
|
Return Reverse Multiplication of series and other, element-wise (binary operator *). |
|
Return Reverse Subtraction of series and other, element-wise (binary operator -). |
|
Return Reverse Floating division of series and other, element-wise (binary operator /). |
|
Return Subtraction of series and other, element-wise (binary operator -). |
|
Return Floating division of series and other, element-wise (binary operator /). |
|
Return Exponential power of series of series and other, element-wise (binary operator **). |
|
Return Reverse Exponential power of series and other, element-wise (binary operator **). |
|
Return Modulo of series and other, element-wise (binary operator %). |
|
Return Reverse Modulo of series and other, element-wise (binary operator %). |
|
Return Integer division of series and other, element-wise (binary operator //). |
|
Return Reverse Integer division of series and other, element-wise (binary operator //). |
|
Return Integer division and modulo of series and other, element-wise (binary operator divmod). |
|
Return Integer division and modulo of series and other, element-wise (binary operator rdivmod). |
|
Combine Series values, choosing the calling Series’s values first. |
|
Compare if the current value is less than the other. |
|
Compare if the current value is greater than the other. |
|
Compare if the current value is less than or equal to the other. |
|
Compare if the current value is greater than or equal to the other. |
|
Compare if the current value is not equal to the other. |
|
Compare if the current value is equal to the other. |
|
Return the product of the values. |
|
Compute the dot product between the Series and the columns of other. |
Function application, GroupBy & Window¶
|
Invoke function on values of Series. |
|
Aggregate using one or more operations over the specified axis. |
|
Aggregate using one or more operations over the specified axis. |
|
Call |
|
Map values of Series according to input correspondence. |
|
Group DataFrame or Series using a Series of columns. |
|
Provide rolling transformations. |
|
Provide expanding transformations. |
|
Apply func(self, *args, **kwargs). |
Computations / Descriptive Stats¶
Return a Series/DataFrame with absolute numeric value of each element. |
|
|
Return whether all elements are True. |
|
Return whether any element is True. |
|
Compute the lag-N autocorrelation. |
|
Return boolean Series equivalent to left <= series <= right. |
|
Trim values at input threshold(s). |
|
Compute correlation with other Series, excluding missing values. |
|
Count non-NA cells for each column. |
|
Compute covariance with Series, excluding missing values. |
|
Return cumulative maximum over a DataFrame or Series axis. |
|
Return cumulative minimum over a DataFrame or Series axis. |
|
Return cumulative sum over a DataFrame or Series axis. |
|
Return cumulative product over a DataFrame or Series axis. |
|
Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding |
|
Subset rows or columns of dataframe according to labels in the specified index. |
|
Return unbiased kurtosis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
Return the mean absolute deviation of values. |
|
|
Return the maximum of the values. |
|
Return the mean of the values. |
|
Return the minimum of the values. |
|
Return the mode(s) of the dataset. |
|
Return the largest n elements. |
|
Return the smallest n elements. |
|
Percentage change between the current and a prior element. |
|
Return the product of the values. |
|
Return number of unique elements in the object. |
Return boolean if values in the object are unique |
|
|
Return value at the given quantile. |
|
Compute numerical data ranks (1 through n) along axis. |
|
Return unbiased standard error of the mean over requested axis. |
|
Return unbiased skew normalized by N-1. |
|
Return sample standard deviation. |
|
Return the sum of the values. |
|
Return the median of the values for the requested axis. |
|
Return unbiased variance. |
|
Return unbiased kurtosis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
Return unique values of Series object. |
|
|
Return a Series containing counts of unique values. |
|
Round each value in a Series to the given number of decimals. |
|
First discrete difference of element. |
Return boolean if values in the object are monotonically increasing. |
|
Return boolean if values in the object are monotonically increasing. |
|
Return boolean if values in the object are monotonically decreasing. |
Reindexing / Selection / Label manipulation¶
|
Align two objects on their axes with the specified join method. |
|
Return Series with specified index labels removed. |
|
Return Series with requested index level(s) removed. |
|
Return Series with duplicate values removed. |
|
Indicate duplicate Series values. |
|
Compare if the current value is equal to the other. |
|
Prefix labels with string prefix. |
|
Suffix labels with string suffix. |
|
Select first periods of time series data based on a date offset. |
|
Return the first n rows. |
|
Return the row label of the maximum value. |
|
Return the row label of the minimum value. |
|
Check whether values are contained in Series or Index. |
|
Select final periods of time series data based on a date offset. |
|
Alter Series index labels or name. |
|
Set the name of the axis for the index or columns. |
|
Conform Series to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
|
Return a Series with matching indices as other object. |
|
Generate a new DataFrame or Series with the index reset. |
|
Return a random sample of items from an axis of object. |
|
Swap levels i and j in a MultiIndex. |
|
Interchange axes and swap values axes appropriately. |
|
Return the elements in the given positional indices along an axis. |
|
Return the last n rows. |
|
Replace values where the condition is False. |
|
Replace values where the condition is True. |
|
Truncate a Series or DataFrame before and after some index value. |
Missing data handling¶
|
Synonym for DataFrame.fillna() or Series.fillna() with |
|
Synonym for DataFrame.fillna() or Series.fillna() with |
Detect existing (non-missing) values. |
|
Detect existing (non-missing) values. |
|
Detect existing (non-missing) values. |
|
Detect existing (non-missing) values. |
|
|
Synonym for DataFrame.fillna() or Series.fillna() with |
|
Return a new Series with missing values removed. |
|
Fill NA/NaN values. |
|
Fill NaN values using an interpolation method. |
Reshaping, sorting, transposing¶
Return the integer indices that would sort the Series values. |
|
Return int position of the smallest value in the Series. |
|
|
Return int position of the largest value in the Series. |
|
Sort object by labels (along an axis) |
|
Sort by the values. |
|
Unstack, a.k.a. |
Transform each element of a list-like to a row. |
|
|
Repeat elements of a Series. |
|
Squeeze 1 dimensional axis objects into scalars. |
|
Encode the object as an enumerated type or categorical variable. |
Combining / joining / merging¶
|
Concatenate two or more Series. |
|
Compare to another Series and show the differences. |
|
Replace values given in to_replace with value. |
|
Modify Series in place using non-NA values from passed Series. |
Accessors¶
Pandas API on Spark provides dtype-specific methods under various accessors.
These are separate namespaces within Series
that only apply
to specific data types.
Data Type |
Accessor |
---|---|
Datetime |
|
String |
|
Categorical |
Date Time Handling¶
Series.dt
can be used to access the values of the series as
datetimelike and return several properties.
These can be accessed like Series.dt.<property>
.
Datetime Properties¶
Returns a Series of python datetime.date objects (namely, the date part of Timestamps without timezone information). |
|
The year of the datetime. |
|
The month of the timestamp as January = 1 December = 12. |
|
The days of the datetime. |
|
The hours of the datetime. |
|
The minutes of the datetime. |
|
The seconds of the datetime. |
|
The microseconds of the datetime. |
|
The week ordinal of the year. |
|
The week ordinal of the year. |
|
The day of the week with Monday=0, Sunday=6. |
|
The day of the week with Monday=0, Sunday=6. |
|
The ordinal day of the year. |
|
The quarter of the date. |
|
Indicates whether the date is the first day of the month. |
|
Indicates whether the date is the last day of the month. |
|
Indicator for whether the date is the first day of a quarter. |
|
Indicator for whether the date is the last day of a quarter. |
|
Indicate whether the date is the first day of a year. |
|
Indicate whether the date is the last day of the year. |
|
Boolean indicator if the date belongs to a leap year. |
|
The number of days in the month. |
|
The number of days in the month. |
Datetime Methods¶
Convert times to midnight. |
|
|
Convert to a string Series using specified date_format. |
|
Perform round operation on the data to the specified freq. |
|
Perform floor operation on the data to the specified freq. |
|
Perform ceil operation on the data to the specified freq. |
|
Return the month names of the series with specified locale. |
|
Return the day names of the series with specified locale. |
String Handling¶
Series.str
can be used to access the values of the series as
strings and apply several methods to it. These can be accessed
like Series.str.<function/property>
.
Convert Strings in the series to be capitalized. |
|
|
Not supported. |
|
Filling left and right side of strings in the Series/Index with an additional character. |
|
Test if pattern or regex is contained within a string of a Series. |
|
Count occurrences of pattern in each string of the Series. |
|
Not supported. |
|
Not supported. |
|
Test if the end of each string element matches a pattern. |
|
Not supported. |
|
Not supported. |
|
Return lowest indexes in each strings in the Series where the substring is fully contained between [start:end]. |
|
Find all occurrences of pattern or regular expression in the Series. |
Extract element from each string or string list/tuple in the Series at the specified position. |
|
|
Not supported. |
|
Return lowest indexes in each strings where the substring is fully contained between [start:end]. |
Check whether all characters in each string are alphanumeric. |
|
Check whether all characters in each string are alphabetic. |
|
Check whether all characters in each string are digits. |
|
Check whether all characters in each string are whitespaces. |
|
Check whether all characters in each string are lowercase. |
|
Check whether all characters in each string are uppercase. |
|
Check whether all characters in each string are titlecase. |
|
Check whether all characters in each string are numeric. |
|
Check whether all characters in each string are decimals. |
|
|
Join lists contained as elements in the Series with passed delimiter. |
Computes the length of each element in the Series. |
|
|
Filling right side of strings in the Series with an additional character. |
Convert strings in the Series/Index to all lowercase. |
|
|
Remove leading characters. |
|
Determine if each string matches a regular expression. |
|
Return the Unicode normal form for the strings in the Series. |
|
Pad strings in the Series up to width. |
|
Not supported. |
|
Duplicate each string in the Series. |
|
Replace occurrences of pattern/regex in the Series with some other string. |
|
Return highest indexes in each strings in the Series where the substring is fully contained between [start:end]. |
|
Return highest indexes in each strings where the substring is fully contained between [start:end]. |
|
Filling left side of strings in the Series with an additional character. |
|
Not supported. |
|
Split strings around given separator/delimiter. |
|
Remove trailing characters. |
|
Slice substrings from each element in the Series. |
|
Slice substrings from each element in the Series. |
|
Split strings around given separator/delimiter. |
|
Test if the start of each string element matches a pattern. |
|
Remove leading and trailing characters. |
Convert strings in the Series/Index to be swapcased. |
|
Convert Strings in the series to be titlecase. |
|
|
Map all characters in the string through the given mapping table. |
Convert strings in the Series/Index to all uppercase. |
|
|
Wrap long strings in the Series to be formatted in paragraphs with length less than a given width. |
|
Pad strings in the Series by prepending ‘0’ characters. |
Categorical accessor¶
Categorical-dtype specific methods and attributes are available under
the Series.cat
accessor.
The categories of this categorical. |
|
Whether the categories have an ordered relationship. |
|
Return Series of codes as well as the index. |
|
|
Rename categories. |
|
Reorder categories as specified in new_categories. |
|
Add new categories. |
|
Remove the specified categories. |
|
Remove categories which are not used. |
|
Set the categories to the specified new_categories. |
|
Set the Categorical to be ordered. |
|
Set the Categorical to be unordered. |
Plotting¶
Series.plot
is both a callable method and a namespace attribute for
specific plotting methods of the form Series.plot.<kind>
.
alias of |
|
|
Draw a stacked area plot. |
|
Vertical bar plot. |
|
Make a horizontal bar plot. |
|
Make a box plot of the Series columns. |
|
Generate Kernel Density Estimate plot using Gaussian kernels. |
|
Draw one histogram of the DataFrame’s columns. |
|
Plot DataFrame/Series as lines. |
|
Generate a pie plot. |
|
Generate Kernel Density Estimate plot using Gaussian kernels. |
|
Draw one histogram of the DataFrame’s columns. |
Serialization / IO / Conversion¶
Return a pandas Series. |
|
A NumPy ndarray representing the values in this DataFrame or Series. |
|
Return a list of the values. |
|
|
Render a string representation of the Series. |
|
Convert Series to {label -> value} dict or dict-like object. |
|
Copy object to the system clipboard. |
|
Render an object to a LaTeX tabular environment table. |
|
Print Series or DataFrame in Markdown-friendly format. |
|
Convert the object to a JSON string. |
|
Write object to a comma-separated values (csv) file. |
|
Write object to an Excel sheet. |
|
Convert Series to DataFrame. |
Pandas-on-Spark specific¶
Series.pandas_on_spark
provides pandas-on-Spark specific features that exists only in pandas API on Spark.
These can be accessed by Series.pandas_on_spark.<function/property>
.
Transform the data with the function that takes pandas Series and outputs pandas Series. |