pyspark.pandas.Series.dot¶
-
Series.
dot
(other: Union[Series, pyspark.pandas.frame.DataFrame]) → Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None, pyspark.pandas.series.Series]¶ Compute the dot product between the Series and the columns of other.
This method computes the dot product between the Series and another one, or the Series and each columns of a DataFrame.
It can also be called using self @ other in Python >= 3.5.
Note
This API is slightly different from pandas when indexes from both Series are not aligned and config ‘compute.eager_check’ is False. pandas raises an exception; however, pandas-on-Spark just proceeds and performs by ignoring mismatches with NaN permissively.
>>> pdf1 = pd.Series([1, 2, 3], index=[0, 1, 2]) >>> pdf2 = pd.Series([1, 2, 3], index=[0, 1, 3]) >>> pdf1.dot(pdf2) ... ValueError: matrices are not aligned
>>> psdf1 = ps.Series([1, 2, 3], index=[0, 1, 2]) >>> psdf2 = ps.Series([1, 2, 3], index=[0, 1, 3]) >>> with ps.option_context("compute.eager_check", False): ... psdf1.dot(psdf2) ... 5
- Parameters
- otherSeries, DataFrame.
The other object to compute the dot product with its columns.
- Returns
- scalar, Series
Return the dot product of the Series and other if other is a Series, the Series of the dot product of Series and each rows of other if other is a DataFrame.
Notes
The Series and other has to share the same index if other is a Series or a DataFrame.
Examples
>>> s = ps.Series([0, 1, 2, 3])
>>> s.dot(s) 14
>>> s @ s 14
>>> psdf = ps.DataFrame({'x': [0, 1, 2, 3], 'y': [0, -1, -2, -3]}) >>> psdf x y 0 0 0 1 1 -1 2 2 -2 3 3 -3
>>> with ps.option_context("compute.ops_on_diff_frames", True): ... s.dot(psdf) ... x 14 y -14 dtype: int64