DenseVector

class pyspark.ml.linalg.DenseVector(ar: Union[bytes, numpy.ndarray, Iterable[float]])

A dense vector represented by a value array. We use numpy array for storage and arithmetics will be delegated to the underlying numpy array.

Examples

>>> v = Vectors.dense([1.0, 2.0])
>>> u = Vectors.dense([3.0, 4.0])
>>> v + u
DenseVector([4.0, 6.0])
>>> 2 - v
DenseVector([1.0, 0.0])
>>> v / 2
DenseVector([0.5, 1.0])
>>> v * u
DenseVector([3.0, 8.0])
>>> u / v
DenseVector([3.0, 2.0])
>>> u % 2
DenseVector([1.0, 0.0])
>>> -v
DenseVector([-1.0, -2.0])

Methods

dot(other)

Compute the dot product of two Vectors.

norm(p)

Calculates the norm of a DenseVector.

numNonzeros()

Number of nonzero elements.

squared_distance(other)

Squared distance of two Vectors.

toArray()

Returns the underlying numpy.ndarray

Attributes

values

Returns the underlying numpy.ndarray

Methods Documentation

dot(other: Iterable[float]) → numpy.float64

Compute the dot product of two Vectors. We support (Numpy array, list, SparseVector, or SciPy sparse) and a target NumPy array that is either 1- or 2-dimensional. Equivalent to calling numpy.dot of the two vectors.

Examples

>>> dense = DenseVector(array.array('d', [1., 2.]))
>>> dense.dot(dense)
5.0
>>> dense.dot(SparseVector(2, [0, 1], [2., 1.]))
4.0
>>> dense.dot(range(1, 3))
5.0
>>> dense.dot(np.array(range(1, 3)))
5.0
>>> dense.dot([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F'))
array([  5.,  11.])
>>> dense.dot(np.reshape([1., 2., 3.], (3, 1), order='F'))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
norm(p: NormType) → numpy.float64

Calculates the norm of a DenseVector.

Examples

>>> a = DenseVector([0, -1, 2, -3])
>>> a.norm(2)
3.7...
>>> a.norm(1)
6.0
numNonzeros() → int

Number of nonzero elements. This scans all active values and count non zeros

squared_distance(other: Iterable[float]) → numpy.float64

Squared distance of two Vectors.

Examples

>>> dense1 = DenseVector(array.array('d', [1., 2.]))
>>> dense1.squared_distance(dense1)
0.0
>>> dense2 = np.array([2., 1.])
>>> dense1.squared_distance(dense2)
2.0
>>> dense3 = [2., 1.]
>>> dense1.squared_distance(dense3)
2.0
>>> sparse1 = SparseVector(2, [0, 1], [2., 1.])
>>> dense1.squared_distance(sparse1)
2.0
>>> dense1.squared_distance([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense1.squared_distance(SparseVector(1, [0,], [1.,]))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
toArray() → numpy.ndarray

Returns the underlying numpy.ndarray

Attributes Documentation

values

Returns the underlying numpy.ndarray