Class/Object

org.tresamigos.smv.edd

Edd

Related Docs: object Edd | package edd

Permalink

class Edd extends AnyRef

Implement the edd method of DFHelper

Provides summary and histogram methods

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Edd
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new Edd(df: DataFrame, keys: Seq[String] = Seq())

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val df: DataFrame
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
def histogram(colName: String, colNames: String*): EddResultFunctions

Perform histogram calculation on a given group of column names with default parameters
Perform histogram calculation on a given group of column names with default parameters
Default binSize: 100.0 Default sortByFreq: false (so sort by key)
def histogram(histCols: HistColumn*): EddResultFunctions

Perform histogram calculation on a given set of HistColumns
Perform histogram calculation on a given set of HistColumns
```
scala> import org.tresamigos.smv.edd._
scala> df.histogram(Hist("v", binSize = 1000), Hist("s", sortByFreq = true)).eddShow
```
returns
org.tresamigos.smv.edd.EddResultFunctions
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val keys: Seq[String]
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def nullRate(colNames: String*): EddResultFunctions
def persistBesideData(dataPath: String): Unit
def summary(colNames: String*): EddResultFunctions

For all the columns with the name in the parameters, run a group of statistics
For all the columns with the name in the parameters, run a group of statistics
NumericType => count, average, standard deviation, min, max BooleanType => histogram TimestampType => min, max, year-hist, month-hist, day of wee hist, hour hist StringType => count, min of length, max of length, approx distinct count
If the parameter list is empty, the summary will run on all the columns.
```
scala> df.summary().eddShow
```
returns
org.tresamigos.smv.edd.EddResultFunctions
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped