Specify an Amount binned histogram Pre-defined bins for amount-like column to best report on log-normal distributed amount fields
Specify an Amount binned histogram Pre-defined bins for amount-like column to best report on log-normal distributed amount fields
Bins:
Implement the edd method of DFHelper
Implement the edd method of DFHelper
Provides summary and histogram methods
Implement methods on Edd results
Implement methods on Edd results
scala> import org.tresamigos.smv.edd._ scala> df.summary().eddShow scala> df.summary().saveReport("file/path") scala> val eddResult: DataFrame = df.summary()
with import the edd package, EddResultFunctions can be implicitly converted to DataFrame
Define histogram parameters for specified the column
Define histogram parameters for specified the column
column name as a String
bin size for numeric column, default 100.0
histogram result sort be frequency or not, default false (sort by key)
Provides Extended Data Dictionary functions for ad hoc data analysis
Depends on the data types of the columns, Edd summary method will perform different statistics.
The
histogrammethod takes a group ofHistColumnas parameters. Or when a group ofStringas the column names are given, it will use the defaultHistColumnparameters. Two types ofHistColumns are supportedThe
eddShowmethod will print report to the console,saveReportwill save report asRDD[String], The strings are JSON strings.