Class/Object

org.tresamigos.smv

SmvApp

Related Docs: object SmvApp | package smv

Permalink

class SmvApp extends AnyRef

Driver for SMV applications. Most apps do not need to override this class and should just be launched using the SmvApp object (defined below)

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SmvApp
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SmvApp(cmdLineArgs: Seq[String], _spark: Option[SparkSession] = None)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. lazy val allDataSets: Seq[SmvDataSet]

    Permalink
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. def createDF(schemaStr: String, data: String = null, isPersistValidateResult: Boolean = false): DataFrame

    Permalink

    Create a DataFrame from string for temporary use (in test or shell) By default, don't persist validation result

    Create a DataFrame from string for temporary use (in test or shell) By default, don't persist validation result

    Passing null for data will create an empty dataframe with a specified schema.

  8. def dependencyGraphDotString(stageNames: Seq[String] = stages): String

    Permalink

    Returns the app-level dependency graph as a dot string

  9. def dependencyGraphJsonString(stageNames: Seq[String] = stages): String

    Permalink

    Returns the app-level dependency graph as a json string

  10. var dfCache: Map[String, DataFrame]

    Permalink

    Get the DataFrame associated with data set.

    Get the DataFrame associated with data set. The DataFrame plan (not data) is cached in dfCache the to ensure only a single DataFrame exists for a given data set (file/module). Note: this keyed by the "versioned" dataset FQN.

  11. val dsm: DataSetMgr

    Permalink
  12. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. val genEdd: Boolean

    Permalink
  16. def generateAllGraphJSON(): String

    Permalink

    zero parameter wrapper around dependencyGraphJsonString that can be called from python directly.

    zero parameter wrapper around dependencyGraphJsonString that can be called from python directly. TODO: remove this once we pass args to dependencyGraphJsonString

  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. def getFileNamesByType(dirName: String, suffix: String): List[String]

    Permalink

    list of all the files with specific suffix in the given directory

  19. def getMetadataJson(urn: URN): String

    Permalink

    Returns metadata for a given urn

  20. def getRunInfo(ds: SmvDataSet, coll: SmvRunInfoCollector = new SmvRunInfoCollector()): SmvRunInfoCollector

    Permalink

    Returns the run information for a given dataset and all its dependencies (including transitive dependencies), from the last run

  21. def getRunInfo(urn: URN): SmvRunInfoCollector

    Permalink
  22. def getRunInfo(partialName: String): SmvRunInfoCollector

    Permalink
  23. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  24. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  25. lazy val modulesToRun: Seq[SmvDataSet]

    Permalink

    sequence of SmvModules to run based on the command line arguments.

    sequence of SmvModules to run based on the command line arguments. Returns the union of -a/-m/-s command line flags.

  26. lazy val modulesToRunWithAncestors: Seq[SmvDataSet]

    Permalink

    Sequence of SmvModules to run + all of their ancestors

  27. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  28. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  29. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  30. def printDeadModules: Boolean

    Permalink
  31. val publishHive: Boolean

    Permalink
  32. val publishJDBC: Boolean

    Permalink
  33. def publishModulesToHive(collector: SmvRunInfoCollector): Boolean

    Permalink

    if the publish to hive flag is setn, the publish

  34. def publishOutputModulesLocally(collector: SmvRunInfoCollector): Boolean

    Permalink

    if the export-csv option is specified, then publish locally

  35. def publishOutputModulesThroughJDBC(collector: SmvRunInfoCollector): Boolean

    Permalink

    Publish through JDBC if the --publish-jdbc flag is set

  36. def registerRepoFactory(factory: DataSetRepoFactory): Unit

    Permalink
  37. def run(): Boolean

    Permalink

    The main entry point into the app.

    The main entry point into the app. This will parse the command line arguments to determine which modules should be run/graphed/etc.

  38. def runDS(ds: SmvDataSet, forceRun: Boolean, version: Option[String], runConfig: Map[String, String] = Map.empty, collector: SmvRunInfoCollector): DataFrame

    Permalink

    proceeds with the execution of an smvDS passed from runModule or runModuleByName TODO: the name of this function should make its distinction from runModule clear (this is an implementation)

  39. def runModule(urn: URN, forceRun: Boolean = false, version: Option[String] = None, runConfig: Map[String, String] = Map.empty, collector: SmvRunInfoCollector = new SmvRunInfoCollector): DataFrame

    Permalink

    Run a module by its fully qualified name in its respective language environment If force argument is true, any existing persisted results will be deleted and the module's DataFrame cache will be ignored, forcing the module to run again.

    Run a module by its fully qualified name in its respective language environment If force argument is true, any existing persisted results will be deleted and the module's DataFrame cache will be ignored, forcing the module to run again. If a version is specified, try to read the module from the published data for the given version. If dynamic runtime configuration is specified, run the module with the configuration provided.

  40. def runModuleByName(modName: String, forceRun: Boolean = false, version: Option[String] = None, runConfig: Map[String, String] = Map.empty, collector: SmvRunInfoCollector = new SmvRunInfoCollector): DataFrame

    Permalink

    Run a module based on the end of its name (must be unique).

    Run a module based on the end of its name (must be unique). If force argument is true, any existing persisted results will be deleted and the module's DataFrame cache will be ignored, forcing the module to run again. If a version is specified, try to read the module from the published data for the given version

  41. val sc: SparkContext

    Permalink
  42. val smvConfig: SmvConfig

    Permalink
  43. val sparkConf: SparkConf

    Permalink
  44. val sparkSession: SparkSession

    Permalink

    Register Kryo Classes Since none of the SMV classes will be put in an RDD, register them or not does not make significant performance improvement

    Register Kryo Classes Since none of the SMV classes will be put in an RDD, register them or not does not make significant performance improvement

    val allSerializables = SmvReflection.objectsInPackage[Serializable]("org.tresamigos.smv") sparkConf.registerKryoClasses(allSerializables.map{_.getClass}.toArray)

  45. val sqlContext: SQLContext

    Permalink
  46. val stages: Seq[String]

    Permalink
  47. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  48. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  49. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  51. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped