smv.iomod package¶
Submodules¶
smv.iomod.base module¶
-
class
smv.iomod.base.
AsFile
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModule
Mixin to assure a fileName method
-
class
smv.iomod.base.
SmvInput
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModule
Base class for all Input modules
Sub-class need to implement:
- connectionType
- _get_input_data
User need to implement:
- connectionName
-
requiresDS
()[source]¶ User-specified list of dependencies
Override this method to specify the SmvGenericModule needed as inputs.
Returns: a list of dependencies Return type: (list(SmvGenericModule))
-
class
smv.iomod.base.
SmvIoModule
(smvApp)[source]¶ Bases:
smv.smvgenericmodule.SmvGenericModule
Base class for input and output modules
Has two sub-classes:
- SmvInput: no dependency module, single output data
- SmvOutput: single dependency module, no output data
-
class
smv.iomod.base.
SmvOutput
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModule
Base class for all Output modules
Sub-class need to implement:
- connectionType
- doRun
Within doRun, assert_single_input should be called.
User need to implement:
- connectionName
-
IsSmvOutput
= True¶
-
class
smv.iomod.base.
SmvSparkDfOutput
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvOutput
SmvOutput which write out Spark DF
smv.iomod.inputs module¶
-
class
smv.iomod.inputs.
SmvJdbcInputTable
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.base.SmvInput
,smv.iomod.base.AsTable
User need to implement
- connectionName
- tableName
-
class
smv.iomod.inputs.
SmvHiveInputTable
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.base.SmvInput
,smv.iomod.base.AsTable
User need to implement:
- connectionName
- tableName
-
class
smv.iomod.inputs.
SmvXmlInputFile
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.inputs.InputFileWithSchema
Input from file in XML format User need to implement:
- rowTag: required
- connectionName: required
- fileName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
-
class
smv.iomod.inputs.
SmvCsvInputFile
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.inputs.WithSmvSchema
,smv.iomod.inputs.WithCsvParser
Csv file input User need to implement:
- connectionName: required
- fileName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
- csvAttr: optional
- failAtParsingError: optional, default True
- dqm: optional, default SmvDQM()
-
class
smv.iomod.inputs.
SmvMultiCsvInputFiles
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.inputs.WithSmvSchema
,smv.iomod.inputs.WithCsvParser
Multiple Csv files under the same dir input User need to implement:
- connectionName: required
- dirName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
- csvAttr: optional
- failAtParsingError: optional, default True
- dqm: optional, default SmvDQM()
-
class
smv.iomod.inputs.
SmvCsvStringInputData
(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod
,smv.iomod.inputs.WithCsvParser
Input data defined by a schema string and data string
User need to implement:
- schemaStr(): required
- dataStr(): required
- failAtParsingError(): optional
- dqm(): optional
-
dataStr
()[source]¶ Smv data string.
E.g. “212,2016-10-03;119,2015-01-07”
Returns: data Return type: (str)
smv.iomod.outputs module¶
-
class
smv.iomod.outputs.
SmvJdbcOutputTable
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput
,smv.iomod.outputs.WithSparkDfWriter
,smv.iomod.base.AsTable
User need to implement
- requiresDS
- connectionName
- tableName
- writeMode: optional, default “errorifexists”
-
class
smv.iomod.outputs.
SmvHiveOutputTable
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput
,smv.iomod.outputs.WithSparkDfWriter
,smv.iomod.base.AsTable
User need to implement
- requiresDS
- connectionName
- tableName
- writeMode: optional, default “errorifexists”
-
class
smv.iomod.outputs.
SmvCsvOutputFile
(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput
,smv.iomod.base.AsFile
User need to implement
- requiresDS
- connectionName
- fileName