smv.iomod package¶
Submodules¶
smv.iomod.base module¶
-
class
smv.iomod.base.AsFile(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModuleMixin to assure a fileName method
-
class
smv.iomod.base.SmvInput(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModuleBase class for all Input modules
Sub-class need to implement:
- connectionType
- _get_input_data
User need to implement:
- connectionName
-
requiresDS()[source]¶ User-specified list of dependencies
Override this method to specify the SmvGenericModule needed as inputs.
Returns: a list of dependencies Return type: (list(SmvGenericModule))
-
class
smv.iomod.base.SmvIoModule(smvApp)[source]¶ Bases:
smv.smvgenericmodule.SmvGenericModuleBase class for input and output modules
Has two sub-classes:
- SmvInput: no dependency module, single output data
- SmvOutput: single dependency module, no output data
-
class
smv.iomod.base.SmvOutput(smvApp)[source]¶ Bases:
smv.iomod.base.SmvIoModuleBase class for all Output modules
Sub-class need to implement:
- connectionType
- doRun
Within doRun, assert_single_input should be called.
User need to implement:
- connectionName
-
IsSmvOutput= True¶
-
class
smv.iomod.base.SmvSparkDfOutput(smvApp)[source]¶ Bases:
smv.iomod.base.SmvOutputSmvOutput which write out Spark DF
smv.iomod.inputs module¶
-
class
smv.iomod.inputs.SmvJdbcInputTable(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.base.SmvInput,smv.iomod.base.AsTableUser need to implement
- connectionName
- tableName
-
class
smv.iomod.inputs.SmvHiveInputTable(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.base.SmvInput,smv.iomod.base.AsTableUser need to implement:
- connectionName
- tableName
-
class
smv.iomod.inputs.SmvXmlInputFile(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.inputs.InputFileWithSchemaInput from file in XML format User need to implement:
- rowTag: required
- connectionName: required
- fileName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
-
class
smv.iomod.inputs.SmvCsvInputFile(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.inputs.WithSmvSchema,smv.iomod.inputs.WithCsvParserCsv file input User need to implement:
- connectionName: required
- fileName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
- csvAttr: optional
- failAtParsingError: optional, default True
- dqm: optional, default SmvDQM()
-
class
smv.iomod.inputs.SmvMultiCsvInputFiles(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.inputs.WithSmvSchema,smv.iomod.inputs.WithCsvParserMultiple Csv files under the same dir input User need to implement:
- connectionName: required
- dirName: required
- schemaConnectionName: optional
- schemaFileName: optional
- userSchema: optional
- csvAttr: optional
- failAtParsingError: optional, default True
- dqm: optional, default SmvDQM()
-
class
smv.iomod.inputs.SmvCsvStringInputData(smvApp)[source]¶ Bases:
smv.smvmodule.SparkDfGenMod,smv.iomod.inputs.WithCsvParserInput data defined by a schema string and data string
User need to implement:
- schemaStr(): required
- dataStr(): required
- failAtParsingError(): optional
- dqm(): optional
-
dataStr()[source]¶ Smv data string.
E.g. “212,2016-10-03;119,2015-01-07”
Returns: data Return type: (str)
smv.iomod.outputs module¶
-
class
smv.iomod.outputs.SmvJdbcOutputTable(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput,smv.iomod.outputs.WithSparkDfWriter,smv.iomod.base.AsTableUser need to implement
- requiresDS
- connectionName
- tableName
- writeMode: optional, default “errorifexists”
-
class
smv.iomod.outputs.SmvHiveOutputTable(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput,smv.iomod.outputs.WithSparkDfWriter,smv.iomod.base.AsTableUser need to implement
- requiresDS
- connectionName
- tableName
- writeMode: optional, default “errorifexists”
-
class
smv.iomod.outputs.SmvCsvOutputFile(smvApp)[source]¶ Bases:
smv.iomod.base.SmvSparkDfOutput,smv.iomod.base.AsFileUser need to implement
- requiresDS
- connectionName
- fileName