Meta Integration® Model Bridge (MIMB)
"Metadata Integration" Solution

MIMB Bridge Documentation

MIMB Import Bridge from Apache Spark (with Python or Scala)

Bridge Specifications

Vendor Apache
Tool Name Spark (with Python or Scala)
Tool Version 2.x
Tool Web Site http://spark.apache.org/
Supported Methodology [Data Integration] Multi-Model, Data Store (Physical Data Model), (Source and Target Data Stores, Transformation Lineage, Expression Parsing) via Spark with Python or Scala File

BRIDGE INFORMATION
Import tool: Apache Spark (with Python or Scala) 2.x (http://spark.apache.org/)
Import interface: [Data Integration] Multi-Model, Data Store (Physical Data Model), (Source and Target Data Stores, Transformation Lineage, Expression Parsing) via Spark with Python or Scala File from Apache Spark (with Python or Scala)
Import bridge: 'ApacheSpark' 10.1.0

BRIDGE DOCUMENTATION
The purpose of this Apache Spark import bridge is to detect and parse all Spark the statements from the Python or Scala scripts
in order to generate the exact scope (data models) of the involved source and target data stores,
as well as the data flow lineage and impact analysis (data integration ETL/ELT model) between them.


Bridge Parameters

Parameter Name Description Type Values Default Scope
Directory Select a directory with the textual files that contain the code to import DIRECTORY     Mandatory
Code Language Select the language ENUMERATED
Python
Scala
Python  
Directory Filter Specify a search filter for the sub directories. Use regular expressions in java format if needed (e.g. '.*_script'). Multiple conditions can be defined by using a space as a separator (e.g. 'directory1 directory2'). The condition must be escaped with double quotes if it contains any spaces inside (e.g. "my directory"). Negation can be defined with the preceeding dash character (e.g. '-bin'). STRING      
File Filter Specify a search filter for files. Use regular expressions in java format if needed (e.g. '.*\.py'). Multiple conditions can be defined by using a space as a separator (e.g. 'file1 file2'). The condition must be escaped with double quotes if it contains any spaces inside (e.g. "my file.py"). Negation can be defined with the preceeding dash character (e.g. '-\.tar\.gz'). STRING      
Miscellaneous Specify miscellaneous options identified with a -letter and value.

For example, -e UTF-16

-e: encoding. This value will be used to load text from the specified script files. By default, UTF-8 will be used. Here are some other possible values: UTF-16, UTF-16BE, US-ASCII.
-p: parameters. Full path to the yaml file that defines all the entry points for the scripts to parse as well as their input parameters. The new template will be generated automatically if the file doesn't exist. Use double quotes in order to escape the path that contains spaces.
-pppd. enables the DI/ETL post-processor processing of DI/ETL designs in order to create the design connections and connection data sets.
STRING      

 

Bridge Mapping

Mapping information is not available

Last updated on Thu, 7 Nov 2019 17:33:24

Copyright © Meta Integration Technology, Inc. 1997-2019 All Rights Reserved.

Meta Integration® is a registered trademark of Meta Integration Technology, Inc.
All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.