Meta Integration® Model Bridge (MIMB)
"Metadata Integration" Solution

MIMB Bridge Documentation

MIMB Import Bridge from W3C XML

Bridge Specifications

Vendor World Wide Web Consortium
Tool Name XML
Tool Version 1.0
Tool Web Site http://www.w3.org/TR/2000/REC-xml-20001006
Supported Methodology [File System] Data Store (NoSQL / Hierarchical, Physical Data Model) via XML File

BRIDGE INFORMATION
Import tool: World Wide Web Consortium XML 1.0 (http://www.w3.org/TR/2000/REC-xml-20001006)
Import interface: [File System] Data Store (NoSQL / Hierarchical, Physical Data Model) via XML File from W3C XML
Import bridge: 'W3cXml' 10.1.0

BRIDGE DOCUMENTATION
This W3C XML import bridge is used in conjunction with other file import bridges (e.g. CSV, XLSX, Json, Avro, Parquet) by all data lake / file crawler import bridges (e.g. File systems, Amazon S3, Hadoop HDFS).

The purpose of this XML import is to reverse engineer a model/schema from its content, when such XML was not formally defined by an XML Schema (XSD or DTD).
Such XML files are common from IoT devices uploaded into a data lake.

Nevertheless, such XML files are expected to be fully W3C compliant, especially with respect to the XML text declaration, well-formed parsed entities, and character encoding of entities.
See W3C standards for more details:
https://www.w3.org/TR/xml/#sec-TextDecl

Warning, you must use the dedicated XML based import bridges for all other needs such as:
- other standard W3C XML import bridges (e.g. DTD, XSD, WSDL, OWL/RDL)
- tool specific XML import bridges (e.g. Erwin Data Modeler XML, Informatica PowerCenter XML)


Bridge Parameters

Parameter Name Description Type Values Default Scope
File The bridge uses the XML file as input. FILE *.xml   Mandatory
Miscellaneous Specify miscellaneous options identified with a -letter and value.

For example, -m 4G -f 100 -j -Dname=value -Xms1G

-m the maximum Java memory size whole number (e.g. -m 4G or -m 2500M ).
-v set environment variable(s) (e.g. -v var1=value -v var2="value with spaces").
-j the last option that is followed by Java command line options (e.g. -j -Dname=value -Xms1G).
-hadoop key1=val1;key2=val2 to manualy set hadoop configuration options
-tps 10 maximum threads pool size
-tl 3600s processing time limit in s -seconds m - minutes or h hours;
-fl 1000 processing files count limit;
-delimited.top_rows_skip 1 number of rows to skip while processing csv files
-delimited.extra_separators ~,||,|~ comma separated extra delimiters each of which will be used while processing csv files
-delimited.no_header by default, bridge automatically tries to detect headers while processing csv files(basing on header columns types), use this option to disable headers import(f.e. to hide sensitive data)
-fresh.partition.models - use to import latest modified files when processing partitions defined in Partitioned directories parameter
-subst K: C:/test - use to associate a root path part with a drive or another path.
-skip.download - use to disable dependencies downloading and use only download cache
-prescript [cmd] - runs a script command before bridge execution. Example: -prescript \"script.bat\"
The script must be located in the bin directory, and have .bat or .sh extension.
The script path must not include any parent directory symbol (..)
The script should return exit code 0 to indicate success, or another value to indicate failure.
-disable.partitions.autodetection - use this option to disable automatic partitions detection(when "Partition directories" option is empty)
STRING      

 

Bridge Mapping

Mapping information is not available

Last updated on Thu, 7 Nov 2019 17:33:24

Copyright © Meta Integration Technology, Inc. 1997-2019 All Rights Reserved.

Meta Integration® is a registered trademark of Meta Integration Technology, Inc.
All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.