Meta Integration® Model Bridge (MIMB)
"Metadata Integration" Solution

MIMB Bridge Documentation

MIMB Import Bridge from Microsoft Azure Blob Storage

Bridge Specifications

Vendor Microsoft
Tool Name Azure Blob Storage
Tool Version 1.0
Tool Web Site https://azure.microsoft.com/en-us/services/storage/blobs/
Supported Methodology [Database] Multi-Model via Java API

Import tool: Microsoft Azure Blob Storage 1.0 (https://azure.microsoft.com/en-us/services/storage/blobs/)
Import interface: [Database] Multi-Model via Java API from Microsoft Azure Blob Storage
Import bridge: 'MicrosoftAzureBlobStorage' 10.0.1

IMPORTING FROM Microsoft Azure Blob Storage Service.

This bridge establishes a connection with a choosed bucket in order to extract the physical metadata. It is critical that the parameters are filled correctly in order to satisfy the local connection requirements on the client workstation that runs the bridge.
This bridge supports the following file formats:
- Flat File (CSV)
- Open Office Excel (XSLX)
- COBOL Copybook
- JSON (JavaScript Object Notation)
- Apache Avro
- Apache Parquet
- Apache ORC
- W3C XML

as well as the compressed versions of the above formats:
- ZIP (as a compression format, not as archive format)
- BZIP
- GZIP
- LZ4
- Snappy (as standard Snappy format, not as Hadoop native Snappy format)

Please refer to the individual parameter's tool tips for more detailed examples.


Bridge Parameters

Parameter Name Description Type Values Default Scope
Storage account An Azure storage account provides a unique namespace in the cloud to store and access your data objects in Azure Storage. STRING      
Storage access key A String that represents the Base-64-encoded 512-bit storage account access key, which are used for authentication when the storage is accessed. PASSWORD      
Root directory Set directory containing metadata files or specify it using browsing tool. Bridge provides up to 3 level browsing depth. Don't forget to specify 'Region' parameter for using browsing tool.


Bridge uses only s3a protocol to load files.
f.e. s3a://bucket/dir1/dir2
REPOSITORY_SUBSET     Mandatory
Include filter The include folder and file filter pattern relative to the root directory.
The patern uses extended unix glob case-sensitive expression syntax.
Here are some common examples:
*.* - include any file at the root level
*.csv - include only csv files at the root level
**.csv -include only csv files at any level
*.{csv,gz} include only csv or gz files at the root level
dir\*.csv - include only csv files in the 'dir' folder
dir\**.csv - include only csv files under 'dir' folder at any level
dir\**.* - include any file under 'dir' folder at any level
f.csv - include only f.csv under root level
**\f.csv - include only f.csv at any level
**dir\** - include all files under any 'dir' folder at any level
**dir1\dir2\** - include all files under any 'dir2' folder under any 'dir1' folder at any level
STRING      
Exclude filter The exclude folder and file filter pattern relative to the root directory.
The patern uses the same syntax as the Include filter. See it for the systax details and examples.
Files that match the exclude filter are skipped.
When both include and exclude filters are empty all folders and files under the Root directory are included.
When the include filter is empty and the exclude one is not folders and files under the Root directory are included except ones matching the exclude filter.
STRING      
Partition directories Files-based partition directories' paths.
The bridge tries to detect partitions automatically. It can take a long time when partitions have a lot of files.
You can shortcut the detection process for a partition by specifying it in this parameter.
Specify the partition directory path relative to the Root directory.
Use . to specify the root directory as the partitioned directory.

Separate multiple paths with the , (or ;) character.
For example: dir1/dir2,dir3/dir4,dir5
STRING      
Sample size Number of files to scan during data-partitioning dirictories analyze NUMERIC      
Miscellaneous Specify miscellaneous options identified with a -letter and value.

For example, -m 4G -f 100 -j -Dname=value -Xms1G

-m the maximum Java memory size whole number (e.g. -m 4G or -m 2500M ).
-v set environment variable(s) (e.g. -v var1=value -v var2="value with spaces").
-j the last option that is followed by Java command line options (e.g. -j -Dname=value -Xms1G).
-hadoop key1=val1;key2=val2 to manualy set hadoop configuration options
-tps 10 maximum threads pool size
-tl 3600s processing time limit in s -seconds m - minutes or h hours;
-fl 1000 processing files count limit;
STRING      

 

Bridge Mapping

Mapping information is not available

Last updated on Fri, 21 Sep 2018 16:15:06

Copyright © Meta Integration Technology, Inc. 1997-2018 All Rights Reserved.

Meta Integration® is a registered trademark of Meta Integration Technology, Inc.
All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.