Monday, December 1, 2014

Ab Initio - Multifiles




A multifile system is a specially created set of directories, possibly on different machines, which have identical substructure.
Each directory is a partition of the multifile system. When a multifile is placed in a multifile system, its partitions are files within each of the partitions of the multifile system.
Multifile system leads to better performance than flat file systems because multifile system can divide your data among multiple disks or CPUs.
Typically (SMP machine is exception) a multifile system is created with the control partition on one node and data partitions on other nodes to distributes the work and improve performance.
To do this use full internet URLs that specifies file and directory names and location on remote machines.
An Ab Initio multifile organizes all partitions of a multifile into one single virtual file that you can reference as one entity

Following are the steps to create a multifile directory for an application.

  • Create the data and partitions directory to keep the data and control files.
  • Execute the m_mkfs command from the data directory to create the multifile system.


Multifile Commands
·         m_mkfs
·         m_mkdir
·         m_ls
·         m_expand
·         m_dump
·         m_cp
·         m_mv
·         m_rm
·         m_touch
·         m_rollback
·         m_kill 
·         m_env 
·         m_env -ev 

m_mkfs Command
m_mkfs will create the mfs file system.
This command is used to create a multifile system. Multifile system consists of Control file directory along with the partitions directory. Control file resides in the $APPL_DATA/mdata directory and data files in partitions directories. Depending upon number of partitions multifile system will be created.
Syntax: 
m_mkfs <name of the control partion> <URL of the first partion1> <URL of the first partion1> <URL of the first partion1>.......

For eg:
m_mkfs controlpartion c:abinitopartion1    c:abinitopartion2    c:abinitopartion3 

Now controlpartion is created under the root directory and the three partions are created under the respective locations.

m_mkdir Command
m_mkdir will create the mfs file directory.

m_mkdir creates a multidirectory. A multidirectory can span several different disks or nodes and contains various items of metadata that are created and maintained by the Ab Initio commands.

Syntax:
m_mkdir url
url must refer to a pathname within an existing multifile system.

m_ls command
m_ls command to list files & directories
Syntax:
m_ls [opts...] url [url...]


m_expand command
m_expand will show the partitions used by a multifile.
Syntax:
m_expand [opts...] path

m_expand commands takes path rather than a file name to display the related information of partition.

m_dump command
This command displays contents of files, multifiles, or selected records from files or multifiles
Syntax:
 m_dump metadata [path] [opts ...]

m_cp command
This command copies files or multifiles that have the same degree of parallelism. Behind the scenes, m_cp actually builds and runs a small graph, so it may copy from one machine to another where Ab Initio is installed.
Syntax:
m_cp source […] directory

m_mv command
This command moves a single file, multifile, directory, or multi-directory from one path to another path on the same host via renaming… does not actually move data.
Syntax:
m_mv oldpath newpath

m_rm command
This command removes a file or multifile and all its associated data partitions.
Syntax:
m_rm [options] path [...]

m_touch command
This command creates an empty file or multifile in the specified location
Syntax:
m_touch path

m_rollback command
m_rollback winds back an interupted Ab Initio graph. You need to find the graph's recovery file in the execution directory and pass its name to m_rollback as in
Syntax:
m_rollback <recovery file name>

m_kill command
m_kill stops a graph and all the associated processes. Use this in preference to the Unix kill command as the Unix command will only halt the single process it can see for that graph whereas m_kill will halt everything associated with the graph. To halt a running graph, then delete all the work files and changes it has made up to that point, use the two commands
Syntax:
m_kill -9 <recovery file name>
m_rollback -d <recovery file name>

m_env  command
m_env Dumps the current Ab Initio environment settings to stdout.

m_env -ev  command
m_env -ev displays the currently running version of the Co>Operating system.

m_eval command
This command to execute functions of ab-initio on Unix prompt.
It useful for quickly testing out or debugging a complex expression

syntax:
m_eval expression


Useful Links

1.Ab Initio Sandbox
2.Ab Initio Components
3.Ab initio Parallelism
4.Ab Initio Basic Graph Development     
5.Ab Initio Multifiles