split command

In Linux you can use split and join commands to split large files into smaller files or join many smaller files into a large file. This kind of operations are often necessary when you are dealing with large quantities of data.


Following is the default functionality of split. It splits a large file every thousand lines and creates new files.

$ split largefile.txt

$ ls
largefile.txt  xaa xab  xac  xad

$ wc -l *
3285  largefile.txt
1000  xaa
1000  xab
1000  xac

You can also define the number of lines you want in each file

Checking disk space in Linux

To check disk space, you can use one the following commands:

$ df

This command would list the amount of space used and available on each partition of the system.

$ du 

du commands lists disk space usage file by file. Read du man pages before using du.

$ vmstat

vmstat outputs virtual memory statistics

$ ls -l

Outputs file names and sizes.

rsync command

rsync is a very useful alternative to rcp and scp. It tool lets you copy files and directories between a local host and a remote host. The main advantage of using rsync is that rsync can use SSH as a secure channel, send/receive only the bytes inside files that changed since the last replication, and remove files on the destination host if those files were deleted on the source host to keep both hosts in sync.

rsync -avz -e ssh ~/

This copies mydir and its contents from user1 to local

Comparing files using diff and cmp

The diff command gives details of differences between two files. The cmp command simply tells you whether two files are the same or different.

$ diff one.txt two.txt

The results would be something like the following:

< txt txt txt
> text text text

The first line indicates the line number(s) which are different between files. The letter c indicates that line 14 must be changed for them to match. The < symbol refers to the text in the first file. > symbol refers to the second file.

alias command

The alias command assigns a command or set of commands to a string. Aliases are generally used to simply typing a long command or to execute an option to a command by default.

yum tutorial

yum stands for 'Yellowdog Updater Modified'. YellowDog Linux is a rpm based distribution of Linux created for the ppc architecture. Fedora Core took up this package manager as of Fedora Core 1.

Finding files on Linux

Finding files on unix and linux Find is an extremely useful command for finding files. It searches based a specified condition at a specified directory and descends into all subdirectories of the specified directory. You must always specify a directory and a condition. Files offers lots of very powerful options which allow you to precisely define your search criteria. Following are the most useful ones:

Compressing and Archiving

tar command

The tar command is used to collate collections of files into one larger file.

Creating archives Suppose we have directory xdir containing many files, we use the following command to archive them into one file.

tar -cvf x.tar xdir

This command would archive the directory xdir and its contents into a file called x.tar. The extension '.tar' is used by convention to indicate tar files.

To create an archive and apply gzip to compress the archive,