Appendix: introduction for NIG supercomputer

Top

About NIG supercumputer

The National Institute of Genetics is an inter-university research institute for all of Japan’s genetics advanced studies and education. In addition to the DNA Data Bank, it also offers supercomputing services to analyse genomic data using a next-generation sequencer. The NIG supercomputer is a key infrastructure supporting Japan’s DNA research. It has almost 2,500 registered researchers, who can access the combined International Nucleotide Sequence Database and upload their own sequencing information. (from International DNA database drives genetics research, an article of “In the field”)

See: NIG supercomputer website

    • System Configuration:
    • Hardware configuration
    • Software configuration – The Supercomputer System has been using Linux as its OS, and the Thin compute nodes use the RedHat Enterprise 7 distribution. This is common to all compute nodes.
    • How to use the system – The hardware system environment utilizes the Univa Grid Engine (UGE) job management system so that multiple users can efficiently share it.
    • How to transfer files – There are two methods that can be used to transfer files to the supercomputer system as follows: 1. File transfer using sftp and 2. File transfer using Aspera.

Create your login account.

  • Criteria for issuing user login accounts – User login accounts for the NIG Supercomputer are generally issued to the persons defined:
    1. Academic staff members of a national or public university or research institute who are resident in Japan, as defined by the Foreign Exchange and Foreign Trade Law (FEFTL)
    2. Any researcher, student, or contractor working on joint research with or under the guidance of a person defined by Definition 1 above (including exchange students, persons posted abroad, and foreign researchers)
  • New Registration – Account application form
  • Registering SSH public keys – For security, users are requested to register their public keys.

Training

Basic Unix commands

redirect / directory

> dump the output of a command into a file
< dump the contents of a file a command
| (pipe) dump the output of one process into the input of another process
. indicator of the directory where you are
.. indicator of the directory just upper from where you are
~ or $HOME indicator of your home directory

Directory and file browsing

ls (LiSt) list of filenames
cat (CATenate) file concatenation and/or display
head displays the beginning lines of the file
less Display files by page [pagenate by space, page back with B, process end with Q keys]
$ ls
ddbj_database bioinformatics.tar lcl
$ ls -l # More information displayed with the -l option
合計 74744
drwxr-xr-x 3 tafujisa yn-nig 4096 12月 27 11:48 2012 ddbj_database
-rw-r--r-- 1 yanakamu yn-nig 76523520 12月 15 17:48 2016 bioinformatics.tar
drwxr-xr-x 11 yanakamu yn-nig 4096 12月 15 13:35 2016 lcl
$ ls -a # You can see invisible directories and files (filename beginning with .), with the -a option
. .bash_profile .emacs.d .login .pyenv .viminfo
.. .bashrc .gem .matplotlib .rbenv .zprofile
.RepeatMaskerCache .cache .gnome2 .mc .screenrc .zshrc
.Xauthority .ddbjing34 .gnuplot_history .mozilla .ssh ddbj_database
.aspera .ddbjing34_deleteme .history .mysql_history .subversion bioinformatics.tar
.bash_history .emacs .lesshst .pki .tcshrc lcl

Create a file from the output of “ls”, then browse it with “cat”.

$ ls > test # dump the output of ls into the file "test"
$ ls
ddbj_database bioinformatics.tar lcl test
$ cat test # Display contents of the file test
ddbj_database
bioinformatics.tar
lcl
test

File & directory (folder) operation

cp (CoPy) to copy files
mv (MoVe) moving and/or renaming files
rm (ReMove) to delete files
$ ls
ddbj_database bioinformatics.tar lcl test
$ cp test test2 # Copy file test to test2
$ cat test2 # display test2
ddbj_database
bioinformatics.tar
lcl
test
$ cat test test2 > test3 # Combine the contents of test and test 2 and write it to test 3
$ cat test3 # display test3
ddbj_database
bioinformatics.tar
lcl
test
ddbj_database
bioinformatics.tar
lcl
test
$ mv test3 test4 # Rename test3 to test4
$ ls
ddbj_database bioinformatics.tar lcl test test2 test4
$ rm test4 # Erase test 4
$ ls
ddbj_database bioinformatics.tar lcl test test2

Move, create directories (folders)

mkdir / rmdir (Make Directory / ReMove Directory) Create / delete directory
pwd (Print Working Directory) Show where you are now
cd (Change Directory) Change where you are
$ mkdir testdir # Create a directory (folder) called testdir
$ ls
ddbj_database bioinformatics.tar lcl test test2 testdir
$ cd testdir # move into testdir
$ pwd # Display the current directory
/home/yanakamu/testdir
$ ls
$ mv ../test ./ # Move the test file from here (..) in the lower directory to here (.)
$ ls
test
$ cd .. # Move to the lower directory
$ pwd # Display the current directory
/home/yanakamu
$ ls # Since "test" moved to "testdir", it has disappeared
ddbj_database bioinformatics.tar lcl test2 testdir
$ rmdir testdir # try to erase testdir, but it is not empty so it cannot be erased
rmdir: failed to remove `testdir'
$ rm testdir/test # Erase "test" file in "testdir"
$ ls testdir # list file inside of testdir, but it is empty
$ rmdir testdir # Now you can delete "testdir"
$ ls
ddbj_database bioinformatics.tar lcl test2

File archive

tar (Tape ARchive) archive command
zip / unzip file compression / expansion
gzip / gunzip GNU’s file compression / expansion command
$ cp test2 test3
$ ls
ddbj_database bioinformatics.tar lcl test2 test3
$ tar -cf test.tar test2 test3 # Bundle test 2 and test 3 with the name test.tar
$ ls
ddbj_database bioinformatics.tar lcl test.tar test2 test3
$ tar -tf test.tar
test2
test3
$ rm test2 test3 # remove test2 and test3
$ ls
ddbj_database bioinformatics.tar lcl test.tar
$ ls -l
合計 74756
drwxr-xr-x 3 tafujisa yn-nig 4096 12月 27 11:48 2012 ddbj_database
-rw-r--r-- 1 yanakamu yn-nig 76523520 12月 15 17:48 2016 ddbjing34.tar
drwxr-xr-x 11 yanakamu yn-nig 4096 12月 15 13:35 2016 lcl
-rw-r--r-- 1 yanakamu yn-nig 10240 12月 16 13:48 2016 test.tar
$ gzip test.tar # Compress test.tar
$ ls -l
合計 74744
drwxr-xr-x 3 tafujisa yn-nig 4096 12月 27 11:48 2012 ddbj_database
-rw-r--r-- 1 yanakamu yn-nig 76523520 12月 15 17:48 2016 bioinformatics
drwxr-xr-x 11 yanakamu yn-nig 4096 12月 15 13:35 2016 lcl
-rw-r--r-- 1 yanakamu yn-nig 192 12月 16 13:48 2016 test.tar.gz
$ gunzip test.tar.gz
$ ls
ddbj_database bioinformatics.tar lcl test.tar
$ tar -xf test.tar # Retrieve the contents of test.tar (test2 and test3)
$ ls
ddbj_database bioinformatics.tar lcl test.tar test2 test3
$ cat test2 test3
ddbj_database
bioinformatics.tar
lcl
test
ddbj_database
bioinformatics.tar
lcl
test
$ rm test.tar test2 test3 # Erase the files
$ ls
ddbj_database bioinformatics.tar lcl # Only the file from the beginning will be displayed

Top