8.2. About File Operations to FEFS/LLIO

8.2.1. When writing from one process to one file

When one process writes to one file, both FEFS and LLIO perform write processing.

../_images/WriteFileSingleProcess_01.png

[Example code to write 1 MB of data from 1 process for 1 file]

1int fd;
2fd = open("output.txt", O_WRONLY | O_CREAT, 0644);
3write(fd, data, 1000000);
4close(fd);

Attention

Although omitted in the code example, errors such as open and write operations should always be handled.

8.2.2. When multiple processes write to a file

This section describes when multiple processes write to a file using MPI-IO and when using the write system call.

../_images/WriteFileMultiProcess_01.png

8.2.2.1. How to Write Files Using MPI-IO

Both FEFS and LLIO allow multiple processes to write to a file using MPI-IO.

[Example code to write 1 MB of data from multiple processes to 1 newly created file]

1MPI_File fh;
2MPI_File_open(MPI_COMM_WORLD, "output.txt", MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &fh);
3MPI_Offset offset = RANK*1000000;
4MPI_File_write_at(fh, offset, data, 1000000, MPI_CHAR, MPI_STATUS_IGNORE);
5MPI_File_close(&fh);

Attention

Although omitted in the code example, errors such as MPI_File_open and MPI_File_write_at operations should always be handled.

An example is given when the file is opened in append mode (MPI_MODE_APPEND).

[Example code for writing 1 MB of data from multiple processes to 1 file opened in append mode]

1MPI_File fh;
2MPI_File_open(MPI_COMM_WORLD, "output.txt", MPI_MODE_APPEND | MPI_MODE_WRONLY, MPI_INFO_NULL, &fh);
3MPI_File_write(fh, data, 1000000, MPI_CHAR, MPI_STATUS_IGNORE);
4MPI_File_close(&fh);

Attention

Although omitted in the code example, errors such as MPI_File_open and MPI_File_write operations should always be handled.

8.2.2.2. How to Write Files Without MPI-IO

For file writing methods that do not use MPI-IO, FEFS/LLIO implements processing differently.

  • FEFS

    To maintain file system consistency, FEFS does not take a lock and performs write operations.

[Example code to write 1 MB of data from multiple processes to 1 newly created file]

1 int fd;
2 fd = open("output.txt", O_WRONLY | O_CREAT, 0644);
3 lseek(fd, RANK*1000000, SEEK_SET);
4 write(fd, data, 1000000);
5 close(fd);

Attention

Although omitted in the code example, errors such as open and write operations should always be handled.

An example is given when the file is opened in append mode (O_APPEND).

[Example code for writing 1 MB of data from multiple processes to 1 file opened in append mode]

1 int fd;
2 fd = open("output.txt", O_WRONLY | O_APPEND, 0644);
3 write(fd, data, 1000000);
4 close(fd);

Attention

Although omitted in the code example, errors such as open and write operations should always be handled.

  • LLIO

    With LLIO, the file system does not control consistency. Therefore, lock is acquired and write processing is performed, or write processing is performed using Direct I/O. The available locks for LLIO are flock(2), fcntl(2).

    Here is an example code that uses fcntl(2) to obtain a lock and perform a write operation.

    If you want to control the timing of writes to second-layer storage in cache area of second-layer storage, see Writing timing to second-layer storage.

    [Example code for writing 1 MB of data from multiple processes for 1 file]

    1 int fd;
    2 fd = open("output.txt", O_WRONLY | O_CREAT, 0644);
    3 lseek(fd, RANK*1000000, SEEK_SET);
    4 struct flock lk0 = {.l_type=F_WRLCK, .l_whence=SEEK_SET, .l_start=RANK*1000000, .l_len=1000000};
    5 fcntl(fd, F_SETLK, &lk0);
    6 write(fd, data, 1000000);
    7 struct flock lk1 = {.l_type=F_UNLCK, .l_whence=SEEK_SET, .l_start=RANK*1000000, .l_len=1000000};
    8 fcntl(fd, F_SETLK, &lk1);
    9 close(fd);
    

    Attention

    Although omitted in the code example, errors such as open and write operations should always be handled.

    An example is given when the file is opened in append mode (O_APPEND).

    [Example code for writing 1 MB of data from multiple processes to 1 file opened in append mode]

    1 int fd;
    2 fd = open("output.txt", O_WRONLY | O_APPEND, 0644);
    3 lseek(fd, RANK*1000000, SEEK_SET);
    4 flock(fd, LOCK_EX)
    5 write(fd, data, 1000000);
    6 flock(fd, LOCK_UN);
    7 close(fd);
    

    Attention

    Although omitted in the code example, errors such as open and write operations should always be handled.

The following code example shows a write operation in Direct I/O. For write alignment and write size, you must consider the page cache size of the compute node, 64 KiB.

[Example code for writing 1 MB of data from multiple processes for 1 file]

1 int size=1000000*100
2 posix_memalign( (void **)&data, 1000000, size);
3 int fd;
4 fd = open("output.txt", O_DIRECT | O_SYNC | O_WRONLY | O_CREAT, 0644);
5 lseek(fd, RANK*1000000, SEEK_SET);
6 write(fd, data, 1000000);
7 close(fd);

Attention

Although omitted in the code example, errors such as open and write operations should always be handled.

8.2.3. Rename File

When renaming a file, both FEFS and LLIO perform a rename operation.

However, on the cache area of second-layer storage or the shared temporary area, if file_A is deleted or renamed another name and then recreated on compute node_A, on compute node_B, open(2) for file_A may fail or may open the file which is deleted or renamed on compute node_A.

This specification can be addressed as follows.

  • Execute ls command on compute node_B against the parent directory of file_A before opening file_A.

  • Wait 60 seconds after recreating file_A on compute node_A before opening file_A on compute node_B.