8. Migration from K

Here explains about migration from K.

8.1. Overview

Here describes about the main difference between the language environment offered in Supercomputer K and Supercomputer Fugaku‘s language environment.

8.2. Language environment correspondance

On Supercomputer Fugaku, as the same as Supercomputer K, it supports for Fortran, C, C++, MPI and OpenMP.

Supported language style is changed as following. Please be aware that if you are describing in the in incompatible.

[Supported language style]

Section

K

Fugaku

Fortran

ISO/IEC 1539-1:2004 (Fortran 2003)

ISO/IEC 1539-1:2010 (Fortran 2008)

C

ISO/IEC 9899:1999 (C99 standard)

ISO/IEC 9899:2011 (C11 standard)

C++

ISO/IEC 14882:2014 (C++14 standard)

ISO/IEC 14882:2017 (C++14 standard)

MPI

2.2

3.1

OpenMP

3.1

4.0

See also

About the detail of Supercomputer Fugaku, please refer to Language specification .

8.3. Version check method

Here indicates the direction of version check.

When compiling, language software version supecifies -V option.

In the following example, it is figured out that the version is 4.8.1.

  • An example of Fortran

[_LNlogin]$ mpifrtpx -V program.f
frtpx: Fujitsu Fortran Compiler 4.8.1 tcsds-1.2.36
jwd_fortpx: Fujitsu Fortran Compiler 4.8.1 (Jun 30 2022 15:02:42)
  • An example of C/C++

[_LNlogin]$ mpifccpx -V main.c
fccpx: Fujitsu C/C++ Compiler 4.8.1 tcsds-1.2.36
simulating gcc version 6.1
ccpcompx: Fujitsu C/C++ Compiler 4.8.1 (Jun 16 2022 14:47:00)

8.4. Incompatible item

Here indicates the main incompatible about compile option and operation.

8.4.1. Fortran

Please see “Fortran User’s Guide” about the detail of each function indicated here.

  1. Convert to default -Kauto, -Ktemparraystack, -Kautoobjstack option

    There is a case that stuck use size might increase than original.

  2. Return value of IOSTAT specifier at execution of non-stop output statement

    The value returned to the IOSTAT specifier when the length of a Fortran record determined by a format specification exceeds the logical record length in a non-persistent output statement has been changed. “134” is returned to the IOSTAT specifier. Previously, “-2” was returned.

    For a program that processes according to the value of the IOSTAT specifier, the program needs to be changed.

  3. Specification change of runtime information output function
    Information (part of cost information, parallelization information, input / output information, hardware monitor information) that could be acquired as a runtime information output function cannot be acquired.
    Also it makes unable to obtain information by area specification (start_collection/stop_collection. If the range specification (start_collection / stop_collection) and its definition (fjcoll_lib) exist in the source program, an error will occur during compilation.

    If you want to obtain parallelization information, and hardware monitor information, use a CPU performance analysis tool.

  4. Implement of LLVM OpenMP library
    Parallelization function uses LLVM OpenMP library as a default.
    If use the original Fujitsu OpenMP library, specify -Nfjomplib option.
  5. Value change on macto and named constant by support of OpenMP API version 4.0 style.
    When specified translation option -Kopenmp , -D_OPENMP=201307 is enabled.
    Named constant openmp_version’s value will be 201307.

    If your program uses the value of the macro _OPENMP or the named constant openmp_version, modify your program to correspond to the changed value.

  6. Removal of translation option -K {openmp_tls | openmp_notls}

    If specified -K{openmp_tls|openmp_notls} option, the following alart message is output when translation.

    Alart : -Kopenmp_[no]tls  option is abolished.
    
  7. -Kvppocl option abolish

    If -Kvppocl option and -Kocl option is enabled, the optimization directive NOVREC was treated the same as the optimization directive NORECURRENCE. Since -Kvppocl option is abolished, it became to unable to use those equally.

  8. Change the options available on the compiler directive line “!optoins”
    The -O[1-3] options can be specified as compiler options.
    If you want to enable transtlation option that specified with compiler directive line “!options” , proceed the following steps.
    • !Cut out the procedure that describes the options line to another file and translate it with the options specified in the! options line

  9. -Nuse_rodata option abolish and style change
    Executing a program that assigns to a constant other than a constant argument may cause abnormal termination at runtime.
    Change the program to not proceed assign to constant
  10. Style change of -Kswp and -Kswp_strong option

    Once specified -Kswp,-O2,-O3 or -Kfast option behind -Kswp_strong option, -Kswp_strong option is unabled and -Kswp option is enabled.

    To enable -Kswp_strong option, specify behind of -Kswp,-O2,-O3or -Kfastoption.

  11. -Kdalign option abolish

    Compile option -Kdalign is unabled and alartted as following.

    Alart : Sub option dalign, specified to -K is incorrect.
    

    If specified -Kdalign to match common variable member align, specify -Kalign_commons.

  12. Change -Kprefetch_strong to default
    The L1 prefetch instruction operates as a strong prefetch by default created by -Kprefetch_sequential, -Kprefetch_stride or -Kprefetch_indirect.
    When generating a prefetch instruction that operates as a weak prefetch, specify -Kprefetch_nostrong.
  13. To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2 , -Khpctag is required.

    To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2, -Khpctag is required. -Khpctag is enabled as default.

  14. Behavior change when an unrecognized option is specified
    If an unrecognized option is specified, a warning message will be output and the specification will be ignored.
    Previously, a warning message was output and passed to the linker.
    To give an option to linker, use -Wl option.
  15. Changes how the -NRtrap option is specified

    To enable the -NRtrap option, it must be set at both compilation and linking.

8.4.2. C language

Please see “C User’s Guide” about the detail of each function indicated here.

  1. Convert to default -Klib opiton

    If -O0 option is enabled, when omitted, -Knolib option is applied. Also more than -O1 is enabled, when omitted, -Klib option is appiled.

    If this option is enabled and the user defines a function with the same name as the standard library function, the result intended by the user may not be obtained.

  2. Specification change of runtime information output function

    Information (part of cost information, parallelization information, input / output information, hardware monitor information) that could be acquired as a runtime information output function cannot be acquired. It makes unable to obtain information by area specification (start_collection/stop_collection.

    If the range specification (start_collection / stop_collection) and its definition (fjcoll_lib) exist in the source program, an error will occur during compilation. If you want to obtain parallelization information, and hardware monitor information, use a CPU performance analysis tool.

  3. Implement of LLVM OpenMP library
    Parallelization function uses LLVM OpenMP library as a default.
    If use the original Fujitsu OpenMP library, specify -Nfjomplib option.
  4. Abolish of translation option -noansi

    If specify language style level, use -std option.

  5. Translation option -f{signed-char|unsigned-char} omit value change

    When -f{signed-char|unsigned-char}option is omitted, -funsigned-char option is applied abd predefined macro __SIGNED_CHARS__ is not defined.

    If there is only a declaration of char type in the source program, the sign may change, and the operation of the program may be different from before.

  6. Translation option -X abolished (language specification mode is GNU compatible mode only)

    GNU C extention style is abusolutely enabled.

  7. GNU C compatible version change

    Each macro value is as below.

    Macro

    Value

    __GNUC__

    6

    __GNUC_MINOR__

    1

    __GNUC_PATCHLEVEL__

    0

    If each macro value is used in the program, modify the program to correspond to the above values.

  8. Omit value change of translation option -std

    If omitted -std option, -std=gnu11 is applied. To specify language style level, use -std option.

  9. Definition change of macro __STRICT_ANSI__, linux, unix and __STDC_VERSION__

    Macro

    Change contents

    __STRICT_ANSI__

    Defined if -ansi option or -std={c89|c99} option is enabled.

    linux

    Defined if -std={gnu89|gnu99} option is enabled.

    unix

    Defined if -std={gnu89|gnu99} option is enabled.

    __STDC_VERSION__

    Not defined if -std={c89|gnu89}option is enabled.

    If each macro is used in the program, modify the program to correspond to the above definition.

  10. Operation change if specified translation option -Dname and -Uname

    If specified the same name to -D option and -U option, the latst specification is enabled.

  11. Omit value change of traslation option -N{line|noline}

    When omitted -N{line|noline} option, -Nline option is applied.

  12. Removal of translation option -K {openmp_tls | openmp_notls}

    If specified -K{openmp_tls|openmp_notls} option, the following alart message is output when translation.

    Alart : -Kopenmp_[no]tls  option is abolished.
    
  13. Style change of -Kswp and -Kswp_strong option

    Specify -Kswp,-O2,-O3 or -Kfast behind of -Kswp_strongoption, -Kswp_strong option is unabled and -Kswp option is enabled.

    To enable -Kswp_strong option, specify behind of -Kswp,-O2,-O3 or -Kfast option.

  14. -Kdalign option abolish

    Compile option -Kdalign is unabled and alartted as following.

    Alart : Sub option dalign, specified to -K is incorrect.
    
  15. Specification changes for built-in functions __sync_fetch_and_nand and __sync_nand_and_fetch

    Changed to as following.

    Built-in functions

    Operation

    type __sync_fetch_and_nand (type *ptr, type value, …)

    tmp = *ptr;
    *ptr = ~(tmp & value);
    return tmp;

    type __sync_nand_and_fetch (type *ptr, type value, …)

    *ptr = ~(*ptr & value);
    return *ptr;
  16. Change -Kprefetch_strong to default

    The L1 prefetch instruction operates as a strong prefetch by default created by -Kprefetch_sequential, -Kprefetch_stride or -Kprefetch_indirect.

    When generating a prefetch instruction that operates as a weak prefetch, specify -Kprefetch_nostrong.

  17. To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2 , -Khpctag is required.

    To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2, -Khpctag is required. -Khpctag is enabled as a default.

  18. Behavior change when an unrecognized option is specified
    If an unrecognized option is specified, a warning message will be output and ignored.
    Previously, a warning message was output and passed to the linker.
    When passing options to the linker, use the -Wl option.

8.4.3. C++

Please see “C++ User’s Guide” about the detail of each function indicated here.

  1. Convert to default -Klib opiton

    If -O0 option is enabled, when omitted, -Knolib option is applied. Also more than -O1is enabled, when omitted, -Klib option is applied.

    If this option is enabled and the user defines a function with the same name as the standard library function, the result intended by the user may not be obtained.

  2. Specification change of runtime information output function

    Information (part of cost information, parallelization information, input / output information, hardware monitor information) that could be acquired as a runtime information output function cannot be acquired. It makes unable to obtain information by area specification (start_collection/stop_collection.

    If the range specification (start_collection / stop_collection) and its definition (fjcoll_lib) exist in the source program, an error will occur during compilation. If you want to obtain parallelization information, and hardware monitor information, use a CPU performance analysis tool.

  3. Implement of LLVM OpenMP library
    Parallelization function uses LLVM OpenMP library as a default.
    If use the original Fujitsu OpenMP library, specify -Nfjomplib option.
  4. Standard template library (STL) change on C++03 style or C++11 style

    If C++03 style or C++11 style, as STL, use libc++. At K, used STLport.

  5. Translation option -f{signed-char|unsigned-char} omit value change

    When -f{signed-char|unsigned-char}option is omitted, -funsigned-char option is applied abd predefined macro __SIGNED_CHARS__ is not defined.

    If there is only a declaration of char type in the source program, the sign may change, and the operation of the program may be different from before.

  6. Translation option -X abolished (language specification mode is GNU compatible mode only)

    GNU C++ extention style is abusolutely enabled.

  7. GNU C++ compatible version change

    Each macro value is as below.

    Macro

    Value

    __GNUC__

    6

    __GNUC_MINOR__

    1

    __GNUC_PATCHLEVEL__

    0

    __GNUG__

    6

  8. Omit value change of translation option -std

    If omitted -std option, -std=gnu++14 is applied.

  9. Definition change of macro __STRICT_ANSI__ ,linux,unix, and __cplusplus

    Macro

    Change contents

    __STRICT_ANSI__

    Defined if -std={c++98|c++03|c++11|c++14} option is enabled.

    linux

    Defined if -std={gnu++98|gnu++03|gnu++11|gnu++14} option is enabled.

    unix

    Defined if -std={gnu++98|gnu++03|gnu++11|gnu++14} option is enabled.

    __cplusplus

    Value will be 199711L if -std={c++98|c++03|gnu++98|gnu++03} option is enabled.

  10. Operation change if specified translation option -Dname and -Uname at the same time

    If specified the same name to -D and -U, the latst specification is enabled.

  11. Omit value change of traslation option -N{line|noline}

    When omitted -N{line|noline} option, -Nline option is applied.

  12. Removal of translation option -K {openmp_tls | openmp_notls}

    If specified -K{openmp_tls|openmp_notls} option, the following alart message is output when translation.

    Alart : -Kopenmp_[no]tls  option is abolished.
    
  13. Style change of -Kswp and -Kswp_strong option

    Specify -Kswp,-O2,-O3 or -Kfast behind of -Kswp_strongoption, -Kswp_strong option is unabled and -Kswp option is enabled.

    To enable -Kswp_strong option, specify behind of -Kswp,-O2,-O3 or -Kfast option.

  14. -Kdalign option abolish

    Compile option -Kdalign is unabled and alartted as following.

    Alart : Sub option dalign, specified to -K is incorrect.
    
  15. Specification changes for built-in functions __sync_fetch_and_nand and __sync_nand_and_fetch

    Changed to as following.

    Built-in functions

    Operation

    type __sync_fetch_and_nand (type *ptr, type value, …)

    tmp = *ptr;
    *ptr = ~(tmp & value);
    return tmp;

    type __sync_nand_and_fetch (type *ptr, type value, …)

    *ptr = ~(*ptr & value);
    return *ptr;
  16. Change -Kprefetch_strong to default

    The L1 prefetch instruction operates as a strong prefetch by default created by -Kprefetch_sequential, -Kprefetch_stride or -Kprefetch_indirect.

    When generating a prefetch instruction that operates as a weak prefetch, specify -Kprefetch_nostrong.

  17. To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2 , -Khpctag is required.

    To enable -Kprefetch_nostrong and -Kprefetch_nostrong_L2, -Khpctag is required. -Khpctag is enabled as a default.

  18. Behavior change when an unrecognized option is specified
    If an unrecognized option is specified, a warning message will be output and ignored.
    Previously, a warning message was output and passed to the linker.
    When passing options to the linker, use the -Wl option.

8.4.4. MPI

Please see “MPI User’s Guide” about the detail of each function indicated here.

  1. Hasty Rendezvous communication function abolish

    Cannot use MCA parameter pml_ob1_use_hasty_rendezvous.

  2. MCA parameter dpm_ple_socket_timeout abolish

    It is not possible to set the communication wait time between two MPI process groups that do not share a communicator such as dynamic processes.

  3. Extention RDMA interface abolish

    Revise to the program which uses uTofu interface.

  4. Rank euery interface change

    The functions have been consolidated into the following routines so that information on the node to which the child process is assigned can be obtained and the conventional routine functions can be used. For the specification, refer to “Rank Query Interface” in the manual “MPI User’s Guide”.

    • FJMPI_TOPOLOGY_GET_COORDS

    • FJMPI_TOPOLOGY_GET_RANKS

  5. C language MPI functions are not hooked by using Fortran profiling interface

    When using the profiling interface in a Fortran program, you cannot hook using the C language interface. When using the profiling interface in Fortran programs, hook using the Fortran interface.

  6. Change to MCA parameter orte_abort_print_stack to opal_abort_print_stack

    The MCA parameter orte_abort_print_stack for outputting stack trace information cannot be used. Change the MCA parameter to opal_abort_print_stack.

  7. Change a parameter name of CA parameter dpm_ple_no_establish_connection

    Correct the MCA parameter to mpi_no_establish_communication.

  8. Change algorithm of reduction operation used in collective communication

    The reduction operation steps is defferent when executed on K and executed on Fugaku.

    For this reason, precision errors may occur in MPI_SUM reduction operations on floating-point or complex data, and the calculation results may differ.

  9. MCA parameter coll_tbi_use_on_max_min omit value change

    Change MCA parameter coll_tbi_use_on_max_min omit value from 0 to 1.

    By changing MCA parameter coll_tbi_use_on_max_min omit value from 0 to 1, the barrier communication is applied, and the calculation result may differ from the result without barrier communication.

    • If NaN is included to the value to use calculation, please be awared of following.
      • When barrier communication is applied, the result is a comparison of values other than NaN. However, if all values are NaN, it will be one of the NaN values.

      • If barrier communication is not applied, the result will be the result of comparing non-NaN values or one of the NaN values, depending on the execution conditions.

    • If both of +0.0 and -0.0 is included to the value to use calculation, please be awared of following.
      • When applying barrier communication, compare by considering the sign of 0. (+0.0 > -0.0)

      • If barrier communication is not applied, the sign of 0 is not considered. Which one is selected in the magnitude comparison between +0.0 and -0.0 depends on the conditions at runtime.

  10. MCA parameter coll_tuned_use_6d_algorithm and coll_tuned_scatterv_use_linear_sync abolish

    If you specify an obsolete MCA parameter, that specification is ignored. For this reason, the intended algorithm may not be selected and performance may not be achieved.

    As an alternative to the elimination of these MCA parameters, use the function of algorithm selection for blocking collective communication. Refer to “Tuning by Algorithm Selection” in the manual “MPI User’s Guide” for details on the function of algorithm selection for blocking collective communication.

  11. Chaneg MCA parameter name about communication time out setting function

    MCA parameter to specify wawiting communication time cut is opal_progress_timeout.

    MCA parameter to specify when delaying program termination by the communication timeout setting function is opal_abort_delay.

  12. MCA parameter mpi_deadlock_timeout and mpi_deadlock_timeout_delay abolish

    MCA parameter mpi_deadlock_timeout and mpi_deadlock_timeout_delay cannot be used. If specified these MCA parameters, communication time out setting function does not work.

  13. Change the default value of the maximum number of processes that can communicate in high-speed communication mode

    The default value of the maximum number of processes that can communicate in the high-speed communication mode is 256. In K, it was 1024.

    For an MPI program in which the number of inter-node communication partner processes per process exceeds 256, the communication performance may decrease. If there is enough available memory when executing the MPI program, specify a value greater than 256 for the MCA parameter common_tofu_max_fastmode_procs and execute the MPI program.

8.4.5. Tool

The incompetible items about the tool is as following.

  1. Interactive debugger abolish

    Cannot use interactive executable debugger using GUI.

  2. Debugger engine (fdb) abolish

    Use gdb for the debugger.

  3. Tracer abolish

    Cannot use a tracer.

  4. Tofu PA abolish

    Cannot use Tofu PA.

  5. Instant Performance Profiler information (GUI style) abolish

    Use text style.

  6. Advanced Performance Profiler information (GUI style) abolish

    Use text style, CSV style, XML style or CPU performance analiyzing report.

  7. Change Advanced Performance Profiler precision PA visualization function (Excel style) to CPU performance analiyzing report

    Already changed “Advanced Performance Profiler precision PA visualization function (Excel style)” to “CPU performance analiyzing report”.

    Notes on CPU performance analiyzing report is as following.

    • Requires to proceed measurement by Advanced Performance Profiler and file output for 5 times, 11 times or 17 times.

    • Excel style report file prepares only English index.

    • On CPU performance analiyzing report (Excel style), output information by CMG unit. Outputs up to 12 threads of information to one Excel file. Therefore, if the number of threads in one process is less than 12, the information of another process in the same CMG unit may be displayed together.

    • Depending on CPU change, the difference on obtainable event information occurs. Thus, the calculation target information is changed.

    • Precision PA’s following items will not be output.
      • SIMD calculation related information

      • Concatenation shift instruction rate

      • Permutation instruction rate

      • L2 throughput

      • MINOPS

      • Integer operation

      • Load balance and instruction balance

      • Rate on program execution time

      • Number of passing in measurement section and average execution time

  8. Core bind when using Advanced Performance Profiler

    When using Advanced Performance Profiler, change to make core bind specification is required. When using Advanced Performance Profiler, specify core bind.

  9. start_collection function and stop_collection function abolish

    If inputting start_collection function and stop_collection function into a source code, change to fapp_start function and fapp_stop function.

  10. Operation change if specified fipp command and fapp command -H option and -I{cpupa|nocpupa} option at the same time

    If specified -H option and -Inocpupa option by fipp command and fapp command at the same time, regardless of the specified order, always -I{cpupa|nocpupa} option is enabled. Also, if not specify -I{cpupa|nocpupa} option and specified -H option, -Icpupa option is enabled.

    If you want to enable -H option, do not specify -Inocpupa option.