Open
Description
- I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.
System information including:
- OpenCoarrays Version: 2.10.1
- Fortran Compiler: gfortran 12.2.0
- C compiler used for building lib: gcc 12.2.0
- Installation method: ./install.sh
- All flags & options passed to the installer: -M /usr
- Output of
uname -a
:Linux stray 6.0.11-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 02 Dec 2022 17:25:31 +0000 x86_64 GNU/Linux
- MPI library being used: openmpi 4.1.4-4
- Machine architecture and number of physical cores: Intel x86, 4 core
- Version of CMake: 3.25.1 installed via package manager
To help us debug your issue please explain:
What you were trying to do (and why)
Use derived type with allocatable component in coarray, Example
module payload_m
implicit none
private
public :: payload_t, empty_payload
type :: payload_t
!! A raw buffer to facilitate data transfer between images
!!
!! Facilitates view of the data as either a string or raw bytes.
!! Typical usage will be either to
!! * produce a string representation of the data, and then parse that string to recover the original data
!! * use the `transfer` function to copy the raw bytes of the data
private
integer, allocatable, public :: payload_(:)
contains
private
procedure, public :: raw_payload
procedure, public :: string_payload
end type
interface payload_t
pure module function from_raw(payload) result(new_payload)
implicit none
integer, intent(in) :: payload(:)
type(payload_t) :: new_payload
end function
pure module function from_string(payload) result(new_payload)
implicit none
character(len=*), intent(in) :: payload
type(payload_t) :: new_payload
end function
module procedure empty_payload
end interface
interface
pure module function empty_payload()
implicit none
type(payload_t) :: empty_payload
end function
pure module function raw_payload(self)
implicit none
class(payload_t), intent(in) :: self
integer, allocatable :: raw_payload(:)
end function
pure module function string_payload(self)
implicit none
class(payload_t), intent(in) :: self
character(len=:), allocatable :: string_payload
end function
end interface
end module
submodule(payload_m) payload_s
implicit none
contains
module procedure from_raw
new_payload%payload_ = payload
end procedure
module procedure from_string
new_payload = payload_t([len(payload), transfer(payload,[integer::])])
end procedure
module procedure empty_payload
empty_payload%payload_ = [integer::]
end procedure
module procedure raw_payload
if (allocated(self%payload_)) then
raw_payload = self%payload_
else
raw_payload = [integer::]
end if
end procedure
module procedure string_payload
if (allocated(self%payload_)) then
if (size(self%payload_) > 0) then
allocate(character(len=self%payload_(1)) :: string_payload)
if (len(string_payload) > 0) &
string_payload = transfer(self%payload_(2:),string_payload)
else
allocate(character(len=0) :: string_payload)
end if
else
allocate(character(len=0) :: string_payload)
end if
end procedure
end submodule
program example
use payload_m, only: payload_t
character(len=*), parameter :: MESSAGE = "Hello, World!"
type(payload_t) :: mailbox[*]
if (this_image() == 1) then
mailbox = payload_t(MESSAGE)
end if
sync all
if (this_image() /= 1) then
mailbox = mailbox[1]
end if
print *, mailbox%string_payload(), " from image: ", this_image()
end program
What happened (include command output, screenshots, logs, etc.)
$ caf main.f90
$ cafrun -n 2 ./a.out
Hello, World! from image: 1
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7fb8c68519ff in ???
#1 0x55dfaf068c12 in ???
#2 0x55dfaf06a3b7 in ???
#3 0x55dfaf06a4ad in ???
#4 0x7fb8c683c28f in ???
#5 0x7fb8c683c349 in ???
#6 0x55dfaf068564 in _start
at ../sysdeps/x86_64/start.S:115
#7 0xffffffffffffffff in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 0 on node stray exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Error: Command:
`/usr/bin/mpiexec -n 2 ./a.out`
failed to run.
with a more complex example (specifically the FEATS quadratic_solver example which can be found here in the examples folder) the error message is
[stray:40367] *** An error occurred in MPI_Get
[stray:40367] *** reported by process [2800353281,1]
[stray:40367] *** on win rdma window 5
[stray:40367] *** MPI_ERR_RMA_RANGE: invalid RMA address range
[stray:40367] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[stray:40367] *** and potentially your MPI job)
Error: Command:
`/usr/bin/mpiexec -n 2 build/caf_3E83561A255DFE60/example/quadratic_solver`
failed to run.
What you expected to happen
The above example should print "Hello, World! from image: 1" for each image.
Step-by-step reproduction instructions to reproduce the error/bug
Simply compile the provided example as shown above. This was originally reported as follow up in #739, but was worthy of its own issue.
Metadata
Assignees
Labels
No labels
Activity