2次元配列の派生型を持つFortran MPI allgatherv

このFortran MPIの問題については、ヘルプが必要です。 2D配列の異なる列からデータを収集しようとしています。問題は、各行のすべてのデータが使用されず、プロセスごとに割り当てられた列が同じでないことです。すべてのプロセスは、データの同等のグローバルビューから始まり、各プロセスは特定の列に対して作業を実行し、最後にすべてのプロセスが共通ビューを再び共有するように情報を交換します。問題の例を集めMPI partition and gather 2D array in FortranとSending 2D arrays in Fortran with MPI_Gather 2次元配列の派生型を持つFortran MPI allgatherv

に類似している：1 2カラムが動作することになる3 MPIプロセス

--------------------- 
| a1 | b1 | c1 | d1 | 
| a2 | b2 | c2 | d2 | 
| a3 | b3 | c3 | d3 | 
| a4 | b4 | c4 | d4 | 
| a5 | b5 | c5 | d5 | 
| a6 | b6 | c6 | d6 | 
| a7 | b7 | c7 | d7 | 
| a8 | b8 | c8 | d8 | 
---------------------

プロセスを使用してデータ（8,4）、プロセス2が1列を取得し、工程3は1を取得カラム。

----------- ------ ------ 
| a1 | b1 | | c1 | | d1 | 
| a2 | b2 | | c2 | | d2 | 
| a3 | b3 | | c3 | | d3 | 
| a4 | b4 | | c4 | | d4 | 
| a5 | b5 | | c5 | | d5 | 
| a6 | b6 | | c6 | | d6 | 
| a7 | b7 | | c7 | | d7 | 
| a8 | b8 | | c8 | | d8 | 
----------- ------ ------

実際の問題として、実際のサイズはデータ（200000,59）です。これはあらかじめ割り当てられたメモリの塊で、私は各列の一部だけで使用されます（常にインデックス1から始まります）。たとえば、各列に最初の3つの値が必要です。

----------- ------ ------ 
| a1 | b1 | | c1 | | d1 | 
| a2 | b2 | | c2 | | d2 | 
| a3 | b3 | | c3 | | d3 | 
| == | == | | == | | == | 
| a4 | b4 | | c4 | | d4 | 
| a5 | b5 | | c5 | | d5 | 
| a6 | b6 | | c6 | | d6 | 
| a7 | b7 | | c7 | | d7 | 
| a8 | b8 | | c8 | | d8 | 
----------- ------ ------

これを達成するために使用できる送受信データ型を作成しようとしています。これまでMPI_TYPE_VECTORを使用していました。 MPI_TYPE_VECTORこのため（COUNT、ブロック長、STRIDE、OLDTYPE、NEWTYPE、IERROR）

はMPI_TYPE_VECTOR（1、3、8、MPI_DOUBLE、のnewtype、IERR）を使用します。これにより、各プロセスは最小限の情報を送信できるようになります。これで、私はALLGATHERVと情報をやり取りできるはずだと思いました。

MPI_ALLGATHERV（SENDBUF、SENDCOUNT、SENDTYPE、RECVBUF、RECVCOUNT、DISPLS、RECVTYPE、COMM、IERROR）私は（データ（1、my_first_col）、num_cols_to_be_sent、のnewtype、データ、RECVCOUNT []、DISPLS [] MPI_ALLGATHERVを使用、newtype、COMM、IERROR）

私が知る限り、これは各プロセスで送信されるべき情報です。

データの列全体または変位のすべてを使用した例は、必然的に必要な部分配列の倍数です。私はそれを正しい列に解凍することはできません。受信側が型のサイズ/範囲を理解しているので、これを行うことはできません。私は全範囲のことについて非常に混乱していると認めます。どんな助けもありがとう。実際のコードは動作していますが、ここでは表示とコメントのためのクイックレクリエーションがあります（コンパイルできず、すばやく作成されます）。

MODULE PARALLEL 
    INTEGER iproc, nproc, rank, ierr 
    INTEGER mylow, myhigh, mysize, ichunk, irem 
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:) 
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:) 
    INTEGER newtype 
    END MODULE 


    PROGRAM MAIN 
    USE PARALLEL 
    IMPLICIT NONE 
    INCLUDE 'mpif.f' 

c **temp variables 
    integer i, j 
    integer num_rows,num_cols 
    integer used_rows 

c ----setup MPI---- 
    call MPI_INIT(ierr) 
    call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr) 
    call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr) 
    iproc = rank+1 !rank is base 0, rest of fortran base 1 

c ----setup initial data  
    integer num_rows=20 !contiguous in memory over rows (ie single column) 
    integer num_cols=11 !noncontiguous memory between different columns 
    integer 

    ALLOCATE (isize(nproc)) 
    ALLOCATE (idisp(nproc)) 
    ALLOCATE (ilow(nproc)) 
    ALLOCATE (ishigh(nproc))  
    ALLOCATE (glob_val(num_rows,num_cols)) 

    glob_val = 1.0*iproc !sent all glob values to process id 
    do i=1,num_cols 
    do j=1,used_rows 
     glob_val(j,i) = iproc+.01*j !add refernce index to used data 
    end do 
    end do 

c ---setup exchange information 
    ichunk = num_cols/nproc 
    irem = num_cols -(ichunk*nproc) 
    mysize=ichunk 
    if(iproc.le.irem) mysize=mysize+1 

    mylow=0 
    myhigh=0 

    do i=1,nproc !establish global understanding of processes 
    mylow=myhigh+1 
    myhigh=mylow+ichunk 
    if(i.le.irem) myhigh=myhigh+1 

    isize(i)=myhigh-mylow+1 
    idisp(i)=(mylow-1) !based on receiving type size/extent 
    ilow(i)=mylow 
    ihigh(i)=myhigh 
    end do 
    mylow=ilow(iproc) 
    myhigh=ihigh(iproc) 

    call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE, 
&      newtype,ierr) 
    call MPI_TYPE_COMMIT(newtype,ierr) 

c --- perform exchange based on 'newtype'  
     !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE, 
     !    RECVBUF, RECVCOUNT, DISPLS, RECVTYPE, 
     !    COMM, IERROR) 
    call MPI_ALLGATHERV(glob_val(1,mylow),mysize,newtype 
&     glob_val,isize,iproc,newtype, 
&     MPI_COMM_WORLD,ierr)  

c ---print out global results of process 2 
    if(iproc.eq.2) then  
    do i=1,num_rows 
     write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do 
    end if 

    END program

出典

2017-03-20 Curtis

データは「倍精度」ではありませんか？カスタムMPIタイプを作成するのはなぜですか？ – Ross

データは倍精度です。カスタムタイプは、必要な配列の量だけを含む単一の型を作成するためのものです。例から、配列列の8つの値のうちの3つが単一のエンティティになります。その後、各プロセスには複数の列が送信されます。 – Curtis

OK、私は次のようにこの作業を得た：

1）myhigh=mylow + ichunk - 1ないmyhigh = mylow + ichunk

2）used_rowsが割り当てループ

3の前に設定する必要があります）を定義します実際のバッファをより明示的に試してみてください。

call MPI_ALLGATHERV(glob_val(:,mylow:myhigh), mysize, newtype, & 
        glob_val(1:used_rows,:), isize, idisp, newtype, & 
        MPI_COMM_WORLD, ierr)

gfortranとopenmpiを使用したフルコード：

MODULE PARALLEL 
    INTEGER iproc, nproc, rank, ierr 
    INTEGER mylow, myhigh, mysize, ichunk, irem 
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:) 
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:) 
    INTEGER newtype 
    END MODULE 


    PROGRAM MAIN 
    USE PARALLEL 
    use mpi 
    IMPLICIT NONE 
    ! INCLUDE 'mpif.f' 

! **temp variables 
    integer i, j 
    integer num_rows,num_cols 
    integer used_rows 

! ----setup MPI---- 
    call MPI_INIT(ierr) 
    call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr) 
    call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr) 
    iproc = rank+1 !rank is base 0, rest of fortran base 1 

! ----setup initial data  
    num_rows=8 !contiguous in memory over rows (ie single column) 
    num_cols=4 !noncontiguous memory between different columns 
    used_rows = 3 

    ALLOCATE (isize(nproc)) 
    ALLOCATE (idisp(nproc)) 
    ALLOCATE (ilow(nproc)) 
    ALLOCATE (ihigh(nproc))  
    ALLOCATE (glob_val(num_rows,num_cols)) 

! glob_val = 1.0*iproc !sent all glob values to process id 
    glob_val = -1.0 * iproc 
    do i=1,num_cols 
    do j=1,used_rows 
     glob_val(j,i) = (1.0*iproc)+(.01*j) !add refernce index to used data 
    end do 
    end do 

! ---setup exchange information 
    ichunk = num_cols/nproc 
    irem = num_cols -(ichunk*nproc) 
    mysize=ichunk 
    if(iproc.le.irem) mysize=mysize+1 

    mylow=0 
    myhigh=0 

    do i=1,nproc !establish global understanding of processes 
    mylow=myhigh+1 
    myhigh=mylow+ichunk-1 
    if(i.le.irem) myhigh=myhigh+1 

    isize(i)=myhigh-mylow+1 
    idisp(i)=(mylow-1) !based on receiving type size/extent 
    ilow(i)=mylow 
    ihigh(i)=myhigh 
    end do 
    mylow=ilow(iproc) 
    myhigh=ihigh(iproc) 

    call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE, & 
         newtype,ierr) 
    call MPI_TYPE_COMMIT(newtype,ierr) 

    write(*,*) rank, idisp 
    write(*,*) rank, isize 
! --- perform exchange based on 'newtype'  
     !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE, 
     !    RECVBUF, RECVCOUNT, DISPLS, RECVTYPE, 
     !    COMM, IERROR) 
    call MPI_ALLGATHERV(glob_val(:,mylow:myhigh),mysize,newtype, & 
        glob_val(1:used_rows,:),isize,idisp,newtype, & 
        MPI_COMM_WORLD,ierr)  

! ---print out global results of process 2 
    if(iproc.eq.2) then  
    do i=1,num_rows 
     write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do 
    end if 

    call MPI_Finalize(ierr) 

    END program

出典

2017-03-21 21:51:11

申し訳ありませんが、この問題に戻るまでには時間がかかりました。私はあなたのコードを実行しました。私はあなたがそれで何をしようとしているのかを見ます。しかし、私は正しい応答を得ていません。結果をよりよくテストするため、データの初期化行を 'code'に変更しました。glob_val（j、i）=（100.0 * iproc）+（01 * j）+ i コード結果は 101.01 -1.00 203.01 -2.00 101.02 -1.00 -2.00 203.02 101.03 -1.00 203.03 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 -2.00 – Curtis

私はそれを得ました！ MPI_TYPE_VECTORからMPI_TYPE_CONTIGUOUS（used_rows、MPI_DOUBLE、newtype、ierr）に変更されました。と変更されたALLGATHER（glob_val（1：used_rows、mylow：myhigh）、...）あなたの助けを借りて実行できませんでした！ Dr.Towerに感謝します。 @ Dr.Tower – Curtis

喜んで:) –

2次元配列の派生型を持つFortran MPI allgatherv

答えて

関連する問題