(https://jira.hdfgroup.org/browse/HDFFV-10055).
Briefly, in H5C_collective_write() in H5Cmpio.c,
the metadata cache attempts to perform a collective
write of metadata cache entries.
This worked fine as long as all processes had at
least one entry to write.
However, when the process has no entries, the
function tries to participate in the collective write
by calling MPI_File_set_view(),
MPI_File_write_all() and then MPI_File_set_view()
again, to match the calls in H5FD_mpio_write().
After pull request 183, the CGNS test benchmark_hdf5
started failing. On investigation, I determined that
the failure occurred in the first call to MPI_File_set_view()
in the "no data to write" path through H5C_collective_write().
Note that pull request 183 did not create the problem,
it only exposed it. The bug can be observed after pull
request 182 if one executes the CGNS progam
src/ptests/benchmark_hdf5 with 90 processes.
The problem appears to have been that the calls to
MPI_File_set_view() in H5C_collective_write() and
H5FD_mpio_write() were using different values for the
info parameter. I patched the problem by adding a
MPI specific VFD call allowing me to get the MPI_Info
used in H5FD_mpio_write() for use in
MPI_File_set_view() calls in H5C_collective_write().
Tested serial & parallel, debug & production on
Jelly.
* commit 'b56fb149c9a3c9dca11b406b7a2488f0c93ee187':
Updated the H5L.c error message after additional thought. Fix for HDFFV-10141.
Updated an error message in H5L.c to be more helpful. Fixes HDFFV-10141.
* commit 'd6ea49f5cbcaa852cd0caf34278ec61108667bc3':
Switch to using flag in signal handler, to trigger dropping out of main loop and shutdown cleanly, instead of calling leave() from the signal handler.
Fix HDFFV-8089 Description: Some code within an "ifdef H5D_CHUNK_DEBUG" block was using outdated data structure but not caught because the case of H5D_CHUNK_DEBUG being defined was never tested. It was commented out. I defined H5D_CHUNK_DEBUG, tested, and commented out again. Platforms tested: Linux/32 2.6 (jam) Linux/64 (platypus) Darwin (osx1010test)
* commit 'd522632b9e1f1d88db2117e89f3caba0dc4cf38b':
Switch to using flag in signal handler, to trigger dropping out of main loop and shutdown cleanly, instead of calling leave() from the signal handler.
Fix HDFFV-8089 Description: Some code within an "ifdef H5D_CHUNK_DEBUG" block was using outdated data structure but not caught because the case of H5D_CHUNK_DEBUG being defined was never tested. It was commented out. I defined H5D_CHUNK_DEBUG, tested, and commented out again. Platforms tested: Linux/32 2.6 (jam) Linux/64 (platypus) Darwin (osx1010test)
* commit '6387f7099d22c66dab415c57f9fd547eb86e4ad5':
Small corrections to DOPYING file.
Add new file COPYING_LBNL_HDF5.
Revert "Clear hdf5 1.10 entries from RELEASE.txt in the develop branch. Entries"
Add LBNL license file and modify COPYING file accordingly.
Omnibus checkin for several relatively minor modifications:
* commit '2412158ed8326a3f3d62fbd947e470667d0b5951':
Add new file COPYING_LBNL_HDF5.
Revert "Clear hdf5 1.10 entries from RELEASE.txt in the develop branch. Entries"
Add LBNL license file and modify COPYING file accordingly.
Omnibus checkin for several relatively minor modifications:
Clear hdf5 1.10 entries from RELEASE.txt in the develop branch. Entries in this branch version of RELEASE.txt should be intended for the future 1.12.0 release.
Fix HDFFV-8089 Description: Some code within an "ifdef H5D_CHUNK_DEBUG" block was using outdated data structure but not caught because the case of H5D_CHUNK_DEBUG being defined was never tested. It was commented out. I defined H5D_CHUNK_DEBUG, tested, and commented out again. Platforms tested: Linux/32 2.6 (jam) Linux/64 (platypus) Darwin (osx1010test)
Fix HDFFV-8089
* commit '52f8c2ed494ea1b89374981ecc6901abe8fd5fed':
Fix HDFFV-8089 Description: Some code within an "ifdef H5D_CHUNK_DEBUG" block was using outdated data structure but not caught because the case of H5D_CHUNK_DEBUG being defined was never tested. It was commented out. I defined H5D_CHUNK_DEBUG, tested, and commented out again. Platforms tested: Linux/32 2.6 (jam) Linux/64 (platypus) Darwin (osx1010test)
1) Added code test/page_buffer.c to verify that page buffering is
disabled in parallel builds.
2) Added code to test/cache_image.c to verify correct interaction
between evict on close and cache image -- in particular management
of a file containing a cache image containing dirty metadata that
has been opened R/O. Also fix for the bug exposed.
3) Added code to testpar/t_cache_image.c to verify expected procedure
for reading cache images, and also supporting stats collection code
needed for the test.
4) Repair of an overactive sanity check in H5C__reconstruct_cache_contents().
5) Other minor tidies in passing.
Tested serial and parallel, debug and production on Jelly.
Description:
Some code within an "ifdef H5D_CHUNK_DEBUG" block was using outdated
data structure but not caught because the case of H5D_CHUNK_DEBUG being
defined was never tested. It was commented out.
I defined H5D_CHUNK_DEBUG, tested, and commented out again.
Platforms tested:
Linux/32 2.6 (jam)
Linux/64 (platypus)
Darwin (osx1010test)
* commit '46c9ab600de491657520897322b75659c3bdfb5f':
Minor style cleanups
Revert "Switch h5clear for cache images to use existing H5Pget_cache_image_config()"