This is a temporary solution, as not all POSIX platforms will have
clock_gettime(). Further code fixes will be required.
With this change, the main C library should compile on Windows.
(2) Revise routine to setup fapl for VFD SWMR legacy and other integration tests: vfd_swmr_create_fapl()
(3) Update all VFD SWMR integration tests to use the above two routines
(4) Clean up VFD SWMR legacy tests: turn on compression in test script, remove #ifdef OUT H5Fflush(), message file name
- CMake files were updated to build new files in src and test.
- As with legacy SWMR, the test programs that are run via shell
scripts are built but not run.
- Updated whitespace in the links_env output file. It's unclear why
this changed, but CMake does an exact diff on the file whereas
the autotools do not.
* commit '47b1c667c64677d2e28c4128fc027cfdfe60b354':
More minor edits to users guide.
Updated users guide for location of alpha 1 release & repaired several typos.
Added .pdf and .docx versions of the most recent version of the VFD SWMR RFC. Updates on the RFC are cursory, and mostly in the introduction. The design section is dated -- added a warning to this effect.
VFD SWMR RFC. Updates on the RFC are cursory, and mostly in the
introduction. The design section is dated -- added a warning to
this effect.
Reviewed and editied the users guide -- specifically:
Verified instructions for build and test, and also for demos.
Removed mention of call to autogen.sh, as that will already be
run on the friendly user release version.
Re-worked the introduction to better discribe VFD SWMR, and
to outline current limitations.
Changed references to "shadow file" to "metadata file" to match
field names.
Still need to update pointer to the bit-bucket repository for the
friendly user release.
Also, must verify that the demo code mentioned in the user guide
is publicly available.
* commit 'aebc84053f9a592aac765fbd3b26f2941015d2e5':
Limit the repeat rate for duplicate zoo warnings to once every five seconds.
Move below_speed_limit() from vfd_swmr_bigset_writer.c to vfd_swmr_common.c, document it, and fix a bug.
Document some of the functions in here. Update the comment at the top of the file. NFCI.
In the `vfd_swmr_create_fapl()` dissection, change the /** **/
comments in the literal code to plain markdown paragraphs.
Slightly change wording and markdown elsewhere.
H5Dopen fails, rapidly retry up to 9,999 times. Log H5Dopen failures,
but log no more than once every five seconds to avoid spamming the
terminal.
With these changes, it's easier for the reader to open the last dataset
before the writer created it, but the reader recovers instead of
quitting with an error. It should only be necessary to retry opening
the *last* dataset; all previous datasets should open on one try if the
last is open.
call H5F__vfd_swmr_writer_md_test(), actually sleep for more than `max_lag`
ticks so that the deferred-frees queue is actually flushed before we force new
items onto the queue and count them.
In H5F__vfd_swmr_writer_md_test(), check that the number of deferred
shadow-space frees is *precisely* the number `nshadow_defrees`, since that
seems to be what the tests that call this routine intend.
using the new routine H5FDdeduplicate().
Simplify H5F_open() a bit, pushing some of the configuration checks into the
SMWR VFD. For example, check that page buffering is enabled in
H5FD_vfd_swmr_open() instead of in H5F_open(). Compare VFD SWMR configurations
in H5FD_vfd_swmr_dedup() instead of in H5F_open().
Clone the default file-access property list at a new file-access property list
ID, H5P_FILE_ACCESS_ANY_VFD. The new ID is used to indicate that if the file
that's being opened is already open under an existing virtual file, and if that
virtual file would not ordinarily be opened with the default FAPL, then it's ok
to use that virtual file.
Add a new optional method, `dedup`, to H5FD_class_t, and use it to customize a
VFD's deduplication.
Customize the SWMR VFD's deduplication. Make it honor H5P_FILE_ACCESS_ANY_VFD
Embed the VFD SWMR configuration in the H5FD_vfd_swmr_t to facilitate
comparison of configuration between new and old SWMR virtual files.
In H5F__sfile_search(), match using a pointer comparison instead of H5FD_cmp(),
because we will only ever enter H5F__sfile_search() with a deduplicated
H5FD_t *.
multiple referring H5F_t on the EOT queue. This stops the VFD SWMR writer from
advancing the shadow file's tick number too fast when we're using virtual
datasets (VDS) with VFD SWMR.
if H5FD__vfd_swmr_header_deserialize() succeeds, then the header that was
passed in was actually initialized. This squashes a used-before-initialized
warning from GCC.
multiple opens of the same file with VFD SWMR---i.e., twice for writing, or for
reading and for writing. In the long run, this will help me encapsulate more
of the SWMR functionality in the VFD, too.
the lower virtual file's `exc_owner` field. While I'm here, remove a
gratuitous assertion.
This is part of a changeset that helps us avoid creating multiple H5F_shared_t
for one file when virtual datasets are used with VFD SWMR. The old code for
deduplicating VFD SWMR H5F_shared_t instances did not work correctly with VFD
SWMR, so we'd end up with multiple H5F_shared_t all active on the same file.
member to all virtual files. Add a routine, H5FD_has_conflict(), that returns
true if a new virtual file is identical to an existing virtual file that has an
exclusive owner. Establish an exclusive owner for a VFD SWMR virtual file's
lower virtual file.
Rename bsdqueue.h to H5queue.h and install it, since it's used by H5FDpublic.h.
This is part of a changeset that helps us avoid creating multiple H5F_shared_t
for one file when virtual datasets are used with VFD SWMR. The old code for
deduplicating VFD SWMR H5F_shared_t instances did not work correctly with VFD
SWMR, so we'd end up with multiple H5F_shared_t all active on the same file.
Assert index entries are in sorted order earlier in the loop over
old and new indices.
When looping over the remaining new index entries, just do
`entries_added++` to match the other loops.
In the final log entry in H5F_vfd_swmr_reader_end_of_tick(), mention
whether the call will exit with success or failure.
apply it individually to each dataset instead of setting the chunk-cache
parameters on the file. Alas, it didn't make any difference, but I'll keep the
change.
H5F_vfd_swmr_reader_end_of_tick: delete superfluous assertions and
extract a com mon subexpression into a H5FD_t * variable.
Carry on with HGOTO_ERROR() cleanup. Delete superfluous parentheses to
reduce visual clutter. Delete superfluous casts. Delete out-of-date
comment: the index size is not fixed any longer.
statement-ify, changing
HGOTO_ERROR(..., \
)
to
HGOTO_ERROR(...,
);
Remove blank lines between if-clause and HGOTO_ERROR. Add some curly
braces to if-statements where that clarifies things.
NFCI.
that makes assertions fail.
Add an optional `close` method to the `H5D_chunk_ops_t`, and use that to
release "holds" on metadata cache (MDC) entries.
For extensible arrays and v2 B-trees, use the existing `dest`(roy)
method to implement `close`. For v1 B-trees and other chunk indices,
don't provide `close`: we cannot safely close the v1 B-tree index, and
the other indices don't have a meaningful presence in the MDC.
Revert my first attempt at making v1 B-tree chunk indices closeable
with `dest`.
Put my comment about the stopgap fix for VFD SWMR at the right place
in src/H5Dchunk.c.
the entries in a hash bucket after evicting tagged entries. Evicting
tagged entries can can affect both entries before and after the current
entry in the bucket's linked list, so we cannot be sure that either
`entry_ptr` or `follow_ptr` is valid.
This stops the assertion (entry_ptr->page != page) ||
(entry_ptr->refreshed_in_tick == tick) from failing in the test
`testvfdswmr.sh many_small`.
H5D__chunk_index_close(), and call it in H5D__chunk_read() after reading
a chunked dataset. In this way, indices based on extensible arrays and
v2 B-trees do not leave pinned/tagged entries in the metadata cache that
we cannot evict/refresh when we load changes from the shadow file.
Make some changes to the v1 B-tree code that set the pointer to the
closed B-tree to NULL and, further, tolerate a NULL pointer where
previously that was impossible.
below than 80 columns: use the malloc idiom,
```
type *p;
p = malloc(sizeof(*p));
```
instead of
```
type *p;
p = (type *)malloc(sizeof(type));
```
Make a similar change to some `memset` calls. NFCI.
HGOTO_ERROR() instead of HDONE_ERROR() so that we jump to the `done`
label right away. This ought to fix the problem Vailin was seeing,
where the library left H5F_vfd_swmr_reader_end_of_tick() prematurely for
seemingly no reason.
each to appear.
Add but do not yet perform tests on many small extensible datasets and a few
big extensible datasets. Vailin is working on a bug that causes both tests to
fail virtually always.
each dataset every `steps` steps.
Update usage message.
Add a cast to `time_t` to quiet a compiler warning.
Replace two occurrences of a debug statement in `verify_extensible_dset()` with
one occurrence in `verify_chunk()`.
Replace the anonymous constant `2` with `hang_back` and increase `hang_back` to
3. XXX Now that I've fixed a bug, reduce `hang_back` to 2, again.
Verify datasets in the reverse of the order they are written so that we spend
less time re-verifying datasets written in the same step.
by the added entries. This fixes a bug in the vfd_swmr_bigset_writer/_reader
test, where the reader would read metadata with a bad checksum. This may also
fix an error that Vailin is seeing.
when handling an I/O request on a metadata entry that has been sub-allocated
from a larger file space allocation (i.e. fixed and extensible array), and
that crosses at least one page boundary. .
This required modifying the metadata cache to provide the type of the
metadata cache entry in the current I/O request. For now, this is done
with a function call. Once we are sure this works, it may be appropriate
to convert this to a macro, or to add a flags parameter to the H5F block
read/write calls.
Also updated the metadata cache to report whether a read request is
speculative -- again via a function call. This allowed me to remove
the last address static variable in the H5PB_read() call, which is
necessary to support multiple files opened in VFD SWMR mode.
Also re-wrote the H5PB_remove_entries() call to handle release
of large metadata file space allocations that have been sub-allocated
into multiple metadata entries. Also modified the call to
H5PB_remove_entries() in H5MF__xfree_impl() to invoke it whenever
the page buffer is enabled and the size of the space to be freed is
of page size or larger.
Tested serial / debug on charis and Jelly.
Found a bug in H5MF_xfree_impl(), in which the call to H5PB_remove_entries()
is skipped due to HGOTO_DONE calls earlier in the function. While the
obvious action is to move the call earlier in the function, best to
consult with Vailin first, as there is much going on and it would be
best to avoid making the situation worse. If nothing else, there are
some error management issues.
to integer of different size [-Werror=pointer-to-int-cast]` and
`test/snapshots-hdf5/current/test/swmr_sparse_reader.c:129:100:
error: cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]`.
g++ (older than gcc/g++ 4.2).
Correct gnu-cxxflags to determine warnings flags to be added based on
C++ compiler version instead of C compiler version.
* Rename server-stop utility to mirror_server_stop.
* Remove external dependency on bzero().
* Modify test/use_common to use only the public API.
* Rename internal bitswap macro to follow convention.
* "Simultaneous and equivalent" Read-Write and Write-Only channels for
file I/O.
* Only supports drivers with the H5FD_FEAT_DEFAULT_VFD_COMPATIBLE flag for
now, preventing issues with multi-file drivers.
Add Mirror VFD to library.
* Write-only operations over a network.
* Uses TCP/IP sockets.
* Server and auxiliary server-shutdown programs provided in a new directory,
`utils/mirror_vfd`.
* Automated testing via loopback ("remote" of localhost).
commit 8963c3bf756f8f8ec21beea9bd29a767e77675a8
Author: Larry Knox <lrknox@hdfgroup.org>
Date: Wed Apr 8 16:52:27 2020 -0500
Commit changes to gnu-cxxflags to remove unmatched " and to gnu-fflags
to not add C warnings flags to H5_FCFLAGS.
This is triggered by the tests for revised references when the libver bounds setting does not allow version
4 datatype message to be created. The test failure is abort core dumped.
This is due to the datatype initialization fails before the datatype ID is registered.
The datatype cleanup code should provide for the above situation.
The code to fix the problem is the same as what is done in H5D__open_oid().
Delete the comment questioning whether pthread_mutex_lock is allowed
in a key destructor, since pthread_key_create(3) provides the answer:
There is no notion of a destructor-safe function. If an application
does not call pthread_exit() from a signal handler, or if it blocks any
signal whose handler may call pthread_exit() while calling async-unsafe
functions, all functions may be safely called from destructors.
Delete redundant comment.
Modify h5repack to integrate with VOL connectors
Update tools library to accomodate VOL connectors
Update logic in h5tools_fopen for VOL connectors
Add command-line options to h5repack for specifying in/out VOL
connectors
Implement h5tools_set_vol_fapl
Fix library shutdown issue
Integrate ROS3 and HDFS VFDs into new h5tools_get_fapl() scheme
Avoid H5Ocopy in h5repack when using different VOL connectors
Update h5tools_test_utils.c for ROS3 and HDFS integration
there's no need to check if the metadata cache is flushing if we already
know the file is closing, because the condition we rely on is "closing
OR flushing." Further, the cache may have already gone away, so
sometimes calling into the cache to see if it's flushing will crash the
program.
could block at the barrier, wake and exit the barrier, re-acquire the barrier
lock and increase `nentered` before the other blocked threads woke and checked
`nentered % count == 0`. Then the other blocked threads would check `nentered
% count == 0` and, finding it false, go back to sleep in the barrier. This new
implementation waits for a looser condition to obtain so that threads don't go
back to sleep in the barrier.
all configurations.
Previously it was neither declared nor defined in --disable-threadsafety
builds. The compiler's warning got lost in the noise---I first saw the issue
because my -Werror branch stopped compiling cold---and the tests still linked
and ran.
Reduce gratuitous casts---e.g., (size_t)1.
Use the right format string for a pointer.
In the H5C sanity checks, change a "size increase" variable from ssize_t
(too narrow) to int64_t (wide enough).
Parenthesize every appearance of `storage` in the macro
`H5D_CHUNK_STORAGE_INDEX_CHK(storage)` so that you can pass in an
expression like &sc and it works properly.
Disallow re-assignment of the `dset` parameter to H5D__chunk_init()
because it helped assure me that it's safe to replace the repeating
expression `&dset->shared->layout.storage.u.chunk` with `sc` throughout.
Replace lengthy expressions such as
`&dset->shared->layout.storage.u.chunk` with `sc` throughout several
functions in H5Dchunk.c ISTR that the compiler warned that `sc` was
declared but unused in a couple of functions, and then I found that `sc`
could be used in many places. Maybe the disused `sc` appeared because a
bunch of code was copied and pasted, I don't know. Anyway, it's a lot
tighter code now that I use `sc`.
In H5D__chunk_update_old_edge_chunks() and H5D__chunk_delete()
I actually expand `sc` and another temporary variable, `pline`,
because they're used only in !defined(NDEBUG) code. This squashes
unused-variable warnings in the defined(NDEBUG) configuration.
Don't drop the `volatile` qualification with a cast in
tools/src/h5import/h5import.c.
previous warnings PR. An array element was assigned to
itself---shape[2]Â =Â shape[2];---instead of being assigned to
chunk[2].
fortran/src/H5Pf.c: move conditional compilation controlled by
H5_NO_DEPRECATED_SYMBOLS outside of a function for readability.
fortran/src/H5match_types.c: put a variable's declaration under the same
conditional compilation (H5_FORTRAN_HAVE_C_LONG_DOUBLE) as its
use.
For now, skip compilation of some unused debug dump routines in the JNI.
While I'm in the JNI, delete a set-but-unused variable.
src/H5Z.c: condition a variable declaration on H5_NO_DEPRECATED_SYMBOLS
so that it's not declared but unused or vice versa.
test/cache_common.h: add an #include in to get some symbols we need to
avoid implicit declaration warnings.
test/dsets.c: use a more conventional conditional-compilation syntax.
test/dt_arith.c, test/fillval.c: initialize a bunch of uninitialized
variables before use.
test/vfd.c: pass the expected type of `void **` to posix_memalign(3)
instead of `int **`.
testpar/t_bigio.c: explicitly compare with 0 instead of using ! when
"equal to 0?" is the question not "is false?" Repair some
indentation while I'm here.
testpar/testpar.h: repair misaligned line-continuation backslashes in a
macro that probably should be a function so that we don't have
to fiddle with the line continuation to begin with.
tools/src/h5repack/h5repack_main.c: fix some compiler fussing about
enums.
tools/test/perform/pio_engine.c: the compiler fusses if you cast a
function call returning double directly to off_t. It's ok if
you cast a variable that's a double to off_t, however. Write
and use a new function, sqrto(), to avoid the cast warnings.
of the va_list, so it's at least possible for another connector to know what
the operation is and decide whether to implement it or not.
Added a new VOL sub-class called "introspect" where callbacks that report
information about the connector or container can be placed. Added an
'opt_query' callback to this sub-class, for a connector to report back
to the library whether a particular optional callback operation is supported.
Also added a 'get_conn_cls' introspection callback, to retrieve the H5VL_class_t
of a connector (either the "current" connector, H5VL_GET_CONN_LVL_CURR, or
the terminal connector, H5VL_GET_CONN_LVL_TERM).
Moved the "post open" operation from a file 'specific' operation to a file
'optional' operation, now that it's possible to detect (with the 'opt_query'
introspection callback) whether a VOL connector implements an optional
operation, without just returning an error.
Added new internal VOL helper routines: H5VL_object_is_native, to determine
if an object is in (or is a) native file, and H5VL_file_is_same, to determine
if two objects are in (or are) the same terminal VOL connector's container.
(And moved the special handling for FILE_IS_EQUAL operation out of internal VOL
callback routine into H5VL_file_is_same)
Made new dataset 'get' operation for H5Dvlen_get_buf_size, aligning it better
with other 'get' operations in API.
Fixed several issues with pass-through connectors, which are now passing the
'make check-passthrough-vol' tests again.
A bunch of warning and style cleanups as well.
structs containing those arrays. Encapsulating the arrays in this way
makes it easier to write and think about pointers to these types, casts
to/from these types, etc.
An interesting side-effect that we probably should *not* rely on is
that the struct-encapsulation changes the alignment so that some GCC
warnings about casts that increase the alignment requirement of the
operand go away. Warnings like that have to be taken seriously: I will
add -Werror=cast-align to the default compiler flags so that they stop
the build quickly.
GCC warnings led me to some surprising casts in test/trefer.c. I found
that it was possible to make many simplifications after introducing the
struct-encapsulation that I described, above.
In test objcopy_ref `same_file` is assigned but never used. Delete it.
procedure for checking it consistent across selection types and between
_s and _u, ensures it is always is performed even when called within the H5S package, and removes the redundant check that would occur when callins H5S_select_adjust_s() from outside the H5S package.
structs containing those arrays. Encapsulating the arrays in this way
makes it easier to write and think about pointers to these types, casts
to/from these types, etc.
An interesting side-effect that we probably should *not* rely on is
that the struct-encapsulation changes the alignment so that some GCC
warnings about casts that increase the alignment requirement of the
operand go away. Warnings like that have to be taken seriously: I will
add -Werror=cast-align to the default compiler flags so that they stop
the build quickly.
GCC warnings led me to some surprising casts in test/trefer.c. I found
that it was possible to make many simplifications after introducing the
struct-encapsulation that I described, above.
of `hsize_t`, `start`, to `long long`, but I think the way that I have
rewritten it, it probably produces a more useful result? As a bonus,
GCC has stopped warning about it.
algorithm to (optionally) avoid sharing selection data structures.
Tested internal code (including with valgrind) by setting VDS code to
avoid sharing selection, has since been changed to share selection for
performance, so this code is not yet tested in regression tests. API
has not been tested.
"obsolete" / POSIX "basic" regular expressions. Also, not every version of
`sed` out there supports the `-E` option. So delete the -E flag and use
the regex `[^/][^/]*` instead of `[^/]+`.
Add config/netbsd to the MANIFEST.
location set to be in a file. Only meant to be used by VOL connectors.
Implement H5VLpeek_connector_id() to support connectors querying their
own IDs. Fix app_ref with connector IDs in a couple places (external
VOLs registered as default through ENV should be visible to the
application). Modify vlen and reference interfaces to work with
arbitrary VOL connectors. Implement file "post open" specific
callback, to enable connectors to update their file structs after a
wrap context has been set.
original installation prefix from the examples prefix. Use that
relative path to locate the current installation prefix, always. Fall
back to an absolute installation prefix if the relative path cannot be
derived.
original installation prefix from the examples prefix. Use that
relative path to locate the current installation prefix, always. Fall
back to an absolute installation prefix if the relative path cannot be
derived.
This is handy for NetBSD where HDF5 examples are installed
by convention in $prefix/share/examples/hdf5/ rather than in
${prefix}/share/hdf5_examples/, which is the HDF5 default.
Place hdf5_examples/ under ${datarootdir} which on most systems will be
${prefix}/share/, anyway.
columns. If the number of datasets is greater than the number of steps,
then only pause between steps, do not pause between individual datasets
written/verified. Otherwise, pause between each dataset written/verified.
that are longer than the buffer that the caller supplied: the checksum
usually will fail, but that's not actually a fatal condition, and
usually we will have another opportunity to verify the checksum.
In H5FD_vfd_swmr_read(), remove a bunch of disused code.
In H5FD_vfd_swmr_read(), do not re-read a shadow image that has a
bad checksum, because a bad checksum indicates a serious problem
(writer outran reader, OS defect, hardware failure) from which
H5FD_vfd_swmr_read() cannot recover.
Rationale: the writer write(2)s new shadow images before the new index,
and the new index before the new header. In H5FD_vfd_swmr_read(),
the reader has read(2) both the index and the header in full. POSIX
semantics indicate that in these circumstances, the last shadow image
write(2) MUST be completely visible when we read(2). That is, the index
write(2) & read(2) and the header write(2) & read(2) pair cannot
divide a preceding shadow-image write(2).
The reader may see a "torn" image at this juncture if, for example,
the writer got max_lag ticks ahead of it and reused the storage for
this shadow image. Even if the reader "recovers" by re-reading the
image until its checksum is correct, it cannot be sure that the
image thus read is the right one for the HDF5 address passed to
H5FD_vfd_swmr_read(), and it cannot be sure that the image thus read is
not stale, because it's operating with an out-of-date shadow index.
Add log outlets swmr_read, swmr_read_exception, and swmr_read_err.
Log to `swmr_read` on entry to H5FD_vfd_swmr_read(), log to
`swmr_read_exception` when checksums are skipped for exceptional
conditions (page buffer not configured, buffer shorter than shadow
image), and log to `swmr_read_err` when the checksum fails.
VFD_SWMR_LEAVE() for use by FUNC_ENTER_API/_LEAVE_API macros. In the
macros, don't HGOTO_ERROR(), since that will jump back to the `out`
label, but HDONE_ERROR() on error, instead.
output streams using STDIN_PATH and STDOUT_PATH. I will use that for
the zoo reader and writer. Move redirection of standard error output to
the standard output stream outside of the curly braces, since usually
I want to save the `echo` and `wait` output, too, and it makes the
redirection of the supervised program a little easier to follow, I
think.
compact datasets.
Bundle the zoo-test configuration into a new type, zoo_config_t.
Add a couple new "zoo" test phases, "delete" and "validate-deletion", to
the existing "create" and "verify" phases. Give names and numbers to
all phases with the new `enum`, `phase_t`, and refactor so that tend_zoo
runs a selection of phases at each step.
Stub the "delete" and "validate-deletion" phases for most test steps.
Actually implement for compact dataset (ds_cpt_i) test.
In tend_zoo(), delay for 50 milliseconds after running all steps.
Really, this should delay after each step....
Implement vfd_swmr_writer_may_increase_tick_to() and
vfd_swmr_reader_did_increase_tick_to() with a file that reader and
writer share. The reader saves its current tick number in the shared
file. The writer does not advance its tick number past the reader's.
Collect some statistics in vfd_swmr_writer_may_increase_tick_to() and
print them before the writer exits.
Add option flags for skipping compact dataset tests (-C) and for
printing error stacks (-e). Update the usage message, which was
stale before the new options were added.
Delete some dead code.
Add #if 0'd-out code for the reader to wait for the writer before
running "delete" and "validate-deletion" steps.
tells whether the call may wait for the reader tick to catch up.
Add stub routines vfd_swmr_writer_may_increase_tick_to() and
vfd_swmr_reader_did_increase_tick_to() for tests---e.g.,
vfd_swmr_zoo_writer/_reader---to use to coordinate their tick numbers.
vfd_swmr_writer_may_increase_tick_to(new_tick, wait_for_reader) returns
true if the writer may increase its tick number to `new_tick` without
overrunning the reader.
A reader uses vfd_swmr_reader_did_increase_tick_to() to tell a writer
that its tick number has increased.
`mdc_invalidation`, and use it to log a message when
H5C_evict_or_refresh_all_entries_in_page() does not find any affected
entries.
Pass a page length to H5C_evict_or_refresh_all_entries_in_page() so that
it can assert if a multipage eviction overlaps a single-page entry,
which had better not happen.
Fix a bug in H5F_vfd_swmr_reader_end_of_tick() and heavily rework it:
remove page-table entries and evict/refresh MDC entries that overlap
*added* shadow-index entries. Because we didn't do that before, in the
zoo test, the reader didn't see all of the changes made by the writer
until the writer closed the file: MDC entries covered the new content
in the shadow file.
In H5F_vfd_swmr_reader_end_of_tick(), log changes to the shadow index
with the new outlet, `shadow_index_update`.
Convert a some of John's disused diagnostic printfs to an
`hlog_fast(eot, ...)` call.
end-of-tick processing macro. H5F_vfd_swmr_process_eot_queue() looks
for files due for end-of-tick processing and calls either the reader or
writer EOT routine.
Always call the reader/writer EOT routines with an actual H5F_t instead
of NULL.
pagebuffer entry size. If the sizes are different, then release the old
shadow space (using the correct size) and set the shadow page number to
0 so that H5F_update_vfd_swmr_metadata_file() will allocate new shadow
space with the right size.
Vailin says that this fixes the bug she found, where a 4096-byte buffer
allocated by H5MV_alloc() is released with H5MV_free() as if it was 8192
bytes long.
that don't count as VFD SWMR test failures.
The variable-length string tests are nondeterministic, so we expect them to
fail, and they do fail quite often on `jelly`, for example.
my investigation of shadow-file freespace leaks.
Copy the tick number from the H5F_shared_t to a temporary variable
and use the temporary instead of spelling out `shared->tick_num`.
Assert that every entry on the tick list is dirty, since clean
entries will not appear on the tick list. Mark each corresponding
shadow-index entry dirty, and set its "tick of last change" to the
current tick and "tick of last flush" to 0. This makes the scan
for entries that do not appear on the tick list more believable.
Lower staircase in the loop that scans for shadow-index entries that
did not appear on the tick list. Log indicies that are marked for
reclamation.
eventually reclaimed.
Defer reclamation of raw data filespace, only.
XXX Deferring only raw-data reclamation isn't *quite* sufficient because
XXX a writer could conceivably reuse metadata space as raw-data space
XXX before max_lag ticks have elapsed. Readers could see metadata
XXX corrupted by raw data.
Once a file starts to close, stop deferring reclamation.
In H5MF_free_aggrs(), perform all deferred reclamations if the file is
closing.
validate_zoo()/check_zoo(), instead of checking/modifying the global
variable `pass`, just return `false` on failure, `true` on success.
Update test `cache_image` to match.
available to other files. Make the tick number a parameter instead of
using the f->shared->tick_num, so that we can pass a maximal tick number
and that way reclaim all of the deferred frees.
the first portion of a metadata cache entry in speculative reads.
This is necessary for VFD SWMR as it presumes that metadata entries
are read and written atomically. See comments in H5C.c / H5C_load_entry()
for further details.
objects as the cache_image test does. The zoo writer is a work in
progress.
This version is useful as a reproducer for the hang in the global heap
that I stumbled over, yesterday. I run this to reproduce,
env HLOG="pbio=on" ./vfd_swmr_zoo_writer -W -a
the tick number in H5F_vfd_swmr_close_or_flush() when just flushing,
because that made the assertion in H5PB_vfd_swmr__set_tick(), that the
tick number had not increased by more than one, fail.
with multiple pages. Update statistics to maintain consistency.
Refactor a bit: in H5PB__write_meta(), move code that's in both the if-
and else- branch to either before the if-else or after and de-duplicate.
In H5PB_vfd_swmr__update_index(), always update the length of a
shadow-index entry to the current size of its corresponding page-table
entry.
In H5PB_vfd_swmr__update_index(), disregard shadow-index entries that
await garbage collection.
page buffer. Now variable-length (VL) data such as VL strings work with VFD
SWMR. This change also makes the library more consistent in its treatment of
global-heap storage, since it's always been allocated as metadata, not raw
data.
unsigned integer value that tells at which index to inject a failure
for testing purposes. I used this to establish that I can start the
shadow-index lookup test with a previous seed and see the same tests run
again.
time in H5PB_dest(). While we're in H5PB_dest(), mark deleted shadow-index
entries as "garbage" and skip the O(n) shadow index-entries copy.
Rename shadow index-entry member `moved_to_hdf5_file` to `moved_to_lower_file`
while I'm in here---NFCI.
messages to them.
Move and update a comment about removing (all) items from the deferred queue
before processing them.
Bug fix: don't leak file space, add back to the deferred queue all items that
were not disposed of.
mysteriously printing "ding ding!" when the shadow index is enlarged.
While I'm here, place the new index into place regardless of whether we succeed
at the deferred free of the old index's shadow-file space.
H5F_shared_t's queue before processing any, instead of removing just one,
processing it, removing another, processing it, and so on. While we processed
the first entry on the queue, we often called H5MF_xfree() again, which called
process_deferred_frees() again, which processed the first entry, calling
H5MF_xfree() again, and on and on, until the deferred frees list was exhausted.
This deep recursion showed up as a wide, tall stack in my flame graphs. Taking
all deferred entries off of the queue to start definitely breaks the recursion
and saves processing time.
implementation, rename hlog_log() to hlog(), hlog_vlog() to vhlog(), et cetera.
Rename hlog_lazy() to hlog_fast().
Define some log sinks and use them in the page buffer and in VFD SWMR.
syntax highlighting colors some (but not all) unescaped underscores red like
there is some problem. It's possible there is some problem, since underscores
are used to indicate some kind of emphasis---probably underlining.
shadow pages. Reduce casts by choosing correct format strings and compatible
variable types.
Poison writes to addr by making it const. Don't increase addr in the read(2)
loop because it's never used afterward.
Delete some more dead code.
Rename read_ptr as p and declare it much closer to its use. Change its type to
`char *` so that no casts are necessary to increase it.
for them did not help me keep track of what they were for.
For brevity, I will call a deferred free record a "defree" in the code.
The deferred_free_queue_t becomes a lower_defree_queue_t, and each record on
the queue becomes a lower_defree_t. A lower_defree_t tracks one deferred free
on the lower VFD---that is, the one under the SWMR VFD.
The old_image_queue_t becomes a shadow_defree_queue_t, and a record therein is
a shadow_defree_t. A shadow_defree_t tracks one deferred free on the shadow
file.
Add to the H5F_shared_t (!) a new member that tells the index in
the shadow file where the index should be written.
Allocate shadow filespace for the header and the index separately so
that the index can float. Update tests to match the expected original
location of the index.
Introduce vfd_swmr_enlarge_shadow_index(), a routine that allocates space in
the shadow file for a new index that has (up to) twice as many entries as the
old index, allocates a new in-core index of the same size, and copies the old
in-core index to the new. Call vfd_swmr_enlarge_shadow_index() in
H5PB_vfd_swmr__update_index() when the in-core index has too few slots.
In the comment at the top of H5FD__vfd_swmr_load_hdr_and_idx(), describe the
protocol that it follows, now, when it reads the shadow header and index.
Delete some dead code in the function and add a bit of diagnostic code.
TBD quiet the diagnostic code.
In H5F_vfd_swmr_init(), follow the protocol: write the index, first, then
the header.
Modify property-list checks and tests to reserve no fewer than two pages at the
front of the shadow file for the header and index.
an H5F_shared_t because the new routine that will relocate the index (which
will be in a future commit) has to pass an H5F_t to the filespace allocator.
the way that the shadow header and shadow index are loaded.
In H5FD__vfd_swmr_load_hdr_and_idx(), adopt a new protocol for reading
the shadow file:
0 If the maximum number of retries have been attempted, then exit
with an error.
1 Try to read the shadow file *header*. If successful, continue to 2.
If there is a hard failure, then return an error. If there is a failure
that may be transient, then sleep and retry at 0.
2 If the tick number in the header is less than the tick last read by the VFD,
then return an error.
3 If the tick number in the header is equal to the last tick read by the
VFD, then exit without doing anything.
4 Try to read the shadow file *index*. If successful, continue to 5.
If there is a hard failure, then return an error. If there is a failure
that may be transient, then sleep and retry at 0.
5 If a different tick number was read from the index than from the index,
then continue at 0.
6 Try to *re-read* the shadow file *header*. If successful, continue to 7.
If there is a hard failure, then return an error. If there is a failure
that may be transient, then sleep and retry at 0.
7 Compare the header that was read previously with the new header. If
the new header is different than the old, then we may not have read
the index at the right shadow-file offset, or the index may have been
read in an inconsistent state, so sleep and retry at 0. Otherwise,
return success.
Simplify H5FD__vfd_swmr_header_deserialize() and
H5FD__vfd_swmr_index_deserialize(). Remove their retry loops. Make
each return TRUE on success, FALSE on an error that may be transient,
and FAIL on an irrecoverable error.
In H5FD__vfd_swmr_header_deserialize(), do not check the size of the
shadow file with fstat(2), since the read(2) will fail if the file is
too small. This saves us a system call.
Lightly consti-ify H5FD__vfd_swmr_index_deserialize() arguments.
In H5FD__vfd_swmr_load_hdr_and_idx():
Consolidate all of the retry-looping. Increase the initial
retry delay from 1ns to 1/10s. Delete the disused maximum-retry
constants.
Use #if 0 to disable some error-checking code that ought to be
unnecessary under the new protocol.
Don't memset() the header and index header, but make sure
they're fully initialized with real content, instead.
simplify H5FD__vfd_swmr_index_deserialize(): reuse
h5_retry_init()/h5_retry_next() for retry loops.
Don't wait for the fstat(2) to read the correct size, because the
read(2) will return short if the file isn't long enough. (This change
should save at least one system call, always.)
Leave a bunch of comments about the changes that I will have to make so
that the shadow index will float.
NFCI: do not cast H5MM_malloc() return values, this is not C++.
compares both new and old shadow indices and calls H5PB_remove_entry()
on each entry that was in the old index but is not in the new.
Ever since H5PB_remove_entry() started removing shadow index
entries, it has been possible for H5F_vfd_swmr_reader_end_of_tick()
to walk past the end of the new shadow index or even to skip entries
in the new index. Sometimes an assertion failed when that happened.
I have restructured the code in H5F_vfd_swmr_reader_end_of_tick()
so that it compares the old and new indices, gathering a list of
removed pages, in one step. In the next step, it processes the
list of removed pages, calling H5PB_remove_entry() on each page.
In the step after that, it notifies the metadata cache of each
removed page. This fixes the bug I described, above.
`if ((p = allocate(...)) == NULL) { }` into two statements, `p =
allocate(...); if (p == NULL) { }`, put a semicolon at the end of an
HGOTO_ERROR(), remove comments /* end if */, /* end for */ after closing
curly braces.
pointer to the metadata index instead of copying the index itself. Use
struct assignment instead of copying individual struct members. Lower a
staircase.
names. While I am here, do not copy the last element of the index over
the element that's being deleted, because in the very next step I'm
shifting all elements over by one.
`uint64_t` to `size_t` because it describes the size of an in-core
structure as well as an on-disk one, and `size_t` is wide enough
to store the size of any in-core structure, while `uint64_t` may
be much too wide. Check that `index_length` is no more than SIZE_MAX
after we read it.
Delete the little-used free-list length, dl_len, and just count up the list
entries when diagnostic code needs the length.
Extract the code for deferring shadow-image free into a new subroutine,
`vfd_swmr_idx_entry_defer_free()`.
Rename type `deferred_free_head_t` as `deferred_free_queue_t`.
Remove the disused H5F__LL_{REMOVE,PREPEND} macros.
Add some diagnostic code and #if 0'd assertions.
Change `qsort(ptr, n, sizeof(type), cmp)` to
`qsort(ptr, n, sizeof(*ptr), cmp)`.
Use a `continue` statement to lower a staircase in
H5F_update_vfd_swmr_metadata_file().
Add vfd_swmr_mdf_idx_entry_remove() to delete a shadow index entry and add the
image at that entry to a deferred-free list. Call it whenever a page is
evicted.
Update the comment in H5PB_remove_entry() that asks if we need to remove shadow
index entries: now we *do* remove them. Remove shadow index entries in
H5PB__evict_entry(). Also mention in the comment that the index-entry removal
performed by H5PB__evict_entry() ought to be sufficient.
alignment. The VFD SWMR code had always assumed that the regions were aligned
to page size. It would blithely round the start addresses of regions to the
next lower page. When the region was freed, the freespace manager (H5MV) would
suffer an assertion or corruption.
HDF5 file and in the shadow file. I had added assertions that the page numbers
were unique, and this caused those assertions to fail. I don't know if I'll
keep the assertions, but this is an inexpensive change that makes the test more
realistic.
including the merge of `hdffv/hdf5/develop`, back to the branch that Vailin and
I share.
Now I need to put this branch on a fork with a less confusing name than
vchoi_fork!
Check for smaller or larger section size after merging and shrinking a section,
for this case is the section that is smaller than threshold (see H5MF_xfree() in H5MF.c).
It is possible for the section to be smaller after merging/shrinking (see H5MF__sect_large_shrink()
in H5MFsection.c).
Given that the VFD SWMR configuration FAPL property is set, the writer field must
be consistent with the flags passed in the H5Fopen() (either H5F_ACC_RDWR for the
VFD SWMR writer, or H5F_ACC_RDONLY for the VFD SWMR readers).
The test "driver_addr != sblock->driver_addr" is failing for superblock version 2 & 3.
Fix: there is no driver_addr in superblock version 2 & 3.
It should decode the root group object header address (root_addr) and verify accordingly.
(A) #5: Add the "pb_expansion_threshold" field to the "H5F_vfd_swmr_config_t" structure
and update H5Pset_vfd_swmr_config() and H5Pget_vfd_swmr_config() accordingly
(B) #13 bullet 2: Comment H5F_vfd_swmr_config_t in H5Fpublic.h properly
(copied from John's description in the RFC)
(C) Change the field name "vfd_swmr_writer" to "writer" in "struct H5F_vfd_swmr_config_t"
(as indicated on page 11 in the RFC) and all references to it
had finished its work and closed the .h5 file, thus removing the shadow file.
Make the sparse writer wait to close the .h5 file for a signal from
testvfdswmr.sh. In testvfdswmr.sh, send the signal when the readers have all
finished.
This lets test/testflushrefresh.sh pass again. It was timing out while it
waited for expected failures to occur because the retry loop ran for way too
long.
cells in a matrix in an arbitrary order, first it chooses a random
starting `offset` in [0, rows * columns - 1]. Then it chooses a
random `increment` that's relatively prime to `rows * columns`.
Then it visits every cell in `rows * columns` steps:
for (i = 0; i < rows * columns; i++) {
visit(cell[offset / columns][offset % columns]);
offset = (increment + offset) % (rows * columns);
}
By moving the HDrandom() calls outside of the main loop and visiting
each cell only once, this probably speeds things up quite a bit. It's
also more resilient to a crummy random sequence. The new code visits
cells in an order that's probably arbitrary enough for testing purposes.
to end with whitespace padding rather than newlines. My introduction of
variadic TESTING() got rid of the padding. I have straightened this out
by newline-terminating the stdout lines in the test program and in its
expected out. I also add some newlines to the program's standard error
output so that the expected error output still matches.
comment to myself that I need to reduce code duplication with the MPMDE
test.
In vfd_read_each_equals(), print the correct expected value when there
is a discrepancy.
No functional change intended: correct a comment in
vfd_read_each_equals. Fix indentation in the test_raw_data_handling()
header comment.
buffer's treatment of multipage metadata entries (MPMDEs). Mention why
an H5PB_flush() is not necessary for MPMDEs to reach visibility at the
VFD layer.
macro magic. Use namebases and namebase, instead.
Extract a bunch of copy-and-paste VFD SWMR setup into a new subroutine,
swmr_fapl_augment().
Make sure that the metadata reads all-0s until it reads all -1s.
Extract a subroutine, vfd_read_each_equals(), that reads and compares a
region with one of its arguments.
Rename from test_basic_metadata_handling() to
test_metadata_delay_basic(), since that gets at what we're testing
better.
Don't perform an H5PB_flush(), it's not necessary for this test because
H5Fvfd_swmr_end_tick() has done essentially the same thing.
anticipate comparing the written buffer with the read buffer.
Don't initialize variables prematurely so that the compiler has a chance
to warn about variables read before they are written.
Repeatedly flush the page buffer, once each time we end the tick.
Write errors to stdout instead of stderr.
gettimeofday() alternate. Perform nanoseconds arithmetic using uint64_t
instead of long to avoid unwanted overflows on 32-bit systems like
my i386 (!) development box.
Combine the VFD SWMR and non-VFD SWMR raw-data test into one
routine that takes a bool parameter to switch on VFD SMWR.
Update my description of the to-be-written metadata test for VFD SWMR.
an assertion from firing:
"entry_ptr->delay_write_until == 0" failed: file "../../../vchoi_fork/src/H5PB.c", line 4093, function "H5PB__write_raw"
In a comment, mention a change that has to be made to accommodate parallel
mode.
output, then it tested $? for an error exit. $? told the error status of
`tee`, though, not the test programs! So no test failures were counted, even
when some tests clearly failed. I changed the test script to use a shell
subroutine, `catch_out_err_and_rc`, to catch test programs' output and result
code.
H5FDvfd_swmr_private.h.
Perform tick processing in FUNC_ENTER_API_NOCLEAR, where it was missing.
Track the number of times the HDF5 library has been entered/exited through its
public API. Only perform tick processing on the first entry and last exit.
This stops us from performing tick processing in API calls invoked by
application callbacks. Performing tick processing in nested API calls led to
crashes.
Note well: FUNC_LEAVE_API now performs tick processing even on an error exit!
Previously, it did not. I'm not sure if the change is ok.
(1) Increase the # of records to write (Nrecords) in testvfdswmr.sh.in so as to ensure the writer
will not exit before the reader
(2) Use H5E_BEGIN_TRY/END_TRY when H5Fopen() the test file in reader tests
(3) Add "READER" or "WRITER" to debugging messages
(4) Misc cleanup
open in the VFD SWMR reader case.
Note that the following failures in testvfdswmr.sh:
1) Unable to find metadata file on VFD SWMR reader open.
2) Occasional sanity check failures in the page buffer on raw data write.
3) Filter failures on raw data read in VFD SWMR readers when compression
is enabled.
4) Unexpected data errors in VFD SWMR readers when compression is
disabled.
Note that I expect that items 3 & 4 two aspects of the same issues -- the
fact that we don't guarantee that raw data is consistent with metadata.
Item 2) must be addressed, but it is so infrequent that it isn't doesn't
affect the conclusion VFD SWMR seems to work, and thus it can wait until
phase 2.
I am given to understand that Vailin has largely addressed item 1),
and will be checking in her solution to this soon.
Tested on Charis and Jelly.
in un-related tests (i.e. earray, fheap, etc.).
On jelly and charis, vfd_swmr now passes.
testvfdswmr.sh displas the following failures:
1) occasionally ccan't open the metadata file. This shows
up more on jelly than charis.
2) occasional complaints about incorrect raw data
3) occasional complaints from Quincey's evict tagged
entries code that it can't evict all the tagged entries.
4) Numerous filter failures. At a guess, this is an artifact of
raw data not making it to file in sync with the metadata.
I didn't see this on charis, as I don't have compression
configured.
5) An assertion failure in the page buffer in which a sanity check
is failing in the code to update the replacement policy.
This is worrying --- I'll need to look into it on my return.
the memory manager -- details shown below.
Note that there are other issues as well -- this is not a
working version.
[mainzer@jelly test]$ ./vfd_swmr
Testing Configure VFD SWMR with fapl PASSED
Testing VFD SWMR configuration for the file and fapl PASSED
Testing H5Fvfd_swmr_end_tick() for VFD SWMR PASSED
Testing Create/Open/Flush an HDF5 file for VFD SWMR PASSED
Testing Verify the metadata file for VFD SWMR writer vfd_swmr: H5MVsection.c:233: H5MV__sect_can_merge: Assertion `((sect1->sect_info.addr)!=((haddr_t)(long)(-1)) && (sect2->sect_info.addr)!=((haddr_t)(long)(-1)) && (sect1->sect_info.addr)<(sect2->sect_info.addr))' failed.
Abort (core dumped)
[mainzer@jelly test]$
(1) Assertion failure in the vfd_swmr test
(2) Reader error in the vfd swmr concurrent tests
Also fixes for:
(a) Use H5MV_alloc() to allocate space for md_pages_reserved when creating the metadata file in H5F__vfd_swmr_init()
(b) Remove a multi-page (when vfd_swmr_writer is true) from the page buffer in H5MF_xfree()
(2) Test files for encoding/decoding property lists
(3) Fix test failures for PB statistics in test/page_buffer.c
(Will double check with John later about PB statistics collection)
--src/H5PB.c: checks for size >= page size
--src/H5MF.c: disable/enable page buffering in H5MF_tidy_self_referential_fsm_hack()
--src/H5MFsection.c: call H5PB_remove_entry() for both raw/metadata pages in H5MF__sect_small_merge()
(B) Port and modify existing concurrent swmr tests to VFD SWMR. Also modify the following:
--remove flushes from VFD SWMR writer tests
--set Nreaders to 0 in test/testvfdswmr.sh.in to test for writers only
Please enter the commit message for your changes. Lines starting
Tested serial / debug on Charis and Jelly.
Two known issues:
1) New page buffer seems to expose issues in the accumulator code.
For whatever reason, fheap with the new page buffer exposes corruption
issues if the page buffer uses H5F__accum_read/write(), but the problems
go away if the page buffers uses H5FD_read/write() instead.
Need to either chase this or dis-able page bufffer in combination with
the accumulator.
2) Encountered a number of assertion failures that are explainable by the
free space manager code not telling the page buffer to discard pages
when they are freed.
Wrote code to handle this -- once the free space manager is modified,
this code should be removed and the original assertions restored.
1) Free space manager for the metadata file
2) Delayed free space release linked list
3) H5F_update_vfd_swmr_metadata_file()
3) VFD SWMR driver: read callback
4) Flushing for VFD SWMR
5) Port one concurrent test from swmr test set
6) Bug fixes and refactoring
1) Define driver for the VFD SWMR reader
2) Implement VFD SWMR open callback
3) Implement H5FD_vfd_swmr_get_tick_and_idx()
4) Load and decode metadata file header and index
4) Closing for VFD SWMR
* commit 'c834d9f99d45e5b9752e8525fe8761ea5592bf2c': (41 commits)
HDFFV-10568 fix hdf5_java library dependency
Remove another extra path var
Move muti-config dir setting to root process
Fix one more intermediate location
Use different variable
Cleanup and add intermediate dir for java
Java must use shared libs to allow dlopen calls
Correct names
Call new function
Correct default API version for develop to 112.
Fix typo
HD prefix and whitespace
Update RELEASE.txt with suggested changes
Update MANIFEST file for new t_coll_md_read.c file
Remove now-unused local variable
Add fix for HDFFV-10501
Revert testfile FILE change
change FILE path
Add testfiles to data copy
Same changes needed for examples as test
...
1) Public routines: H5Pget/set_vfd_swmr_config
2) Public routine: H5Fvfd_swmr_end_tick
3) Initialization when opening a file with VFD SWMR writer
4) Tests in test/vfd_swmr.c for VFD SWMR
5) Fix a bug in src/H5Fint.c: when error is encountered after the root group is created
Note the following:
--This is WORK IN PROGRESS and will subject to change as implementation goes.
--There is test failure form enc_dec_plist.c: I will fix this when changes to the property list are settled.
--The branch is updated with the latest from develop as of 8/14/2018
* commit '5647dea421be9dc8429f08632aa72a8a22904292': (47 commits)
Rearrange issues by date order
RELEASE.txt changes for MPI updates
Update Drop Site options and Coverage settings
Reorder bugfix release notes from latest to earliest, and miscellaneous format cleanup.
Add RELEASE.txt entry for HDFFV-10475
HDFFV-10544 Add more descriptive text
HDFFV-10544 Correct var name
HDFFV-10544 remove native from class function
HDFFV-10544 correct typo
HDFFV-10544 add release note
HDFFV-10544 add class name to error text
HDFFV-10544 exception variable as local class
Improve error handling of exceptions
Typo fix
Set CMAKE_REQUIRED_INCLUDES instead of using path in call
Add Autotools and CMake checks for big I/O MPI-3 functions
Add hdf5settings section for parallel compression status in CMake builds
HDFFV-10508 rework sentence
HDFFV-10508 clarify library differences
HDFFV-10508 spelling and grammer
...
2018-08-14 12:10:04 -05:00
1894 changed files with 549590 additions and 322381 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.