Commit Graph

189 Commits

Author SHA1 Message Date
Quincey Koziol
138bc92ebd [svn-r8126] Purpose:
Bug fix/optimization

Description:
    Address slowdown in MPI-I/O file metadata operations that was introduced
mid-stream.  We now _require_ a POSIX compliant parallel file system for the
MPI-I/O file driver (as well as for the MPI-POSIX file driver).
    Also optimized file open operation when the file is being created by
reducing the number of collective & syncronizing calls.
    Additionally, refactor the MPI routines into a common place, eliminating
duplicated code.

Platforms tested:
    FreeBSD 4.9 (sleipnir) w/parallel
    h5committest
2004-01-30 20:38:44 -05:00
Quincey Koziol
1619a308cb [svn-r7860] Purpose:
Bug fix

Description:
    Our previous "optimization" of metadata writing which only wrote metadata
from one process was abusing MPI-I/O and after some consultation with Rob Ross
and Rajeev Thakur, Albert & I have come up with a solution...

Solution:
    Instead of only writing from one process, issue a collective write
operation with all processes, for metadata writes.

Platforms tested:
    FreeBSD 4.9 (sleipnir)
    h5committest
2003-11-20 09:36:59 -05:00
Raymond Lu
d1f7c81a46 [svn-r7784] *** empty log message *** 2003-10-29 12:04:58 -05:00
Quincey Koziol
a3888a3b00 [svn-r7426] Purpose:
Bug fix

Description:
    When datasets are deleted from a file, they are removed from the sieve
buffer, but instead of invalidating only the part of the sieve buffer affected,
the sieve buffer code would throw away the entire sieve buffer, potentially
including other raw data in the buffer that hadn't been written to disk yet.

Solution:
    Improve the sieve buffer clearing code to handle partial invalidations.

Platforms tested:
    FreeBSD 4.8 (sleipnir)
    h5committest
2003-08-28 11:02:21 -05:00
Quincey Koziol
9881fa7e34 [svn-r7381] Purpose:
Code cleanup

Description:
    Various cleanups resulting from running lint tool over H5F.c source module

Platforms tested:
    FreeBSD 4.8 (sleipnir)
    too minor to require h5committest
2003-08-18 11:34:27 -05:00
Quincey Koziol
e9cc951e03 [svn-r7232] Purpose:
Bug fix

Description:
    When a non-default indexed storage B-tree internal 'K' value is set by the
user, the chunked datasets created in that file (until it is closed) use the
user's 'K' value and the data can be accessed correctly, but the 'K' value is
not stored in the file.
    However, once the file is closed and re-opened, the non-default 'K' value
is lost and the data in the chunked datasets will not be able to be accessed
correctly.

Solution:
    Store the indexed storage B-tree internal 'K' value in the superblock.

Platforms tested:
    FreeBSD 4.8 (sleipnir)
    h5committest
2003-07-16 09:56:58 -05:00
Quincey Koziol
932101bb80 [svn-r7201] Purpose:
Code cleanup

Description:
    Finish converting the B-tree 'K' values to use unsigned integers, rather
than signed ones, since negative amounts of entries in a B-tree node aren't
meaningful.

Platforms tested:
    FreeBSD 4.8 (sleipnir)
    h5committest
2003-07-10 14:39:04 -05:00
Quincey Koziol
839de1e367 [svn-r7189] Purpose:
Code cleanup

Description:
    Break some of the "debugging" routines into their own module, so they
aren't pulled into every executable, which certainly isn't going to use them.

Platforms tested:
    h5committested
2003-07-09 13:00:43 -05:00
Quincey Koziol
6028d7f9c9 [svn-r7172] Purpose:
Code cleanup

Description:
    Remove "UNUSED" keyword from function prototypes in header files.

Platforms tested:
    IA64 Linux cluster (titan)
    too small to need commit test
2003-07-07 11:20:43 -05:00
Quincey Koziol
87a5f6480c [svn-r7122] Purpose:
Code cleanup/Feature removal

Description:
    Remove "round robin" writing of metadata in MPI VFDs (and a couple of other
places).  This never really worked the way I hoped it would and it causes
problems on certain machines.

Platforms tested:
    FreeBSD 4.8 (sleipnir) w/parallel
    Too small for triple check.
2003-06-30 10:53:02 -05:00
Quincey Koziol
b496ac1482 [svn-r6878] Purpose:
Code cleanup

Description:
    Limit the scope on more function prototypes/macros/typedefs.

Platforms tested:
    FreeBSD 4.8 (sleipnir)
    h5committest not necessary.
2003-05-15 14:22:33 -05:00
Quincey Koziol
7d9c86097a [svn-r6843] Purpose:
Code cleanup

Description:
    Clean up warnings exposed by compiling on O2K.  Also, revert some of Bill
and my changes to the H5S_mpi_opt_types_g, etc. and settle them back into their
original location.

Platforms tested:
    h5committested.
2003-05-09 13:18:21 -05:00
Quincey Koziol
43e3b45021 [svn-r6825] Purpose:
New feature/enhancement

Description:
    Chunked datasets are handled poorly in several circumstances involving
certain selections and chunks that are too large for the chunk cache and/or
chunks with filters, causing the chunk to be read from disk multiple times.

Solution:
    Rearrange raw data I/O infrastructure to handle chunked datasets in a much
more friendly way by creating a selection in memory and on disk for each chunk
in a chunked dataset and performing all of the I/O on that chunk at one time.

    There are still some scalability (the current code attempts to
create a selection for all the chunks in the dataset, instead of just the
chunks that are accessed, requiring portions of the istore.c and fillval.c
tests to be commented out) and performance issues, but checking this in will
allow the changes to be tested by a much wider audience while I address the
remaining issues.


Platforms tested:
    h5committested, FreeBSD 4.8 (sleipnir) serial & parallel, Linux 2.4 (eirene)
2003-05-07 16:52:24 -05:00
Bill Wendling
b5d7fa02a9 [svn-r6546] Purpose:
Update

Description:
    Updated copyright statement in files which hadn't been updated yet.

Platforms tested:
    Linux (Only comment change)

Misc. update:
2003-03-31 13:30:57 -05:00
Quincey Koziol
a52914987e [svn-r6497] Purpose:
Finish code cleanup

Description:
    Wrap up the conversion of H5F_flush's multiple boolean flags into a single
bitfield of flags by pushing the flags down into the H5AC_flush and
H5F_istore_flush routines.
    Also, changed the flags from H5_FLUSH_<foo> to H5F_FLUSH_<foo> to be more
consistent with rest of library.
    And reverted the changes to H5FDflush and H5FD_flush routines.

Platforms tested:
    FreeBSD 4.7 (sleipnir)
    Solaris 5.8 (sol)
    IRIX64 6.5 (modi4) w/parallel

Misc. update:
2003-03-19 13:58:54 -05:00
Quincey Koziol
24d8506dd5 [svn-r6387] Purpose:
Bug Fix

Description:
    Metadata cache in parallel I/O can cause hangs in applications which
    perform independent I/O on chunked datasets, because the metadata cache
    can attempt to flush out dirty metadata from only a single process, instead
    of collectively from all processes.

Solution:
    Pass a dataset transfer property list down from every API function which
    could possibly trigger metadata I/O.

    Then, split the metadata cache into two sets of entries to allow dirty
    metadata to be set aside when a hash table collision occurs during
    independent I/O.

Platforms tested:
    Tested h5committest {arabica (fortran), eirene (fortran, C++)
        modi4 (parallel, fortran)}

    FreeBSD 4.7 (sleipnir) serial & parallel

Misc. update:
    Updated release_docs/RELEASE
2003-02-10 12:26:09 -05:00
Quincey Koziol
9a433b99a5 [svn-r6252] Purpose:
Lots of performance improvements & a couple new internal API interfaces.

Description:
    Performance Improvements:
        - Cached file offset & length sizes in shared file struct, to avoid
            constantly looking them up in the FCPL.
        - Generic property improvements:
            - Added "revision" number to generic property classes to speed
                up comparisons.
            - Changed method of storing properties from using a hash-table
                to the TBBT routines in the library.
            - Share the propery names between classes and the lists derived
                from them.
            - Removed redundant 'def_value' buffer from each property.
            - Switching code to use a "copy on write" strategy for
                properties in each list, where the properties in each list
                are shared with the properties in the class, until a
                property's value is changed in a list.
        - Fixed error in layout code which was allocating too many buffers.
        - Redefined public macros of the form (H5open()/H5check, <variable>)
            internally to only be (<variable>), avoiding innumerable useless
            calls to H5open() and H5check_version().
        - Reuse already zeroed buffers in H5F_contig_fill instead of
            constantly re-zeroing them.
        - Don't write fill values if writing entire dataset.
        - Use gettimeofday() system call instead of time() system when
            checking the modification time of a dataset.
        - Added reference counted string API and use it for tracking the
            names of objects opening in a file (for the ID->name code).
        - Removed redundant H5P_get() calls in B-tree routines.
        - Redefine H5T datatype macros internally to the library, to avoid
            calling H5check redundantly.
        - Keep dataspace information for dataset locally instead of reading
            from disk each time.  Added new module to track open objects
            in a file, to allow this (which will be useful eventually for
            some FPH5 metadata caching issues).
        - Remove H5AC_find macro which was inlining metadata cache lookups,
            and call function instead.
        - Remove redundant memset() calls from H5G_namei() routine.
        - Remove redundant checking of object type when locating objects
            in metadata cache and rely on the address only.
        - Create default dataset object to use when default dataset creation
            property list is used to create datasets, bypassing querying
            for all the property list values.
        - Use default I/O vector size when performing raw data with the
            default dataset transfer property list, instead of querying for
            I/O vector size.
        - Remove H5P_DEFAULT internally to the library, replacing it with
            more specific default property list based on the type of
            property list needed.
        - Remove redundant memset() calls in object header message (H5O*)
            routines.
        - Remove redunant memset() calls in data I/O routines.
        - Split free-list allocation routines into malloc() and calloc()-
            like routines, instead of one combined routine.
        - Remove lots of indirection in H5O*() routines.
        - Simplify metadata cache entry comparison routine (used when
            flushing entire cache out).
        - Only enable metadata cache statistics when H5AC_DEBUG is turned
            on, instead of always tracking them.
        - Simplify address comparison macro (H5F_addr_eq).
        - Remove redundant metadata cache entry protections during dataset
            creation by protecting the object header once and making all
            the modifications necessary for the dataset creation before
            unprotecting it.
        - Reduce # of "number of element in extent" computations performed
            by computing and storing the value during dataspace creation.
        - Simplify checking for group location's file information, when file
            has not been involving in file-mounting operations.
        - Use binary encoding for modification time, instead of ASCII.
        - Hoist H5HL_peek calls (to get information in a local heap)
            out of loops in many group routine.
        - Use static variable for iterators of selections, instead of
            dynamically allocation them each time.
        - Lookup & insert new entries in one step, avoiding traversing
            group's B-tree twice.
        - Fixed memory leak in H5Gget_objname_idx() routine (tangential to
            performance improvements, but fixed along the way).
        - Use free-list for reference counted strings.
        - Don't bother copying object names into cached group entries,
            since they are re-created when an object is opened.

        The benchmark I used to measure these results created several thousand
        small (2K) datasets in a file and wrote out the data for them.  This is
        Elena's "regular.c" benchmark.

        These changes resulted in approximately ~4.3x speedup of the
        development branch when compared to the previous code in the
        development branch and ~1.4x speedup compared to the release
        branch.

        Additionally, these changes reduce the total memory used (code and
        data) by the development branch by ~800KB, bringing the development
        branch back into the same ballpark as the release branch.

        I'll send out a more detailed description of the benchmark results
        as a followup note.

    New internal API routines:
        Added "reference counted strings" API for tracking strings that get
            used by multiple owners without duplicating the strings.
        Added "ternary search tree" API for text->object mappings.

Platforms tested:
    Tested h5committest {arabica (fortran), eirene (fortran, C++)
	modi4 (parallel, fortran)}
    Other platforms/configurations tested?
        FreeBSD 4.7 (sleipnir) serial & parallel
        Solaris 2.6 (baldric) serial
2003-01-09 12:20:03 -05:00
MuQun Yang
e5b28ef37b [svn-r5931]
Purpose:
__DLL__ is a keyword in some platforms and __DLL__ is also defined as a macro for windows DLL applications.
That causes problems.
Description:
Solution:
Use H5_DLL*** to replace __DLL***__ at all header files.
Change the macro defination at H5api_adpt.h.
Platforms tested:
linux2.2.18smp, irix64, solaris 2.7 and windows 2000
2002-09-20 15:36:09 -05:00
Quincey Koziol
32b58cef08 [svn-r5894] Purpose:
Bug fix/Code cleanup/New Feature

Description:
    Correct problems with writing fill-values to external storage and allocate
    the data storage at the correct times.

    Also, mostly straighten out the strange code which allocates and fills
    raw data storage for datasets.  Things are still a bit odd in that the
    fill-values for chunked datasets are written when the space is allocated,
    instead of in a separate routine, but there are two reasons for this:
    it's inefficient (especially in parallel) to iterate through all the chunks
    twice, and (more importantly) the space needed to store compressed chunks
    isn't known until we've got a buffer of compressed fill-values ready to
    write to the chunk.

    Additionally, add in the H5D_SPACE_ALLOC_INCR and H5D_SPACE_ALLOC_DEFAULT
    setting for the "space time", which incorporate the previous behavior of
    the space allocation for chunked datasets.

    The default settings for the different types of dataset storage are now
    as follows:
        Contiguous - Late
        Chunked    - Incremental
        Compact    - Early

    This checkin also incorporates a change to the behavior of external data
    storage in two ways - fill-values are _never_ written to external storage
    (under the assumption that writing fill-values is triggered by allocating
    space in an HDF5 file, and since space is not allocated in the file, the
    fill-values should not be written) and external data files are now created
    if they don't exist when data is written to them.  The fill-value will
    probably need to be revisited at some time in the future, this just seemed
    like the safer course currently.

    I think I cleaned up some compiler errors also, before getting bogged down
    in the fixes for the space allocation and fill-values.

Platforms tested:
    FreeBSD 4.6 (sleipnir) w/serial & parallel.  Will be testing on IRIX64
    6.5 (modi4) in serial & parallel shortly.
2002-08-27 08:41:32 -05:00
Raymond Lu
29da4951f8 [svn-r5879]
Purpose:
    Design for compact dataset
Description:
    Compact dataset is stored in the header message for dataset layout.
Platforms tested:
    arabica, eirene.
2002-08-20 11:18:02 -05:00
Quincey Koziol
363ec52b7c [svn-r5799] Purpose:
New feature.

Description:
    Added MPI-posix VFL driver.  This driver uses MPI to coordinate actions, but
    performs I/O directly with posix sec(2) I/O functions.  This driver should
    _NOT_ be used if the file accessed is not on a parallel filesystem.

Platforms tested:
    FreeBSD 4.6 (sleipnir) w/parallel & IRIX64 6.5 (modi4) w/parallel
2002-07-15 10:21:32 -05:00
Quincey Koziol
aefc39ac32 [svn-r5667] Purpose:
Code cleanup

Description:
    Turn on more warnings in the IRIX builds and clean them up.

Platforms tested:
    IRIX64 6.5 (modi4) w/parallel
2002-06-19 07:54:53 -05:00
Quincey Koziol
b803308acd [svn-r5538] Purpose:
Code improvement.

Description:
    Added boot block and driver info block checksumming feature to the shared
    file information.  This prevents these blocks from being written out
    multiple times when they haven't changed.

    This reduces the number of I/O operations which hit the disk for my test
    program from 15 to 14 (i.e. from 393 to 14, overall).

Platforms tested:
    Solaris 2.7 (arabica) w/FORTRAN and FreeBSD 4.5 (sleipnir) w/C++
2002-06-05 13:18:56 -05:00
Quincey Koziol
a6b4cba798 [svn-r5429] Purpose:
Bug fix/Code improvement.

Description:
    Currently, the chunk data allocation routine invoked to allocate space for
    the entire dataset is inefficient.  It writes out each chunk in the dataset,
    whether it is already allocated or not.  Additionally, this happens not
    only when it is created, but also anytime it is opened for writing, or the
    dataset is extended.  Worse, there's too much parallel I/O syncronization,
    which slows things down even more.

Solution:
    Only attempt to write out chunks that don't already exist.  Additionally,
    share the I/O writing between all the nodes, instead of writing everything
    with process 0.  Then, only block with MPI_Barrier if chunks were actually
    created.

Platforms tested:
    IRIX64 6.5 (modi4)
2002-05-17 07:53:46 -05:00
MuQun Yang
7971d2873b [svn-r5325]
Purpose:
   minor bugs fixed
Description:
    Typos in H5Fpkg.h and H5TB.c
Solution:
Platforms tested:
linux 2.2.18
2002-05-02 13:05:11 -05:00
Albert Cheng
73683e4380 [svn-r5278] Purpose:
Migrate from configure macros of XYZ_ABC to H5_XYZ_ABC
Description:
    configure generates many macros definitions on the fly and
    were stored in src/H5config.h which is included by H5public.h.
    But other software that uses hdf5 may also run their own configure.
    There can be a clash in macro name space.  We decided awhile ago
    to prepend all generated macros with "H5_" to avoid conflicts.
    The process has started and this commit completes it (at least attempt
    to).
Solution:
    Many macros symbols (e.g. SIZEOF_xxx and HAVE_xxx were changed to
    H5_SIZEOF_xxx and H5_HAVE_xxx).  Then H5private.h no longer includes
    H5config.h.  This cuts H5config.h away from HDF5 source code.
Pending issues:
    The module of fortran and pablo are to be resolved in a different
    commit.
Platforms tested:
    eirene (parallel), arabica (solaris 7 --enable-fortran, --enable-cxx)
2002-04-28 03:34:17 -05:00
Quincey Koziol
d33f7d93a3 [svn-r5259] Purpose:
Code cleanup

Description:
    Previously, the I/O pipeline (pline), external file list (efl) and fill-
    value (fill) structs were passed down the raw data function call chain,
    even into and/or through functions which didn't use them.  Since all three
    of these pieces of information are available from the dataset creation
    property list, just pass the dataset creation property list down the
    function call chain and query for the information needed in a particular
    function.

Platforms tested:
    FreeBSD 4.5 (sleipnir)
2002-04-25 12:56:56 -05:00
Quincey Koziol
d2232a345f [svn-r5130] Purpose:
Bug Fix & Feature

Description:
    The selection offset was being ignored for optimized hyperslab selection
    I/O operations.

    Additionally, I've found that the restrictions on optimized selection
    I/O operations were too strict and found a way to allow more hyperslabs
    to use the optimized I/O routines.

Solution:
    Incorporate the selection offset into the selection location when performing
    optimized I/O operations.

    Allow optimized I/O on any single hyperslab selection and also allow
    hyperslab operations on chunked datasets.

Platforms tested:
    FreeBSD 4.5 (sleipnir)
2002-04-02 15:51:41 -05:00
Raymond Lu
d28fd08f4a [svn-r4696]
Purpose:
    Modify H5Fclose behavior
Description:
    The HDF5 actual file close behaves in several ways in terms of if there
    are still objects(dataset, group, datatype) opened in file.
Solution:
    Added a new file access property, file close degree.  It has four values,
	H5F_CLOSE_DEFAULT
	H5F_CLOSE_WEAK
	H5F_CLOSE_SEMI
	H5F_CLOSE_STRONG
    The way a file is closed is decided by these values.
Platforms tested:
    IRIX64 6.5, SunOS 5.6, FreeBSD 4.4
2001-12-11 14:53:44 -05:00
Quincey Koziol
d456c2bb82 [svn-r4643] Purpose:
Code cleanup
Description:
    Windows is generating hundreds of warnings from some of the practices in
    the library.  Mostly, they are because size_t is 32-bit and hsize_t is
    64-bit on Windows and we were carelessly casting the larger values down to
    the smaller ones without checking for overflow.

    Also, some other small code cleanups,etc.

Solution:
    Re-worked some algorithms to eliminate the casts and also added more
    overflow checking for assignments and function parameters which needed
    casts.

    Kent did most of the work, I just went over his changes and fit them into
    the the library code a bit better.

Platforms tested:
    FreeBSD 4.4 (hawkwind)
2001-11-27 11:29:13 -05:00
Raymond Lu
b6da4ea427 [svn-r4569]
Purpose:
    Generic Property List Change
Description:
    Changed file access list to the new generic list.
Platforms tested:
    IRIX64, SunOS5.7, FreeBSD
2001-10-24 13:02:27 -05:00
Raymond Lu
fe76b00dc6 [svn-r4543]
Purpose:
    Changed the file creation property list to the new generic property list.
Platform tested:
    IRIX64, SunOS5.7, FreeBSD
2001-10-15 14:36:48 -05:00
Quincey Koziol
e87fc517b8 [svn-r4355] Purpose:
Code cleanup (sorta)

Description:
    When the first versions of the HDF5 library were designed, I remembered
    vividly the difficulties of porting code from a 32-bit platform to a 16-bit
    platform and asked that people use intn & uintn instead of int & unsigned
    int, respectively.  However, in hindsight, this was overkill and
    unnecessary since we weren't going to be porting the HDF5 library to
    16-bit architectures.

    Currently, the extra uintn & intn typedefs are causing problems for users
    who'd like to include both the HDF5 and HDF4 header files in one source
    module (like Kent's h4toh5 library).

Solution:
    Changed the uintn & intn's to unsigned and int's respectively.

Platforms tested:
    FreeBSD 4.4 (hawkwind)
2001-08-14 17:09:56 -05:00
Quincey Koziol
990fadfbe5 [svn-r4181] Purpose:
Bug Fix, Code Cleanup, Code Optimization, etc.
Description:
    Fold in the hyperslab speedups, clean up compile warnings and change a
    few things from using 'unsigned' or 'hsize_t' to use 'size_t' instead.
Platforms tested:
    FreeBSD 4.3 (hawkwind), Solaris 2.7 (arabica), Irix64 6.5 (modi4)
2001-07-10 16:19:18 -05:00
Quincey Koziol
076efa3382 [svn-r3885] Purpose:
Document bug fix
Description:
    IMPORTANT! IMPORTANT! IMPORTANT!
    A case where metadata in a file could get corrupted in certain unusual
    sitations was detected and fixed.

    In certain circumstances, metadata could get cached in the raw data cache,
    and if that particular piece of metadata was updated on disk while
    incorrectly cached, the new metadata would get overwritten with the stale
    metadata from the raw data cache when it was flushed out.

    Additionally, I've patched up the raw data cache to be smarter about how
    much it caches and how much I/O it triggers, leading to some speedups.

Solution:
    Changed the raw data I/O routines which perform caching to require a
    parameter with the size of the dataset being accessed and limited the
    cache to no more than that many bytes.

Platforms tested:
    FreeBSD 4.3 (hawkwind)
2001-05-02 10:04:55 -05:00
Bill Wendling
5e483d0184 [svn-r3781] Purpose:
Update
Description:
    Changed

        #include <hdf_file.h>

    construct to

        #include "hdf_file.h"

    so that the GNU compiler can more easily pick up the dependencies
    which it places in the .depend and Dependencies files. Also
    regenerated the Dependencies to go along with this.
Platforms tested:
    Linux
2001-04-05 12:29:14 -05:00
Quincey Koziol
35bc545296 [svn-r3252] Purpose:
Code cleanup.
Description:
    Fixed _lots_ (I mean _tons_) of warnings spit out by the gcc with the
    extra warnings.  Including a few show-stoppers for compression on IRIX
    machines.
Solution:
    Changed lots of variables' types to more sensible and consistent types,
    more range-checking, more variable typecasts, etc.
Platforms tested:
    FreeBSD 4.2 (hawkwind), IRIX64-64 (modi4)
2001-01-09 16:22:30 -05:00
Quincey Koziol
6aa0dd1620 [svn-r2722] Purpose:
Feature symmetry
Description:
    A while ago I needed to get the 'type' of data being accessed during writes
    to the VFL driver, so I put in code to get the information down there.
    Albert asked for the same information during reads, so I've added that in.
Tested:
    FreeBSD 4.1.1 (hawkwind)
2000-10-24 13:18:09 -05:00
Quincey Koziol
ba28c64ba7 [svn-r2652] Purpose:
Maintainance & performance enhancements
Description:
    Re-arranged header files to protect private symbols better.

    Changed optimized regular hyperslab I/O to compute the offsets more
    efficiently from previous method of using matrix operations.

    Added sequential I/O operations at a more abstract level (at the same level
    as H5F_arr_read/write), to support the optimized hyperslab I/O.

Platforms tested:
    Solaris 2.6 (baldric) & FreeBSD 4.1.1 (hawkwind)
2000-10-10 02:43:38 -05:00