[svn-r2208] Big.html --> BigDataSmMach.html

Coding.html --> NamingScheme.html CodeReview.html ExternalFiles.html compat.html --> H4-H5Compat.html heap.txt --> HeapMgmt.html IOPipe.html Lib_Maint.html --> LibMaint.html MemoryManagement.html move.html --> MoveDStruct.html ObjectHeader.txt storage.html --> RawDStorage.html symtab --> SymbolTables.html Version.html Above files moved from doc/html/ to doc/html/TechNotes/ for into new "HDF5 Technical Notes" document. Filenames changed as indicated.
2000-05-01 16:31:11 -05:00
parent 74f1fc208d
commit 7749127d80
14 changed files with 2964 additions and 0 deletions
--- a/doc/html/TechNotes/BigDataSmMach.html
+++ b/doc/html/TechNotes/BigDataSmMach.html
@@ -0,0 +1,122 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Big Datasets on Small Machines</title>
+  </head>
+
+  <body>
+    <h1>Big Datasets on Small Machines</h1>
+
+    <h2>1. Introduction</h2>
+
+    <p>The HDF5 library is able to handle files larger than the
+      maximum file size, and datasets larger than the maximum memory
+      size.  For instance, a machine where <code>sizeof(off_t)</code>
+      and <code>sizeof(size_t)</code> are both four bytes can handle
+      datasets and files as large as 18x10^18 bytes.  However, most
+      Unix systems limit the number of concurrently open files, so a
+      practical file size limit is closer to 512GB or 1TB.
+
+    <p>Two "tricks" must be imployed on these small systems in order
+      to store large datasets.  The first trick circumvents the
+      <code>off_t</code> file size limit and the second circumvents
+      the <code>size_t</code> main memory limit.
+
+    <h2>2. File Size Limits</h2>
+
+    <p>Systems that have 64-bit file addresses will be able to access
+      those files automatically.  One should see the following output
+      from configure:
+
+    <p><code><pre>
+checking size of off_t... 8
+    </pre></code>
+
+    <p>Also, some 32-bit operating systems have special file systems
+      that can support large (&gt;2GB) files and HDF5 will detect
+      these and use them automatically.  If this is the case, the
+      output from configure will show:
+
+    <p><code><pre>
+checking for lseek64... yes
+checking for fseek64... yes
+    </pre></code>
+
+    <p>Otherwise one must use an HDF5 file family.  Such a family is
+      created by setting file family properties in a file access
+      property list and then supplying a file name that includes a
+      <code>printf</code>-style integer format.  For instance:
+
+    <p><code><pre>
+hid_t plist, file;
+plist = H5Pcreate (H5P_FILE_ACCESS);
+H5Pset_family (plist, 1&lt;&lt;30, H5P_DEFAULT);
+file = H5Fcreate ("big%03d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, plist);
+    </code></pre>
+
+    <p>The second argument (<code>1&lt;&lt;30</code>) to
+      <code>H5Pset_family()</code> indicates that the family members
+      are to be 2^30 bytes (1GB) each although we could have used any
+      reasonably large value.  In general, family members cannot be
+      2GB because writes to byte number 2,147,483,647 will fail, so
+      the largest safe value for a family member is 2,147,483,647.
+      HDF5 will create family members on demand as the HDF5 address
+      space increases, but since most Unix systems limit the number of
+      concurrently open files the effective maximum size of the HDF5
+      address space will be limited (the system on which this was
+      developed allows 1024 open files, so if each family member is
+      approx 2GB then the largest HDF5 file is approx 2TB).
+
+    <p>If the effective HDF5 address space is limited then one may be
+      able to store datasets as external datasets each spanning
+      multiple files of any length since HDF5 opens external dataset
+      files one at a time.  To arrange storage for a 5TB dataset split
+      among 1GB files one could say:
+
+    <p><code><pre>
+hid_t plist = H5Pcreate (H5P_DATASET_CREATE);
+for (i=0; i&lt;5*1024; i++) {
+   sprintf (name, "velocity-%04d.raw", i);
+   H5Pset_external (plist, name, 0, (size_t)1&lt;&lt;30);
+}
+    </code></pre>
+
+    <h2>3. Dataset Size Limits</h2>
+
+    <p>The second limit which must be overcome is that of
+      <code>sizeof(size_t)</code>.  HDF5 defines a data type called
+      <code>hsize_t</code> which is used for sizes of datasets and is,
+      by default, defined as <code>unsigned long long</code>.
+
+    <p>To create a dataset with 8*2^30 4-byte integers for a total of
+      32GB one first creates the dataspace.  We give two examples
+      here: a 4-dimensional dataset whose dimension sizes are smaller
+      than the maximum value of a <code>size_t</code>, and a
+      1-dimensional dataset whose dimension size is too large to fit
+      in a <code>size_t</code>.
+
+    <p><code><pre>
+hsize_t size1[4] = {8, 1024, 1024, 1024};
+hid_t space1 = H5Screate_simple (4, size1, size1);
+
+hsize_t size2[1] = {8589934592LL};
+hid_t space2 = H5Screate_simple (1, size2, size2};
+    </pre></code>
+
+    <p>However, the <code>LL</code> suffix is not portable, so it may
+      be better to replace the number with
+      <code>(hsize_t)8*1024*1024*1024</code>.
+
+    <p>For compilers that don't support <code>long long</code> large
+      datasets will not be possible.  The library performs too much
+      arithmetic on <code>hsize_t</code> types to make the use of a
+      struct feasible.
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Fri Apr 10 13:26:04 EDT 1998 -->
+<!-- hhmts start -->
+Last modified: Sun Jul 19 11:37:25 EDT 1998
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/CodeReview.html
+++ b/doc/html/TechNotes/CodeReview.html
@@ -0,0 +1,300 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Code Review</title>
+  </head>
+  <body>
+    <center><h1>Code Review 1</h1></center>
+
+    <h3>Some background...</h3>
+    <p>This is one of the functions exported from the
+      <code>H5B.c</code> file that implements a B-link-tree class
+      without worrying about concurrency yet (thus the `Note:' in the
+      function prologue). The <code>H5B.c</code> file provides the
+      basic machinery for operating on generic B-trees, but it isn't
+      much use by itself. Various subclasses of the B-tree (like
+      symbol tables or indirect storage) provide their own interface
+      and back end to this function.  For instance,
+      <code>H5G_stab_find()</code> takes a symbol table OID and a name
+      and calls <code>H5B_find()</code> with an appropriate
+      <code>udata</code> argument that eventually gets passed to the
+      <code>H5G_stab_find()</code> function.
+
+    <p><code><pre>
+ 1 /*-------------------------------------------------------------------------
+ 2  * Function:    H5B_find
+ 3  *
+ 4  * Purpose:     Locate the specified information in a B-tree and return
+ 5  *              that information by filling in fields of the caller-supplied
+ 6  *              UDATA pointer depending on the type of leaf node
+ 7  *              requested.  The UDATA can point to additional data passed
+ 8  *              to the key comparison function.
+ 9  *
+10  * Note:        This function does not follow the left/right sibling
+11  *              pointers since it assumes that all nodes can be reached
+12  *              from the parent node.
+13  *
+14  * Return:      Success:        SUCCEED if found, values returned through the
+15  *                              UDATA argument.
+16  *
+17  *              Failure:        FAIL if not found, UDATA is undefined.
+18  *
+19  * Programmer:  Robb Matzke
+20  *              matzke@llnl.gov
+21  *              Jun 23 1997
+22  *
+23  * Modifications:
+24  *
+25  *-------------------------------------------------------------------------
+26  */
+27 herr_t
+28 H5B_find (H5F_t *f, const H5B_class_t *type, const haddr_t *addr, void *udata)
+29 {
+30    H5B_t        *bt=NULL;
+31    intn         idx=-1, lt=0, rt, cmp=1;
+32    int          ret_value = FAIL;
+    </pre></code>
+
+    <p>All pointer arguments are initialized when defined. I don't
+      worry much about non-pointers because it's usually obvious when
+      the value isn't initialized.
+
+    <p><code><pre>
+33 
+34    FUNC_ENTER (H5B_find, NULL, FAIL);
+35 
+36    /*
+37     * Check arguments.
+38     */
+39    assert (f);
+40    assert (type);
+41    assert (type->decode);
+42    assert (type->cmp3);
+43    assert (type->found);
+44    assert (addr && H5F_addr_defined (addr));
+    </pre></code>
+
+    <p>I use <code>assert</code> to check invariant conditions. At
+      this level of the library, none of these assertions should fail
+      unless something is majorly wrong.  The arguments should have
+      already been checked by higher layers.  It also provides
+      documentation about what arguments might be optional.
+
+    <p><code><pre>
+45    
+46    /*
+47     * Perform a binary search to locate the child which contains
+48     * the thing for which we're searching.
+49     */
+50    if (NULL==(bt=H5AC_protect (f, H5AC_BT, addr, type, udata))) {
+51       HGOTO_ERROR (H5E_BTREE, H5E_CANTLOAD, FAIL);
+52    }
+    </pre></code>
+
+    <p>You'll see this quite often in the low-level stuff and it's
+      documented in the <code>H5AC.c</code> file.  The
+      <code>H5AC_protect</code> insures that the B-tree node (which
+      inherits from the H5AC package) whose OID is <code>addr</code>
+      is locked into memory for the duration of this function (see the
+      <code>H5AC_unprotect</code> on line 90).  Most likely, if this
+      node has been accessed in the not-to-distant past, it will still
+      be in memory and the <code>H5AC_protect</code> is almost a
+      no-op. If cache debugging is compiled in, then the protect also
+      prevents other parts of the library from accessing the node
+      while this function is protecting it, so this function can allow
+      the node to be in an inconsistent state while calling other
+      parts of the library.
+
+    <p>The alternative is to call the slighlty cheaper
+      <code>H5AC_find</code> and assume that the pointer it returns is
+      valid only until some other library function is called, but
+      since we're accessing the pointer throughout this function, I
+      chose to use the simpler protect scheme. All protected objects
+      <em>must be unprotected</em> before the file is closed, thus the
+      use of <code>HGOTO_ERROR</code> instead of
+      <code>HRETURN_ERROR</code>.
+
+    <p><code><pre>
+53    rt = bt->nchildren;
+54 
+55    while (lt&lt;rt && cmp) {
+56       idx = (lt + rt) / 2;
+57       if (H5B_decode_keys (f, bt, idx)&lt;0) {
+58          HGOTO_ERROR (H5E_BTREE, H5E_CANTDECODE, FAIL);
+59       }
+60 
+61       /* compare */
+62       if ((cmp=(type-&gt;cmp3)(f, bt->key[idx].nkey, udata,
+63                             bt->key[idx+1].nkey))&lt;0) {
+64          rt = idx;
+65       } else {
+66          lt = idx+1;
+67       }
+68    }
+69    if (cmp) {
+70       HGOTO_ERROR (H5E_BTREE, H5E_NOTFOUND, FAIL);
+71    }
+    </pre></code>
+
+    <p>Code is arranged in paragraphs with a comment starting each
+    paragraph. The previous paragraph is a standard binary search
+    algorithm. The <code>(type-&gt;cmp3)()</code> is an indirect
+    function call into the subclass of the B-tree.  All indirect
+    function calls have the function part in parentheses to document
+    that it's indirect (quite obvious here, but not so obvious when
+    the function is a variable).
+
+    <p>It's also my standard practice to have side effects in
+      conditional expressions because I can write code faster and it's
+      more apparent to me what the condition is testing.  But if I
+      have an assignment in a conditional expr, then I use an extra
+      set of parens even if they're not required (usually they are, as
+      in this case) so it's clear that I meant <code>=</code> instead
+      of <code>==</code>.
+
+    <p><code><pre>
+72 
+73    /*
+74     * Follow the link to the subtree or to the data node.
+75     */
+76    assert (idx&gt;=0 && idx<bt->nchildren);
+77    if (bt->level > 0) {
+78       if ((ret_value = H5B_find (f, type, bt->child+idx, udata))&lt;0) {
+79          HGOTO_ERROR (H5E_BTREE, H5E_NOTFOUND, FAIL);
+80       }
+81    } else {
+82       ret_value = (type-&gt;found)(f, bt->child+idx, bt->key[idx].nkey,
+83                                 udata, bt->key[idx+1].nkey);
+84       if (ret_value&lt;0) {
+85          HGOTO_ERROR (H5E_BTREE, H5E_NOTFOUND, FAIL);
+86       }
+87    }
+    </pre></code>
+
+    <p>Here I broke the "side effect in conditional" rule, which I
+      sometimes do if the expression is so long that the
+      <code>&lt;0</code> gets lost at the end.  Another thing to note is
+      that success/failure is always determined by comparing with zero
+      instead of <code>SUCCEED</code> or <code>FAIL</code>. I do this
+      because occassionally one might want to return other meaningful
+      values (always non-negative) or distinguish between various types of
+      failure (always negative).
+
+    <p><code><pre>
+88 
+89 done:
+90    if (bt && H5AC_unprotect (f, H5AC_BT, addr, bt)&lt;0) {
+91       HRETURN_ERROR (H5E_BTREE, H5E_PROTECT, FAIL);
+92    }
+93    FUNC_LEAVE (ret_value);
+94 }
+    </pre></code>
+
+    <p>For lack of a better way to handle errors during error cleanup,
+      I just call the <code>HRETURN_ERROR</code> macro even though it
+      will make the error stack not quite right.  I also use short
+      circuiting boolean operators instead of nested <code>if</code>
+      statements since that's standard C practice.
+
+      <center><h1>Code Review 2</h1></center>
+
+
+    <p>The following code is an API function from the H5F package...
+
+    <p><code><pre>
+ 1 /*--------------------------------------------------------------------------
+ 2  NAME
+ 3     H5Fflush
+ 4 
+ 5  PURPOSE
+ 6     Flush all cached data to disk and optionally invalidates all cached
+ 7     data.
+ 8 
+ 9  USAGE
+10     herr_t H5Fflush(fid, invalidate)
+11         hid_t fid;              IN: File ID of file to close.
+12         hbool_t invalidate;     IN: Invalidate all of the cache?
+13 
+14  ERRORS
+15     ARGS      BADTYPE       Not a file atom. 
+16     ATOM      BADATOM       Can't get file struct. 
+17     CACHE     CANTFLUSH     Flush failed. 
+18 
+19  RETURNS
+20     SUCCEED/FAIL
+21 
+22  DESCRIPTION
+23         This function flushes all cached data to disk and, if INVALIDATE
+24     is non-zero, removes cached objects from the cache so they must be
+25     re-read from the file on the next access to the object.
+26 
+27  MODIFICATIONS:
+28 --------------------------------------------------------------------------*/
+    </pre></code>
+
+    <p>An API prologue is used for each API function instead of my
+      normal function prologue. I use the prologue from Code Review 1
+      for non-API functions because it's more suited to C programmers,
+      it requires less work to keep it synchronized with the code, and
+      I have better editing tools for it.
+
+    <p><code><pre>
+29 herr_t
+30 H5Fflush (hid_t fid, hbool_t invalidate)
+31 {
+32    H5F_t        *file = NULL;
+33 
+34    FUNC_ENTER (H5Fflush, H5F_init_interface, FAIL);
+35    H5ECLEAR;
+    </pre></code>
+
+    <p>API functions are never called internally, therefore I always
+      clear the error stack before doing anything.
+
+    <p><code><pre>
+36 
+37    /* check arguments */
+38    if (H5_FILE!=H5Aatom_group (fid)) {
+39       HRETURN_ERROR (H5E_ARGS, H5E_BADTYPE, FAIL); /*not a file atom*/
+40    }
+41    if (NULL==(file=H5Aatom_object (fid))) {
+42       HRETURN_ERROR (H5E_ATOM, H5E_BADATOM, FAIL); /*can't get file struct*/
+43    }
+    </pre></code>
+
+    <p>If something is wrong with the arguments then we raise an
+      error.  We never <code>assert</code> arguments at this level.
+      We also convert atoms to pointers since atoms are really just a
+      pointer-hiding mechanism.  Functions that can be called
+      internally always have pointer arguments instead of atoms
+      because (1) then they don't have to always convert atoms to
+      pointers, and (2) the various pointer data types provide more
+      documentation and type checking than just an <code>hid_t</code>
+      type.
+
+    <p><code><pre>
+44 
+45    /* do work */
+46    if (H5F_flush (file, invalidate)&lt;0) {
+47       HRETURN_ERROR (H5E_CACHE, H5E_CANTFLUSH, FAIL); /*flush failed*/
+48    }
+    </pre></code>
+
+    <p>An internal version of the function does the real work.  That
+      internal version calls <code>assert</code> to check/document
+      it's arguments and can be called from other library functions.
+
+    <p><code><pre>
+49 
+50    FUNC_LEAVE (SUCCEED);
+51 }
+    </pre></code>
+    
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Sat Nov  8 17:09:33 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Mon Nov 10 15:33:33 EST 1997
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/ExternalFiles.html
+++ b/doc/html/TechNotes/ExternalFiles.html
@@ -0,0 +1,279 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>External Files in HDF5</title>
+  </head>
+
+  <body>
+    <center><h1>External Files in HDF5</h1></center>
+
+    <h3>Overview of Layers</h3>
+
+    <p>This table shows some of the layers of HDF5.  Each layer calls
+      functions at the same or lower layers and never functions at
+      higher layers.  An object identifier (OID) takes various forms
+      at the various layers: at layer 0 an OID is an absolute physical
+      file address; at layers 1 and 2 it's an absolute virtual file
+      address. At layers 3 through 6 it's a relative address, and at
+      layers 7 and above it's an object handle.
+
+    <p><center>
+	<table border cellpadding=4 width="60%">
+	  <tr align=center>
+	    <td>Layer-7</td>
+	    <td>Groups</td>
+	    <td>Datasets</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-6</td>
+	    <td>Indirect Storage</td>
+	    <td>Symbol Tables</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-5</td>
+	    <td>B-trees</td>
+	    <td>Object Hdrs</td>
+	    <td>Heaps</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-4</td>
+	    <td>Caching</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-3</td>
+	    <td>H5F chunk I/O</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-2</td>
+	    <td>H5F low</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-1</td>
+	    <td>File Family</td>
+	    <td>Split Meta/Raw</td>
+	  </tr>
+	  <tr align=center>
+	    <td>Layer-0</td>
+	    <td>Section-2 I/O</td>
+	    <td>Standard I/O</td>
+	    <td>Malloc/Free</td>
+	  </tr>
+	</table>
+      </center>
+
+    <h3>Single Address Space</h3>
+
+    <p>The simplest form of hdf5 file is a single file containing only
+      hdf5 data. The file begins with the boot block, which is
+      followed until the end of the file by hdf5 data.  The next most
+      complicated file allows non-hdf5 data (user defined data or
+      internal wrappers) to appear before the boot block and after the
+      end of the hdf5 data.  The hdf5 data is treated as a single
+      linear address space in both cases.
+
+    <p>The next level of complexity comes when non-hdf5 data is
+      interspersed with the hdf5 data.  We handle that by including
+      the non-hdf5 interspersed data in the hdf5 address space and
+      simply not referencing it (eventually we might add those
+      addresses to a "do-not-disturb" list using the same mechanism as
+      the hdf5 free list, but it's not absolutely necessary).  This is
+      implemented except for the "do-not-disturb" list.
+
+    <p>The most complicated single address space hdf5 file is when we
+      allow the address space to be split among multiple physical
+      files. For instance, a >2GB file can be split into smaller
+      chunks and transfered to a 32 bit machine, then accessed as a
+      single logical hdf5 file.  The library already supports >32 bit
+      addresses, so at layer 1 we split a 64-bit address into a 32-bit
+      file number and a 32-bit offset (the 64 and 32 are
+      arbitrary). The rest of the library still operates with a linear
+      address space.
+
+    <p>Another variation might be a family of two files where all the
+      meta data is stored in one file and all the raw data is stored
+      in another file to allow the HDF5 wrapper to be easily replaced
+      with some other wrapper.
+
+    <p>The <code>H5Fcreate</code> and <code>H5Fopen</code> functions
+      would need to be modified to pass file-type info down to layer 2
+      so the correct drivers can be called and parameters passed to
+      the drivers to initialize them.
+      
+    <h4>Implementation</h4>
+
+    <p>I've implemented fixed-size family members.  The entire hdf5
+      file is partitioned into members where each member is the same
+      size.  The family scheme is used if one passes a name to
+      <code>H5F_open</code> (which is called by <code>H5Fopen()</code>
+      and <code>H5Fcreate</code>) that contains a
+      <code>printf(3c)</code>-style integer format specifier.
+      Currently, the default low-level file driver is used for all
+      family members (H5F_LOW_DFLT, usually set to be Section 2 I/O or
+      Section 3 stdio), but we'll probably eventually want to pass
+      that as a parameter of the file access property list, which
+      hasn't been implemented yet.  When creating a family, a default
+      family member size is used (defined at the top H5Ffamily.c,
+      currently 64MB) but that also should be settable in the file
+      access property list. When opening an existing family, the size
+      of the first member is used to determine the member size
+      (flushing/closing a family ensures that the first member is the
+      correct size) but the other family members don't have to be that
+      large (the local address space, however, is logically the same
+      size for all members).
+
+    <p>I haven't implemented a split meta/raw family yet but am rather
+      curious to see how it would perform. I was planning to use the
+      `.h5' extension for the meta data file and `.raw' for the raw
+      data file.  The high-order bit in the address would determine
+      whether the address refers to meta data or raw data. If the user
+      passes a name that ends with `.raw' to <code>H5F_open</code>
+      then we'll chose the split family and use the default low level
+      driver for each of the two family members.  Eventually we'll
+      want to pass these kinds of things through the file access
+      property list instead of relying on naming convention.
+
+    <h3>External Raw Data</h3>
+
+    <p>We also need the ability to point to raw data that isn't in the
+      HDF5 linear address space.  For instance, a dataset might be
+      striped across several raw data files.
+
+    <p>Fortunately, the only two packages that need to be aware of
+      this are the packages for reading/writing contiguous raw data
+      and discontiguous raw data.  Since contiguous raw data is a
+      special case, I'll discuss how to implement external raw data in
+      the discontiguous case.
+
+    <p>Discontiguous data is stored as a B-tree whose keys are the
+      chunk indices and whose leaf nodes point to the raw data by
+      storing a file address. So what we need is some way to name the
+      external files, and a way to efficiently store the external file
+      name for each chunk.
+
+    <p>I propose adding to the object header an <em>External File
+	List</em> message that is a 1-origin array of file names.
+      Then, in the B-tree, each key has an index into the External
+      File List (or zero for the HDF5 file) for the file where the
+      chunk can be found. The external file index is only used at
+      the leaf nodes to get to the raw data (the entire B-tree is in
+      the HDF5 file) but because of the way keys are copied among
+      the B-tree nodes, it's much easier to store the index with
+      every key.
+
+    <h3>Multiple HDF5 Files</h3>
+
+    <p>One might also want to combine two or more HDF5 files in a
+      manner similar to mounting file systems in Unix.  That is, the
+      group structure and meta data from one file appear as though
+      they exist in the first file.  One opens File-A, and then
+      <em>mounts</em> File-B at some point in File-A, the <em>mount
+      point</em>, so that traversing into the mount point actually
+      causes one to enter the root object of File-B.  File-A and
+      File-B are each complete HDF5 files and can be accessed
+      individually without mounting them.
+
+    <p>We need a couple additional pieces of machinery to make this
+      work.  First, an haddr_t type (a file address) doesn't contain
+      any info about which HDF5 file's address space the address
+      belongs to.  But since haddr_t is an opaque type except at
+      layers 2 and below, it should be quite easy to add a pointer to
+      the HDF5 file.  This would also remove the H5F_t argument from
+      most of the low-level functions since it would be part of the
+      OID.
+
+    <p>The other thing we need is a table of mount points and some
+      functions that understand them.  We would add the following
+      table to each H5F_t struct:
+
+    <p><code><pre>
+struct H5F_mount_t {
+   H5F_t *parent;         /* Parent HDF5 file if any */
+   struct {
+      H5F_t *f;           /* File which is mounted */
+      haddr_t where;      /* Address of mount point */
+   } *mount;              /* Array sorted by mount point */
+   intn nmounts;          /* Number of mounted files */
+   intn alloc;            /* Size of mount table */
+}
+    </pre></code>
+
+    <p>The <code>H5Fmount</code> function takes the ID of an open
+      file or group, the name of a to-be-mounted file, the name of the mount
+      point, and a file access property list (like <code>H5Fopen</code>).
+      It opens the new file and adds a record to the parent's mount
+      table.  The <code>H5Funmount</code> function takes the parent
+      file or group ID and the name of the mount point and disassociates
+      the mounted file from the mount point.  It does not close the 
+      mounted file.  The <code>H5Fclose</code>
+      function closes/unmounts files recursively.
+
+    <p>The <code>H5G_iname</code> function which translates a name to
+      a file address (<code>haddr_t</code>) looks at the mount table
+      at each step in the translation and switches files where
+      appropriate.  All name-to-address translations occur through
+      this function.
+
+    <h3>How Long?</h3>
+
+    <p>I'm expecting to be able to implement the two new flavors of
+      single linear address space in about two days. It took two hours
+      to implement the malloc/free file driver at level zero and I
+      don't expect this to be much more work.
+
+    <p>I'm expecting three days to implement the external raw data for
+      discontiguous arrays.  Adding the file index to the B-tree is
+      quite trivial; adding the external file list message shouldn't
+      be too hard since the object header message class from wich this
+      message derives is fully implemented; and changing
+      <code>H5F_istore_read</code> should be trivial.  Most of the
+      time will be spent designing a way to cache Unix file
+      descriptors efficiently since the total number open files
+      allowed per process could be much smaller than the total number
+      of HDF5 files and external raw data files.
+
+    <p>I'm expecting four days to implement being able to mount one
+      HDF5 file on another.  I was originally planning a lot more, but
+      making <code>haddr_t</code> opaque turned out to be much easier
+      than I planned (I did it last Fri).  Most of the work will
+      probably be removing the redundant H5F_t arguments for lots of
+      functions.
+
+    <h3>Conclusion</h3>
+
+    <p>The external raw data could be implemented as a single linear
+      address space, but doing so would require one to allocate large
+      enough file addresses throughout the file (>32bits) before the
+      file was created.  It would make mixing an HDF5 file family with
+      external raw data, or external HDF5 wrapper around an HDF4 file
+      a more difficult process. So I consider the implementation of
+      external raw data files as a single HDF5 linear address space a
+      kludge.
+
+    <p>The ability to mount one HDF5 file on another might not be a
+      very important feature especially since each HDF5 file must be a
+      complete file by itself.  It's not possible to stripe an array
+      over multiple HDF5 files because the B-tree wouldn't be complete
+      in any one file, so the only choice is to stripe the array
+      across multiple raw data files and store the B-tree in the HDF5
+      file.  On the other hand, it might be useful if one file
+      contains some public data which can be mounted by other files
+      (e.g., a mesh topology shared among collaborators and mounted by
+      files that contain other fields defined on the mesh).  Of course
+      the applications can open the two files separately, but it might
+      be more portable if we support it in the library.
+
+    <p>So we're looking at about two weeks to implement all three
+      versions.  I didn't get a chance to do any of them in AIO
+      although we had long-term plans for the first two with a
+      possibility of the third. They'll be much easier to implement in
+      HDF5 than AIO since I've been keeping these in mind from the
+      start.
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Sat Nov  8 18:08:52 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Tue Sep  8 14:43:32 EDT 1998
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/H4-H5Compat.html
+++ b/doc/html/TechNotes/H4-H5Compat.html
@@ -0,0 +1,271 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Backward/Forward Compatability</title>
+  </head>
+
+  <body>
+    <h1>Backward/Forward Compatability</h1>
+
+    <p>The HDF5 development must proceed in such a manner as to
+      satisfy the following conditions:
+
+    <ol type=A>
+      <li>HDF5 applications can produce data that HDF5
+	applications can read and write and HDF4 applications can produce
+	data that HDF4 applications can read and write. The situation
+	that demands this condition is obvious.</li>
+
+      <li>HDF5 applications are able to produce data that HDF4 applications
+	can read and HDF4 applications can subsequently modify the
+	file subject to certain constraints depending on the
+	implementation. This condition is for the temporary
+	situation where a consumer has neither been relinked with a new
+	HDF4 API built on top of the HDF5 API nor recompiled with the
+	HDF5 API.</li>
+
+      <li>HDF5 applications can read existing HDF4 files and subsequently
+	modify the file subject to certain constraints depending on
+	the implementation. This is condition is for the temporary
+	situation in which the producer has neither been relinked with a
+	new HDF4 API built on top of the HDF5 API nor recompiled with
+	the HDF5 API, or the permanent situation of HDF5 consumers
+	reading archived HDF4 files.</li>
+    </ul>
+
+    <p>There's at least one invarient: new object features introduced
+      in the HDF5 file format (like 2-d arrays of structs) might be
+      impossible to "translate" to a format that an old HDF4
+      application can understand either because the HDF4 file format
+      or the HDF4 API has no mechanism to describe the object.
+
+    <p>What follows is one possible implementation based on how
+      Condition B was solved in the AIO/PDB world.  It also attempts
+      to satisfy these goals:
+
+    <ol type=1>
+      <li>The main HDF5 library contains as little extra baggage as
+	possible by either relying on external programs to take care
+	of compatability issues or by incorporating the logic of such
+	programs as optional modules in the HDF5 library.  Conditions B
+	and C are separate programs/modules.</li>
+
+      <li>No extra baggage not only means the library proper is small,
+	but also means it can be implemented (rather than migrated
+	from HDF4 source) from the ground up with minimal regard for
+	HDF4 thus keeping the logic straight forward.</li>
+
+      <li>Compatability issues are handled behind the scenes when
+	necessary (and possible) but can be carried out explicitly
+	during things like data migration.</li>
+    </ol>
+
+    <hr>
+    <h2>Wrappers</h2>
+
+    <p>The proposed implementation uses <i>wrappers</i> to handle
+      compatability issues.  A Format-X file is <i>wrapped</i> in a
+      Format-Y file by creating a Format-Y skeleton that replicates
+      the Format-X meta data.  The Format-Y skeleton points to the raw
+      data stored in Format-X without moving the raw data.  The
+      restriction is that raw data storage methods in Format-Y is a
+      superset of raw data storage methods in Format-X (otherwise the
+      raw data must be copied to Format-Y).  We're assuming that meta
+      data is small wrt the entire file.
+
+    <p>The wrapper can be a separate file that has pointers into the
+      first file or it can be contained within the first file.  If
+      contained in a single file, the file can appear as a Format-Y
+      file or simultaneously a Format-Y and Format-X file.
+
+    <p>The Format-X meta-data can be thought of as the original
+      wrapper around raw data and Format-Y is a second wrapper around
+      the same data.  The wrappers are independend of one another;
+      modifying the meta-data in one wrapper causes the other to
+      become out of date.  Modification of raw data doesn't invalidate
+      either view as long as the meta data that describes its storage
+      isn't modifed. For instance, an array element can change values
+      if storage is already allocated for the element, but if storage
+      isn't allocated then the meta data describing the storage must
+      change, invalidating all wrappers but one.
+
+    <p>It's perfectly legal to modify the meta data of one wrapper
+      without modifying the meta data in the other wrapper(s).  The
+      illegal part is accessing the raw data through a wrapper which
+      is out of date.
+
+    <p>If raw data is wrapped by more than one internal wrapper
+      (<i>internal</i> means that the wrapper is in the same file as
+      the raw data) then access to that file must assume that
+      unreferenced parts of that file contain meta data for another
+      wrapper and cannot be reclaimed as free memory.
+
+    <hr>
+    <h2>Implementation of Condition B</h2>
+
+    <p>Since this is a temporary situation which can't be
+      automatically detected by the HDF5 library, we must rely
+      on the application to notify the HDF5 library whether or not it
+      must satisfy Condition B. (Even if we don't rely on the
+      application, at some point someone is going to remove the
+      Condition B constraint from the library.)  So the module that
+      handles Condition B is conditionally compiled and then enabled
+      on a per-file basis.
+
+    <p>If the application desires to produce an HDF4 file (determined
+      by arguments to <code>H5Fopen</code>), and the Condition B
+      module is compiled into the library, then <code>H5Fclose</code>
+      calls the module to traverse the HDF5 wrapper and generate an
+      additional internal or external HDF4 wrapper (wrapper specifics
+      are described below).  If Condition B is implemented as a module
+      then it can benefit from the metadata already cached by the main
+      library.
+
+    <p>An internal HDF4 wrapper would be used if the HDF5 file is
+      writable and the user doesn't mind that the HDF5 file is
+      modified.  An external wrapper would be used if the file isn't
+      writable or if the user wants the data file to be primarily HDF5
+      but a few applications need an HDF4 view of the data.
+
+    <p>Modifying through the HDF5 library an HDF5 file that has
+      internal HDF4 wrapper should invalidate the HDF4 wrapper (and
+      optionally regenerate it when <code>H5Fclose</code> is
+      called). The HDF5 library must understand how wrappers work, but
+      not necessarily anything about the HDF4 file format.
+
+    <p>Modifying through the HDF5 library an HDF5 file that has an
+      external HDF4 wrapper will cause the HDF4 wrapper to become out
+      of date (but possibly regenerated during <code>H5Fclose</code>).
+      <b>Note:  Perhaps the next release of the HDF4 library should
+      insure that the HDF4 wrapper file has a more recent modification
+      time than the raw data file (the HDF5 file) to which it
+      points(?)</b>
+
+    <p>Modifying through the HDF4 library an HDF5 file that has an
+      internal or external HDF4 wrapper will cause the HDF5 wrapper to
+      become out of date. However, there is now way for the old HDF4
+      library to notify the HDF5 wrapper that it's out of date.
+      Therefore the HDF5 library must be able to detect when the HDF5
+      wrapper is out of date and be able to fix it. If the HDF4
+      wrapper is complete then the easy way is to ignore the original
+      HDF5 wrapper and generate a new one from the HDF4 wrapper. The
+      other approach is to compare the HDF4 and HDF5 wrappers and
+      assume that if they differ HDF4 is the right one, if HDF4 omits
+      data then it was because HDF4 is a partial wrapper (rather than
+      assume HDF4 deleted the data), and if HDF4 has new data then
+      copy the new meta data to the HDF5 wrapper. On the other hand,
+      perhaps we don't need to allow these situations (modifying an
+      HDF5 file with the old HDF4 library and then accessing it with
+      the HDF5 library is either disallowed or causes HDF5 objects
+      that can't be described by HDF4 to be lost).
+
+    <p>To convert an HDF5 file to an HDF4 file on demand, one simply
+      opens the file with the HDF4 flag and closes it. This is also
+      how AIO implemented backward compatability with PDB in its file
+      format.
+
+    <hr>
+    <h2>Implementation of Condition C</h2>
+
+    <p>This condition must be satisfied for all time because there
+      will always be archived HDF4 files. If a pure HDF4 file (that
+      is, one without HDF5 meta data) is opened with an HDF5 library,
+      the <code>H5Fopen</code> builds an internal or external HDF5
+      wrapper and then accesses the raw data through that wrapper. If
+      the HDF5 library modifies the file then the HDF4 wrapper becomes
+      out of date.  However, since the HDF5 library hasn't been
+      released, we can at least implement it to disable and/or reclaim
+      the HDF4 wrapper.
+
+    <p>If an external and temporary HDF5 wrapper is desired, the
+      wrapper is created through the cache like all other HDF5 files.
+      The data appears on disk only if a particular cached datum is
+      preempted. Instead of calling <code>H5Fclose</code> on the HDF5
+      wrapper file we call <code>H5Fabort</code> which immediately
+      releases all file resources without updating the file, and then
+      we unlink the file from Unix.
+
+    <hr>
+    <h2>What do wrappers look like?</h2>
+
+    <p>External wrappers are quite obvious: they contain only things
+      from the format specs for the wrapper and nothing from the
+      format specs of the format which they wrap.
+
+    <p>An internal HDF4 wrapper is added to an HDF5 file in such a way
+      that the file appears to be both an HDF4 file and an HDF5
+      file. HDF4 requires an HDF4 file header at file offset zero. If
+      a user block is present then we just move the user block down a
+      bit (and truncate it) and insert the minimum HDF4 signature.
+      The HDF4 <code>dd</code> list and any other data it needs are
+      appended to the end of the file and the HDF5 signature uses the
+      logical file length field to determine the beginning of the
+      trailing part of the wrapper.
+
+    <p>
+      <center>
+	<table border width="60%">
+	  <tr>
+	    <td>HDF4 minimal file header. Its main job is to point to
+	      the <code>dd</code> list at the end of the file.</td>
+	  </tr>
+	  <tr>
+	    <td>User-defined block which is truncated by the size of the
+	      HDF4 file header so that the HDF5 boot block file address
+	      doesn't change.</td>
+	  </tr>
+	  <tr>
+	    <td>The HDF5 boot block and data, unmodified by adding the
+	      HDF4 wrapper.</td>
+	  </tr>
+	  <tr>
+	    <td>The main part of the HDF4 wrapper.  The <code>dd</code>
+	      list will have entries for all parts of the file so
+	      hdpack(?) doesn't (re)move anything.</td>
+	  </tr>
+	</table>
+      </center>
+    
+    <p>When such a file is opened by the HDF5 library for
+      modification it shifts the user block back down to address zero
+      and fills with zeros, then truncates the file at the end of the
+      HDF5 data or adds the trailing HDF4 wrapper to the free
+      list. This prevents HDF4 applications from reading the file with
+      an out of date wrapper.
+
+    <p>If there is no user block then we have a problem.  The HDF5
+      boot block must be moved to make room for the HDF4 file header.
+      But moving just the boot block causes problems because all file
+      addresses stored in the file are relative to the boot block
+      address.  The only option is to shift the entire file contents
+      by 512 bytes to open up a user block (too bad we don't have
+      hooks into the Unix i-node stuff so we could shift the entire
+      file contents by the size of a file system page without ever
+      performing I/O on the file :-)
+
+    <p>Is it possible to place an HDF5 wrapper in an HDF4 file?  I
+      don't know enough about the HDF4 format, but I would suspect it
+      might be possible to open a hole at file address 512 (and
+      possibly before) by moving some things to the end of the file
+      to make room for the HDF5 signature.  The remainder of the HDF5
+      wrapper goes at the end of the file and entries are added to the
+      HDF4 <code>dd</code> list to mark the location(s) of the HDF5
+      wrapper.
+
+    <hr>
+    <h2>Other Thoughts</h2>
+
+    <p>Conversion programs that copy an entire HDF4 file to a separate,
+      self-contained HDF5 file and vice versa might be useful.
+
+
+
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Fri Oct  3 11:52:31 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Wed Oct  8 12:34:42 EST 1997
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/HeapMgmt.html
+++ b/doc/html/TechNotes/HeapMgmt.html
@@ -0,0 +1,79 @@
+<html>
+<body>
+
+<h1>Heap Management in HDF5</h1>
+
+<pre>
+
+Heap functions are in the H5H package.
+
+
+off_t
+H5H_new (hdf5_file_t *f, size_t size_hint, size_t realloc_hint);
+
+	Creates a new heap in the specified file which can efficiently
+	store at least SIZE_HINT bytes.  The heap can store more than
+	that, but doing so may cause the heap to become less efficient
+	(for instance, a heap implemented as a B-tree might become
+	discontigous).  The REALLOC_HINT is the minimum number of bytes
+	by which the heap will grow when it must be resized. The hints
+	may be zero in which case reasonable (but probably not
+	optimal) values will be chosen.
+
+	The return value is the address of the new heap relative to
+	the beginning of the file boot block.
+
+off_t
+H5H_insert (hdf5_file_t *f, off_t addr, size_t size, const void *buf);
+
+	Copies SIZE bytes of data from BUF into the heap whose address
+	is ADDR in file F.  BUF must be the _entire_ heap object.  The
+	return value is the byte offset of the new data in the heap.
+
+void *
+H5H_read (hdf5_file_t *f, off_t addr, off_t offset, size_t size, void *buf);
+
+	Copies SIZE bytes of data from the heap whose address is ADDR
+	in file F into BUF and then returns the address of BUF.  If
+	BUF is the null pointer then a new buffer will be malloc'd by
+	this function and its address is returned.
+
+	Returns buffer address or null.
+
+const void *
+H5H_peek (hdf5_file_t *f, off_t addr, off_t offset)
+
+	A more efficient version of H5H_read that returns a pointer
+	directly into the cache; the data is not copied from the cache
+	to a buffer.  The pointer is valid until the next call to an
+	H5AC function directly or indirectly.
+
+	Returns a pointer or null.  Do not free the pointer.
+
+void *
+H5H_write (hdf5_file_t *f, off_t addr, off_t offset, size_t size,
+           const void *buf);
+
+	Modifies (part of) an object in the heap at address ADDR of
+	file F by copying SIZE bytes from the beginning of BUF to the
+	file.  OFFSET is the address withing the heap where the output
+	is to occur.
+
+	This function can fail if the combination of OFFSET and SIZE
+	would write over a boundary between two heap objects.
+
+herr_t
+H5H_remove (hdf5_file_t *f, off_t addr, off_t offset, size_t size);
+
+	Removes an object or part of an object which begins at byte
+	OFFSET within a heap whose address is ADDR in file F.  SIZE
+	bytes are returned to the free list.  Removing the middle of
+	an object has the side effect that one object is now split
+	into two objects.
+
+	Returns success or failure.
+
+</pre>
+
+</body>
+</html>
--- a/doc/html/TechNotes/IOPipe.html
+++ b/doc/html/TechNotes/IOPipe.html
@@ -0,0 +1,114 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>The Raw Data I/O Pipeline</title>
+  </head>
+
+  <body>
+    <h1>The Raw Data I/O Pipeline</h1>
+
+    <p>The HDF5 raw data pipeline is a complicated beast that handles
+      all aspects of raw data storage and transfer of that data
+      between the file and the application.  Data can be stored
+      contiguously (internal or external), in variable size external
+      segments, or regularly chunked; it can be sparse, extendible,
+      and/or compressible. Data transfers must be able to convert from
+      one data space to another, convert from one number type to
+      another, and perform partial I/O operations. Furthermore,
+      applications will expect their common usage of the pipeline to
+      perform well.
+
+    <p>To accomplish these goals, the pipeline has been designed in a
+      modular way so no single subroutine is overly complicated and so
+      functionality can be inserted easily at the appropriate
+      locations in the pipeline.  A general pipeline was developed and
+      then certain paths through the pipeline were optimized for
+      performance.
+
+    <p>We describe only the file-to-memory side of the pipeline since
+      the memory-to-file side is a mirror image. We also assume that a
+      proper hyperslab of a simple data space is being read from the
+      file into a proper hyperslab of a simple data space in memory,
+      and that the data type is a compound type which may require
+      various number conversions on its members.
+
+      <img alt="Figure 1" src="pipe1.gif">
+
+    <p>The diagrams should be read from the top down. The Line A
+      in the figure above shows that <code>H5Dread()</code> copies
+      data from a hyperslab of a file dataset to a hyperslab of an
+      application buffer by calling <code>H5D_read()</code>. And
+      <code>H5D_read()</code> calls, in a loop,
+      <code>H5S_simp_fgath()</code>, <code>H5T_conv_struct()</code>,
+      and <code>H5S_simp_mscat()</code>. A temporary buffer, TCONV, is
+      loaded with data points from the file, then data type conversion
+      is performed on the temporary buffer, and finally data points
+      are scattered out to application memory. Thus, data type
+      conversion is an in-place operation and data space conversion
+      consists of two steps. An additional temporary buffer, BKG, is
+      large enough to hold <em>N</em> instances of the destination
+      data type where <em>N</em> is the same number of data points
+      that can be held by the TCONV buffer (which is large enough to
+      hold either source or destination data points).
+
+    <p>The application sets an upper limit for the size of the TCONV
+      buffer and optionally supplies a buffer. If no buffer is
+      supplied then one will be created by calling
+      <code>malloc()</code> when the pipeline is executed (when
+      necessary) and freed when the pipeline exits.  The size of the
+      BKG buffer depends on the size of the TCONV buffer and if the
+      application supplies a BKG buffer it should be at least as large
+      as the TCONV buffer.  The default size for these buffers is one
+      megabyte but the buffer might not be used to full capacity if
+      the buffer size is not an integer multiple of the source or
+      destination data point size (whichever is larger, but only
+      destination for the BKG buffer).
+
+
+
+    <p>Occassionally the destination data points will be partially
+      initialized and the <code>H5Dread()</code> operation should not
+      clobber those values.  For instance, the destination type might
+      be a struct with members <code>a</code> and <code>b</code> where
+      <code>a</code> is already initialized and we're reading
+      <code>b</code> from the file.  An extra line, G, is added to the
+      pipeline to provide the type conversion functions with the
+      existing data.
+
+      <img alt="Figure 2" src="pipe2.gif">
+
+    <p>It will most likely be quite common that no data type
+      conversion is necessary.  In such cases a temporary buffer for
+      data type conversion is not needed and data space conversion
+      can happen in a single step. In fact, when the source and
+      destination data are both contiguous (they aren't in the
+      picture) the loop degenerates to a single iteration.
+
+
+      <img alt="Figure 3" src="pipe3.gif">
+
+    <p>So far we've looked only at internal contiguous storage, but by
+      replacing Line B in Figures 1 and 2 and Line A in Figure 3 with
+      Figure 4 the pipeline is able to handle regularly chunked
+      objects. Line B of Figure 4 is executed once for each chunk
+      which contains data to be read and the chunk address is found by
+      looking at a multi-dimensional key in a chunk B-tree which has
+      one entry per chunk.
+
+      <img alt="Figure 4" src="pipe4.gif">
+
+    <p>If a single chunk is requested and the destination buffer is
+      the same size/shape as the chunk, then the CHUNK buffer is
+      bypassed and the destination buffer is used instead as shown in
+      Figure 5.
+
+      <img alt="Figure 5" src="pipe5.gif">
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Tue Mar 17 11:13:35 EST 1998 -->
+<!-- hhmts start -->
+Last modified: Wed Mar 18 10:38:30 EST 1998
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/LibMaint.html
+++ b/doc/html/TechNotes/LibMaint.html
@@ -0,0 +1,122 @@
+<html>
+<body>
+
+
+<h1>Information for HDF5 Maintainers</h1>
+
+<pre>
+
+* You can run make from any directory.  However, running in a
+  subdirectory only knows how to build things in that directory and
+  below.  However, all makefiles know when their target depends on
+  something outside the local directory tree:
+
+	$ cd test
+	$ make
+	make: *** No rule to make target ../src/libhdf5.a
+
+* All Makefiles understand the following targets:
+
+        all              -- build locally.
+        install          -- install libs, headers, progs.
+        uninstall        -- remove installed files.
+        mostlyclean      -- remove temp files (eg, *.o but not *.a).
+        clean            -- mostlyclean plus libs and progs.
+        distclean        -- all non-distributed files.
+        maintainer-clean -- all derived files but H5config.h.in and configure.
+
+* Most Makefiles also understand:
+
+	TAGS		-- build a tags table
+	dep, depend	-- recalculate source dependencies
+	lib		-- build just the libraries w/o programs
+
+* If you have personal preferences for which make, compiler, compiler
+  flags, preprocessor flags, etc., that you use and you don't want to
+  set environment variables, then use a site configuration file.
+
+  When configure starts, it looks in the config directory for files
+  whose name is some combination of the CPU name, vendor, and
+  operating system in this order:
+
+	CPU-VENDOR-OS
+	VENDOR-OS
+	CPU-VENDOR
+	OS
+	VENDOR
+	CPU
+
+  The first file which is found is sourced and can therefore affect
+  the behavior of the rest of configure. See config/BlankForm for the
+  template.
+
+* If you use GNU make along with gcc the Makefile will contain targets
+  that automatically maintain a list of source interdependencies; you
+  seldom have to say `make clean'.  I say `seldom' because if you
+  change how one `*.h' file includes other `*.h' files you'll have
+  to force an update.
+
+  To force an update of all dependency information remove the
+  `.depend' file from each directory and type `make'.  For
+  instance:
+
+	$ cd $HDF5_HOME
+	$ find . -name .depend -exec rm {} \;
+	$ make
+
+  If you're not using GNU make and gcc then dependencies come from
+  ".distdep" files in each directory.  Those files are generated on
+  GNU systems and inserted into the Makefile's by running
+  config.status (which happens near the end of configure).
+
+* If you use GNU make along with gcc then the Perl script `trace' is
+  run just before dependencies are calculated to update any H5TRACE()
+  calls that might appear in the file.  Otherwise, after changing the
+  type of a function (return type or argument types) one should run
+  `trace' manually on those source files (e.g., ../bin/trace *.c).
+
+* Object files stay in the directory and are added to the library as a
+  final step instead of placing the file in the library immediately
+  and removing it from the directory.  The reason is three-fold:
+
+	1.  Most versions of make don't allow `$(LIB)($(SRC:.c=.o))'
+	    which makes it necessary to have two lists of files, one
+	    that ends with `.c' and the other that has the library
+	    name wrapped around each `.o' file.
+
+	2.  Some versions of make/ar have problems with modification
+	    times of archive members.
+
+	3.  Adding object files immediately causes problems on SMP
+	    machines where make is doing more than one thing at a
+	    time.
+
+* When using GNU make on an SMP you can cause it to compile more than
+  one thing at a time.  At the top of the source tree invoke make as
+
+	$ make -j -l6
+
+  which causes make to fork as many children as possible as long as
+  the load average doesn't go above 6.  In subdirectories one can say
+
+	$ make -j2
+
+  which limits the number of children to two (this doesn't work at the
+  top level because the `-j2' is not passed to recursive makes).
+
+* To create a release tarball go to the top-level directory and run
+  ./bin/release.  You can optionally supply one or more of the words
+  `tar', `gzip', `bzip2' or `compress' on the command line.  The
+  result will be a (compressed) tar file(s) in the `releases'
+  directory.  The README file is updated to contain the release date
+  and version number.
+
+* To create a tarball of all the files which are part of HDF5 go to
+  the top-level directory and type:
+
+      tar cvf foo.tar `grep '^\.' MANIFEST |unexpand |cut -f1`
+
+</pre>
+
+</body>
+</html>
--- a/doc/html/TechNotes/MemoryMgmt.html
+++ b/doc/html/TechNotes/MemoryMgmt.html
@@ -0,0 +1,510 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Memory Management in HDF5</title>
+  </head>
+
+  <body>
+      <h1>Memory Management in HDF5</h1>
+
+      <!-- ---------------------------------------------------------------- -->
+      <h2>Is a Memory Manager Necessary?</h2>
+
+      <p>Some form of memory management may be necessary in HDF5 when
+	the various deletion operators are implemented so that the
+	file memory is not permanently orphaned.  However, since an
+	HDF5 file was designed with persistent data in mind, the
+	importance of a memory manager is questionable.
+
+      <p>On the other hand, when certain meta data containers (file glue)
+	grow, they may need to be relocated in order to keep the
+	container contiguous.
+
+	<blockquote>
+	  <b>Example:</b> An object header consists of up to two
+	  chunks of contiguous memory.  The first chunk is a fixed
+	  size at a fixed location when the header link count is
+	  greater than one.  Thus, inserting additional items into an
+	  object header may require the second chunk to expand.  When
+	  this occurs, the second chunk may need to move to another
+	  location in the file, freeing the file memory which that
+	  chunk originally occupied.
+      </blockquote>
+
+      <p>The relocation of meta data containers could potentially
+	orphan a significant amount of file memory if the application
+	has made poor estimates for preallocation sizes.
+
+      <!-- ---------------------------------------------------------------- -->
+      <h2>Levels of Memory Management</h2>
+
+      <p>Memory management by the library can be independent of memory
+	management support by the file format.  The file format can
+	support no memory management, some memory management, or full
+	memory management.  Similarly with the library.
+
+      <h3>Support in the Library</h3>
+
+      <dl>
+	<dt><b>No Support: I</b>
+	<dd>When memory is deallocated it simply becomes unreferenced
+	  (orphaned) in the file.  Memory allocation requests are
+	  satisfied by extending the file.
+	  
+	<dd>A separate off-line utility can be used to detect the
+	  unreferenced bytes of a file and "bubble" them up to the end
+	  of the file and then truncate the file.
+
+	<dt><b>Some Support: II</b>
+	<dd>The library could support partial memory management all
+	  the time, or full memory management some of the time.
+	  Orphaning free blocks instead of adding them to a free list
+	  should not affect the file integrity, nor should fulfilling
+	  new requests by extending the file instead of using the free
+	  list.
+
+	<dt><b>Full Support: III</b>
+	<dd>The library supports space-efficient memory management by
+	  always fulfilling allocation requests from the free list when
+	  possible, and by coalescing adjacent free blocks into a
+	  single larger free block.
+      </dl>
+
+      <h3>Support in the File Format</h3>
+      
+      <dl>
+	<dt><b>No Support: A</b>
+	<dd>The file format does not support memory management; any
+	  unreferenced block in the file is assumed to be free.  If
+	  the library supports full memory management then it will
+	  have to traverse the entire file to determine which blocks
+	  are unreferenced.
+
+	<dt><b>Some Support: B</b>
+	<dd>Assuming that unreferenced blocks are free can be
+	  dangerous in a situation where the file is not consistent.
+	  For instance, if a directory tree becomes detached from the
+	  main directory hierarchy, then the detached directory and
+	  everything that is referenced only through the detached
+	  directory become unreferenced.  File repair utilities will
+	  be unable to determine which unreferenced blocks need to be
+	  linked back into the file hierarchy.
+
+	<dd>Therefore, it might be useful to keep an unsorted,
+	  doubly-linked list of free blocks in the file.  The library
+	  can add and remove blocks from the list in constant time,
+	  and can generate its own internal free-block data structure
+	  in time proportional to the number of free blocks instead of
+	  the size of the file.  Additionally, a library can use a
+	  subset of the free blocks, an alternative which is not
+	  feasible if the file format doesn't support any form of
+	  memory management.
+
+	<dt><b>Full Support: C</b>
+	<dd>The file format can mirror library data structures for
+	  space-efficient memory management.  The free blocks are
+	  linked in unsorted, doubly-linked lists with one list per
+	  free block size.  The heads of the lists are pointed to by a
+	  B-tree whose nodes are sorted by free block size.  At the
+	  same time, all free blocks are the leaf nodes of another
+	  B-tree sorted by starting and ending address.  When the
+	  trees are used in combination we can deallocate and allocate
+	  memory in O(log <em>N</em>) time where <em>N</em> is the
+	  number of free blocks.
+      </dl>
+
+      <h3>Combinations of Library and File Format Support</h3>
+
+      <p>We now evaluate each combination of library support with file
+	support:
+
+      <dl>
+	<dt><b>I-A</b>
+	<dd>If neither the library nor the file support memory
+	  management, then each allocation request will come from the
+	  end of the file and each deallocation request is a no-op
+	  that simply leaves the free block unreferenced.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>No file overhead for allocation or deallocation.
+		<li>No library overhead for allocation or
+		  deallocation.
+		<li>No file traversal required at time of open.
+		<li>No data needs to be written back to the file when
+		  it's closed.
+		<li>Trivial to implement (already implemented).
+	      </ul>
+
+	    <li>Disadvantages
+	      <ul>
+		<li>Inefficient use of file space.
+		<li>A file repair utility must reclaim lost file space.
+		<li>Difficulties for file repair utilities. (Is an
+		  unreferenced block a free block or orphaned data?)
+	      </ul>
+	  </ul>
+
+	<dt><b>II-A</b>
+	<dd>In order for the library to support memory management, it
+	  will be required to build the internal free block
+	  representation by traversing the entire file looking for
+	  unreferenced blocks.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>No file overhead for allocation or deallocation.
+		<li>Variable amount of library overhead for allocation
+		  and deallocation depending on how much work the
+		  library wants to do.
+		<li>No data needs to be written back to the file when
+		  it's closed.
+		<li>Might use file space efficiently.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>Might use file space inefficiently.
+		<li>File traversal required at time of open.
+		<li>A file repair utility must reclaim lost file space.
+		<li>Difficulties for file repair utilities.
+		<li>Sharing of the free list between processes falls
+		  outside the HDF5 file format documentation.
+	      </ul>
+	  </ul>
+
+	<dt><b>III-A</b>
+	<dd>In order for the library to support full memory
+	  management, it will be required to build the internal free
+	  block representation by traversing the entire file looking
+	  for unreferenced blocks.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>No file overhead for allocation or deallocation.
+		<li>Efficient use of file space.
+		<li>No data needs to be written back to the file when
+		  it's closed.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>Moderate amount of library overhead for allocation
+		  and deallocation.
+		<li>File traversal required at time of open.
+		<li>A file repair utility must reclaim lost file space.
+		<li>Difficulties for file repair utilities.
+		<li>Sharing of the free list between processes falls
+		  outside the HDF5 file format documentation.
+	      </ul>
+	  </ul>
+
+	<dt><b>I-B</b>
+	<dd>If the library doesn't support memory management but the
+	  file format supports some level of management, then a file
+	  repair utility will have to be run occasionally to reclaim
+	  unreferenced blocks.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>No file overhead for allocation or deallocation.
+		<li>No library overhead for allocation or
+		  deallocation.
+		<li>No file traversal required at time of open.
+		<li>No data needs to be written back to the file when
+		  it's closed.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>A file repair utility must reclaim lost file space.
+		<li>Difficulties for file repair utilities.
+	      </ul>
+	  </ul>
+
+	<dt><b>II-B</b>
+	<dd>Both the library and the file format support some level
+	  of memory management.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>Constant file overhead per allocation or
+		  deallocation.
+		<li>Variable library overhead per allocation or
+		  deallocation depending on how much work the library
+		  wants to do.
+		<li>Traversal at file open time is on the order of the
+		  free list size instead of the file size.
+		<li>The library has the option of reading only part of
+		  the free list.
+		<li>No data needs to be written at file close time if
+		  it has been amortized into the cost of allocation
+		  and deallocation.
+		<li>File repair utilties don't have to be run to
+		  reclaim memory.
+		<li>File repair utilities can detect whether an
+		  unreferenced block is a free block or orphaned data.
+		<li>Sharing of the free list between processes might
+		  be easier.
+		<li>Possible efficient use of file space.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>Possible inefficient use of file space.
+	      </ul>
+	  </ul>
+
+	<dt><b>III-B</b>
+	<dd>The library provides space-efficient memory management but
+	  the file format only supports an unsorted list of free
+	  blocks.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>Constant time file overhead per allocation or
+		  deallocation.
+		<li>No data needs to be written at file close time if
+		  it has been amortized into the cost of allocation
+		  and deallocation.
+		<li>File repair utilities don't have to be run to
+		  reclaim memory.
+		<li>File repair utilities can detect whether an
+		  unreferenced block is a free block or orphaned data.
+		<li>Sharing of the free list between processes might
+		  be easier.
+		<li>Efficient use of file space.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>O(log <em>N</em>) library overhead per allocation or
+		  deallocation where <em>N</em> is the total number of
+		  free blocks.
+		<li>O(<em>N</em>) time to open a file since the entire
+		  free list must be read to construct the in-core
+		  trees used by the library.
+		<li>Library is more complicated.
+	      </ul>
+	  </ul>
+
+	<dt><b>I-C</b>
+	<dd>This has the same advantages and disadvantages as I-C with
+	  the added disadvantage that the file format is much more
+	  complicated.
+
+	<dt><b>II-C</b>
+	<dd>If the library only provides partial memory management but
+	  the file requires full memory management, then this method
+	  degenerates to the same as II-A with the added disadvantage
+	  that the file format is much more complicated.
+
+	<dt><b>III-C</b>
+	<dd>The library and file format both provide complete data
+	  structures for space-efficient memory management.
+
+	  <ul>
+	    <li>Advantages
+	      <ul>
+		<li>Files can be opened in constant time since the
+		  free list is read on demand and amortised into the
+		  allocation and deallocation requests.
+		<li>No data needs to be written back to the file when
+		  it's closed.
+		<li>File repair utilities don't have to be run to
+		  reclaim memory.
+		<li>File repair utilities can detect whether an
+		  unreferenced block is a free block or orphaned data.
+		<li>Sharing the free list between processes is easy.
+		<li>Efficient use of file space.
+	      </ul>
+	    <li>Disadvantages
+	      <ul>
+		<li>O(log <em>N</em>) file allocation and deallocation
+		  cost where <em>N</em> is the total number of free
+		  blocks.
+		<li>O(log <em>N</em>) library allocation and
+		  deallocation cost.
+		<li>Much more complicated file format.
+		<li>More complicated library.
+	      </ul>
+	  </ul>
+
+      </dl>
+
+      <!-- ---------------------------------------------------------------- -->
+      <h2>The Algorithm for II-B</h2>
+
+      <p>The file contains an unsorted, doubly-linked list of free
+	blocks.  The address of the head of the list appears in the
+	boot block.  Each free block contains the following fields:
+
+      <center>
+      <table border cellpadding=4 width="60%">
+	<tr align=center>
+	  <th width="25%">byte</th>
+	  <th width="25%">byte</th>
+	  <th width="25%">byte</th>
+	  <th width="25%">byte</th>
+
+	<tr align=center>
+	  <th colspan=4>Free Block Signature</th>
+
+	<tr align=center>
+	  <th colspan=4>Total Free Block Size</th>
+
+	<tr align=center>
+	  <th colspan=4>Address of Left Sibling</th>
+
+	<tr align=center>
+	  <th colspan=4>Address of Right Sibling</th>
+
+	<tr align=center>
+	  <th colspan=4><br><br>Remainder of Free Block<br><br><br></th>
+      </table>
+      </center>
+      
+      <p>The library reads as much of the free list as convenient when
+	convenient and pushes those entries onto stacks.  This can
+	occur when a file is opened or any time during the life of the
+	file. There is one stack for each free block size and the
+	stacks are sorted by size in a balanced tree in memory.
+
+      <p>Deallocation involves finding the correct stack or creating
+	a new one (an O(log <em>K</em>) operation where <em>K</em> is
+	the number of stacks), pushing the free block info onto the
+	stack (a constant-time operation), and inserting the free
+	block into the file free block list (a constant-time operation
+	which doesn't necessarily involve any I/O since the free blocks
+	can be cached like other objects).  No attempt is made to
+	coalesce adjacent free blocks into larger blocks.
+
+      <p>Allocation involves finding the correct stack (an O(log
+	<em>K</em>) operation), removing the top item from the stack
+	(a constant-time operation), and removing the block from the
+	file free block list (a constant-time operation).  If there is
+	no free block of the requested size or larger, then the file
+	is extended.
+
+      <p>To provide sharability of the free list between processes,
+	the last step of an allocation will check for the free block
+	signature and if it doesn't find one will repeat the process.
+	Alternatively, a process can temporarily remove free blocks
+	from the file and hold them in it's own private pool.
+
+      <p>To summarize...
+	<dl>
+	<dt>File opening
+	<dd>O(<em>N</em>) amortized over the time the file is open,
+	  where <em>N</em> is the number of free blocks.  The library
+	  can still function without reading any of the file free
+	  block list.
+
+	<dt>Deallocation
+	<dd>O(log <em>K</em>) where <em>K</em> is the number of unique
+	  sizes of free blocks.  File access is constant.
+
+	<dt>Allocation
+	<dd>O(log <em>K</em>).  File access is constant.
+
+	<dt>File closing
+	<dd>O(1) even if the library temporarily removes free
+	  blocks from the file to hold them in a private pool since
+	  the pool can still be a linked list on disk.
+      </dl>
+
+      <!-- ---------------------------------------------------------------- -->
+      <h2>The Algorithm for III-C</h2>
+
+      <p>The HDF5 file format supports a general B-tree mechanism
+	for storing data with keys.  If we use a B-tree to represent
+	all parts of the file that are free and the B-tree is indexed
+	so that a free file chunk can be found if we know the starting
+	or ending address, then we can efficiently determine whether a
+	free chunk begins or ends at the specified address.  Call this
+	the <em>Address B-Tree</em>.
+
+      <p>If a second B-tree points to a set of stacks where the
+	members of a particular stack are all free chunks of the same
+	size, and the tree is indexed by chunk size, then we can
+	efficiently find the best-fit chunk size for a memory request.
+	Call this the <em>Size B-Tree</em>.
+
+      <p>All free blocks of a particular size can be linked together
+	with an unsorted, doubly-linked, circular list and the left
+	and right sibling addresses can be stored within the free
+	chunk, allowing us to remove or insert items from the list in
+	constant time.
+
+      <p>Deallocation of a block fo file memory consists of:
+
+	<ol type="I">
+	<li>Add the new free block whose address is <em>ADDR</em> to the
+	  address B-tree.
+
+	  <ol type="A">
+	    <li>If the address B-tree contains an entry for a free
+	      block that ends at <em>ADDR</em>-1 then remove that
+	      block from the B-tree and from the linked list (if the
+	      block was the first on the list then the size B-tree
+	      must be updated).  Adjust the size and address of the
+	      block being freed to include the block just removed from
+	      the free list.  The time required to search for and
+	      possibly remove the left block is O(log <em>N</em>)
+	      where <em>N</em> is the number of free blocks.
+
+	    <li>If the address B-tree contains an entry for the free
+	      block that begins at <em>ADDR</em>+<em>LENGTH</em> then
+	      remove that block from the B-tree and from the linked
+	      list (if the block was the first on the list then the
+	      size B-tree must be updated).  Adjust the size of the
+	      block being freed to include the block just removed from
+	      the free list.  The time required to search for and
+	      possibly remove the right block is O(log <em>N</em>).
+
+	    <li>Add the new (adjusted) block to the address B-tree.
+	      The time for this operation is O(log <em>N</em>).
+	  </ol>
+
+	<li>Add the new block to the size B-tree and linked list.
+
+	  <ol type="A">
+	    <li>If the size B-tree has an entry for this particular
+	      size, then add the chunk to the tail of the list. This
+	      is an O(log <em>K</em>) operation where <em>K</em> is
+	      the number of unique free block sizes.
+
+	    <li>Otherwise make a new entry in the B-tree for chunks of
+	      this size.  This is also O(log <em>K</em>).
+	  </ol>
+      </ol>
+
+      <p>Allocation is similar to deallocation.
+
+      <p>To summarize...
+
+      <dl>
+	<dt>File opening
+	<dd>O(1)
+
+	<dt>Deallocation
+	<dd>O(log <em>N</em>) where <em>N</em> is the total number of
+	  free blocks.  File access time is O(log <em>N</em>).
+
+	<dt>Allocation
+	<dd>O(log <em>N</em>).  File access time is O(log <em>N</em>).
+
+	<dt>File closing
+	<dd>O(1).
+      </dl>
+
+
+      <hr>
+      <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Thu Jul 24 15:16:40 PDT 1997 -->
+<!-- hhmts start -->
+Last modified: Thu Jul 31 14:41:01 EST 
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/MoveDStruct.html
+++ b/doc/html/TechNotes/MoveDStruct.html
@@ -0,0 +1,66 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Relocating a File Data Structure</title>
+  </head>
+
+  <body>
+      <h1>Relocating a File Data Structure</h1>
+
+      <p>Since file data structures can be cached in memory by the H5AC
+	package it becomes problematic to move such a data structure in
+	the file. One cannot just copy a portion of the file from one
+	location to another because:
+
+      <ol>
+	<li>the file might not contain the latest information, and</li>
+	<li>the H5AC package might not realize that the object's
+	  address has changed and attempt to write the object to disk
+	  at the old address.</li>
+      </ol>
+      
+      <p>Here's a correct method to move data from one location to
+	another.  The example code assumes that one is moving a B-link
+	tree node from <code>old_addr</code> to <code>new_addr</code>.
+	
+      <ol>
+	<li>Make sure the disk is up-to-date with respect to the
+	  cache.  There is no need to remove the item from the cache,
+	  hence the final argument to <code>H5AC_flush</code> is
+	  <code>FALSE</code>.
+	  <br><br>
+	  <code>
+	    H5AC_flush (f, H5AC_BT, old_addr, FALSE);<br>
+	  </code>
+	  <br>
+	</li>
+	
+	<li>Read the data from the old address and write it to the new
+	  address.
+	  <br><br>
+	  <code>
+	    H5F_block_read (f, old_addr, size, buf);<br>
+	    H5F_block_write (f, new_addr, size, buf);<br>
+	  </code>
+	  <br>
+	</li>
+	  
+	<li>Notify the cache that the address of the object changed.
+	  <br><br>
+	  <code>
+	    H5AC_rename (f, H5AC_BT, old_addr, new_addr);<br>
+	  </code>
+	  <br>
+	</li>
+      </ol>
+	  
+
+
+      <hr>
+      <address><a href="mailto:robb@maya.nuance.com">Robb Matzke</a></address>
+<!-- Created: Mon Jul 14 15:09:06 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Mon Jul 14 15:38:29 EST 
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/NamingScheme.html
+++ b/doc/html/TechNotes/NamingScheme.html
@@ -0,0 +1,300 @@
+<HTML>
+<HEAD><TITLE>
+             HDF5 Naming Scheme
+      </TITLE> </HEAD>
+
+<BODY bgcolor="#ffffff">
+  
+
+<H1>
+<FONT color="#c80028"
+ <I> <B> <CENTER>  HDF5 Naming Scheme for </CENTER> </B> </I> </H1>
+</FONT>
+<P>
+<UL>
+
+<LI>       <A HREF = "#01"><I>  FILES </I> </A>
+<LI>       <A HREF = "#02"><I>  PACKAGES </I> </A>
+<LI>       <A HREF = "#03"><I>  PUBLIC vs PRIVATE </I> </A>
+<LI>       <A HREF = "#04"><I>  INTEGRAL TYPES </I> </A>
+<LI>       <A HREF = "#05"><I>  OTHER TYPES </I> </A>
+<LI>       <A HREF = "#06"><I>  GLOBAL VARIABLES </I> </A>
+<LI>       <A HREF = "#07"><I>  MACROS, PREPROCESSOR CONSTANTS, AND ENUM MEMEBERs </I> </A>
+
+</UL>
+<P>
+<center>
+	 Authors: <A HREF = "mailto:koziol@ncsa.uiuc.edu">
+        <I>Quincey Koziol</I> </A> and 
+                  <A HREF = "mailto:matzke@llnl.gov">   
+        <I>		  Robb Matzke </I> </A>   
+
+</center>
+<UL>
+
+<FONT color="#c80028"
+<LI> <A NAME="01">  <B> <I> FILES </I> </B>  </A>
+</FONT>
+
+<UL>
+
+  <LI>  Source files are named according to the package they contain (see
+    below).  All files will begin with `H5' so we can stuff our
+    object files into someone else's library and not worry about file
+    name conflicts.
+  <P>For Example:
+<i><b>
+<dd>	H5.c		-- "Generic" library functions 
+   <br>
+<dd>	H5B.c		-- B-link tree functions
+</i></b>
+  <p>
+   <LI> If a package is in more than one file, then another name is tacked
+    on.  It's all lower case with no underscores or hyphens.
+   <P>For Example:
+<i><b>
+<dd>	H5F.c		-- the file for this package
+   <br>
+<dd>	H5Fstdio.c	-- stdio functions (just an example)
+   <br>
+<dd>	H5Ffcntl.c	-- fcntl functions (just an example)
+</i></b>
+   <p>
+  <LI> Each package file has a header file of API stuff (unless there is
+    no API component to the package)
+   <P>For Example:
+<i><b>
+<dd>	H5F.h		-- things an application would see. </i> </b>
+   <P>
+    and a header file of private stuff
+<i><b>
+   <p>
+<dd>	H5Fprivate.h	-- things an application wouldn't see. The
+                    	   private header includes the public header.
+</i></b>
+    <p>
+    and a header for private prototypes
+<i><b>
+   <p>
+<dd>	H5Fproto.h	-- prototypes for internal functions.
+</i></b>
+    <P>
+    By splitting the prototypes into separate include files we don't
+    have to recompile everything when just one function prototype
+    changes.
+
+   <LI> The main API header file is `hdf5.h' and it includes each of the
+    public header files but none of the private header files.  Or the
+    application can include just the public header files it needs.
+
+   <LI> There is no main private or prototype header file because it
+    prevents make from being efficient.  Instead, each source file
+    includes only the private header and prototype files it needs
+    (first all the private headers, then all the private prototypes).
+
+   <LI> Header files should include everything they need and nothing more.
+
+</UL>
+<P>
+
+<FONT color="#c80028"
+<LI> <A NAME="02">  <B> <I> PACKAGES </I> </B>  </A>
+</FONT>
+
+<P>
+Names exported beyond function scope begin with `H5' followed by zero,
+one, or two upper-case letters that describe the class of object.
+This prefix is the package name.  The implementation of packages
+doesn't necessarily have to map 1:1 to the source files.
+<P>
+<i><b>
+<dd>	H5	-- library functions
+<br>
+<dd>	H5A	-- atoms
+<br>
+<dd>	H5AC	-- cache
+<br>
+<dd>	H5B	-- B-link trees
+<br>
+<dd>	H5D	-- datasets
+<br>
+<dd>	H5E	-- error handling
+<br>
+<dd>	H5F	-- files
+<br>
+<dd>	H5G	-- groups
+<br>
+<dd>	H5M	-- meta data
+<br>
+<dd>	H5MM	-- core memory management
+<br>
+<dd>	H5MF	-- file memory management
+<br>
+<dd>	H5O	-- object headers
+<br>
+<dd>	H5P	-- Property Lists
+<br>
+<dd>	H5S	-- dataspaces
+<br>
+<dd>	H5R	-- relationships
+<br>
+<dd>	H5T	-- datatype
+</i></b>
+<p>
+Each package implements a single main class of object (e.g., the H5B
+package implements B-link trees).  The main data type of a package is
+the package name followed by `_t'.
+<p>
+<i><b>
+<dd>	H5F_t		-- HDF5 file type
+<br>
+<dd>	H5B_t		-- B-link tree data type
+</i></b>
+<p>
+
+Not all packages implement a data type (H5, H5MF) and some
+packages provide access to a preexisting data type (H5MM, H5S).
+<p>
+
+
+<FONT color="#c80028"
+<LI> <A NAME="03">  <B> <I> PUBLIC vs PRIVATE </I> </B>  </A>
+</FONT>
+<p>
+If the symbol is for internal use only, then the package name is
+followed by an underscore and the rest of the name.  Otherwise, the
+symbol is part of the API and there is no underscore between the
+package name and the rest of the name.
+<p>
+<i><b>
+<dd>	H5Fopen		-- an API function.
+<br>
+<dd>	H5B_find	-- an internal function.
+</i></b>
+<p>
+For functions, this is important because the API functions never pass
+pointers around (they use atoms instead for hiding the implementation)
+and they perform stringent checks on their arguments.  Internal
+unctions, on the other hand, check arguments with assert().
+<p>
+Data types like H5B_t carry no information about whether the type is
+public or private since it doesn't matter.
+
+<p>
+
+
+<FONT color="#c80028"
+<LI> <A NAME="04"> <B> <I> INTEGRAL TYPES </I> </B>  </A>
+</FONT>
+<p>
+Integral fixed-point type names are an optional `u' followed by `int'
+followed by the size in bits (8, 16,
+32, or 64).  There is no trailing `_t' because these are common
+enough and follow their own naming convention.
+<p>
+<pre><H4>
+<dd>	hbool_t     -- boolean values (BTRUE, BFALSE, BFAIL)
+<br>
+<dd>	int8		-- signed 8-bit integers
+<br>
+<dd>	uint8       -- unsigned 8-bit integers
+<br>
+<dd>	int16       -- signed 16-bit integers
+<br>
+<dd>	uint16      -- unsigned 16-bit integers
+<br>
+<dd>	int32       -- signed 32-bit integers
+<br>
+<dd>	uint32      -- unsigned 32-bit integers
+<br>
+<dd>	int64       -- signed 64-bit integers
+<br>
+<dd>	uint64      -- unsigned 64-bit integers
+<br>
+<dd>	intn		-- "native" integers
+<br>
+<dd>	uintn		-- "native" unsigned integers
+
+</pre></H4>
+<p>
+
+<FONT color="#c80028"
+<LI> <A NAME="05"> <B> <I> OTHER TYPES </I> </B> </A>
+</FONT>
+
+<p>
+
+Other data types are always followed by `_t'.
+<p>
+<pre><H4>
+<dd>	H5B_key_t-- additional data type used by H5B package.
+</pre></H4>
+<p>
+
+However, if the name is so common that it's used almost everywhere,
+then we make an alias for it by removing the package name and leading
+underscore and replacing it with an `h'  (the main datatype for a
+package already has a short enough name, so we don't have aliases for
+them).
+<P>
+<pre><H4>
+<dd>	typedef H5E_err_t herr_t;
+</pre> </H4>
+<p>
+
+<FONT color="#c80028"
+<LI> <A NAME="06">  <B> <I> GLOBAL VARIABLES </I> </B>  </A>
+</FONT>
+<p>
+Global variables include the package name and end with `_g'.
+<p>
+<pre><H4>
+<dd>	H5AC_methods_g	-- global variable in the H5AC package.
+</pre> </H4>
+<p>
+
+
+<FONT color="#c80028"
+<LI> <A NAME="07">   
+<I> <B>
+MACROS, PREPROCESSOR CONSTANTS, AND ENUM MEMBERS
+  </I> </B>  </A>
+</FONT>
+<p>
+Same rules as other symbols except the name is all upper case.  There
+are a few exceptions: <br>
+<ul>
+<li>	Constants and macros defined on a system that is deficient: 
+       <p><pre><H4> 
+<dd>		MIN(x,y), MAX(x,y) and their relatives
+        </pre></H4>
+
+<li>	Platform constants :
+       <P> 
+		No naming scheme; determined by OS and compiler.<br>
+		These appear only in one header file anyway.
+        <p>
+<li>	Feature test constants (?)<br>
+		Always start with `HDF5_HAVE_' like HDF5_HAVE_STDARG_H for a
+		header file, or HDF5_HAVE_DEV_T for a data type, or
+		HDF5_HAVE_DIV for a function.
+</UL>
+<p>
+
+</UL>
+<p>
+<H6>
+<center>
+	 This file /hdf3/web/hdf/internal/HDF_standard/HDF5.coding_standard.html is
+	 maintained by Elena Pourmal <A HREF = "mailto:epourmal@ncsa.uiuc.edu">
+         <I>epourmal@ncsa.uiuc.edu</I> </A>.
+</center>
+<p>
+<center>
+          Last modified August 5, 1997
+</center>
+
+</H6>
+</BODY>
+<HTML>
+
--- a/doc/html/TechNotes/ObjectHeader.html
+++ b/doc/html/TechNotes/ObjectHeader.html
@@ -0,0 +1,67 @@
+<html>
+<body>
+
+<h1>Object Headers</h1>
+
+<pre>
+
+haddr_t
+H5O_new (hdf5_file_t *f, intn nrefs, size_t size_hint)
+
+	Creates a new empty object header and returns its address.
+	The SIZE_HINT is the initial size of the data portion of the
+	object header and NREFS is the number of symbol table entries
+	that reference this object header (normally one).
+
+	If SIZE_HINT is too small, then at least some default amount
+	of space is allocated for the object header.
+
+intn				        /*num remaining links		*/
+H5O_link (hdf5_file_t *f,		/*file containing header	*/
+	  haddr_t addr,			/*header file address		*/
+	  intn adjust)			/*link adjustment amount	*/
+
+
+size_t
+H5O_sizeof (hdf5_file_t *f,		/*file containing header	*/
+	    haddr_t addr,		/*header file address		*/
+            H5O_class_t *type,		/*message type or H5O_ANY	*/
+	    intn sequence)		/*sequence number, usually zero	*/
+		
+	Returns the size of a particular instance of a message in an
+	object header.  When an object header has more than one
+	instance of a particular message type, then SEQUENCE indicates
+	which instance to return.
+
+void *
+H5O_read (hdf5_file_t *f,		/*file containing header	*/
+	  haddr_t addr,			/*header file address		*/
+	  H5G_entry_t *ent,		/*optional symbol table entry	*/
+	  H5O_class_t *type,		/*message type or H5O_ANY	*/
+	  intn sequence,		/*sequence number, usually zero	*/
+	  size_t size,			/*size of output message	*/
+	  void *mesg)			/*output buffer			*/
+
+	Reads a message from the object header into memory.
+
+const void *
+H5O_peek (hdf5_file_t *f,		/*file containing header	*/
+          haddr_t addr,			/*header file address		*/
+	  H5G_entry_t *ent,		/*optional symbol table entry	*/
+	  H5O_class_t *type,		/*type of message or H5O_ANY	*/
+	  intn sequence)		/*sequence number, usually zero	*/
+
+haddr_t					/*new heap address		*/
+H5O_modify (hdf5_file_t *f,		/*file containing header	*/
+            haddr_t addr,		/*header file address		*/
+	    H5G_entry_t *ent,		/*optional symbol table entry	*/
+	    hbool_t *ent_modified,	/*entry modification flag	*/
+	    H5O_class_t *type,		/*message type			*/
+	    intn overwrite,		/*sequence number or -1		*/
+	    void *mesg)			/*the message			*/  
+	  
+
+</pre>
+
+</body>
+</html>
--- a/doc/html/TechNotes/RawDStorage.html
+++ b/doc/html/TechNotes/RawDStorage.html
@@ -0,0 +1,274 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Raw Data Storage in HDF5</title>
+  </head>
+
+  <body>
+    <h1>Raw Data Storage in HDF5</h1>
+
+    <p>This document describes the various ways that raw data is
+      stored in an HDF5 file and the object header messages which
+      contain the parameters for the storage.
+
+    <p>Raw data storage has three components: the mapping from some
+      logical multi-dimensional element space to the linear address
+      space of a file, compression of the raw data on disk, and
+      striping of raw data across multiple files.  These components
+      are orthogonal.
+      
+    <p>Some goals of the storage mechanism are to be able to
+      efficently store data which is:
+
+    <dl>
+      <dt>Small
+      <dd>Small pieces of raw data can be treated as meta data and
+	stored in the object header.  This will be achieved by storing
+	the raw data in the object header with message 0x0006.
+	Compression and striping are not supported in this case.
+
+      <dt>Complete Large
+      <dd>The library should be able to store large arrays
+	contiguously in the file provided the user knows the final
+	array size a priori.  The array can then be read/written in a
+	single I/O request.  This is accomplished by describing the
+	storage with object header message 0x0005. Compression and
+	striping are not supported in this case.
+
+      <dt>Sparse Large
+      <dd>A large sparse raw data array should be stored in a manner
+	that is space-efficient but one in which any element can still
+	be accessed in a reasonable amount of time. Implementation
+	details are below.
+	
+      <dt>Dynamic Size
+      <dd>One often doesn't have prior knowledge of the size of an
+	array. It would be nice to allow arrays to grow dynamically in
+	any dimension. It might also be nice to allow the array to
+	grow in the negative dimension directions if convenient to
+	implement. Implementation details are below.
+
+      <dt>Subslab Access
+      <dd>Some multi-dimensional arrays are almost always accessed by
+	subslabs. For instance, a 2-d array of pixels might always be
+	accessed as smaller 1k-by-1k 2-d arrays always aligned on 1k
+	index values.  We should be able to store the array in such a
+	way that striding though the entire array is not necessary.
+	Subslab access might also be useful with compression
+	algorithms where each storage slab can be compressed
+	independently of the others. Implementation details are below.
+
+      <dt>Compressed
+      <dd>Various compression algorithms can be applied to the entire
+	array. We're not planning to support separate algorithms (or a
+	single algorithm with separate parameters) for each chunk
+	although it would be possible to implement that in a manner
+	similar to the way striping across files is
+	implemented.
+
+      <dt>Striped Across Files
+      <dd>The array access functions should support arrays stored
+	discontiguously across a set of files.
+    </dl>
+
+    <h1>Implementation of Indexed Storage</h1>
+
+    <p>The Sparse Large, Dynamic Size, and Subslab Access methods
+      share so much code that they can be described with a single
+      message.  The new Indexed Storage Message (<code>0x0008</code>)
+      will replace the old Chunked Object (<code>0x0009</code>) and
+      Sparse Object (<code>0x000A</code>) Messages.
+
+    <p>
+      <center>
+	<table border cellpadding=4 width="60%">
+	  <caption align=bottom>
+	    <b>The Format of the Indexed Storage Message</b>
+	  </caption>
+	  <tr align=center>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	  </tr>
+
+	  <tr align=center>
+	    <td colspan=4><br>Address of B-tree<br><br></td>
+	  </tr>
+	  <tr align=center>
+	    <td>Number of Dimensions</td>
+	    <td>Reserved</td>
+	    <td>Reserved</td>
+	    <td>Reserved</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Reserved (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Alignment for Dimension 0 (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Alignment for Dimension 1 (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>...</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Alignment for Dimension N (4 bytes)</td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>The alignment fields indicate the alignment in logical space to
+      use when allocating new storage areas on disk.  For instance,
+      writing every other element of a 100-element one-dimensional
+      array (using one HDF5 I/O partial write operation per element)
+      that has unit storage alignment would result in 50
+      single-element, discontiguous storage segments.  However, using
+      an alignment of 25 would result in only four discontiguous
+      segments.  The size of the message varies with the number of
+      dimensions.
+
+    <p>A B-tree is used to point to the discontiguous portions of
+      storage which has been allocated for the object.  All keys of a
+      particular B-tree are the same size and are a function of the
+      number of dimensions. It is therefore not possible to change the
+      dimensionality of an indexed storage array after its B-tree is
+      created.
+
+    <p>
+      <center>
+	<table border cellpadding=4 width="60%">
+	  <caption align=bottom>
+	    <b>The Format of a B-Tree Key</b>
+	  </caption>
+	  <tr align=center>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	  </tr>
+
+	  <tr align=center>
+	    <td colspan=4>External File Number or Zero (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Chunk Offset in Dimension 0 (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Chunk Offset in Dimension 1 (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>...</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Chunk Offset in Dimension N (4 bytes)</td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>The keys within a B-tree obey an ordering based on the chunk
+      offsets.  If the offsets in dimension-0 are equal, then
+      dimension-1 is used, etc. The External File Number field
+      contains a 1-origin offset into the External File List message
+      which contains the name of the external file in which that chunk
+      is stored.
+
+    <h1>Implementation of Striping</h1>
+
+    <p>The indexed storage will support arbitrary striping at the
+      chunk level; each chunk can be stored in any file.  This is
+      accomplished by using the External File Number field of an
+      indexed storage B-tree key as a 1-origin offset into an External
+      File List Message (0x0009) which takes the form:
+
+    <p>
+      <center>
+	<table border cellpadding=4 width="60%">
+	  <caption align=bottom>
+	    <b>The Format of the External File List Message</b>
+	  </caption>
+	  <tr align=center>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	    <th width="25%">byte</th>
+	  </tr>
+
+	  <tr align=center>
+	    <td colspan=4><br>Name Heap Address<br><br></td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Number of Slots Allocated (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Number of File Names (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Byte Offset of Name 1 in Heap (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>Byte Offset of Name 2 in Heap (4 bytes)</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4>...</td>
+	  </tr>
+	  <tr align=center>
+	    <td colspan=4><br>Unused Slot(s)<br><br></td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>Each indexed storage array that has all or part of its data
+      stored in external files will contain a single external file
+      list message.  The size of the messages is determined when the
+      message is created, but it may be possible to enlarge the
+      message on demand by moving it.  At this time, it's not possible
+      for multiple arrays to share a single external file list
+      message.
+
+    <dl>
+      <dt><code>
+	  H5O_efl_t *H5O_efl_new (H5G_entry_t *object, intn
+	  nslots_hint, intn heap_size_hint)
+	</code>
+      <dd>Adds a new, empty external file list message to an object
+	header and returns a pointer to that message.  The message
+	acts as a cache for file descriptors of external files that
+	are open.
+
+      <p><dt><code>
+	  intn H5O_efl_index (H5O_efl_t *efl, const char *filename)
+	</code>
+      <dd>Gets the external file index number for a particular file name.
+	If the name isn't in the external file list then it's added to
+	the H5O_efl_t struct and immediately written to the object
+	header to which the external file list message belongs. Name
+	comparison is textual.  Each name should be relative to the
+	directory which contains the HDF5 file.
+
+      <p><dt><code>
+	  H5F_low_t *H5O_efl_open (H5O_efl_t *efl, intn index, uintn mode)
+	</code>
+      <dd>Gets a low-level file descriptor for an external file.  The
+	external file list caches file descriptors because we might
+	have many more external files than there are file descriptors
+	available to this process.  The caller should not close this file.
+
+      <p><dt><code>
+	  herr_t H5O_efl_release (H5O_efl_t *efl)
+	</code>
+      <dd>Releases an external file list, closes all files
+	associated with that list, and if the list has been modified
+	since the call to <code>H5O_efl_new</code> flushes the message
+	to disk.
+    </dl>
+
+    <hr>
+    <address><a href="mailto:robb@arborea.spizella.com">Robb Matzke</a></address>
+<!-- Created: Fri Oct  3 09:52:32 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Tue Nov 25 12:36:50 EST 1997
+<!-- hhmts end -->
+  </body>
+</html>
--- a/doc/html/TechNotes/SymbolTables.html
+++ b/doc/html/TechNotes/SymbolTables.html
@@ -0,0 +1,323 @@
+<html>
+<body>
+
+<h1>Symbol Table Caching Issues</h1>
+
+<pre>
+
+A number of issues involving caching of object header messages in
+symbol table entries must be resolved.
+
+What is the motivation for these changes?
+
+   If we make objects completely independent of object name it allows
+   us to refer to one object by multiple names (a concept called hard
+   links in Unix file systems), which in turn provides an easy way to
+   share data between datasets.
+
+   Every object in an HDF5 file has a unique, constant object header
+   address which serves as a handle (or OID) for the object.  The
+   object header contains messages which describe the object.
+
+   HDF5 allows some of the object header messages to be cached in
+   symbol table entries so that the object header doesn't have to be
+   read from disk.  For instance, an entry for a directory caches the
+   directory disk addresses required to access that directory, so the
+   object header for that directory is seldom read.
+
+   If an object has multiple names (that is, a link count greater than
+   one), then it has multiple symbol table entries which point to it.
+   All symbol table entries must agree on header messages.  The
+   current mechanism is to turn off the caching of header messages in
+   symbol table entries when the header link count is more than one,
+   and to allow caching once the link count returns to one.
+
+   However, in the current implementation, a package is allowed to
+   copy a symbol table entry and use it as a private cache for the
+   object header.  This doesn't work for a number of reasons (all but
+   one require a `delete symbol entry' operation).
+
+      1. If two packages hold copies of the same symbol table entry,
+         they don't notify each other of changes to the symbol table
+         entry. Eventually, one package reads a cached message and
+         gets the wrong value because the other package changed the
+         message in the object header.
+
+      2. If one package holds a copy of the symbol table entry and
+         some other part of HDF5 removes the object and replaces it
+         with some other object, then the original package will
+         continue to access the non-existent object using the new
+         object header.
+
+      3. If one package holds a copy of the symbol table entry and
+         some other part of HDF5 (re)moves the directory which
+         contains the object, then the package will be unable to
+         update the symbol table entry with the new cached
+         data. Packages that refer to the object by the new name will
+         use old cached data.
+
+
+The basic problem is that there may be multiple copies of the object
+symbol table entry floating around in the code when there should
+really be at most one per hard link.
+
+   Level 0: A copy may exist on disk as part of a symbol table node, which
+            is a small 1d array of symbol table entries.
+
+   Level 1: A copy may be cached in memory as part of a symbol table node
+	    in the H5Gnode.c file by the H5AC layer.
+
+   Level 2a: Another package may be holding a copy so it can perform
+   	     fast lookup of any header messages that might be cached in
+   	     the symbol table entry.  It can't point directly to the
+   	     cached symbol table node because that node can dissappear
+   	     at any time.
+
+   Level 2b: Packages may hold more than one copy of a symbol table
+             entry.  For instance, if H5D_open() is called twice for
+             the same name, then two copies of the symbol table entry
+             for the dataset exist in the H5D package.
+
+How can level 2a and 2b be combined?
+
+   If package data structures contained pointers to symbol table
+   entries instead of copies of symbol table entries and if H5G
+   allocated one symbol table entry per hard link, then it's trivial
+   for Level 2a and 2b to benefit from one another's actions since
+   they share the same cache.
+
+How does this work conceptually?
+
+   Level 2a and 2b must notify Level 1 of their intent to use (or stop
+   using) a symbol table entry to access an object header.  The
+   notification of the intent to access an object header is called
+   `opening' the object and releasing the access is `closing' the
+   object.
+
+   Opening an object requires an object name which is used to locate
+   the symbol table entry to use for caching of object header
+   messages.  The return value is a handle for the object.  Figure 1
+   shows the state after Dataset1 opens Object with a name that maps
+   through Entry1.  The open request created a copy of Entry1 called
+   Shadow1 which exists even if SymNode1 is preempted from the H5AC
+   layer.
+
+                                                     ______
+                                            Object  /      \
+	     SymNode1                     +--------+        |
+	    +--------+            _____\  | Header |        |
+	    |        |           /     /  +--------+        |
+	    +--------+ +---------+                  \______/
+	    | Entry1 | | Shadow1 | /____
+	    +--------+ +---------+ \    \
+	    :        :                   \
+	    +--------+                    +----------+
+					  | Dataset1 |
+					  +----------+
+			     FIGURE 1
+
+
+
+  The SymNode1 can appear and disappear from the H5AC layer at any
+  time without affecting the Object Header data cached in the Shadow.
+  The rules are:
+
+  * If the SymNode1 is present and is about to disappear and the
+    Shadow1 dirty bit is set, then Shadow1 is copied over Entry1, the
+    Entry1 dirty bit is set, and the Shadow1 dirty bit is cleared.
+
+  * If something requests a copy of Entry1 (for a read-only peek
+    request), and Shadow1 exists, then a copy (not pointer) of Shadow1
+    is returned instead.
+
+  * Entry1 cannot be deleted while Shadow1 exists.
+
+  * Entry1 cannot change directly if Shadow1 exists since this means
+    that some other package has opened the object and may be modifying
+    it.  I haven't decided if it's useful to ever change Entry1
+    directly (except of course within the H5G layer itself).
+
+  * Shadow1 is created when Dataset1 `opens' the object through
+    Entry1. Dataset1 is given a pointer to Shadow1 and Shadow1's
+    reference count is incremented.
+
+  * When Dataset1 `closes' the Object the Shadow1 reference count is
+    decremented.  When the reference count reaches zero, if the
+    Shadow1 dirty bit is set, then Shadow1's contents are copied to
+    Entry1, and the Entry1 dirty bit is set. Shadow1 is then deleted
+    if its reference count is zero.  This may require reading SymNode1
+    back into the H5AC layer.
+
+What happens when another Dataset opens the Object through Entry1?
+
+  If the current state is represented by the top part of Figure 2,
+  then Dataset2 will be given a pointer to Shadow1 and the Shadow1
+  reference count will be incremented to two.  The Object header link
+  count remains at one so Object Header messages continue to be cached
+  by Shadow1. Dataset1 and Dataset2 benefit from one another
+  actions. The resulting state is represented by Figure 2.
+
+                                                     _____
+             SymNode1                       Object  /     \
+            +--------+            _____\  +--------+       |
+            |        |           /     /  | Header |       |
+            +--------+ +---------+        +--------+       |
+            | Entry1 | | Shadow1 | /____            \_____/
+            +--------+ +---------+ \    \
+            :        :        _          \
+            +--------+       |\           +----------+
+                               \          | Dataset1 |
+                                \________ +----------+
+                                         \              \
+                                          +----------+   |
+                                          | Dataset2 |   |- New Dataset
+                                          +----------+   |
+                                                        /
+			     FIGURE 2
+
+
+What happens when the link count for Object increases while Dataset
+has the Object open?
+
+                                                     SymNode2
+                                                    +--------+
+    SymNode1                       Object           |        |
+   +--------+             ____\  +--------+ /______ +--------+
+   |        |            /    /  | header | \      `| Entry2 |
+   +--------+ +---------+        +--------+         +--------+
+   | Entry1 | | Shadow1 | /____                     :        :
+   +--------+ +---------+ \    \                    +--------+
+   :        :                   \
+   +--------+                    +----------+   \________________/
+                                 | Dataset1 |            |
+                                 +----------+         New Link
+
+			     FIGURE 3
+
+  The current state is represented by the left part of Figure 3.  To
+  create a new link the Object Header had to be located by traversing
+  through Entry1/Shadow1.  On the way through, the Entry1/Shadow1 
+  cache is invalidated and the Object Header link count is
+  incremented. Entry2 is then added to SymNode2.
+
+  Since the Object Header link count is greater than one, Object
+  header data will not be cached in Entry1/Shadow1.
+
+  If the initial state had been all of Figure 3 and a third link is
+  being added and Object is open by Entry1 and Entry2, then creation
+  of the third link will invalidate the cache in Entry1 or Entry2.  It
+  doesn't matter which since both caches are already invalidated
+  anyway.
+
+What happens if another Dataset opens the same object by another name?
+
+  If the current state is represented by Figure 3, then a Shadow2 is
+  created and associated with Entry2.  However, since the Object
+  Header link count is more than one, nothing gets cached in Shadow2
+  (or Shadow1).
+
+What happens if the link count decreases?
+
+  If the current state is represented by all of Figure 3 then it isn't
+  possible to delete Entry1 because the object is currently open
+  through that entry.  Therefore, the link count must have
+  decreased because Entry2 was removed.
+
+  As Dataset1 reads/writes messages in the Object header they will
+  begin to be cached in Shadow1 again because the Object header link
+  count is one.
+
+What happens if the object is removed while it's open?
+
+  That operation is not allowed.
+
+What happens if the directory containing the object is deleted?
+
+  That operation is not allowed since deleting the directory requires
+  that the directory be empty.  The directory cannot be emptied
+  because the open object cannot be removed from the directory.
+
+What happens if the object is moved?
+
+  Moving an object is a process consisting of creating a new
+  hard-link with the new name and then deleting the old name.
+  This will fail if the object is open.
+
+What happens if the directory containing the entry is moved?
+
+  The entry and the shadow still exist and are associated with one
+  another.
+
+What if a file is flushed or closed when objects are open?
+
+  Flushing a symbol table with open objects writes correct information
+  to the file since Shadow is copied to Entry before the table is
+  flushed.
+
+  Closing a file with open objects will create a valid file but will
+  return failure.
+
+How is the Shadow associated with the Entry?
+
+  A symbol table is composed of one or more symbol nodes.  A node is a
+  small 1-d array of symbol table entries.  The entries can move
+  around within a node and from node-to-node as entries are added or
+  removed from the symbol table and nodes can move around within a
+  symbol table, being created and destroyed as necessary.
+
+  Since a symbol table has an object header with a unique and constant
+  file offset, and since H5G contains code to efficiently locate a
+  symbol table entry given it's name, we use these two values as a key
+  within a shadow to associate the shadow with the symbol table
+  entry.
+
+	struct H5G_shadow_t {
+	   haddr_t	stab_addr;    /*symbol table header address*/   
+	   char         *name;	      /*entry name wrt symbol table*/
+           hbool_t      dirty;	      /*out-of-date wrt stab entry?*/
+	   H5G_entry_t  ent;	      /*my copy of stab entry      */
+	   H5G_entry_t  *main;	      /*the level 1 entry or null  */
+           H5G_shadow_t *next, *prev; /*other shadows for this stab*/
+      	};
+
+  The set of shadows will be organized in a hash table of linked
+  lists.  Each linked list will contain the shadows associated with a
+  particular symbol table header address and the list will be sorted
+  lexicographically.
+
+  Also, each Entry will have a pointer to the corresponding Shadow or
+  null if there is no shadow.
+
+  When a symbol table node is loaded into the main cache, we look up
+  the linked list of shadows in the shadow hash table based on the
+  address of the symbol table object header.  We then traverse that
+  list matching shadows with symbol table entries.
+
+  We assume that opening/closing objects will be a relatively
+  infrequent event compared with loading/flushing symbol table
+  nodes. Therefore, if we keep the linked list of shadows sorted it
+  costs O(N) to open and close objects where N is the number of open
+  objects in that symbol table (instead of O(1)) but it costs only
+  O(N) to load a symbol table node (instead of O(N^2)).
+
+What about the root symbol entry?
+
+  Level 1 storage for the root symbol entry is always available since
+  it's stored in the hdf5_file_t struct instead of a symbol table
+  node.  However, the contents of that entry can move from the file
+  handle to a symbol table node by H5G_mkroot().  Therefore, if the
+  root object is opened, we keep a shadow entry for it whose
+  `stab_addr' field is zero and whose `name' is null.
+
+  For this reason, the root object should always be read through the
+  H5G interface.
+
+One more key invariant:  The H5O_STAB message in a symbol table header
+never changes.  This allows symbol table entries to cache the H5O_STAB
+message for the symbol table to which it points without worrying about
+whether the cache will ever be invalidated.
+
+</pre>
+  
+</body>
+</html>
--- a/doc/html/TechNotes/Version.html
+++ b/doc/html/TechNotes/Version.html
@@ -0,0 +1,137 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Version Numbers</title>
+  </head>
+
+  <body>
+    <h1>HDF5 Release Version Numbers</h1>
+
+    <h2>1. Introduction</h2>
+
+    <p>The HDF5 version number is a set of three integer values
+      written as either <code>hdf5-1.2.3</code> or <code>hdf5 version
+      1.2 release 3</code>.
+
+    <p>The <code>5</code> is part of the library name and will only
+      change if the entire file format and library are redesigned
+      similar in scope to the changes between HDF4 and HDF5.
+
+    <p>The <code>1</code> is the <em>major version number</em> and
+      changes when there is an extensive change to the file format or
+      library API.  Such a change will likely require files to be
+      translated and applications to be modified.  This number is not
+      expected to change frequently.
+
+    <p>The <code>2</code> is the <em>minor version number</em> and is
+      incremented by each public release that presents new features.
+      Even numbers are reserved for stable public versions of the
+      library while odd numbers are reserved for developement
+      versions.  See the diagram below for examples.
+
+    <p>The <code>3</code> is the <em>release number</em>.  For public
+      versions of the library, the release number is incremented each
+      time a bug is fixed and the fix is made available to the public.
+      For development versions, the release number is incremented more 
+      often (perhaps almost daily).
+
+    <h2>2. Abbreviated Versions</h2>
+
+    <p>It's often convenient to drop the release number when referring
+      to a version of the library, like saying version 1.2 of HDF5.
+      The release number can be any value in this case.
+
+    <h2>3. Special Versions</h2>
+
+    <p>Version 1.0.0 was released for alpha testing the first week of
+      March, 1998.  The developement version number was incremented to 
+      1.0.1 and remained constant until the the last week of April,
+      when the release number started to increase and development
+      versions were made available to people outside the core HDF5
+      development team.
+
+    <p>Version 1.0.23 was released mid-July as a second alpha
+      version.
+
+    <p>Version 1.1.0 will be the first official beta release but the
+      1.1 branch will also serve as a development branch since we're
+      not concerned about providing bug fixes separate from normal
+      development for the beta version.
+
+    <p>After the beta release we rolled back the version number so the
+      first release is version 1.0 and development will continue on
+      version 1.1. We felt that an initial version of 1.0 was more
+      important than continuing to increment the pre-release version
+      numbers.
+
+    <h2>4. Public versus Development</h2>
+
+    <p>The motivation for separate public and development versions is
+      that the public version will receive only bug fixes while the
+      development version will receive new features.  This also allows 
+      us to release bug fixes expediently without waiting for the
+      development version to reach a stable state.
+
+    <p>Eventually, the development version will near completion and a
+      new development branch will fork while the original one enters a 
+      feature freeze state.  When the original development branch is
+      ready for release the minor version number will be incremented
+      to an even value.
+
+    <p>
+      <center>
+	<img alt="Version Example" src="version.gif">
+	<br><b>Fig 1: Version Example</b>
+      </center>
+
+    <h2>5. Version Support from the Library</h2>
+
+    <p>The library provides a set of macros and functions to query and 
+      check version numbers.
+
+    <dl>
+      <dt><code>H5_VERS_MAJOR</code>
+      <dt><code>H5_VERS_MINOR</code>
+      <dt><code>H5_VERS_RELEASE</code>
+      <dd>These preprocessor constants are defined in the public
+	include file and determine the version of the include files.
+
+	<br><br>
+      <dt><code>herr_t H5get_libversion (unsigned *<em>majnum</em>, unsigned
+	  *<em>minnum</em>, unsigned *<em>relnum</em>)</code>
+      <dd>This function returns through its arguments the version
+	numbers for the library to which the application is linked.
+
+	<br><br>
+      <dt><code>void H5check(void)</code>
+      <dd>This is a macro that verifies that the version number of the 
+	HDF5 include file used to compile the application matches the
+	version number of the library to which the application is
+	linked.  This check occurs automatically when the first HDF5
+	file is created or opened and is important because a mismatch
+	between the include files and the library is likely to result
+	in corrupted data and/or segmentation faults.  If a mismatch
+	is detected the library issues an error message on the
+	standard error stream and aborts with a core dump.
+
+	<br><br>
+      <dt><code>herr_t H5check_version (unsigned <em>majnum</em>,
+	  unsigned <em>minnum</em>, unsigned <em>relnum</em>)</code>
+      <dd>This function is called by the <code>H5check()</code> macro
+	with the include file version constants.  The function
+	compares its arguments to the result returned by
+	<code>H5get_libversion()</code> and if a mismatch is detected prints
+	an error message on the standard error stream and aborts.
+    </dl>
+
+<hr>
+<address><a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a></address>
+<br>
+
+<!-- Created: Wed Apr 22 11:24:40 EDT 1998 -->
+<!-- hhmts start -->
+Last modified: Fri Oct 30 10:32:50 EST 1998
+<!-- hhmts end -->
+
+  </body>
+</html>