Accumulating related XML nodes using HXT

30 Oct 2006

      Hello.

I have some html from which I want to extract records.  
Each record is represented within a number of <tr> nodes, and all records <tr> 
nodes are contained by the same parent node.

The things I've tried so far end up giving me the cartesian product of record 
fields, so for the html fragment included below I'd end up with:

[ Prod "Television" 17 "/prod17" "A very nice telly."
, Prod "Television" 17 "/prod17" "Mind your fillings."
, Prod "Cyclotron" 24 "/prod24" "A very nice telly."
, Prod "Cyclotron" 24 "/prod24" "Mind your fillings."
]

instead of:

[ Prod "Television" 17 "/prod17" "A very nice telly."
, Prod "Cyclotron" 24 "/prod24" "Mind your fillings."
]

How should I go about accumulating related <tr> nodes into individual records?

Thanks
Daniel

HTML fragment follows:

...
<tr>
  <tr>
    <td><strong>Product:</strong></td>
    <td><strong><a href="/prod17">Television</a></strong> (code: 17)</td>
  </tr>
  <tr>
    <td><strong>Description:</strong></td>
    <td>A very nice telly.</td>
  </tr>

  <tr>
    <td><hr color="#00000"></td>
  </tr>

  <tr>
    <td><strong>Product:</strong></td>
    <td><strong><a href="/prod24">Cyclotron</a></strong> (code: 24)</td>
  </tr>
  <tr>
    <td><strong>Description:</strong></td>
    <td>Mind your fillings.</td>
  </tr>

  <tr>
    <td><hr color="#00000"></td>
  </tr>
</tr>
...

Daniel McAllansmith

Albert Lai

Daniel McAllansmith

Daniel McAllansmith

tags

participants (2)