HTML::Element::traverse man page on HP-UX

Man page or keyword search:  
man Server   10987 pages
apropos Keyword Search (all sections)
Output format
HP-UX logo
[printable version]

HTML::Element::traversUser Contributed Perl DocumentHTML::Element::traverse(3)

       HTML::Element::traverse - discussion of HTML::Element's traverse method

	 # $element->traverse is unnecessary and obscure.
	 #   Don't use it in new code.

       "HTML::Element" provides a method "traverse" that traverses the tree
       and calls user-specified callbacks for each node, in pre- or
       post-order.  However, use of the method is quite superfluous: if you
       want to recursively visit every node in the tree, it's almost always
       simpler to write a subroutine does just that, than it is to bundle up
       the pre- and/or post-order code in callbacks for the "traverse" method.

       Suppose you want to traverse at/under a node $tree and give elements an
       'id' attribute unless they already have one.

       You can use the "traverse" method:

	   my $counter = 'x0000';
	     [ # Callbacks;
	       # pre-order callback:
	       sub {
		 my $x = $_[0];
		 $x->attr('id', $counter++) unless defined $x->attr('id');
		 return HTML::Element::OK; # keep traversing
	       # post-order callback:
	     1, # don't call the callbacks for text nodes

       or you can just be simple and clear (and not have to understand the
       calling format for "traverse") by writing a sub that traverses the tree
       by just calling itself:

	   my $counter = 'x0000';
	   sub give_id {
	     my $x = $_[0];
	     $x->attr('id', $counter++) unless defined $x->attr('id');
	     foreach my $c ($x->content_list) {
	       give_id($c) if ref $c; # ignore text nodes

       See, isn't that nice and clear?

       But, if you really need to know:

       The "traverse()" method is a general object-method for traversing a
       tree or subtree and calling user-specified callbacks.  It accepts the
       following syntaxes:

       or $h->traverse(\&callback, $ignore_text)
       or $h->traverse( [\&pre_callback,\&post_callback] , $ignore_text)

       These all mean to traverse the element and all of its children.	That
       is, this method starts at node $h, "pre-order visits" $h, traverses its
       children, and then will "post-order visit" $h.  "Visiting" means that
       the callback routine is called, with these arguments:

	   $_[0] : the node (element or text segment),
	   $_[1] : a startflag, and
	   $_[2] : the depth

       If the $ignore_text parameter is given and true, then the pre-order
       call will not be happen for text content.

       The startflag is 1 when we enter a node (i.e., in pre-order calls) and
       0 when we leave the node (in post-order calls).

       Note, however, that post-order calls don't happen for nodes that are
       text segments or are elements that are prototypically empty (like "br",
       "hr", etc.).

       If we visit text nodes (i.e., unless $ignore_text is given and true),
       then when text nodes are visited, we will also pass two extra arguments
       to the callback:

	   $_[3] : the element that's the parent
		    of this text node
	   $_[4] : the index of this text node
		    in its parent's content list

       Note that you can specify that the pre-order routine can be a different
       routine from the post-order one:

	   $h->traverse( [\&pre_callback,\&post_callback], ...);

       You can also specify that no post-order calls are to be made, by pro‐
       viding a false value as the post-order routine:

	   $h->traverse([ \&pre_callback,0 ], ...);

       And similarly for suppressing pre-order callbacks:

	   $h->traverse([ 0,\&post_callback ], ...);

       Note that these two syntaxes specify the same operation:

	   $h->traverse([\&foo,\&foo], ...);
	   $h->traverse( \&foo	     , ...);

       The return values from calls to your pre- or post-order routines are
       significant, and are used to control recursion into the tree.

       These are the values you can return, listed in descending order of my
       estimation of their usefulness:

       HTML::Element::OK, 1, or any other true value keep on traversing.

	   Note that "HTML::Element::OK" et al are constants.  So if you're
	   running under "use strict" (as I hope you are), and you say:
	   "return HTML::Element::PRUEN" the compiler will flag this as an
	   error (an unallowable bareword, specifically), whereas if you spell
	   PRUNE correctly, the compiler will not complain.

       undef, 0, '0', '', or HTML::Element::PRUNE block traversing under the current element's content.	(This
	   is ignored if received from a post-order callback, since by then
	   the recursion has already happened.)	 If this is returned by a pre-
	   order callback, no post-order callback for the current node will
	   happen.  (Recall that if your callback exits with just "return;",
	   it is returning undef -- at least in scalar context, and "traverse"
	   always calls your callbacks in scalar context.)

       HTML::Element::ABORT abort the whole traversal immediately.	 This is often useful
	   when you're looking for just the first node in the tree that meets
	   some criterion of yours.

       HTML::Element::PRUNE_UP abort continued traversal into this node and its parent node.
	   No post-order callback for the current or parent node will happen.

	   Like PRUNE, except that the post-order call for the current node is
	   not blocked.

       Almost every task to do with extracting information from a tree can be
       expressed in terms of traverse operations (usually in only one pass,
       and usually paying attention to only pre-order, or to only post-order),
       or operations based on traversing. (In fact, many of the other methods
       in this class are basically calls to traverse() with particular argu‐

       The source code for HTML::Element and HTML::TreeBuilder contain several
       examples of the use of the "traverse" method to gather information
       about the content of trees and subtrees.

       (Note: you should not change the structure of a tree while you are
       traversing it.)

       [End of documentation for the "traverse()" method]

       Traversing with Recursive Anonymous Routines

       Now, if you've been reading Structure and Interpretation of Computer
       Programs too much, maybe you even want a recursive lambda.  Go ahead:

	   my $counter = 'x0000';
	   my $give_id;
	   $give_id = sub {
	     my $x = $_[0];
	     $x->attr('id', $counter++) unless defined $x->attr('id');
	     foreach my $c ($x->content_list) {
	       $give_id->($c) if ref $c; # ignore text nodes
	   undef $give_id;

       It's a bit nutty, and it's still more concise than a call to the "tra‐
       verse" method!

       It is left as an exercise to the reader to figure out how to do the
       same thing without using a $give_id symbol at all.

       It is also left as an exercise to the reader to figure out why I unde‐
       fine $give_id, above; and why I could achieved the same effect with any

	   $give_id = 'I like pie!';
	  # or...
	   $give_id = [];
	  # or even;
	   $give_id = sub { print "Mmmm pie!\n" };

       But not:

	   $give_id = sub { print "I'm $give_id and I like pie!\n" };
	  # nor...
	   $give_id = \$give_id;
	  # nor...
	   $give_id = { 'pie' => \$give_id, 'mode' => 'a la' };

       Doing Recursive Things Iteratively

       Note that you may at times see an iterative implementation of pre-order
       traversal, like so:

	    my @to_do = ($tree); # start-node
	    while(@to_do) {
	      my $this = shift @to_do;

	      # "Visit" the node:
	      $this->attr('id', $counter++)
	       unless defined $this->attr('id');

	      unshift @to_do, grep ref $_, $this->content_list;
	       # Put children on the stack -- they'll be visited next

       This can under certain circumstances be more efficient than just a nor‐
       mal recursive routine, but at the cost of being rather obscure.	It
       gains efficiency by avoiding the overhead of function-calling, but
       since there are several method dispatches however you do it (to "attr"
       and "content_list"), the overhead for a simple function call is

       Pruning and Whatnot

       The "traverse" method does have the fairly neat features of the
       "ABORT", "PRUNE_UP" and "PRUNE_SOFTLY" signals.	None of these can be
       implemented totally straightforwardly with recursive routines, but it
       is quite possible.  "ABORT"-like behavior can be implemented either
       with using non-local returning with "eval"/"die":

	 my $died_on; # if you need to know where...
	 sub thing {
	   ... visits $_[0]...
	   ... maybe set $died_on to $_[0] and die "ABORT_TRAV" ...
	   ... else call thing($child) for each child...
	   ...any post-order visiting $_[0]...
	 eval { thing($node) };
	 if($@) {
	   if($@ =~ m<^ABORT_TRAV>) { died (aborted) on $died_on...
	   } else {
	     die $@; # some REAL error happened

       or you can just do it with flags:

	 my($abort_flag, $died_on);
	 sub thing {
	   ... visits $_[0]...
	   ... maybe set $abort_flag = 1; $died_on = $_[0]; return;
	   foreach my $c ($_[0]->content_list) {
	     return if $abort_flag;
	   ...any post-order visiting $_[0]...

	 $abort_flag = $died_on = undef;
	 ...if defined $abort_flag, it died on $died_on


       Copyright 2000,2001 Sean M. Burke

       Sean M. Burke, <>

perl v5.8.8			  2006-08-04	    HTML::Element::traverse(3)

List of man pages available for HP-UX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net