[Reddes.bvs-tech] indizacion de texto html

spinaker spinaker at adinet.com.uy
Wed Apr 4 16:21:40 BRST 2012


Estimados

Tengo una base con un campo que contiene el texto completo de una pagina 
xhtml
Quiero indizar con tecnica 4/8, o sea palabra por palabra
pero NO quiero indizar la informacion de las etiquetas html
por ejemplo
------------------------------------------------------------------------
*<HTML>
<HEAD>
<TITLE>*National water development report for Ethiopia; A WWAP case study
prepared for the 2nd UN world water development report: Water, a shared
responsibility (2006); 2006*</TITLE>*
...
The World Water Assessment Program was initiated in 2000 as a global mecha
ism for *<br>*measuring and reporting Progress with achievement of 
international
bjectives in the water *<br>*sector as part of the international 
sustainable deve
opment agenda, formalized in 1992, and *<br>*re-evaluated and focused on 
developi
g countries in 2002 in Johannesburg.  The first *<br>*World Water 
Development Rep
rt was produced in 2003 and released at the 3rd World *<br>*Water Forum 
in Kyoto.
*<br>*
..
Hay algún procedimiento para descartar informacion entre elementos delim 
< >  de la indizacion?
------------------------------------------------------------------------

Otra pregunta
Estoy intentando usar el proc= *Gdump[/<tag>][/nonl][/xml][=<file>]*

  pero tengo problemas con algunos parametros y no sé qué cosa hago mal. 
Si uso
*>mx pepe1 "proc='Gdump/1=xxx.txt'"*   funciona bien

pero si uso alguno de los otros parametros

*c:\temp> mx pepe1 "proc='Gdump/1/xml=xxx.txt'"
fatal: fldupdat/procx/Gdump/option

c:\temp> mx pepe1 "proc='Gdump/1/nonl=xxx.txt'"
fatal: fldupdat/procx/Gdump/option

c:\temp> mx pepe1 "proc='Gdump/nonl=xxx.txt'"
fatal: fldupdat/procx/Gdump/option*

saludos
Ernesto Spinak

-- 
   .^.                                .^.
   ( )                                ( )
   ===                                ===
  =[=]================================[=]=
   | |  Ernesto Spinak                | |
   | |  spinaker at adinet.com.uy        | |
   | |  Montevideo, Uruguay           | |
   | |  tel/fax  (598) 2622-3352      | |
   | |  celular  (598) 99612238      | |
  =[=]================================[=]=
   ===                                ===
   ( )                                ( )
    V                                  V

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listas.bireme.br/pipermail/reddes.bvs-tech/attachments/20120404/421c82c9/attachment-0001.html 


More information about the Reddes.bvs-tech mailing list