* Stripping html from a string
@ 2004-02-22 0:13 wave
2004-02-22 0:45 ` Georg Bauhaus
0 siblings, 1 reply; 4+ messages in thread
From: wave @ 2004-02-22 0:13 UTC (permalink / raw)
Hello, I was wondering if anybody knew of a function lying around that
would return a given string with any html tags in it stripped.
I've had a look at Gnat.regexp, but for some reasons it's not liking
my regular expressions at all which 'should' strip the html.
Here is some of my example code:
with Ada.Text_Io, Gnat.Regexp;
use Ada.Text_Io, Gnat.Regexp;
procedure Regex is
procedure Testmatch (
Re : Regexp;
S : String ) is
begin
if Match( S, Re ) then
Put_Line( S & " matches the expression" );
else
Put_Line( S & " doesn't match the expression" );
end if;
end Testmatch;
Criteria : Regexp;
begin
Put_Line( "This program demonstrates GNAT's regular expression" );
Put_Line( "capabilities. These are used to find text that match" );
Put_Line( "a certain pattern." );
New_Line;
Criteria := Compile("<([A-Z][A-Z0-9]*)[^>]*></\1>", False, True);
Testmatch( Criteria, "hello world" );
Testmatch( Criteria, "<a
href=""http://www.helloworld.org/"">hello</a>" );
Testmatch( Criteria, "<b>hello, world</b>" );
Testmatch( Criteria, "some random text" );
end Regex;
Any input in this matter would be greatly appreciated.
Mut.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Stripping html from a string
2004-02-22 0:13 Stripping html from a string wave
@ 2004-02-22 0:45 ` Georg Bauhaus
2004-02-22 11:07 ` wave
0 siblings, 1 reply; 4+ messages in thread
From: Georg Bauhaus @ 2004-02-22 0:45 UTC (permalink / raw)
wave <mutilation@bonbon.net> wrote:
:
: Criteria := Compile("<([A-Z][A-Z0-9]*)[^>]*></\1>", False, True);
What "output" to you expect? AFAIKS, input will have to be rather
mute, so to speak, in order to match.
-- Georg
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Stripping html from a string
2004-02-22 0:45 ` Georg Bauhaus
@ 2004-02-22 11:07 ` wave
2004-02-22 16:17 ` Georg Bauhaus
0 siblings, 1 reply; 4+ messages in thread
From: wave @ 2004-02-22 11:07 UTC (permalink / raw)
Georg Bauhaus <sb463ba@l1-hrz.uni-duisburg.de> wrote in message news:<c18u2p$gls$1@a1-hrz.uni-duisburg.de>...
> wave <mutilation@bonbon.net> wrote:
> :
> : Criteria := Compile("<([A-Z][A-Z0-9]*)[^>]*></\1>", False, True);
>
> What "output" to you expect? AFAIKS, input will have to be rather
> mute, so to speak, in order to match.
>
>
> -- Georg
Oh, sorry, the code I gave was just to test the regular expression
pattern. Gnat is just throwing back an error with the pattern, if I
could get it working correctly then I could arrange the stripping of
the string.
Cheers,
Mut.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Stripping html from a string
2004-02-22 11:07 ` wave
@ 2004-02-22 16:17 ` Georg Bauhaus
0 siblings, 0 replies; 4+ messages in thread
From: Georg Bauhaus @ 2004-02-22 16:17 UTC (permalink / raw)
wave <mutilation@bonbon.net> wrote:
: Gnat is just throwing back an error with the pattern, if I
: could get it working correctly then I could arrange the stripping of
: the string.
Other than the line break that the news reader shows in your
source, I only noticed the backreference \1. I think this is
not provided in the simple regex matching package.
-- Georg
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-02-22 16:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-22 0:13 Stripping html from a string wave
2004-02-22 0:45 ` Georg Bauhaus
2004-02-22 11:07 ` wave
2004-02-22 16:17 ` Georg Bauhaus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox