Discussion:
Exposing String Escape Functions
Volkan YAZICI
2007-07-16 11:13:27 UTC
Permalink
Hi,

In a parser I'm working on, trying to convert hand-written documents
into XHTML form. And for this purpose using CL-WHO integrated within
META-SEXP. Because of character-by-character parsing, I need to escape
unrecognized atoms on-the-fly. At the moment, I'm using below method.

(elt
(cl-who:escape-string
(make-string 1 :initial-element character-needs-escaping))
0)

Yep, quite nasty code piece to escape a single character. Therefore,
I'd ask if it'd be possible to expose the character escaping
routines. (If you approve the proposal, I'm volunteered to send a
patch.)


Regards.
Edi Weitz
2007-07-16 11:24:33 UTC
Permalink
Post by Volkan YAZICI
In a parser I'm working on, trying to convert hand-written documents
into XHTML form. And for this purpose using CL-WHO integrated within
META-SEXP. Because of character-by-character parsing, I need to
escape unrecognized atoms on-the-fly. At the moment, I'm using below
method.
(elt
(cl-who:escape-string
(make-string 1 :initial-element character-needs-escaping))
0)
Hmm, I don't think I undertstand that. Grabbing just the first
character will usually just give you the ampersand:

CL-USER 1 > (let ((character-needs-escaping #\>))
(elt
(cl-who:escape-string
(make-string 1 :initial-element character-needs-escaping))
0))
#\&

Is that really what you want?
Volkan YAZICI
2007-07-16 11:29:26 UTC
Permalink
Post by Edi Weitz
Hmm, I don't think I undertstand that. Grabbing just the first
CL-USER 1 > (let ((character-needs-escaping #\>))
(elt
(cl-who:escape-string
(make-string 1 :initial-element character-needs-escaping))
0))
#\&
Is that really what you want?
Execuse, that's my fault. I realized the mistake after I pressed C-c
C-c. Here's a small snippet from the real code:

(write-string
(cl-who:escape-string (make-string 1 :initial-element c))
some-output-stream)


Regards.
Edi Weitz
2007-07-16 11:56:07 UTC
Permalink
Post by Volkan YAZICI
Execuse, that's my fault. I realized the mistake after I pressed C-c
(write-string
(cl-who:escape-string (make-string 1 :initial-element c))
some-output-stream)
OK, I see.

It's fine with me if you want to isolate the corresponding code and
export a function which works on characters as long as your patch
adheres with these guidelines:

http://weitz.de/patches.html

Go wild,
Edi.
Volkan YAZICI
2007-07-16 13:35:40 UTC
Permalink
Post by Edi Weitz
Post by Volkan YAZICI
(write-string
(cl-who:escape-string (make-string 1 :initial-element c))
some-output-stream)
OK, I see.
It's fine with me if you want to isolate the corresponding code and
export a function which works on characters as long as your patch
http://weitz.de/patches.html
Edi Weitz
2007-07-19 21:44:19 UTC
Permalink
Sorry for the delay. Busy...
I attached the related patch with the post.
Thanks. That's OK with me except that I'd use FLET instead of LET for
the test functions. But the patch for the HTML documentation is
missing.
But if you'd ask for my opinion, escaping functions are just
polluting function namespace.
I don't think that's a big issue because we have packages. CL-WHO
only exports two dozens of symbols or so.
IMHO, it would be better to collect them under a single generic
function.
Yeah, but it'd be "harder" to use. Again, I think this is not a big
issue and mainly a matter of taste.
By the way, (eq *html-node* :xml) checks in the code make FORMAT
optimization impossible for character escaping routines. I didn't
test the impact of this from the performance point of view, but how
many clients there are that doesn't support hexadecimals in the
escaped entities? (Maybe let that check as a compile time
parameter?)
I agree that it'd be nicer to make this a compile-time decision. I'm
not so much concerned about performance, but it'd be good for
consistency.

Thanks,
Edi.
Volkan YAZICI
2007-07-19 21:54:37 UTC
Permalink
But the patch for the HTML documentation is missing.
I totally missed to diff index.html. Here it is.


Regards.
Edi Weitz
2007-07-24 21:57:44 UTC
Permalink
I attached the related patch with the post.
The patch contained TAB characters and wrong links amongst other
things, so I had to go through it manually anyway. That's why it took
me a while. Here's the new release.

Thanks,
Edi.

Loading...