Re:How to convert rtf format to html format
On Fri, 10 Jan 2003 02:57:41 +0800, "Jianfeng Zou" <zjf1...@21cn.com>
wrote:
Quote
>I know the html help workshop can convert rtf file to html file and can get
>the image from rtf file. How I programe to convert rtf format to html
>format? Thank you!
This is an interesting excercise, (I hope it is an excercise, as you
can download a converter without having to write one).
Basically you need to write your own RTF interpreter. This is a matter
of opening the RTF file as if it were plain text, parsing the RTF
markup from the text, and using some sort of lookup mechanism to
convert this to HTML.
So
\bHello, World!\b0
would become
<B>Hello, World!</B>
It's not always a straight one-to-one conversion, as an RTF file lists
all the fonts and colours that it uses at the top, a bit like a style
sheet. So whereas in HTML you have <font face = "Verdana">, in RTF
you'll have \f0, where "f0" is the identifier for "Verdana" in the
font table.
You can download the RTF spec, along with some info about creating an
RTF reader here:
http://www.biblioscape.com/rtf15_spec.htm
A good RTF reader will politely ignore anything it doesn't understand.
This is a Good Thing, as it will enable you to get a basic converter
up and running quite quickly, and there is a *lot* of non-standard RTF
out there - in fact almost anything created in MS Word and saved as
RTF will contain a whole load of markup that isn't standard.
To get started, I recommend using WordPad to create a very basic RTF
file to work on - just a couple of sentences with different styles is
enough.
Anyway, have fun.
--
jc