Board index » delphi » How to convert rtf format to html format

How to convert rtf format to html format

I know the html help workshop can convert rtf file to html file and can get
the image from rtf file. How I programe to convert rtf format to html
format? Thank you!
 

Re:How to convert rtf format to html format


On Fri, 10 Jan 2003 02:57:41 +0800, "Jianfeng Zou" <zjf1...@21cn.com>
wrote:

Quote
>I know the html help workshop can convert rtf file to html file and can get
>the image from rtf file. How I programe to convert rtf format to html
>format? Thank you!

This is an interesting excercise, (I hope it is an excercise, as you
can download a converter without having to write one).

Basically you need to write your own RTF interpreter. This is a matter
of opening the RTF file as if it were plain text, parsing the RTF
markup from the text, and using some sort of lookup mechanism to
convert this to HTML.

So
  \bHello, World!\b0  
would become
  <B>Hello, World!</B>

It's not always a straight one-to-one conversion, as an RTF file lists
all the fonts and colours that it uses at the top, a bit like a style
sheet. So whereas in HTML you have <font face = "Verdana">, in RTF
you'll have \f0, where "f0" is the identifier for "Verdana" in the
font table.

You can download the RTF spec, along with some info about creating an
RTF reader here:

  http://www.biblioscape.com/rtf15_spec.htm

A good RTF reader will politely ignore anything it doesn't understand.
This is a Good Thing, as it will enable you to get a basic converter
up and running quite quickly, and there is a *lot* of non-standard RTF
out there - in fact almost anything created in MS Word and saved as
RTF will contain a whole load of markup that isn't standard.

To get started, I recommend using WordPad to create a very basic RTF
file to work on - just a couple of sentences with different styles is
enough.

Anyway, have fun.

--
jc

Re:How to convert rtf format to html format


On Fri, 10 Jan 2003 02:57:41 +0800, "Jianfeng Zou" <zjf1...@21cn.com>
wrote:

Quote
>I know the html help workshop can convert rtf file to html file and can get
>the image from rtf file. How I programe to convert rtf format to html
>format? Thank you!

I did this in C# by having two viewers on a form - 1 x Rich Text and 1
x HTML. I then wrote a prog to load a file into the Rich Text Box, set
focus to this control, sent keys Ctrl+A and Ctrl+C then switched focus
to the HTML viewer and sent keys Ctrl+V then saved the file.
It was messy but saved a lot of work.

--
Jeff Gaines Damerham Hampshire UK
j...@jgaines.co.uk

Re:How to convert rtf format to html format


I have the rtf spec and have found some converter. But these converter have
some problem because they can't convert chinese or japanese to
correct html format. I have found the Html Help worker can convert it
correctly. It must depend on the hha.dll which the software included. I only
know the declaration of HHA_Compile function, and I don't know which
function is convert rtf format to html format.

Quote
"Jeremy Collins" <jd.coll...@ntlworld.com> wrote in message

news:9m3t1vcnecbhnog6uo0gj4iho13a5l9f7j@4ax.com...
Quote
> On Fri, 10 Jan 2003 02:57:41 +0800, "Jianfeng Zou" <zjf1...@21cn.com>
> wrote:

> >I know the html help workshop can convert rtf file to html file and can
get
> >the image from rtf file. How I programe to convert rtf format to html
> >format? Thank you!

> This is an interesting excercise, (I hope it is an excercise, as you
> can download a converter without having to write one).

> Basically you need to write your own RTF interpreter. This is a matter
> of opening the RTF file as if it were plain text, parsing the RTF
> markup from the text, and using some sort of lookup mechanism to
> convert this to HTML.

> So
>   \bHello, World!\b0
> would become
>   <B>Hello, World!</B>

> It's not always a straight one-to-one conversion, as an RTF file lists
> all the fonts and colours that it uses at the top, a bit like a style
> sheet. So whereas in HTML you have <font face = "Verdana">, in RTF
> you'll have \f0, where "f0" is the identifier for "Verdana" in the
> font table.

> You can download the RTF spec, along with some info about creating an
> RTF reader here:

>   http://www.biblioscape.com/rtf15_spec.htm

> A good RTF reader will politely ignore anything it doesn't understand.
> This is a Good Thing, as it will enable you to get a basic converter
> up and running quite quickly, and there is a *lot* of non-standard RTF
> out there - in fact almost anything created in MS Word and saved as
> RTF will contain a whole load of markup that isn't standard.

> To get started, I recommend using WordPad to create a very basic RTF
> file to work on - just a couple of sentences with different styles is
> enough.

> Anyway, have fun.

> --
> jc

Other Threads