A quick glance over the code shows that it has a couple of tables that do
appear in the developer's kit. These tables could have been reverse-engineered
by looking at clients and the data stream, so it's possible that the kit
was not used. However, it _is_ a possibility.
It does _not_ have the full text of the copyright messages as listed in
the developer's kit, only a source. The copyright code is ignored, which
may make it technically illegal to run this program; if you are going to
run it, know that you are stripping copyright notices from messages which
contain them.
The absence of the copyright messages may imply laziness, or a desire not to
have copyright messages, or it may imply that the author did not have them
available (in turn implying that he did not have a developer's kit).
>(3) It spits out rather nicely formatted netnews articles, but it doesn't
> categorize them as deeply as the X*Press PC software that I have.
> For example, all Weather articles are put into "xpress.weather"
> and not "xpress.weather.<locale>".
This actually looks like a design decision. If you give the -w command line
switch to put it into weather mode, it _does_ decode the locale, and puts
the report into weather/<locale> . If you don't give the -w option, it
puts an "X-Weather:" header line in the article.
>(6) I haven't quite figured out *what* it does with the stock quotes.
> They seem to be getting recognized, but I am not sure where they
> are going just yet.
They go to the file "stock/<code>" where <code> is the stock name, iff that
file already exists. Or, if you give it the "-s" flag, they all go to the
file "stock_data" in the current directory.
>(7) Beware of your disk space limitations before you run this all night.
> A 20 minute run produced roughly 10 articles per minute at about
> 1.5KB per article. Of course, duplicates should get rejected
> somewhere along the way and so one wouldn't expect that rate
> all day long, but still...
I'm not sure that his duplicate-rejection scheme with the Message-ID: being
the timestamp in the article header is sufficient. I don't think I can say
much more right now.
My first optimization would be, if you're running C News, to popen() relaynews
directly, instead of writing to the "news" file, and to have time_stamp()
pclose() and popen() it. That should get you significanlty better
xpress->news time, and will only actually write the data to disk in the final
destination, instead of having a temporary copy on disk. Plus, there's no
need to run the other script periodically.
(You probably want "relaynews -r" so that the relaynews output goes to the
right place. Or popen("relaynews > xnewslog 2> xnewserr","w"), or something.)
Disclaimer: All of my comments are from examining the code. I do not have
an xpress feed available to me at my current job, so I cannot actually test
any of this.
-- Bill Fenner fenner@jazz.psu.edu ..psuvax1!hogbbs!wcfpc!wcf wcf@hogbbs.scol.pa.us (+1 814 238-9633 v.32bis)