XML indexer, experimental, patch+python script

From: Bernhard Reiter <bernhard_at_uwm.edu_at_hypermail-project.org>
Date: Tue, 30 Nov 1999 02:18:32 -0600
Message-ID: <19991130021832.A3401_at_climate2.geog.uwm.edu>


Hello hypermailers,

I know I shouldn't have done this,
but I learned about python's XML handling in the process. This is my first attempt in XML file mangeling, so bear with me.

I was always talking and thinking about how to get montly archives done and not only done, but the index files created, too:

		The problem:
		------------

Hypermail in combination with the archive scripts, creates a bunch of directories for each year and each month when mails come in.

  1. How do you make a top page, linking all the scattered index files?
  2. Subproblem: If one mail comes in, do you really want to rebuild the complete index overviewfiles? Of course not.
  3. What if I want my top index page to have the number of mails grouped by week or so. :) (Hi egroups.)

Solution: A python script strangles the problem.

Part 1: I patched hypermail so that it creates an archive overview file

        complying with the haof.dtd in each directory it operates in.

Oh, back to Part 0:

        Wrote a dtd for the Hypermail Archive Overview Format (hoaf).

Part 2a: Wrote a little python module, which creates a HTML snipplet from this

	overviewfile and leaves it in the directory above. But only,
	if the overviewfile exists and is newer as the the snipplet.

Part 2b: Wrote another python script, which runs through a directory,
	and checks each year and month and runs the module from 2a a
	couple of times.

Results attached.

Left for the interested reader: Beautify the output.

Interesting research topics:
* Only the mail references are missing in the   hoaf, otherwise threading could be done on that level. * Well we could write this data into a little database. http://www.dbxml.org/ ?  Or Postgres or MySql-GPL?

Enjoy,

        Bernhard
ps:This contribution to hypermail shall be free software under the GPL.

Received on Tue 30 Nov 1999 03:53:45 PM GMT

This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:52 PM GMT GMT