Listserv to mbox converter (ls2mail)

From: David D. Kilzer <ddkilzer_at_ti.com_at_hypermail-project.org>
Date: Wed, 03 Mar 1999 15:28:45 -0600
Message-Id: <199903032128.PAA26065_at_elbonia.rsc.raytheon.com>


Below please find a Perl script I helped to write for archiving the ADV-HTML mailing list with Hypermail. I originally wrote the script for Patrick Douglas Crispen <crispen_at_netsquirrel.com>, but he said it would be okay to include the script with Hypermail (if so desired).

This script appears to function similarly to the n2folder Perl script that Peter Murray <pem_at_po.cwru.edu> recently sent, and they may even operate on the same type of listserv archive.

One thing unique feature that ls2mail may have is that it generates a "Message-ID" for each message when converting the listserv archive to an mbox archive. While this won't help when linking message threads, it did help Hypermail (possibly required by the old 1.x versions?) to create its output.

Dave

#!/usr/local/bin/perl
#
# ls2mail -- converts listserv formated digests to UNIX "mail" format
#
# Usage:
# ./ls2mail < infile > outfile
# cat infile | ./ls2mail > outfile
#
# Written by David Kilzer <ddkilzer_at_ti.com>
# Tue, Mar 24, 1998
#

use strict;

my $first_time = 1;	# marks first time through script
my $line;		# stores one input line
my $header;		# stores mail message header lines
my $from_address;	# stores "From:" address
my _at_date;		# stores "Date:" information
my $message_id;		# stores new message ID info


while ($line = <>)	# use '<>' operator so we act like a UNIX filter
{
  chomp ($line);	# remove extra newlines

  if ($line !~ m/^={73}$/)	# Separator line?
  {
    print $line, "\n";		# Not separator, just print
  }
  else				# Found separator line, process
  {
    $header = "";	# clear variable
    $from_address = "";	# clear variable
    _at_date = ();		# clear variable
    $message_id = "";	# clear variable

    # Read in email header lines

    while ($line = <>)
    {

      last if ($line =~ m/^\s*$/);  # message header ends with "blank" line
      $header .= $line;		# add $line to $header
    }

    $header =~ s/^Sender:\s/To: /mi; # change "Sender:" to "To:"

    $header =~ s/^([^\s:]+:)\s+/$1 /mg; # remove extra space from all lines

    $header =~ s/\n\s+/\n /mg; # continued lines used 8 spaces

    # Find "From" address to use
    if ($header =~ m/^Reply-To:\s.*<([^>]+)>/mi)     {
      $from_address = $1;
    }
    elsif ($header =~ m/^Reply-To:\s.*\n\s.*<([^>]+)>/mi)     {
      $from_address = $1;
    }

    $header =~ m/^Date:\s(.*)$/mi;	# find "Date" header
    _at_date = split (' ', $1);		# split date into an array
    $date[0] =~ tr/,//d;		# remove commas from first date element
    $date[1] = " " . $date[1] if (length($date[1]) == 1);
    					# add space to single days

    $message_id = uc (join ('.', _at_date, $from_address)); # create message ID
    $message_id =~ tr/[A-Z][0-9]._at_//cd;	# remove bad characters

    # Print new UNIX mail header
    print "\n" if (! $first_time);
    $first_time &&= 0;
    print "From $from_address $date[0] $date[2] $date[1] $date[4] $date[3]\n";     print "Message-Id: <$message_id>\n";     print $header, "\n";
  }
}

exit 0;

__END__

Received on Wed 03 Mar 1999 11:34:17 PM GMT

This archive was generated by hypermail 2.2.0 : Thu 22 Feb 2007 07:33:50 PM GMT GMT