[tpop3d-discuss] tpop3d, maildir and postfix (Was: UIDL shows messages with same ID?)

Sun, 4 Nov 2001 10:04:02 +0000

[ Cc:ed to list as other users may wish to be aware of
  this. ]

On Sun, Nov 04, 2001 at 04:58:49AM +0100, Henrik Larsson wrote:

> I use postfix witch deliver to Maildirs.
> 
> But when i authenticate up against tpop3d sometimes the server gives me 
> several messages with same UIDL:
> +OK Welcome aboard! You have 3 messages.
> stat
> +OK 3 2130
> UIDL
> +OK ID list follows:
> 1 313030343834393135362e3134333630
> 2 313030343834393135362e3134333630
> 3 313030343834393135382e3134333633
> 
> The problem is that the mail-client only takes unique messages, so it will 
> only take 2 this time.
> Then i can login again with the client, and it will get the last message 
> this time.
> 
> Is this an error in tpop3d or postfix?

I'm not certain. Could you tell me what the filenames of
the messages are within the maildir?

For BSD mailspools, tpop3d uses the MD5 digest of the
first part of the message. For maildir, it uses the first
16 characters of the filename, encoded in hex. Obviously
this means that if the messages differ in uniqueness only
in the 17th and later characters.

A brief look at the Postfix source suggests that it uses
names of the form

    <time> . <PID> _ <count> . <hostname>

<time> is now 10 digits; <PID> will vary and ought to
change for each message, but I can imagine multiple
messages being delivered during the same second by PIDs
100000, 100001, ...  ending up with the same ID.

So: Postfix is not doing anything wrong, though it's
hardly being very helpful. The bug is in tpop3d.

I'm not sure what the most portable way to fix this is,
which will preserve your old message IDs. If you do not
care about that (i.e. don't think your users will complain
too much when they receive new copies of old messages),
then you could on or around line 48 of maildir.c the lines

    /* XXX this could break uniqueness, though in practice this is unlikely. */
    strncpy(m->hash, filename+4, sizeof(m->hash));    /* +4: skip cur/ or new/ subdir */

for

    /* XXX not tested but should work. */
    for (i = 4; filename[i]; ++i) m->hash[(i - 4) & 0xf] += filename[i];

also add `int i;' at the top of the function. The above
has the property that messages in files which have names
of 16 characters or less have their UIDs preserved, which
doesn't help much here.

If preserving the uniqueness of existing UIDs is
important, a more sophisticated strategy is required.
Obviously all but the first message whose old UID needs to
have a UID obtained by a better hash function. Clearly
this needs to be fixed, and I will do a release to fix it
reasonably shortly (not instantly as I'm rather busy at
the moment).

It would probably be more sensible to form an MD5 digest
of the filename, since that's more likely to be unique
than the above.

Apologies for this SNAFU.

-- 
 I'd like to see anyone-- prophet, king or god--
 who could get a thousand cats to agree on anything