[tpop3d-discuss] Memory leak?

Marc Lewis marc at blarg.net
Wed, 8 May 2002 13:07:34 -0700


On Wed, May 08, 2002 at 06:38:07PM +0100, Chris Lightfoot wrote:
> On Wed, May 08, 2002 at 10:12:52AM -0700, Marc Lewis wrote:
> > On Wed, May 08, 2002 at 11:40:56AM +0100, Chris Lightfoot wrote:
>     [...]
> > > I've attached a trivial test of the PAM code, pamtest.c.
> > > Could you replace `user' and `password' at the end with a
> > > valid username and password, compile it with
> > >     cc -o pamtest -lpam -ldl pamtest.c
> > > and see whether it leaks memory when you run it?
> > 
> > Yes, it does.  When first started:
> > 
> > # ps auxw | grep pamtest
> > root     27063  8.5  0.1  4412 1904 pts/1    R    09:28   0:00 ./pamtest
> > 
> > After about a minute of running:
> > 
> > # ps auxw | grep pamtest
> > root     27810 12.5  0.7  9748 7236 pts/1    S    09:28   0:07 ./pamtest
> > 
> 
> OK. This is, I guess, about the same rate of growth that
> the tpop3d process is seeing.

Roughly, yes.  The tpop3d process is on a live server, so users are hitting
it at anywhere between 30 and 200 times per minute depending on the system
load.

> > I may try configuring it so it uses auth-ldap and bypasses PAM, but that
> > doesn't seem like a good long term fix since we're using PAM to keep things
> > uniform.
> 
> Agree. (But see my notes about PAM below.) I am fairly
> sure that the code I have sent you is a correct PAM
> program in the sense that it ought not leak memory. I
> don't have a recent machine to hand with which to test
> this, and all the older PAM implementations leak memory
> even in simple modules like pam_unix.

I've been running some tests using the auth-ldap module, and it appears
that the leak is still there.

When first starting up the daemon:

root     10829  0.0  0.1  4788 1900 pts/1    S    12:18   0:00 /usr/src/tpop3d-1.4.1/tpop3d -f /etc/tpop3d-test.conf -d -v

After about a hundred or so mail checks (on a different port):

root     10829  0.0  0.1  4836 1948 pts/1    S    12:18   0:00 /usr/src/tpop3d-1.4.1/tpop3d -f /etc/tpop3d-test.conf -d -v

Looks like it has grown by about 48K.  So, it appears to be a small leak,
but there none the less.

I know this is hackish sounding, but what is the motivation for doing all
of the auth and such before forking?  When I've written simple daemons and
such in the past, I use the parent process to only accep the connection and
then fork and let the child process do the authentication.  That way if I
missed something, when the child dies, any memory that it allocates is
freed without affecting the parent process.  The other bonus is that if the
child process dies due to a programming error or anything else, the parent
will keep running and the services stay going.

I will probably end up hacking in a new option to tpop3d to make it not
detach from the controlling TTY and not send out debugging information to
stderr.  This is how we ran our old pop3 server (cucipop) before switching
to Maildir format.  Dropping it into inittab made sure that even if the
parent died, it would be restarted.

> > > Also, can you tell me more about the crashes you've
> > > experienced -- in particular, is there any useful
> > > information in the logs?
> > 
> > Nothing.  Things just stop.
> 
> Hmm. That could just be an out-of-memory issue, I suppose.
> Or possibly something timing out connecting to the LDAP
> server?

We hadn't seen any timeouts like this when we were using Courier POP, but
it often corrupts attachments over 500K (the first reason we switched to
tpop3d).  We also have a lot of IMAP usage for our Webmail services and
haven't seen any timeouts at all (big LDAP in-memory caches are setup).

> > Also, I don't know if anyone else has seen
> > this, but it is a bit bizarre.  After first starting up tpop3d, it logs
> > things to /var/log/maillog as one would expect.  After running for a while,
> > though, suddenly it will start up in /var/log/messages and the mail log
> > entries will stop.  It is very, very strange and the only application that
> > shifts from one log to another.  It could be a symptom of the other
> > problem, but I thought I would mention it anyway.
> 
> Yep-- I've seen this before. It's a bug in one of the PAM
> authentication modules, which is evidently calling
> openlog(3) and changing which log file the thing is
> writing to. Sigh. Try the following patch:
[snip]

Thanks, I'll give the patch a try and see if it makes a difference.  If it
is a PAM bug, this may be a moot patch anyway if we end up going with
straight LDAP authentication.

> (You probably don't all want to hear me rant about PAM.
> Let me just say that, for a security-critical component of
> the operating system, it is astonishingly shoddily
> implemented. Coupled with a loudy design and poor
> documentation, I'm surprised that it works at all.)

Actually, I feel similar about PAM, this was our first heavy usage of it
beyond what RedHat provides in their stock distributions.  But, because of
the way we are rolling out new servers, PAM + LDAP is a good way for us to
keep a single configuration and secure authentication mechanism in place
for a whole network of servers that all need to have the same user
information for POP, IMAP, Apache userdir and auth, SASL, FTP, SSH, etc,
etc, etc..

Thanks.

 - Marc

-- 
Marc Lewis
Network Administrator
Blarg! Online Services, Inc.
http://www.blarg.net/~marc