When I started using a Power Mac G4 running Mac OS X as desktop machine a year ago I decided to use NIS and NFS on the machine. That would allow me to share data easily and kept my personal file on my server which uses RAID and gets backed up.
Getting NIS and NFS to work wasn’t very difficult using Marcel Bresink’s excellent instructions.The first problem I encountered was poor NFS performance, about 2MB/sec over Gigabit Ethernet. Following the advise of a fellow NetBSD developer I tried using NFS over UDP. While this is usually slower and less reliable it fixed the problem in this case. Reading a large file via NFS now runs at 30MB/sec. The only remaining problem was that I could occasionally not log in after booting up the machine. This happened about once a week and restarting the machine via the login window usually fixed the problem.
Unfortunately the problem got a lot worse when I upgraded the hardware to a Power Mac G5. I wasn’t able to login after one out of three (re)boots. On at least one occassion the problem required half a dozen reboots before I could finally use the machine. I also experienced a new problem where my account would work but the home directory couldn’t be mounted. This error required logging in as a local user and removing the bogus home directory which got created because NFS didn’t work. The automounter would otherwise not mount my home directory even if NIS worked fine.
The situation became unbearable and I began to analyzed the problem. I tried modifying the NIS startup script with little success. After a while I realized that lookupd was causing the problems with NIS. It sometimes failed for no apparent reason to talk to the NIS server. The result was that either the NIS accounts were not available or that the automounter couldn’t load the NIS mount map and the home directories weren’t accessible. I finally figured out the sequences to get my Mac working when it was in that dodgy state:
- Login using a local account.
- Open a Terminal window and use sudo zsh to get system administrator privileges.
- Force a restart of lookupd with killall lookupd.
- Wait a moment and tell the automounter to reload its configuration via killall -HUP automount.
I became tired of doing that manually of course and finally wrote a shell script which did the job automatically. The scripts gets started from /etc/rc.local like this:
nohup /usr/local/sbin/fix-nis 25 >/tmp/fix-nis.log 2>&1 &
Using that brute force approach fixed the problem. If I can’t login after booting the machine I just wait a few seconds until the scripts teaches lookupd a lesson and can finally login and access my home directory.
I nevertheless wanted to know what causes those problems and posted an article in a german Mac OS X network related newsgroup. In the resulting discussion somebody pointed out that Marcel Bresink has added a section about Mac OS X Tiger related NIS bugs to his instructions. It seems that Apple introduced quite a lot of bugs with the integration of launchd into Mac OS X Tiger. I remember that the Solaris 10 on my company laptop at a previos job had similar problems because Sun had also invented a parallelised system startup with that operating system release.
So the good news is that my NIS setup at home isn’t broken. But the bad news is that there is no better solution than my brute force shell script. Let’s hope that Apple fixes these problems in Mac OS X Leopard.
When I told Stephen Borrill about the NFS over TCP performance problem during last Friday’s NetBSD meeting in Cambridge he told me that similar performance problems occur with (NetBSD) Samba servers. Tuning the TCP stack under Mac OS X by setting the kernel parameter “net.inet.tcp.delayed_ack†to zero fixes this problem.
After making that change on my Power Mac G5 the performance of NFS over TCP went up to 25MB/Sec.
You should probably do The Right Thing and create a launchd rule for your script instead of rc.local! At least until Apple come up with ANOTHER reinvention of service management.
I’m using NIS and NFS and I have the same problem. My login window kept on freezing.. alot.. I tried not to let my mac go into sleep mode, and that helped, but now I got another problem. Sometimes the login window just shakes when I try to login, as if I enter a wrong password, but I’m 100% sure that it’s the right password (sometimes both the local admin account and the “Other” account freezes and sometimes just one of those does). I have also installed a program called unlockudp.
I took a look in the system log and this message is printed (both in lookupd and DirectoryService): NetInfo connection failed for server 127.0.0.1/local
That sounds exactly like the problem that my script fixes for me. Could you perhaps give it a try?
Yes I will. But I can’t login as admin atm 😛
Well, for better or for worse this will not be a problem in OS X Leopard. Apple has removed lookupd entirely from 10.5.
I have 20 machines I run NIS/NFS on and was hopeful that this script would do the trick….Alas, no. I run exactly has stated, via /etc/rc.local with the script running out of /usr/loca/sbin/fix-nis.
1/3 of the systems still freeze at the login screen
Please try increasing the delay (the “25” argument) and see whether that helps. I’m not claiming the script is perfect. But it is able to fix the problem most of the time on my Mac.
Please know how appreciative I am of your script! It is the closest thing I have found to a solution. I have increased from 25 to 45.
I have found a solution in the meantime: using LDAP instead of NIS.
I have written a script (which I will publish after some cleanup) that automatically replicates the relevant NIS maps into the database of an OpenLDAP server. Ever since my Mac uses LDAP as the directory service and works like a charm.