[rescue] Solaris 9 NFS client with Linux server

Wed Aug 8 07:14:47 CDT 2007

On Wed, Aug 08, 2007 at 07:45:59AM -0400, Steve Sandau wrote:

> Well, after considerable experimentation (i.e. a ratio of trial and 
> error close to 1:1), it seems that I apparently finally got this to work 
> using "mount -o proto=tcp -o vers=2 mrsnorton:/raid /import" to mount 
> the share. It appears to be slightly more responsive using "proto=udp" 
> instead.
> 
> I thought I had tried forcing this to version 2 before, but that did it 
> unless something else changed between last night and this morning.
> 

> 
> Maybe someone else can make some sense of that, but I'd think there were 
> network problems or the NIC in the server was a problem if scp from this 
> box and NFS from other boxes didn't work fine.

I think you are missing the point here. UDP is RAW IP, SCP is built over
so many layers each with error checking that if any of them fail, 
it gets "fixed" without you ever noticing it.

> Now I just unmounted it, tried some other options (version 2 is what 
> makes it work) and seem to have introduced some slowness just by 
> remounting it. When it is slow, or when it pauses, ethereal shows a 
> packet labeled as "fragmented IP protocol", "RPC retransmission", "RPC 
> duplicate" packets and a single ICMP packet labeled "Time to live 
> exceeded (fragment reassembly time exceeded)".

Version 2 is a slower protocol than version 3. Since you have gone to
TCP instead of UDP, you are introducing all sorts of error checking
on top of the NFS protocol itself. I'm not really familiar with it
anymore, but did not version 3 introduce synchronous file transfers?

This is where a block is sent from the server and when it's finshed
being written, the client acks it and the next block is sent. 
On reads, this happens when the block(s) are read by the client 
application. It slows things down at the expense of reliability.

Without it a server could be receiving data from a client and
would loose data because of connection glitch or failure. 
However the client would not know it and attempt to resume
where it left off and there would be an impossible situation
where the data was no longer held in the client and never
made it to the server. 

Since you seem to have problems with your network, make sure that
the data packets are less than an MTU in size. It seems like UDP
packets are being lost along the way. 

This looks like a network problem to me, I'd first check out
the hub/switch. Are you using one interface at 10mbps and the
other at 100? Do you have any lost packets, overruns, framing
errors?

What network card do you have on the Linux server and what driver
are you using? 

If all else fails, try a new network card on the Linux server.

I had similar problems and they turned out to be capacitor failure
in an 8 port dumb switch.

Geoff.

-- 
Geoffrey S. Mendelson, Jerusalem, Israel gsm at mendelson.com  N3OWJ/4X1GM
IL Voice: (07)-7424-1667 U.S. Voice: 1-215-821-1838 
Visit my 'blog at http://geoffstechno.livejournal.com/