[geeks] script language advice

Shannon Hendrix shannon at widomaker.com
Fri Feb 1 22:18:52 CST 2008


On Feb 1, 2008, at 10:53 PM, Nadine Miller wrote:

> What language would the collective brain recommend for a script to  
> parse
> lines of up to 7500 chars in length?  I'm leaning towards shell or php
> since I've been doing a lot of tinkering with those of late, and my  
> perl
> is very weak.

PHP is the Microsoft of programming languages, but it's your brain... :)

Shell is very powerful, but also very slow and looks line line noise.

Do you care what the names of the duplicate file listings are?

The basic algorithm:

	sort the list of files
	read line
	count=0
	dupe=0
	do
		read newline
		if line == newline
			cnt++
			dupe=1
		else
			if dupe
				write line to filename.$cnt
			cnt=0
			dupe=0
		line = newline
	while read line

Or something like that.

It's not hard if you just think about it a bit.

That's probably how I'd do it from the hip.

> Aside from the line lengths, the biggest bear is that the filesystems
> are fat32, so there's a lot of unusual characters (rsync choked on "?"
> for example) and spaces in the file paths.

How did you get such long lengths from fat32?

I thought it had a 256 character total limit?

-- 
"Where some they sell their dreams for small desires."



More information about the geeks mailing list