Date: Sat, 11 Oct 97 20:39:50 EST From: Dwight McKay (The Moderator) Reply-To: Suns-at-Home@net-kitchen.com Subject: Suns-at-Home Digest V10 #35 To: Suns-at-Home-List Suns-at-Home Digest Sat, 11 Oct 97 Volume 10 : Issue 35 Today's Topics: Are they worth it? Formatting blues Help with monitors Just got 3/50 OS Tape Copy Program [was: Suns-at-Home Digest V10 #34] Question about a cable Sparc 2 info and OS recommendation wanted (2 msgs) Suns-at-Home Digest V10 #34 (3 msgs) very strange hardware problems with my big old sun-3/280 server.... +--------------------------------------------------------------------+ | Submissions: suns-at-home \ | | Requests: suns-at-home-request > @net-kitchen.com | | Archives: suns-at-home-archives / | | WWW Archive access: http://www.net-kitchen.com/~sah | +--------------------------------------------------------------------+ ---------------------------------------------------------------------- Date: Mon, 6 Oct 1997 15:05:49 -0700 (PDT) From: James Lockwood Subject: Are they worth it? To: Suns-at-Home@tigger.net-kitchen.com > It's a matter of tradeoffs. How important are the following to you? > > o A usable keyboard with shift, ctrl, esc, and ~ keys in sane places Agreed, the type 4 and type 5 keyboards are some of the best around IMHO. > o Having to screw around with IRQ's Amen, brother! > o Being able to reliably use more than 128M of memory > o Searching the universe for a device driver for each peice of hardware True, but you do have to deal with Sun EOL'ing h/w (like my GT and the SPARCprinter). I just have my GT running off of an old SS1+ w/64mb, but it would be great to have it on my SS10 running 2.6. > >I just saw someone advertise a 24-bit VME frame-buffer for a Sun-3 at > >$100 (USA). > > OS support for it? Chances are that one would be stuck either running > SunOS 4.1.1x to get support for it, or maybe using it in 8-bit mode > under *BSD. I can't say for sure without specifics, but those are my > suspicions. Yep. Sounds like a cg8/cg9, and support for those under X is spotty at best. > >You can pick up a 3/260 or similar that would drive it for > >probably less than $200 > > SS2's without a monitor are in the $250 range these days, though. I see them bare for $175 all the time, add on some memory and disk and you're still only around $300. > > but if your interest is in graphics the more > >modern SGI platforms probably come with better and more interesting > >graphics software, > > Irix remains hostile to getting freeware to compile & run, though. Very true. This promises to change soon, but most bargain SGI machines are not going to be running IRIX 6 anytime soon. > >Solaris 2.6 media is now available for only $100 > > Only for educational customers under ScholarPAC, I thought. The rest of > us pay at least $380 for a desktop. Those prices at > http://www.execpc.com/~tkeidl/ seem suspicious to me. Maybe that > doesn't include the license. Keep in mind that _only Sun_ (and Sun-designated agencies) can legally sell you a RTU under their current licensing agreements. Very few people actually pay attention to this. (from the Solaris 2.6 binary code license): 1. LICENSE TO USE. Customer is granted a non-exclusive and non-transferable license ("License") for the use of the accompanying binary software in machine-readable form, together with accompanying documentation ("Software"), by the number of users and the class of computer hardware for which the corresponding fee has been paid. My (unofficial) answer from someone at Sun was "don't ask, don't tell" with regards to RTU-less copies of Solaris and copies obtained from a reseller. Obviously this is not sufficient for a business, but I suspect that a large number of home users overlook the non-transferable clause. In any event, you can't go too far wrong with Sun TeleSales. 1-800-786-0404. > >As far as Applix goes, it's a nice package but too pricey > > As of 1994 I found it to be pretty horrid. One might speculate that > they changed the name from Asterix to try to leave behind its > reputation. I don't know what it's like now, but then it was entirely > OpenLook (via XView, I think). Bleah. It's been given a major facelift, it looks extremely motif-like but really only uses Xlib. It's quite fast even on my SS2. Some of us like OpenLook, it was quite a dramatic improvement from SunView. > >As far as I understand it, 4.1.3_U1 and 4.1.4 is basically the same except > >4.1.4 has a lot of the patches already applied > > I ran a 4.1.4 machine for a few months, and had major problems trying to > get AMD to be stable. Heck, the machine itself would reboot frequently. > The same hardware ran find under 4.1.3_U1.. 4.1.4 changed a lot of things including some subtle stuff in csh and sh. > >I would not recommend a PC/pentium (mainly because I am a Sun bigot ;) ) > >over a Sun, unless you want to run the Gates Virus or Linux. > > Modulo the hassles of messing with the hardware, on x86 hardware I'd go > with SunOS 5.6. It's not all that expensive, and it sure beats fighting > with the likes of BSD/OS. Agreed. I still haven't gotten Solaris x86 to boot with my Mirage Z-128 card installed (even in plain text mode) but if you've got supported hardware it's 100% worth it. -James ============================================================================= James D. Lockwood The Getty Information Institute System Administrator 1200 Getty Center Drive, Suite 300 james@gii.getty.edu Los Angeles, CA 90049-1680 - ------------------------------ Date: Mon, 6 Oct 1997 09:11:06 -0400 (EDT) From: raub@kushana.aero.ufl.edu (Mauricio Tavares) Subject: Formatting blues To: suns-at-home@tigger.net-kitchen.com I feel so stupid asking this, but anyway: Ok, I haven't added a new drive to my Sparc 1+ running SunOS 4.1.4 in ages. Now, I want to replace its 1GB drive by a 2GB one. As it was to be expected, the drive I have does not have an entry in the /etc/format.dat file 9I think that is the name). "No Sweat," I thought. "All I have to do is run format and be on my merry way." Right now the machien complains this drive does not have right magic number, something I don't even remember its function anymore. Well, things are not that simple. First of all, I found that if I create an entry in /etc/format.dat, it won't show up when I run format, and vice-versa. Is it me or that goes against the man page? Anyway, here are the specs for the drive Samsung WN-321010S 2160MB (formatted, according to the docs and web page) 3 Disks 6 heads 5588 cylinders 512 bytes/sector 5400 RPM All that is nice and peachy, but, how do I enter that in format? Well, actually I did get to enter that, but do not like that I can't find the entry in /etc/format.dat. I wonder where it hides it. Back to my problem, I then start partitioning the HD. Can anyone help me and explain what each field does? What I mean is the line that goes like this: /dev/sd2f: Start sector: AAA BBBBBBBB (CCCC/D/E) How does AAA and CCC are related? If my mind is not mistaken, CCCC is the size of the partition in the same units AAAA is. So, to get the AAAA for the next partition you would add the current AAA to CCCC. Am I right here? -- ===========================+========================================== | Mauricio Tavares | "We will attack... | | raub@kushana.aero.ufl.edu | ...under the cover of daylight!" Rimmer | ===========================+========================================== - ------------------------------ Date: Sat, 11 Oct 1997 16:00:47 -0500 From: Jeff Fredrickson Subject: Help with monitors To: Suns-at-Home@tigger.net-kitchen.com I'm pretty new to Suns, and I have a SPARCclassic without a monitor. I don't know anything about Sun monitors, so I'm wondering if anyone has any suggestions when comparing prices/quality/etc. I'm looking for something as inexpensive as possible. How good is the GDM-1604 16"/17" monitor? There's one for sale for $60 but I don't know if it's worth getting. - ------------------------------ Date: Tue, 7 Oct 97 02:45:00 UT From: "Richard A. Cini, Jr." Subject: Just got 3/50 To: "SAHList" Hi! I'm new to the list because I just adopted my first 3/50. Mostly I collect "classic" (pre-1980) computers, but I just couldn't pass this one up. Anyway, this one has a monitor problem. It's the 19" Philips model and it squeals when turned on. There is no picture at all, and it almost sounds like that a transformer is trying to oscillate, but just isn't getting there. I looked at the three boards inside the case. The power supply board had no obvious problems (burn marks, blown caps, etc.) The video board had one broken wire from the DB9 (repaired; no effect). The driver board had one unsoldered cap and a few bad solder joints (repaired; no effect). Before I ship this off to a TV-repair place (I've seen TV places repair monitors), what typically goes wrong with this monitor model. I've seen some vague references in the archives, but none seem to fit. TIA for any clues. Rich Cini/WUGNET rcini@msn.com ClubWin! charter member (6) MCP Windows 95 and Windows Networking - ------------------------------ Date: Sun, 5 Oct 1997 22:49:38 -0400 (EDT) From: woods@most.weird.com (Greg A. Woods) Subject: OS Tape Copy Program [was: Suns-at-Home Digest V10 #34] To: Suns-at-Home@tigger.net-kitchen.com > Date: Sat, 27 Sep 1997 12:41:19 -0700 > From: Kevin Cosgrove > Subject: OS Tape Copy Program [was: Suns-at-Home Digest V10 #33] > To: Suns-at-Home@tigger.net-kitchen.com > > Below my signature are two script that will allow you to copy > tapes, boot tapes or otherwise. Feed each of the two > "begin...end" portions to uudecode and you have these two scripts. I wrote some similar scripts after fighting to try and make reliable copies of multi-file tapes and trying to move from say 60MB tapes to 150MB tapes. They've quite a few more options than Kevin's scripts, including a verify feature. They can be found at the following URL, or I can e-mail them to anyone who doesn't have FTP access: ftp://ftp.planix.com/pub/Planix/tapestuff.shar I've reliably made copies of old SunOS distribution tapes (after several attempts to read almost unreadable tapes), and have even re-installed from copies made to 150MB tapes, so I know they work flawlessly. -- Greg A. Woods +1 416 443-1734 VE3TCP Planix, Inc. ; Secrets of the Weird - ------------------------------ Date: Tue, 7 Oct 1997 16:10:02 +0000 (GMT) From: Christofer Karatzinis Subject: Question about a cable To: Suns-at-Home@tigger.net-kitchen.com Hi all, I just have my new Sun (an old IPC, discarded from my university). The problem is that I don't have a monitor. I think, I can use my PC's monitor, but I need a special cable to connect the Sun's monitor card to my SVGA PC's monitor. Anyone who knows how to make one of these cables? Thanx in advance, Christofer - ------------------------------ Date: Sun, 5 Oct 1997 13:23:30 -0400 (EDT) From: Mike Frisch Subject: Sparc 2 info and OS recommendation wanted To: John Ruschmeyer [There were several responses to John's question. I'm HEAVILY editting these] [to cut down the repetition. --ddm] On Fri, 26 Sep 1997, John Ruschmeyer wrote: > 1) The system has 32MB of RAM. Am I correct in my reading of the hardware > FAQ in that I can upgrade this with standard 30-pin, 9-chip SIMMS? Yes. > - What is the minimum speed I need? 80ns or better for an SS2. > - Can I add SIMMs 4 at a time (as opposed to 8)? Yes, a bank is 4 SIMMs. You do not have to add 8 at a time. > - Can I use 1MB SIMMS to go to 36 or 40 MB? Yes. > - If I can use 1MB SIMMS, which banks mus the 4MB SIMMS be in? Not quite sure if there are any special arrangements on the SS2. Typically, put the 4MB SIMMs first and the 1MB SIMMs last.a > 2) What is the recommended OS for this system? I run Solaris 2.5.1 (and soon, 2.6 when I get it) on mine. Performance is reasonable and quick enough for most tasks I do. > installed) or Solaris 2.2. I've tentative ruled out Solaris 1.1 becuase of > its maturity and Sparc Linux becuase of its immaturity. (Does OpenBSD > have the emulation code?) You'll typically find many people still running SunOS 4.x and therefore, can still get decent support from the 'net. IMHO, SunOS 4.x will give you the best performance on this hardware. > Unix/X hacking, and acting as a SAMBA and Appletalk server. Initially, > it will be connected to the net via PPP, but may ultimately have a > cable modem connection. Ick... If you want to run the port at 57.6k or better, you definitely should look into an aftermarket serial port. Sun serial ports are terrible. Mike. ====================================================================== Mike Frisch Email: mfrisch@saturn.tlug.org Northstar Technologies WWW: http://saturn.tlug.org/~mfrisch Newmarket, Ontario, CANADA ====================================================================== - ------------------------------ Date: Sun, 5 Oct 1997 12:42:17 -0700 (PDT) From: James Lockwood Subject: Sparc 2 info and OS recommendation wanted To: Suns-at-Home-List@tigger.net-kitchen.com > - Can I use 1MB SIMMS to go to 36 or 40 MB? Sort-of. It's not officially endorsed by Sun, and I've had a few problems when trying it. > - If I can use 1MB SIMMS, which banks mus the 4MB SIMMS be in? The first banks. Get the Sun Hardware Reference at ftp.picarefy.com (IIRC), it has the information you need and much more. > 2) What is the recommended OS for this system? First of all, Solaris 2.x where x < 4 is what gave Solaris its legendary bad (and slow) reputation. Ditch it immediately. I'm not kidding. If you want to go the Solaris route (which is what I recommend) then go out and find a copy of 2.5, 2.5.1, or 2.6 Desktop. It's fairly cheap on the used market (<$100) and you'll be glad you did. I run 2.6 on my SS2 and it's actually quite fast (and far less buggy than 2.2!). Solaris 1.1 (aka SunOS 4.1.3) is an extremely solid OS. True, it hasn't changed much in the last few years, but it's by far the most stable of the OS's you've listed, and runs just dandy on a SS2. I've run machines under 4.1.3U1 with multi-_year_ uptimes under heavy load. NetBSD is fairly mature (but nowhere near as much as SunOS 4.1.3U1). I'd order your list like: Solaris 2.5 or higher (but this is my personal preference) Solaris 1.1 NetBSD OpenBSD Sparc Linux and would take Solaris 2.2 completely out of the picture. Just not worth messing with. > Am I correct in assuming that much of the decision is really one of > BSD vs. SVR4? Given what I want to do with the system, are there any > showstoppers along either path? Not really. Any of the above operating systems will serve, although Solaris gives you a lot of nice freebies (motif etc) and is quite fast with enough memory. -James ============================================================================= James D. Lockwood The Getty Information Institute System Administrator 1200 Getty Center Drive, Suite 300 james@gii.getty.edu Los Angeles, CA 90049-1680 - ------------------------------ Date: Sat, 4 Oct 1997 20:41:20 -0700 (PDT) From: Curt Sampson Subject: Suns-at-Home Digest V10 #34 To: Suns-at-Home-List@tigger.net-kitchen.com Oh boy, another multi-message reply here. > From: Anthony Talltree > Subject: Are they worth it? > > Modulo the hassles of messing with the hardware, on x86 hardware I'd go > with SunOS 5.6. It's not all that expensive, and it sure beats fighting > with the likes of BSD/OS. You still have to deal with the vagaries of PC hardware, of course. Given the extra performance you get for the money, it may well be worth the tradeoff. It's just something to keep in mind. > From: James Lockwood > Subject: Buying a Sparc for Home > > > ...and because I really like > > the Sun 3 keyboards (which I use on my Sparc systems, too). > > Ergh, the old type-3 purple-printed keyboards? With the funky slopedown > in front? Why? You bet, the type 3. They have lots of function keys (very useful when you use a lot of virtual screens and also have various window manipulation functions bound to the keys), they're not as wide as the more modern keyboards, all the keys are in the right place, backspace and delete are both easily available for someone with his hands on the home row, and the feel is great. I had to build a little adapter to hook it up to more modern workstations, but even on my SS5 it works just fine. > Sun is finally getting committed to producing machines that are price > competitive with the PC marketplace. Yes, I've started to notice this with some recent price cuts. This is good news. > To put it bluntly, They Just Work. I plug in a Sun, put in any expansion > cards, turn it on, and everything works. This is not nearly as big of a > deal with a single machine at home, where the owner can play with it for > hours getting everything just right, but when you maintain dozens or hundreds > of machines it makes a tremendous difference. This is true, although if you standardise on some fairly good and commonly available PC hardware you only have to go through the problems once, and with dozens or hundreds of machines, you're looking a very significant cost savings by going with a PC. These savings may be enough even to hire someone to just take care of the darn things. > From: John Ruschmeyer > Subject: Sparc 2 info and OS recommendation wanted > > 1) The system has 32MB of RAM. Am I correct in my reading of the hardware > FAQ in that I can upgrade this with standard 30-pin, 9-chip SIMMS? Yes. I believe that 80 ns SIMMs are fine, though you may need 70. Regardless, almost anything you by that was manufactured recently will be a 60. You add them in banks of four, and you can use 1 MB SIMMs in a bank instead, if you like. I've never tried putting my 4 MB SIMMs in anything but the lowest banks when putting both 4 MB and 1 MB SIMMs in a box. > 2) What is the recommended OS for this system? Well, you probably want to avoid anything less than Solaris 2.4 on these low-end Sparcs, because the earlier ones had pretty poor performance, I understand. That basically leaves you with NetBSD. :-) (Ok, ok, so I'm biased. I'm a NetBSD developer, and currently running it on all five of my Sparcs. :-) Do note that NetBSD-1.3 will be out in December, and that squishes a lot of problems. Even 1.2.1 is pretty reliable on the sparcs, though.) Oh, and I'm running all sorts of stuff with the SunOS shared libs and Openwindows libraries. I've never had a problem with the emulation. > My primary use for the system will be general home Net surfing (possibly > also acting as a proxy or firewall to mine and my wife's PCs)... 1.3 will have full packet filtering and network address translation. The latest NetBSD-current snapshot does as well. I've been using filtering to prevent certain nasty stuff from getting on to my home network for some time. cjs Curt Sampson cjs@portal.ca Info at http://www.portal.ca/ Internet Portal Services, Inc. Through infinite myst, software reverberates Vancouver, BC (604) 257-9400 In code possess'd of invisible folly. - ------------------------------ Date: Sun, 05 Oct 1997 13:41:09 +0200 (MET DST) From: "Jan G. Timm" Subject: Suns-at-Home Digest V10 #34 To: Suns-at-Home@tigger.net-kitchen.com, jruschme@exit109.com > Am I correct in assuming that much of the decision is really one of > BSD vs. SVR4? Given what I want to do with the system, are there any > showstoppers along either path? I guess you will be able to do all that with any of the above listed OS's, just a matter of how much work you're willing to invest. If you come across weird hardware you might want to add to your system, you're better off with Solaris 2.[456] I'm using a HP scanner, an old archive DAT drive, a SunPC sbus accelerator card and a 4x RS232 card from PTI with my IPX. You'd probably have problems to get drivers for those on Net/OpenBSD or Linux for those. Cheers, Jan Jan-G. Timm | email: timm@coli.uni-sb.de (private) | timm@mpi-sb.mpg.de | bofh@mindless.com 66111 Saarbruecken, Germany | timm@dfki.de (work) - ------------------------------ Date: Sun, 5 Oct 1997 22:37:20 -0400 (EDT) From: woods@most.weird.com (Greg A. Woods) Subject: Suns-at-Home Digest V10 #34 To: Suns-at-Home@tigger.net-kitchen.com > Date: Sat, 27 Sep 1997 12:44:57 -0700 (PDT) > From: Anthony Talltree > Subject: Are they worth it? > To: suns-at-home@tigger.net-kitchen.com > > >I just saw someone advertise a 24-bit VME frame-buffer for a Sun-3 at > >$100 (USA). > > OS support for it? Chances are that one would be stuck either running > SunOS 4.1.1x to get support for it, or maybe using it in 8-bit mode > under *BSD. I can't say for sure without specifics, but those are my > suspicions. If you mean a basic device driver then yes 4.1.1_U1 on the sun3 has a "cgnine" driver for the VME and a cgeight for the P4 24bit cards. There are cgeight and cgfourteen drivers still in 5.5.1. There's nothing in X11R5 for the CG9. However X11R6 has a sunCfb24 driver intended for the CG8 but it may work on the CG9 or be easily tweaked/ported. R6 also includes source for the GX support but the CG9 only works with the GP2 and there's no GP2 support in X11 so far as I know though supposedly 4.x and later have built-in GP2 support. Yes, if you put a CG9 and a GP2 in your VME machine you'll need an air conditioner and a "Sugar Daddy" to pay the power bill! ;-) > These days the 24-bit bargains are PC hardware, but one has to work hard > to find stuff that does 24 bits over a decent number of pixels at a > decent speed. Depends. Of course you pay "new" prices for such gear and with a good monitor it's sometimes even more expensive than equivalent workstation gear, never mind used workstations. The new Matrox cards are quite nice and the new Xfree86 has good accelerated drivers for them. The nice things about workstations and X terminal are they generally use monitors that have relatively few (if any more than one) frequencey range which makes them far more stable and often a whole lot cheaper (at least in the 19" through 21" range). Such monitors are available used these days for amazingly low prices. Used versions of big multi-sync monitors can often be a lot of trouble as they may have buggy firmware, etc. The brand new versions seem lots better, but are lots of $$$$ too. Though I don't often use or want to use a colour screen I find that multi-sync monitors are never as stable or sharp as mono-sync monitors. Of course that said I have an NCDhmx X terminal now sitting beside my trusty 3/60 hires display for those times I do want color, and it takes an essentially standard SVGA high-end multi-sync monitor to deliver 1280x1024 8-bit colour. > Irix remains hostile to getting freeware to compile & run, though. Depends on your porting experience and abilities. I find IRIX-6 to be a relatively rich environment for porting. Maybe stuff won't work out-of-the-box, but that's not what freeware is necessarily all about. I've noted that SGI often distribute tons of good ready-to-run freeware binaries. In that sense SunOS-5.x and SunOS-4.x are no longer the platforms of choice if you want ready-to-run applications. FreeBSD and Linux are much less "hostile" for that purpose. > I ran a 4.1.4 machine for a few months, and had major problems trying to > get AMD to be stable. Heck, the machine itself would reboot frequently. > The same hardware ran find under 4.1.3_U1.. Not that I've ever used amd before, but I know for a fact that 4.1.4 plus the recent recommended patches for it is one heck of a lot more stable than equivalently patched 4.1.3_U1 on Sparc 20's and clones. > Modulo the hassles of messing with the hardware, on x86 hardware I'd go > with SunOS 5.6. It's not all that expensive, and it sure beats fighting > with the likes of BSD/OS. Hmm... that seems a bit twisted to me. BSD/OS on ix86 hardware is quite nice. FreeBSD is a bit better, but not by much. Unless you have thousands of identical machines to manage I'd choose any BSD over SunOS-5 any day. SunOS 5.6 is a whole new can of worms, and of course 5.5.1 is chock full of holes, so if you go that route your damned if you do and damned if you don't (upgrade that is). Running NetBSD (or I suppose Linux) on more modern Sun hardware isn't much different than running it on most other hardware. I find NetBSD-1.2 on the sun3 to be unacceptably slower than SunOS-4.1 in the disk I/O department, but it may be improved now (with 1.3 near the door) from what it was. On the other hand NetBSD/1.1 on a Sparc IPC was somewhat faster in some areas than SunOS-4.1.4 was and otherwise nearly equiv. -- Greg A. Woods +1 416 443-1734 VE3TCP Planix, Inc. ; Secrets of the Weird - ------------------------------ Date: Mon, 6 Oct 1997 01:11:45 -0400 (EDT) From: woods@most.weird.com (Greg A. Woods) Subject: very strange hardware problems with my big old sun-3/280 server.... To: Suns-at-Home@tigger.net-kitchen.com Hi Suns-at-home folks! Are any of you intimately familiar with the hardware design of the sun-3 machines? ;-) Recently my big old custom-built sun-3/280 server has begun to suffer more and more of what appear to be memory errors. This rather long and detailed message describes my system and the problems as well as the various things I've done to date to try and rectify the problem. Essentially I'm looking for advice and a copy of the Sun Diagnostic Executive tape for Sun-3 (and hopefully the manual for it too). So far I've failed to find the latter locally, and all advice to date hasn't really helped any. ;-) For those who've not read a description of it before it is a: 19" enclosed rack with 2x ~110CFM filtered fans in the top digital power controller with EMI filters Sun 3/280 chassis (ps, backplane, and frame) Sun 3200 CPU with 3.0 PROM four Sun 501-1451 32MB memory boards (yes! 128MB! ;-) Sun SCSI 3 All the boards are properly slotted, jumpered, terminated, etc. It's running SunOS-4.1.1_U1 and has one each 1.2GB, 2.4GB, 380MB disk drives, 150MB and 525MB QIC tape drives, and a CD-ROM drive. The only thing really "different" about this machine (other than the extraordinary amount of RAM) is there's also a Sun 3/60 C/G machine plugged into slots 11+12 (i.e. sharing the power bus). Recently it has begun to suffer ever more frequent ECC writeback panics (from one every couple of weeks to as many as one every few hours) like this one: Oct 2 01:58:06 most vmunix: ECC Error Register d4 Oct 2 01:58:06 most vmunix: DVMA = 0, context = 7, virtual address = dcd8b50 Oct 2 01:58:06 most vmunix: pme = 0, physical address = b50 Oct 2 01:58:06 most vmunix: panic: writeback error Most crashes don't go so well that the panic message can be picked up by syslog after the reboot unfortunately. There are never any error LEDs lit on the memory boards when I do get a chance to look such as these other really hard crashes: Usually the firmware prints a second similar error at a slightly different physical address that looks something like this: MEMORY ERROR status C4 DVMA-BIT 0 context 7 vaddr=DCD8BF4 paddr=BF4 type 0 at 0x0E067BD8 At this point the only reliable way to recovery is a power cycle or, curiously enough, by entering the PROM extended diagnostics (x) and exiting (q) before doing a PROM system reset command (k2). Even hitting the reset button doesn't always help. When it gets stuck it either hangs during the initial ECC memory initialization (though never with the diagnostics switch on!) where the CPU LED on board 0 stays on steady, or it hangs just before or just after firing up the init process (i.e. after printing the root/dump partion info). Often the crashes are immediately preceded by various processes dumping core from SIGBUS (usually) or SIGSEGV (rarely) crashes that are *NOT* due to software bugs. Usually it's a constantly or often running process such as xload, xmailwatcher, xclock, inetd, etc. I've taken to sweeping my mouse cursor over the xload window occasionally to be sure it's still running full speed ahead. I've run the /usr/diag/sundiag/pmem diagnostics many times and many ways but it has never reported any problem. Every so often if I catch processes dying unexpectedly soon enough and I run pmem the system will survive and go back to normal operation. Luckily so far I've not suffered any major filesystem damage (never anything more than the normal scrambling of currently open files as is common with SunOS-4.1). The panics always manage to sync the disks, though given the nature of the problem this could be more dangerous than not syncing them! ;-) If anyone can describe the Sun ECC memory design to me and the exact meaning of the WBACKERR bit, etc., I'd much appreciate it. Reading /usr/kvm/sys/sun3/memerr.h & eccreg.h and the existing NetBSD memerr.c module haven't told me much more than how the addresses are calculated and how under SunOS the chip location is computed. I've done a number of memory board swaps, a memory board exchange, and a couple of cpu board exchanges (though the last CPU had ie0 problems take it out of the running). Nothing I change seems to vary the pattern (i.e. pme #0, phys addr < 0x2000, no hardware errors, DVMA bit 0, etc.), though I finally seem to have made it more stable (i.e. it's lasted an average of 24 hours between the three crashes since the last hardware change). Some of the exchanges had promises that never came through, such as finding a bent ground pin on one memory board, updating to a much newer rev of the CPU board, etc. Of course Sun never intended the 501-1451 memory boards to run with the Sun 3200 series CPUs (only the 3400). On the other hand I know of several other similar machines running supposedly without problem and there's nothing in this which would explain why the errors would increase in frequency or why they would always happen to board #0, and why swapping everything but the backplane and power supply wouldn't help or hurt. Unfortunately neither the writeback error panic, nor the firmware ECC error message give any indication of which chip is implicated, unlike the soft ecc interrupt handler: Jun 4 17:11:36 most vmunix: mem3: soft ecc addr 18618d8 syn 8a 62 U2062 (my understanding is this means location U2062 on board #3) There have been no single-bit errors recently either, which sort of discounts a strong Alpha particle source in the new chassis or cabinet. There had been an average of one per month when the machine was running an average of two months between crashes which seems high but didn't seem to cause any troubles. It seems that if I keep the machine under heavy load for extended periods of time (i.e. hours at >2.0 load average) the chance of a crash is much higher. I can't prove this yet though as it ran through the nightly filesystem scans while I was compiling stuff last night and didn't crash until 7:30am when it should have been under light load. A bit of history: Prior to being a 3/280 this machine was in a 3/260 chassis. It had been experiencing similar crashes on an infrequent basis -- perhaps one every couple of weeks or month. Even before adding the 3/60 to the power bus, and even before upgrading to the four 32MB boards (was 3x16MB and then 4x16 for a while) there were occasional similar crashes. Sometimes error lights on the memory boards indicated multi-bit errors or other problems for those cases though. However due to both some physical and power issues I wanted to get the machine into a full rack, and because the power supply fan in the 260 chassis needed replacing, and because a 260 chassis isn't really as cool as it could be, esp. since it can suck in a lot of dust and cat hair. I thought that might also help those memory errors too, which at the time were beginning to happen every other day or so. After moving to the new cabinet it had run for as long as a month. The only seemingly common denominator has been that since inserting the 3/60 to share the power supply the memory writeback panics have increased in frequency at times, and recently to an intolerable level. My current plan of action (until I can completely upgrade to a new server, something running NetBSD, probably either a mid-sized Alpha or a big Pentium II), is to: 1. continue to search for the diagnostics tape 2. measure & watch the power supply outputs on a scope for a bit. 3. take the 3/60 out of the chassis on the next crash and try for a couple of hours or so to cause a crash by running lots of memory and CPU intensive jobs. 4. downgrade to a pre-2.8 PROM that should have more and better extended diagnositcs and run the memory test for an hour or so. 5. try further filtering the power through an old Alpha UPS (that's unfortunately not computer grade, but will soak up all the brownouts and spikes) [though there's now more filtering on the power supplies than a dozen PCs normally enjoy!] 6. try pulling one of the memory boards [:-(] (I almost always use more than 75 MB at any one time!) 7. move to a newer, faster, more reliable server. 8. install NetBSD/sun3 on that machine and learn more about the memory architecture to write my own diags! ;-) If anyone has any other better suggestions, or diagnostic tapes, please let me know. Well, over 12 hours and still ticking since the last crash! -- Greg A. Woods +1 416 443-1734 VE3TCP Planix, Inc. ; Secrets of the Weird - ------------------------------ End of Suns-at-Home Digest ******************************