Thursday, April 3, 2008

Sun X4150 - OS install - OK Now What...

I just received our first Sun X4150 Server from Sun. Nice machine, seems to be pretty heavy on the hardware and the Toolless rail kits are simply a "snap". I'll be looking for these rail kits now on all of our Sun Hardware - talk about easy to install. I could go on and on about how solid and such this hardware looks but I need it to run...

Day 1

I want to get Windows 2003 Server R2 on this device, I have others on order that will run Solaris x86 so this one is my guinea pig.

Knowing that the disks shipped with systems now a days are not the most recent I go looking for all the new software/firmware/etc. first.

First I notice that there is a newer version of the "Tools and Drivers CD" at http://www.sun.com/servers/x64/x4150/downloads.jsp The version there is 1.1, the version that ships with the system is 1.0 For those that are running the newer Harpertown CPU's this is a life saver CD. I'll explain this a bit later. Download and burn the X4150_11.iso before you do ANYTHING as this will save you some hair and running around later.

We ordered this system in to run Windows 2003 server. No problem I thought as we have a site license for Windows 2003 server thus I did not order a copy from Sun - no need to pay Microsoft twice.

So following along in the documentation "Sun Fire X4150 Server Operating System Installation Guide" (Part No. 820-1853-11 - which can be found at http://docs.sun.com/app/docs/coll/x4150)

On page 42 of this manual you will find a section "To Create a Windows Server 2003 CD with the Sun StorageTek or LSI Drivers and Install the OS". Excellent, this I have got to see and I am ecstatic that Sun has thought of how to do this for us. This procedure describes the usage of a shell script called 2003Reburn which if worked as the documentation says would be an excellent little tool that creates a new bootable Windows 2003 Server CD with all of the drivers needed incorporated into the image. I wish other manufactures would do this as it would save on finding a working floppy disk drive or buying those USB floppy disk drives for those systems that no longer come with them. 1.44MB such storage space!

The big problem is that this tool/script does not work completely as you would expect. The 2003Reburn script is supposed to read your copy of MS Windows 2003 Server R2 and merge the needed Sun drivers and files to support the X4150 hardware, namely the StorageTek Raid controller or the LSI controller so that you can access the disks in the system. The documentation that I indicated to above says that this script works on "Solaris x86, Solaris OS (SPARC Platform Edition), Red Hat Enterprise Linux 5, or SLES 10". I've got two of those systems so I continued on. Well on my Solaris 10u4 SPARC Sun Blade 1500 I had the best luck but it still did not work completely, On my Solaris 10u4 x86 system the script just plain bombed out somewhere in lines 600-609 in the script. I suspect an issue here where a patch is needed.

Let's go back to my SPARC system as it seems that it may work with a bit of hammering...

On my Sun Blade 1500 I tried putting the script on my new ZFS /usr/local/ file system as I had more room there to play with and figured there was no problem with using any file system. Well on the contrary, it seems the script requires a "ufs" file system (see script line# 270 - 279). So I moved it to my /opt file system which is "ufs". Why this script requires ufs I have no idea, I believe I read somewhere that it was due to performance but I can not find where I ran across that (sorry - missed placed that tidbit of info). If performance is an issue maybe passing a speed parameter to 'cdrw -i' on line# 1423 of the script would work. But I wonder if it was because the developer of the script wasn't aware of the ZFS (scratch head here). Isn't ZFS one of Sun's most touted innovations recently? You would think it would work on that.

Anyway, I finally was able to run the "2003Reburn -b" as the manual says to and at first it looks like it's going to work (files are being extracted, etc.) but then bombs out somewhere in the script and it looks like it is due to my /cdrom directory having left over ghost directories from prior CD mounts. I see in /cdrom references to disks in the drive that are not what I have mounted so I clear out my /cdrom directory and start again. (I'm sure there is a reason for the ghost directories but why I don't know and if there is a reason I'd like to hear why) Finally the script seems like we have success as the script creates the directory structure and modifies files, etc., asks for a new CD and starts writing the CD, and eventually ejects the CD and exits. So I should have now in my hands a new Windows 2003 Server R2 Disk#1 with all of the drivers on it to boot off of and install Windows 2003 Server Right? Wrong.

Seems that although you can put the disk in a windows system and browse the contents that show exactly what you would expect to, the 2003Reburn script fails to produce an actual bootable CD as the X4150 just sits there and looks at you with a blinking underscore about 4 lines down on the left side of the screen. So now what? After a full day of working on this I'm running out of steam and getting a bit tired of looking at a script that seems did not get reviewed much by quality control or run on more than one device to see if it actually worked.

Day 2.

So I have a CD with what looks like everything on it that I need to boot and install Windows 2003 Server R2. After a bit of talking with a Windows Admin (which I am not), he fires up Nero Burning Rom creates a CD compilation that includes "bootable" in the template name and copies over all of the data from my created CD into the compilation and he burns it to a new "bootable" CD. This new CD works perfectly as I'm finally able to boot and install Windows 2003 Server R2 on the X4150 with no issues at all. The script did it's job except the bootable CD part. I'm sure there are tutorials and such out there for Nero and making a bootable CD but I'll leave that for you to find out how to do as I'm in the same boat as you. I couldn't redo what he did to save my life or tell you how to do it. I guess just trust that it can be done.

Note that I had to go around Sun's 2003Reburn script to get this to actually work in the end. In the script, lines 890-905 try to grab the boot loader off of the Win2003 CD and copy it to the file system, but due to a bug (http://sunsolve.sun.com/search/document.do?assetkey=1-1-6638198-1 Document ID:6638198 ) which seems to be based upon byte-ordering by 'od' in which there is no Suggested Fix or Work Around, it fails and you are left with a non bootable CD. If Sun could fix this bit of code or supply a how to or a work around for this I'm sure it would be VERY well received.

Now in addition, I have also found that in the Tools and Drivers CD 1.1 under utilities/2003reburn where you find the 2003Reburn_1.1.zip file there is a readme.txt file. This file contradicts what the documentation says by saying that you need to run "Solaris 10 x86, or an equivalent Linux distro" Note that it does not suggest SPARC. Funny thing is like I stated above the script will not run at all under Solaris 10x86 update 4 anyway but so far runs best under SPARC (scratch head again).

Enough with the Windows lets try Solaris x86....
Now I need to preface this a bit with I wanted a hardware mirrored boot disk, which means you have to wipe the volumes that the raid controller shipped with and create new volumes which is why I did not use the "pre-installed" Solaris x86 as doing so wipes the drives. I'm sure someone would have commented on just using the pre-installed OS. I like to re-install all pre-installed OS's so that I can put what I want on the machine, not what the vendor wants to put on my machine. ;-)

So trudging on I continue to see if the included Solaris Recovery DVD that came with the system will work. I boot off the DVD and install the OS (although it is Solaris 10 x86 11/06 - what's up with that? Isn't 8/07 out?) I click the "Reboot Now" button and system reboots, lights blink, fans blow, screen comes back to life, and GRUB comes up. I select the installed Solaris 10 system and GRUB clears then the system reboots. This happens a few times - what's this now?

Seems that there is another issue - the new hardware (Intel Harpertown CPU's). No problem I figure, I grab my latest copy of Solaris 10x86 8/07 - that should do the trick. Install, reboot, same problem. Grrrr. A quick bit of Google searching on "X4150 reboot" turned up this on William Hathaway's blog (http://williamhathaway.com/?cat=8). Excellent writeup William! This little tidbit of info led to me looking high and low for any more information on why the latest Recommended Patch cluster made this work and eventually finding a patches directory on the Tools and Drivers 1.1 CD /drivers/sx86/patches. This directory has two patches and a Patch_Installation.txt file that seem to indicate that they need to be installed. But it doesn't seem to mention why. You should follow the Patch_Installation.txt file especially the part about updating the Firmware (CPLD, BIOS and SP) first. You can boot off the Tools and Drivers CD and it will give you a chance to do this. See the README.TXT file in the root of the Tools and Drivers 1.1 CD.

In the patches directory there are two patches:

Patch# 125370-06 is a Fault Manager Patch (for those with a Sun contract - http://sunsolve.sun.com/search/document.do?assetkey=1-21-125370-06-1 )

Patch# 127112-05 is a kernel Patch (for those with a Sun contract - http://sunsolve.sun.com/search/document.do?assetkey=1-21-127112-05-1 )

So I try again - Solaris 10x86 update 4 install, DO NOT reboot. If you do reboot, follow William Hathaway's GRUB editing trick and apply the patches like normal else follow along...

1.) unmount S10u4x86 DVD, (umount /dev/dsk/c1t0d0p0 - I think.)
2.) eject Solaris 10x86 update 4 DVD,
3.) insert Tools and Drivers 1.1 CD and mount it (mount /dev/dsk/c1t0d0s0 /cdrom - I think.)
4.) cd /cdrom/drivers/sx86/patches
5.) patchadd -R /a 125370-06
6.) patchadd -R /a 127112-05
7.) unmount CD, eject CD,
8.) NOW reboot.

The system finally now comes up Solaris 10x86u4 with a 64-bit kernel.

So lets see...
Windows 2003 Server R2 - Yes - but with a little outside (of Sun) help as 2003Reburn has issues or you can buy a USB floppy drive and go that route but honestly why go back to floppy disks? You can order the system with Windows 2003 Server R2 installed on it, and you would receive a 2003 Recovery CD but in the case in which we already have the media and license it made no sense to order yet another.

Solaris 10 x86 update 4 - Yes - but with 2 Patches that are hidden on the Tools and Drivers CD 1.1 and are not really mentioned elsewhere.

Neither of these work arounds are optimal or documented and it leaves you pull your hair out, but it can be done. I'm assuming the next release of Solaris 10x86 will already have either the existing patches included.

Additional Notes...

1.
) Parameters.
The 2003Reburn script has parameters that you can pass to it, you can find these in lines #193-230 in the script they are:
-d - dryrun
-b - autoburn
-v - verbose
-cdrom {cdrom} - you can pass where the source is located at
-cdrdev {cdrom} - you can pass where your burner is.

I found that by not adding the "-b" creates the .iso image file that you can copy to a windows system and burn it to a CD. This comes in useful if you wish to keep that copy of the .iso somewhere on a file server for later or if you wish to copy it to a windows device for burning (for those without burners on their Solaris systems).

2.) CD Eject
In lines #1378-1386 of the script where it is supposed to eject the Windows 2003 R2 CD, if you pass in say "-cdrom cdrom1" the script will eject cdrom0 as it doesn't take the passed in parameter into account and you will have to eject the cdrom1 by hand.

3.) RAID Configuration/"OK" LED goes out
When inspecting the hardware for X4150 I saw that there were two "channels" coming of the Adaptec SAS Raid Controller. One channel went to drives 0-3 and the second channel went to drives 4-7. It would make sense that if I wanted to mirror the disks that I would choose 1 drive on say channel 'A' and one drive on channel 'B'. I know that this is on the same RAID card but humor me. So I setup my 4 disks in bays 2,3,4,5 and created a mirror between drives 2&4 and 3&5 putting half the drives on one channel and half the drives on the other - in case one of the two channels fail.

Funny thing is that in this configuration now drives 2&4 no longer light the "OK" led on each of the drives. The RAID card sees the two volumes and the OS's see both volumes but I guess something doesn't like the setup as the two "OK" lights go out.

4.) Install the StorageTek RAID controller utilities
After you install Windows 2003 Server R2 from your media you may wish to install the RAID utilities so that you can view the RAID configuration. The "Sun Fire X4150 Server Operating System Installation Guide" is for Tools and Drivers CD 1.0 because on page 48 of that manual it references a directory called "RAIDmgmt" which does not exist on the CD 1.1.
For Sun StorageTek RAID cards the software is located at:
/drivers/windows/RAID/StorageTek/ASM/{2003_32 or 2003_64}/setup_sstrm_x{86 or 64}.exe

For LSI RAID Cards the software is located similarly at:
/drivers/windows/RAID/LSI/MSM_Windows_21800.zip


I wonder how Jumpstart works with this... (Maybe later).

No comments: