|
Issue 14 |
July 2003 |
Windows
NT 4.0 became officially “Unsupported” by Microsoft at the end of June.
Microsoft stopped releasing bug fixes and patches for Windows NT 4.0 at that
time. In a last minute change (May); Microsoft also announced that they would
not be releasing Security Rollup Packages for Windows NT. Individual security
patches will continue to be
available for a
while, but even that will eventually end.
As
a result of these changes, all JLab computer systems running Windows NT 4.0
need to be upgraded by
To
assist users in the upgrade process, the
The
Windows XP Pro Minimum System
Requirements
Bear
in mind that these are minimum requirements.
Additionally,
your hardware needs to be supported by Windows XP. You can check using
Microsoft’s UpgradeAdvisor tool. Available on the web at:
http://www.microsoft.com/windowsxp/pro/howtobuy/upgrading/advisor.asp
Due to the large
number of Linux distributions and kernel versions the
The site's central
Calendar Server was upgraded in early June, 2003. Please see the calendar web
page for information and instructions located at http://cc.jlab.org/services/calendar. The upgrade was transparent to most users,
however the following changes occurred:
1) WebCal - the web based calendar client was
upgraded. The new version has many cosmetic and feature
enhancements. Users who used webcal in the past who had bookmarked its
location are required to update their bookmarks to reflect its new
location. To update the new location go to the new location at webcal.jlab.org,
and then bookmark the page that is displayed.
2) Linux - Linux users have been using the command
"ctime" to start the calendar Linux client. This version had
been unstable in the past and as part of the upgrade a new stable version has
been installed. Linux users can start the new program with the command
"ctime5" from any Linux systems that mount the Linux CUE apps
directory. During the August maintenance period the old version will be
removed from production and the command "ctime" will call launch the
new version of the Linux calendar client.
3) Sun– As part of the upgrade a version of the calendar client
has been installed, and can be launched from the command prompt by executing
“ctime5”, this too will be renamed to “ctime” during the
4) Macintosh – There
is now a client that runs directly on Macintosh's systems which available for
download from the calendar web page link provided above.
5) Palm Pilot and Pocket PC – Although the
Computer Center does not provide technical support for Palm Pilots or Pocket
PCs, we do now have calendar client software that can be downloaded and
installed on your PC or Macs which will allow you to interface these devices to
our calendar server which can be downloaded from the calendar server web page
provided above.
The
UpgradesA total of ten 9940B tape drives are now
installed in the SILO. They went into
production after the July 4th holiday. All writes to the tape SILO are now being
made to 9940B tape drives. The six 9840 and fifteen 9940A tape drives are now
being used as read only devices. The 9940B tape drives have a capacity of 200GB
per tape and a transfer rate of 30MB/sec. This represents a 233% gain in
capacity and a 200% gain in transfer rate when compared to the 9940A tape
drives. The table below lists the type and quantity of drives we currently have
in production and how they are being used.
Type
|
Quantity
|
Capacity
|
I/O
Rate |
Use |
|
9840 |
6 |
20
GBytes |
10
Mbytes/sec |
Read
Only |
|
9940A |
15 |
60
GBytes |
10
Mbytes/sec |
Read
Only |
|
9940B |
10 |
200
Gbytes |
30
Mbytes/sec |
Read
and Write |
There are plans to migrate the data stored on
the 9840 and 9940A tapes to 9940B tapes. Once migrated, the older tape drives
will be removed from production use. By migrating the data from older tape
formats to the new 9940B format we will increase the storage capacity of the
existing tape SILO. It is projected that
this migration will extend the requirement for a third SILO to the summer of
2005.
The following table shows the usage of the
tape SILO over the past 3 months.
|
Month,
Year |
Terabytes
|
Files
Requested |
Failures |
Percent
Success |
|
March,
2003 |
91.40 |
161,374 |
8,653 |
94.64 |
|
April,
2003 |
93.02 |
185,071 |
21,261 |
88.51 |
|
May,
2003 |
59.01 |
104,915 |
7,390 |
92.95 |
For March, 70% of the failures were from jgets that were either killed by the user or executed on machines that crashed before they could complete. For April, 5% of the failures were from jgets that were either killed by the user or executed on machines that crashed before they could complete and 80% were due to cache servers crashing during data transfers. For May, 49% of the failures were from jgets that were either killed by the user or executed on machines that crashed before they could complete, 3% were due to cache servers crashing during data transfers, and 5% were due to bad tape drives or tapes.
The oldest 25 farm nodes have been decommissioned;
these systems were dual Intel Pentium III 450MHz systems. They have been replaced by 25 dual Intel P4
Xeon 2.4GHz systems which have 1GB of RAM and 120GB of disk space for user
jobs. The Hyper-Threading feature of the P4 Xeon processor make the systems appear
as quad processor systems which allows these systems to run 5 simulations jobs
instead of the traditional 3.
A second order of 25 farm systems will be
made later in the year which will replace the 25 Pentium III 500MHz systems in
the farm. Two purchases of farm nodes
are being made this year in an effort to replace the older farm systems because
these systems have started to show their age with hardware failures.
Ifarml1, a 4-year-old Intel 500MHz Xeon
system which was our first quad processor Linux system, will be replaced sometime
this summer. Its replacement will be a quad Intel P4 Xeon 1.9GHz system similar
to ifarml3.
Over the July 4th weekend,
a number of changes were made to Jasmine, the mass storage system that manages
the cache disks and tape storage. This is a quick summary of those items that
are most likely to affect you.
To increase
security and improve accounting, each user must now have a certificate to use
Jasmine or Auger (farm job server). These certificates are stored in your CUE
home directory in a file named ~/.jlab.scicomp/jlab.scicomp.keystore. This file should be treated just
like an ssh key and stored with strict permissions (we recommend the default
mode 0400). If you want to use jasmine or auger on a personal machine that does
not
scp
-pr ~/.jlab.scicomp ~
Until now
we have encouraged users to make requests of many files, and to group them in
the order that the files are stored on tape to improve the efficiency of the
system. This is no longer necessary. Jasmine will now automatically group
similar requests and process them together to save tape mounts and seek time. There
are still reasons to group files into requests: Files that are grouped into a
request will all be processed at close to the same time. Grouping files into a
request lets Jasmine know that you have a need for those files to be processed
at the same time.
StorageTek 9940B tape drives are now used to
write all new files. These drives use a fibre channel interface and write to
200GB cartridges at 30MB/sec. Because of the significant per-cartridge capacity
gain, we are in the process of duplicating over 4,000 of the existing 9840 data
cartridges to make additional space in the two silos. When this process is
done, the 9840 tape drives will be decommissioned, and repacking of the 9940A
tapes to 9940B will begin.
Below is a quick summary of jasmine commands
and their uses. For more complete documentation see the web pages at http://cc.jlab.org/scicomp/userdocs.
|
jget |
Retrieve a file from /mss and store it on
your disk |
|
jcache |
Retrieve a file from /mss and store it on a
cache disk |
|
jput |
Store a file on tape using Jasmine. The
destination stub file must begin with /mss |
|
jremove |
Remove a stub file from the /mss tree. The
file is not removed from tape, but is no longer accessible. It is renamed to
a subdirectory with “.storeattic” in the path. |
|
jrestore |
Restore a previously removed file. The name
must be in a “.storeattic” directory. |
|
jrename |
Rename a stub file in the /mss tree. |
|
jcancel |
Cancel the specified request id(s), unless
the files are already being processed. |
We are
interested in your feedback about Jasmine. Please direct any comment to farm@jlab.org.
Auger is the name for the new job submission tool which went
into place in May, 2003. While it essentially duplicates functionality that
already existed in the "JOBS" job server, it has added some important
new functionality, and more importantly is structured to support needed changes
in the future.
The current improvements include a certificate based
authentication mechanism and a database for tracking the status of submitted requests.
This use of certificates is a large step toward improving security. The
database allows for more sophisticated web pages and command line utilities for
monitoring the status of requests. It
also allows system administrators to better track the farm utilization.
Auger supports two forms of describing job submissions, the
existing format, and a new XML format.
The new XML format will be published shortly and users will be
encouraged to use it. The new XML-based job description language supports
additional functionality plus some advanced features. Since this new format allows for a much richer
description of jobs, the old format will slowly be phased out of use over the
next year.
These new features include the ability to more completely
describe multiple jobs in a single request. This will include situations where you
want a single request to consist of multiple jobs, each of which requires
multiple files from tape for processing. It also supports the ability to
specify different command-line options to your program and the environment
variables that should be set before your program is run. Probably the most
important change that is provided is the ability to specify resources that your
program needs to run to limit which farm nodes a job goes to. For example, you will be able to specify the
minimum amount of hard drive space that is required, or the minimum amount of
memory required to run your job. This will allow for a more efficient use of
the heterogeneous nature of the farm.
Additional functionality that will be added to Auger will
include better Jasmine integration. With
both Auger and Jasmine being upgraded, it will be possible for tighter
integration between the applications which will allow more efficient use of the
cache drives for files needed by farm jobs.
One of the main long term driving factors behind the job
submission rewrite was to support the seamless ability to submit jobs which can
not only use the resources at Jefferson Labs, but also use the mass storage
systems and farms set up by collaborators. The first step toward this will be
extending Auger to also support submissions to the High Performance Computing
group’s job server. No specific dates
have been set for these upgrades.
Information on the current status of auger and links to the
status web pages can all be found at http://auger.jlab.org.
Need
a Linux workstation but don't have $1200 to spend? Maybe you need only
occasional access to a Unix desktop, or perhaps you need to set up a computer
in a common area but don't want to spend a lot of time managing it? Or maybe
you just have some old X terminals to replace? The
A diskless Linux workstation is very similar
to a full Linux computer. All programs run directly on the workstation, not on
a different computer somewhere else on the network. The workstation simply has no local hard
drive. All files, including the operating system, are stored on our central
file servers where the
These diskless workstations are no slouch in
the software department, either. Each is fully integrated into the JLab CUE
environment, so users have access to their central home directories, /apps,
/work and /scratch. In short, for most
normal uses these diskless machines are indistinguishable from full-blown Linux
workstations. Some early users have described it as being "like having
ifarml1 on your desktop."
These diskless workstations will soon be
available on the new, improved online PC purchasing page, but you don't have to
wait to order them. For ordering
information (or other details), please contact David Bianco at x5268, or via
email at <bianco@jlab.org>. With prices
starting at just $600 (excluding monitor), these systems are well suited to a
variety of uses.
As
a part of the
Most
recently purchased systems contain a network adapter that provides support for
the “PXE” or “Preboot eXecution Environment”. This facility and others allows
systems to use the network to find a server from which to load and begin
running an Operating System. For older systems that don’t have PXE built in, a
floppy disk is available that provides this PXE functionality. As the system
starts up, it transmits identifying information on the network. The RIS server
responds with information necessary to load and run the setup tools necessary to
install Windows.
The
user is presented with a screen asking for username and password on the JLab
domain. If the user is authenticated, the actual configuration of Windows to be
installed is selected from a list of those available on the RIS server. The
installable copies of Windows include recent service packs, hotfixes, etc. Once
a selection is made, the system downloads and runs the normal Window Setup
process. This process includes all of the parts to identify hardware, install
drivers, etc. At the end of this process, the system starts a script that then
installs the several common CUE utilities that are
installed on all CUE
systems like Acrobat Reader, WinZIP and PuTTY.
Overall,
the process takes about 1 hour to complete and leaves the system properly
configured and ready to run. All JLab/CUE Windows systems (except for a few,
specially excluded systems involved in accelerator or experimental control or
production systems) get Norton Antivirus and Microsoft SMS Client installed
automatically after they are built and online, but everything else is pretty
much ready to go.
A
few additional applications that are individually licensed like Microsoft
Office, and others that are just not necessary for everyone, must be separately
installed. Install packages for these are available on the network. Links to
these packages are provided on the web. To install a package, users just need
to navigate to the web page, and click on the appropriate link. This method
currently relies on the availability of CUE, so it can only be used to install
software to systems on site. A full, web-based approach will soon be available.
This will allow for installation over the web to home systems, or to systems
while people are on call or on travel. This method, running on our local
network on a fairly fast system has installed the entire MS Office suite (no
clip art) in approximately 3 minutes!
Since
Windows NT is being phased out (see “Windows NT Systems Must Be Upgraded by
2/1/04” in this CC Newsletter), it is expected that RIS will be used by lots of
people performing upgrades of their desktop NT systems to Windows XP. For a
typical system, the upgrade procedure will be something like:
Start
making your plans to upgrade now. The RIS server, software install server, etc.
all should be ready and available by mid August. When you’re ready, just check
the
Automated
Windows Patch Delivery and InstallationComputer
Security is an integral part of all our lives: leaving systems unprotected from
hackers and viruses significantly increases the risk of loss of our work. At
JLab, the
Since
the beginning of 2000, Microsoft alone has issued more than 230 security
bulletins, most of those requiring some sort of corrective action by the end
user to correct potential security holes in Microsoft products. The numbers
speak for themselves: Windows patch management in any type of enterprise
environment is a nightmare. In the past,
SMS performs inventories of hardware and software of all machines within the domain and saves that information in a data