News
October 6, 2020 • 2 min read • News
LOTUS/SLURM issues update
Dear all,
As you know we have recently seen some problems with the SLURM scheduler for the LOTUS batch processing cluster, with jobs remaining in the pending state for much longer than normal. This issue has been and remains a top priority for the JASMIN team to resolve. Here’s the current state of affairs:
• Today (Monday 5 Oct)
o Fair-share configuration has been adjusted to attempt to lower the priority of very large/long-running jobs (which were causing sections of the cluster to become blocked)
o An even mix of user ids is now being seen with running and pending jobs, implying that jobs are now starting to flow more normally, without being dominated by any one user.
o A fix to a storage client has been rolled out to some parts of JASMIN which fixes a write issue for some users (NB not necessarily related to scheduling issues)
...
September 16, 2020 • 2 min read • News
JASMIN Migration to CentOS7 & LSF replacement with SLURM UPDATE 11
Dear JASMIN users,
This message includes information about the following:
New high-memory CentOS7 sci machines Retiring RHEL6 high-mem Sci machines on Friday 25th September Hpxfer servers Details of the update The new high-memory (1TB) CentOS7 scientific analysis server with SLURM enabled sci3.jasmin.ac.uk and sci6.jasmin.ac.uk and sci8.jasmin.ac.uk are available for users to use
...
September 14, 2020 • 1 min read • News
System maintenance Tues 15 Sept 2020
Scheduled maintenance is planned for Tuesday 15th September, which may cause some disruption to JASMIN and CEDA services.
On a regular (roughly quarterly) basis, important updates are applied to systems within the JASMIN infrastructure (which also hosts the CEDA Archive and associated services) in order to keep them up to date and secure. Some servers may need to be rebooted in order for these changes to take effect, so there may be an interruption to some JASMIN and CEDA services on this date. The maintenance work will also include a network change which should help prevent recent problems experienced with the virtualization cluster.
...