Notice history

Operational

Under maintenance

Operational

Under maintenance

Sep 2025

Resolved
September 26, 2025 at 1:06 PM
Resolved
September 26, 2025 at 1:06 PM
The root cause was found and a fish has been implemented.
Investigating
September 26, 2025 at 10:02 AM
Investigating
September 26, 2025 at 10:02 AM
The changelog (which tracks changes on the filesystem) on holylfs05 is throwing errors creating instability on the filesystem. We are investigating.

Resolved
September 24, 2025 at 4:03 PM
Resolved
September 24, 2025 at 4:03 PM
The gpu_test partition is back to normal service.
This incident has been resolved.
Investigating
September 24, 2025 at 1:44 PM
Investigating
September 24, 2025 at 1:44 PM
The gpu_test partition is currently down. Please use our other public gpu partitions in the meanwhile.
We are currently investigating this incident. incident.

Resolved
September 23, 2025 at 4:38 PM
Resolved
September 23, 2025 at 4:38 PM
The gpu_test partition is back in normal service.
Investigating
September 23, 2025 at 2:45 PM
Investigating
September 23, 2025 at 2:45 PM
The gpu_test partition is currently down due to a networking issue in that row. Please use our other public gpu partitions in the meanwhile.
We are currently investigating this incident.

Resolved
September 22, 2025 at 1:01 PM
Resolved
September 22, 2025 at 1:01 PM
holylfs05 is back online. Thank you for your patience.
Update
September 20, 2025 at 4:32 PM
Update
September 20, 2025 at 4:32 PM
holylfs05 is undergoing filesystem checks. Due to the size of the system, this process will take the rest of the weekend at a minimum.
No ETA at this time.
We are continuing to work on a fix for this incident. Our sincere apologies for the unexpected disruption.
Identified
September 20, 2025 at 12:08 AM
Identified
September 20, 2025 at 12:08 AM
holylfs05 will be inaccessible for an extended period of time while our staff continues to troubleshoot the underlying cause We are continuing to work on a fix for this incident.
Investigating
September 19, 2025 at 8:22 PM
Investigating
September 19, 2025 at 8:22 PM
Writing to holylfs05 may result in errors. We are currently investigating this incident.

Resolved
September 09, 2025 at 2:23 PM
Resolved
September 09, 2025 at 2:23 PM
This incident has been resolved. Coldfront is back online.
Investigating
September 09, 2025 at 1:31 PM
Investigating
September 09, 2025 at 1:31 PM
Coldfront is not operating normally. We are investigating.

Aug 2025

Resolved
September 03, 2025 at 4:06 PM
Resolved
September 03, 2025 at 4:06 PM
Thanks to staff that made multiple datacenter visits, the leak has been addressed and all systems are back up and running.
This incident has been resolved.
Monitoring
September 02, 2025 at 6:27 PM
Monitoring
September 02, 2025 at 6:27 PM
8a14 and 8a18 are back online, 8a16 is still down
Update
August 30, 2025 at 10:41 PM
Update
August 30, 2025 at 10:41 PM
Power and water is shutdown to these racks and the vendor is working to send someone in next working day to investigate. Partition totally down are mweber_compute, mweber_gpu, seas_gpu and seas_compute.
Identified
August 30, 2025 at 9:50 PM
Identified
August 30, 2025 at 9:50 PM
Row 8A18 (row 8A rack18) will also be shut down.
Investigating
August 30, 2025 at 9:45 PM
Investigating
August 30, 2025 at 9:45 PM
MGHPCC found a leak in 8A14 or 8A16 racks and asked us to shutdown those nodes. We are working with the vendor to fix the issue ASAP.

Resolved
August 30, 2025 at 12:40 AM
Resolved
August 30, 2025 at 12:40 AM
SMB access has been restored. Please disconnect and retry if you have a failed mapped drive. If you still cannot connect to a share, please contact rchelp@rc.fas.harvard.edu and let us know your username and exactly which share you are attempting to map.
Identified
August 30, 2025 at 12:09 AM
Identified
August 30, 2025 at 12:09 AM
We are continuing to work on a fix for this incident. No ETA.
Investigating
August 29, 2025 at 7:33 PM
Investigating
August 29, 2025 at 7:33 PM
Drive mapping to some shares may fail if those shares use the Samba Cluster. This includes but is not limited to share paths that begin with \\\\smbip. Known affected shares: * `anderson_lab` * `arlotta_lab` * `bellono_lab` * `bertoldi_lab` * `capellini_lab` * `dasch14` * `dasch15` * `dasch16` * `denic_lab` * `dobbie_lab` * `engert_lab` * `ferreira_lab` * `fortune_lab` * `friedman_lab` * `girguis_lab` * `grad_lab` * `hausmann_lab` * `hays_lab` * `hbs_liran` * `hbs_rcs` * `huh` * `illumina` * `jessicacohen_lab` * `lichtman_boslfs02` * `mallet_lab` * `mason_lab` * `mckinley_lab` * `mcz` * `mitrano_lab` * `moorcroftfs5` * `murraylab` * `nmr_large` * `nmr_small` * `novitsky_lab` * `pooling` * `qbrc_center` * `ramachandran_lab` * `schnapp_lab` * `schrag_lab` * `srivastava_lab` * `test` * `whited_lab` * `yau2_lab`

Resolved
August 23, 2025 at 12:31 AM
Resolved
August 23, 2025 at 12:31 AM
The endpoints are working properly again. This incident has been resolved.
Identified
August 22, 2025 at 2:26 PM
Identified
August 22, 2025 at 2:26 PM
We are continuing to work on a fix for this incident.

Starfish upgrade Wednesday, August 20th 5PM-7PM

Completed
August 20, 2025 at 11:00 PM
Completed
August 20, 2025 at 11:00 PM
Maintenance has completed successfully
In progress
August 20, 2025 at 9:00 PM
In progress
August 20, 2025 at 9:00 PM
Maintenance is now in progress
Planned
August 20, 2025 at 9:00 PM
Planned
August 20, 2025 at 9:00 PM
Starfish will be performing an upgrade on Wednesday, August 20th from 5PM-7PM. The web interface will be unavailable during that timeframe.

Resolved
August 20, 2025 at 1:36 PM
Resolved
August 20, 2025 at 1:36 PM
This incident has been resolved.
Identified
August 19, 2025 at 9:11 PM
Identified
August 19, 2025 at 9:11 PM
We are continuing to work on a fix for this incident.

Jul 2025

FASRC away for all-hands meeting 7/17

Completed
July 17, 2025 at 9:00 PM
Completed
July 17, 2025 at 9:00 PM
Maintenance has completed successfully
In progress
July 17, 2025 at 1:00 PM
In progress
July 17, 2025 at 1:00 PM
Maintenance is now in progress
Planned
July 17, 2025 at 1:00 PM
Planned
July 17, 2025 at 1:00 PM
FASRC staff will be attending an all-hands meeting all day Thursday 7/17/25.
Ticket response will be delayed.

Resolved
July 10, 2025 at 2:48 PM
Resolved
July 10, 2025 at 2:48 PM
boslfs02 is rebalanced and back to normal operation.
Identified
July 10, 2025 at 2:43 PM
Identified
July 10, 2025 at 2:43 PM
We have noticed some instability with boslfs02 and are working to rebalance it. Performance degradation may be noticeable until this is complete.

FASRC Monthly maintenance July 7, 2025 9AM-1PM

Completed
July 07, 2025 at 5:00 PM
Completed
July 07, 2025 at 5:00 PM
Maintenance has completed successfully
In progress
July 07, 2025 at 1:00 PM
In progress
July 07, 2025 at 1:00 PM
Maintenance is now in progress
Planned
July 07, 2025 at 1:00 PM
Planned
July 07, 2025 at 1:00 PM
FASRC monthly maintenance will take place Monday July 7th, 2025 from 9am-1pm
NOTICES
- New Quota tool available (/usr/local/sbin/quota) - Works on all filesystem types (home directory, lustre, isilon, netscratch, etc.)
  Type quota -h to see the full instructions for usage o visit the usage doc.
- Training: Upcoming training from FASRC and other sources can be found on our Training Calendar. at https://www.rc.fas.harvard.edu/upcoming-training/
- Status Page: You can subscribe to our status to receive notifications of maintenance, incidents, and their resolution at https://status.rc.fas.harvard.edu/ (click Get Updates for options).
- Upcoming holidays: Juneteenth - Thur. June 19 / Independence Day - Fri. July 4
MAINTENANCE TASKS
Cannon cluster will be paused during this maintenance?: YES
FASSE cluster will be paused during this maintenance?: YES
- Slurm Upgrade to 24.11.5
  Audience: All cluster users
  Impact: Jobs and the scheduler will be paused during this upgrade
- Login node OS upgrades
  Audience: Anyone logged into a FASRC Cannon or FASSE login node
  Impact: All login nodes will upgraded and unavailable during this maintenance window
- Start of cluster OS upgrades - July 7 -10
  Audience: All cluster users
  Impact: Over 4 days, July 7 through 10, we will upgrade the OS on 25% of the cluster each day. During that time, total capacity will be reduced across the cluster by 1/4 each day. This will require draining each sub-set of nodes ahead of time.
- Netscratch cleanup ( https://docs.rc.fas.harvard.edu/kb/policy-scratch/ )
  Audience: Cluster users
  Impact: Files older than 90 days will be removed. Please note that retention cleanup can and does run at any time, not just during the maintenance window.
Thank you,
FAS Research Computing
https://docs.rc.fas.harvard.edu/
https://www.rc.fas.harvard.edu/

Rolling cluster OS upgrades July 7 - 10

Completed
July 11, 2025 at 4:02 PM
Completed
July 11, 2025 at 4:02 PM
All upgrades are complete. A small number of nodes need clean-up, but the cluster is back to normal operation with all nodes running Rocky 8.10. Thanks for your patience.
Update
July 07, 2025 at 1:00 PM
Update
July 07, 2025 at 1:00 PM
Cannon rolling upgrades are in progress. Not all nodes are available.
https://www.rc.fas.harvard.edu/blog/2025-compute-os-upgrade/
In progress
July 07, 2025 at 1:00 PM
In progress
July 07, 2025 at 1:00 PM
UPDATE: 7/7/25 6M FASSE is operational.
Please be aware that FASSE jobs cannot be launched at this time due to the upgrades.
We will return all FASSE nodes to normal services as soon as possible.
https://www.rc.fas.harvard.edu/blog/2025-compute-os-upgrade/
Planned
July 07, 2025 at 1:00 PM
Planned
July 07, 2025 at 1:00 PM
Cluster OS upgrades - July 7 -10
- Audience: All cluster users
- Impact: Over 4 days, July 7 through 10, we will upgrade the OS on 25% of the cluster each day.
  During that time, total capacity will be reduced across the cluster by 1/4 each day.
  This will require draining each sub-set of nodes ahead of time.
Work begins during the July 7th maintenance (login nogdes will be upgraded during the 7/7 maintenance window) and will continue through July 10th.
Additional details and a breakdown of each phase: 2025 Compute OS Upgrade

Jul 2025 to Sep 2025

FAS Research Computing - Notice history

Notice history

Sep 2025

Aug 2025

Jul 2025