Give us back lightning-fast Updates
Consider having the entire Component-based Servicing Infrastructure running from RAM if available and as far as is possible.
I've done extensive analysis and testing to what CSI is actually doing. While no doubt it is much much more secure/reliable then for example WinXP updates were, it does have a significant drawback, which hurts even more in Virtualization. Let me ask you: What do you commonly have enough of in Virtualization ? CPU ? Yes, Mem. ? Usually, also yes, Network ? Usually, yes, Disk I/O ? NO. Usually, disk I/O is the common bottleneck in most Virtualization-environments I've come across by far. So while it would be easy to scale up no. of VMs per host based on CPU/Mem./Network, it usually gets bottlenecked because of limited available disk I/O. Not even so much under normal running conditions, but especially when doing servicing, aka Windows Updates/DISM. In general, the current CSI does a very bad job at this simply because everything seems to be totally disk-based. Every stage from detection to download to staging to comparing, to installing, to pre-shutdown and post-startup ("Working on updates", etc.) is totally disk-based and as such heavily bound by the total available disk I/O on a host (or SOFS/etc.). This greatly impairs otherwise possible upscaling of no. of VMs per host. I've calculated roughly that to deploy 1 Gbyte of updates you will get over 11 Gbytes of total disk-churn. Hey, this is talking 1 (one!) VM. If excluding the downloads, you're still, generally speaking, talking about a factor 5-6 disk-churn ! -> Which could have been largely avoided if most of the entire CSI would be running from RAM. I've developed a script which employs a dynamic ramdisk in combi with alot of folder-redirecting (reparsepoints) to get most of CSI running from RAM, saving state to ramdisk at startup and restoring state at shutdown, and as such have been able to reduce disk-churn by a factor 3 and deployment-time from 3 h. to 2 h. for 204 updates so far! The script though is an ongoing experiment as I find more and more stuff that I can redirect to ramdisk to speed up this whole process, reducing disk-churn along the way. Since it's a dynamic ramdisk, using it in combi with Dynamic Memory means a VM will grow for around 4 Gbytes during the whole process, but after it's done everything gets returned to the OS by the dynamic Ramdisk, which in turn 5 min. later returns it back to the host for use elsewhere. The current new way of doing major upgrades is in my opinion also just a big miss! How about "servicing" 50 or a 100 or even more systems with that simultaneously ? What kind of disk-subsystem would you need for that, just to be able to keep up with that ? From my testing so far I've seen major improvements just by having all these processes done from ram(disk), thereby increasing chance of corruption, no doubt, but why not making it one huge "major" transaction as such then ? If it fails somewhere, the whole thing gets undone/ignored/discarded and has to be started from scratch (in ram), but if it succeeds, it would enormously speed up everything involved, so also installing roles/features to name something (DISM), not to mention needed IOPs and as such (using tiered storage or just SSDs) alot of unneccesary wear. Having a Replica ? Calculate your savings from this times 2. Oh, what about backup ? It certainly is a best practice, right ? Calculate x3. So, roughly, having Replica and backups, to install 1Gbyte of updates on 1 VM costs you 1 * (factor of 5) * 3 = 15 Gbytes. Employing CBT do you ? Okay, then 1 * (factor of 5) * 2 + 1 = 11 Gbytes. Doing most in RAM would mean you deploy 1Gbytes of updates which only writes the end-result = 1 Gb, so 1*3 = 3Gb. Per VM and one large update-run you thus save roughly 11 Gbytes of wear and tear of your machines (Replica-network gets considerably less aswell ofcourse). Now please recount when having 100 VMs to be serviced/replicated/backed up !
Karl Wester-Ebbinghaus (@tweet_alqamar) commented
I see your point but we cannot compare CBS / dism based patching as we use it today, which usually allow rollbacks of any update without caring for the dependencies with a plain extract and replace all files" updates we had in Windows XP.
I agree however that dism / the Trusted Installerserices is comparibly slow even on my small test environment using 2 EVO 960 NVME in RAID 0.
The problem imho is that the trustedinstaller is not optimized for virtualized environment and causes very high cpu loads but is very limited with parallel IO.
if the dism process could be optimized for better multicore support and better usage of disk the update processes might be faster at all.