Calamari's WindowsService_BeforePostDeploy.ps1 script freezing on Get-WMIObject() call
Hey guys,
We're seeing the issue in the attached screenshot happen a lot on our deployments lately. Basically, the deployment freezes up on any step that's using the built-in Windows service install feature, right between configuring and starting the service. It can take anywhere from 30 mins to an hour or more to complete -- normally the entire deployment takes <5 mins!
Looking at the source for the script (https://github.com/OctopusDeploy/Calamari/blob/master/source/Calamari/Scripts/Octopus.Features.WindowsService_BeforePostDeploy.ps1#L145-L146), there's only 2 lines that occur between those two outputs:
$wmiServiceName = $serviceName -replace "'", "\'"
$status = Get-WMIObject win32_service -filter ("name='" + $wmiServiceName + "'") -computer "." | select -expand startMode
Googled around a bit, and it seems that these kinds of WMI queries can freeze up like this quite often (for whatever reason). The recommended approach is usually to wrap the call in a task with a timeout (since that query doesn't support a timeout directly).
Has anyone experienced similar issues? Or have any recommendations for fixing this? Right now is causing major heartburn getting bugfixes from DEV --> QA, so our devops team is looking to create a custom script to replace the built-in script... but we'd prefer to keep using the built-in features as much as possible (they're very easy to configure!).
Thanks,
-Adam
-
Octopus_freeze.png 46.7 KB
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 1 Posted by Robert Wagner on 20 Apr, 2017 05:12 AM
Hi Adam,
Thanks for getting in touch. When this does happen again, would you be able to log onto the machine and run
Get-WMIObject win32_service
and see if that hangs too (to determine whether it is random, or point in time error). Also if it does hang, could you runGet-Service
. Perhaps the output of those will show where it is hanging (ie print everything until the service that is causing the hang).I'm happy to look into adding a timeout to that call. I'm thinking 5 minutes?
In the meantime, you can update the script yourself, name it
Octopus.Features.WindowsService_BeforePostDeploy.ps1
and add it to root of the package. Octopus will then use that script instead of the built in script.Robert W
2 Posted by adam_schueller on 21 Apr, 2017 04:07 PM
Thanks Robert! Following your advice, it seems to be a systemic problem on that box -- every time I run "Get-WMIObject win32_service", it hangs right after the FDResPub service. The command finishes in ~1s on other machines.
Looking at the list of services using Get-Service shows that the next service in line is "FontCache" (though I did notice the WMI query doesn't match the output of Get-Service exactly). Not sure if that means anything -- I checked the WMI repository with "winmgmt /verifyrepository" and it comes back "WMI repository is consistent".
It hangs pretty much forever (I've been waiting for 4+ hours, still stuck). Seems to be same problem as this post: http://www.muscetta.com/2009/05/27/get-wmicustom/
That's from 2009, so clearly not a new issue with WMI in general.
I ran WMIDiag and the output was all good: "(0) ** SUCCESS: WMIDiag determined that WMI works CORRECTLY.", so not sure what our issue is.
At this point, we clearly need to address the WMI issues on our DEV box. A timeout on you guy's end would just mask the issue we're experiencing... and even a 5 min timeout would essentially double the deploy time.
Because we always auto-start, we're considering the option of a custom script, but I want to see if we can solve the WMI issue before adding anything to the source repo.
Thanks again for the quick response!
-Adam
Support Staff 3 Posted by Robert Wagner on 23 Apr, 2017 09:51 PM
Hi Adam,
Thanks for the update, hope you get it sorted.
Robert W
4 Posted by adam_schueller on 27 Apr, 2017 11:39 PM
Hey Robert,
Final outcome: we managed to fix the WMI issues on the box. Not entirely sure what solved it -- we rebuilt the WMI repository, tweaked some other settings, and finally rebooted the machine (of course) and the query started working again.
With the 'Get-WMIObject win32_service' call working as expected, and the deployments are back to their snappy selves. :)
In the future we'll know to check the health of WMI on the target boxes when we see this behavior again.
I appreciate the help; we've got some new debugging tools under our belts now.
Thanks,
-Adam
Paul Stovell closed this discussion on 02 Aug, 2017 09:37 AM.