By BoLOHUKE payday loans uk

Oct 062008
 

No matter how careful you are. No matter how many times you’ve checked your parameters and state conditions. No matter how many times you tested it out of production.

It happens.

Some misconfigured rule – or even an event that happens much differently in a production environment then in test – begins firing off alerts.  Maybe you don’t notice it right away – perhaps you haven’t setup notifications for this particular rule.

But then, one day you start getting emails from your RMS with scary event IDs like 2115 (Data source not receiving a response), 25017 (Backlogged event processing) and 29202 (Inconsistent database state).

So you decide to investigate, and open up the Operations Console.

Only… wow, it’s running a lot slower then it normally does. Insanely slow.
You click onton Monitoring > Active Alerts – and then wait. And wait some more. As our once friendly green progress bar seems to start taunting you. So you lock the desktop and go chat up that new girl they hired. Wow, she’s pretty amazing right? Funny and smart as a whip, too.

Feeling happy and content after working your suave IT skills on her, you literally float back to your desk and unlock your desktop. Wasn’t there something bothering you before? Oh well, must have not been all that important. You peek up from your cube and catch a glimpse at her, then move those eyes down and see your still open Operations Console. The evil green bar still chugging away. But then you also see why…

A rogue rule quickly causes the alert count to soar to over 140,000

A rogue rule quickly causes the alert count to soar to over 140,000

And your nemesis, the green progress bar, it still keeps going. That number is rising faster then your blood pressure right now.

Must be a bug, eh? Ok, well, we’ll just check it via SQL to be sure – so you open SQL Studio and run

And then your informed, without any gentleness of a WWII nurse as depicted in the movies, that you have a lot of open alerts.

Over 250,000 alerts according to SQL

Over 250,000 alerts according to SQL

Wow! You better fix this!
And you better do it on the RMS, because it’s taking forever from your desktop.
You already have a general idea of which rule did it – that active alerts panel should be filled with it. So your first stop is to get back to authoring panel and either disable that rule or setup some proper alert suppression. Then we just have to deal with cleanup.

You turn, as always, to our friend PowerShell to help us out. Surely the easiest and most obvious solution to this problem is to run

Then just wait for the nightly alert grooming to happen to nudge it along with a SQL exec p_AlertGrooming

Only, when you try to do it, you get an OutOfMemory exception.

Out of memory!

Out of memory!

Now what to do?! The console is crippling slow – if you had to close the alerts that way your company would have gone bankrupt during the Dot Com Re-Burst of 2799! And when you try with PowerShell, you’ve run out of memory!

That’s where I was, until I talked to an unnamed friend((If you want to be named, just let me know. Better to err on the side of caution and all that)) from MS that really helped me out. That, combined with hindsight, allows me help you out as well!

How To Clean Up an Alert Storm

  1. Try the console. We’re going to assume it’s running slower than <Insert joke about large celebrities in the 1980s doing something they’re known for>, so we’ll move on.
  2. Try the Command Shell. $alerts = Get-Alert |? {($_.ResolutionState -eq 255) -or ($_.Name -match “Rule name if you know the naughty one”)}  – Running out of memory still?
  3. Try the same command, only instead of piping it to Where-Object, use the builtin filter object.
    $alerts = Get-Alert -criteria ‘WHERE ResolutionState = 0 AND Name LIKE ”%Rule Name%”’
  4. Still OOM? Try running both of those commands on the RMS, or another management server. Pick one with the most amount of memory, and hope for the best.
  5. Still receiving Out of Memory exceptions? Let’s stop using the OS to manage our memory. Open RegEdit and navigate to HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Config Service – change the value of Should Manage Memory from False to True. Stop and restart the config service.
  6. Now try your command again. It will take some time, but it will complete. Now lets try running the whole script to fix this:

    (Alternately, you can use the Resolve-Alert cmdLet, but from testing it’s not quite fast enough to keep up with the next step)
  7. Now when you ran step 6, it probably gave you a lot of errors when attempting to update the alert. That’s because there’s a small window of freshness to your alert object, and if you don’t update it within that window it becomes stale and unable to be used. To fix that, change the ForEach to look like this:
    ForEach($alert in $alerts) {
    $freshAlert = Get-Alert $alert.id
    $freshAlert.ResolutionState = 255
    $freshAlert.Update(“”)
    }
    That will grab a fresh version of that alert and update it.
  8. But what if you have thousands upon thousands of alerts? The above solutions could conceivably take days to run. Don’t worry, there’s a way around that, too.
    Before I show you, please be noted that this METHOD IS NOT SUPPORTED BY MICROSOFT and use of this method could possibly BLACKLIST YOUR OpsMgr INSTALL. It is the answer given out occasionally though, much to the dismay of the product group, so use that information how you’d like.
  9. Connect to your operations manager database and run the following update. This one updates every rule, but you could narrow it down with an additional AND WHERE RuleName = “My Rule Name”
  10. When that’s completed, you’ll need to update the TimeResolved via:

    < Make TimeResolved be some day in the past so it will groom them out.
  11. Either wait overnight until the grooming jobs kick off or run
  12. You’re done. Now don’t do it again!

[print_link]

Sep 242008
 

I’ve talked about this before, calling it the hidden ‘nag mode’ inside of SCOM, but I really need to find out whether it’s intended or not.

Please see the following bug report I filed @Connect.

Essentially, if you grab an alert object via Get-Alert, then call the Update method, one of two things will happen depending on what parameters you fiddled with.

Open up the command console and grab an alert, something like $oneAlert = Get-Alert | Select -First 1
Now, change something in that alert, such as the resolution state ($oneAlert.ResolutionState = 111), then call $alert.Update(“”) (Or comment it with $alert.Update(“Changed resolution state”)).

What happens?

As you expect, not much, just the alert resolution state was changed. Now grab another alert and call the update method alone ($alert.Update(“”)) or with only a comment ($alert.Update(“Testing an issue”)).

What happens?

Whatever notification channel attached to that alert – usually email – will now fire again. So is it a bug, or not? If it isn’t a bug, then that’s excellent – we have more tools at our disposal and can now easily add that nag mode. If it’s a bug, that means it will be fixed, and as such shouldn’t be used.

There are additional bugs I need to file, regarding how a lot of the OpsMgr Commandlets, while stating they support the common parameters, actually don’t. And how the filter parameter should follow basic syntax and use “-Filter” instead of “-Criteria”. But that’s for another post.

PS: It’s my birthday on Friday. Perhaps you’d like to buy something from my wish list? Or at least enter the contest. The next 2 will have considerably better prizes, I promise!

Sep 182008
 

I’ve been working on several things all in parallel, and I’ll give you a little insight into them all.

But first, don’t forget to enter the Pavleck.NET Contest for your chance to win 1 of two Amazon.Com gift cards. We’re currently at a paltry 10 entries, so your chances are pretty good to say the least. If I get a good turnout for this, I’ll make this a regular event – I already have next months prize ready, too. A rare copy of System Center Operations Manager 2007 Unleashed – signed by all the authors. Not many of these exist, but I have one for you!

What’s I’m working on:

  • Writing a small MOM 2005 to SCOM 2007 migration report script. Examines your agents from MOM and compares them with what’s installed in SCOM.
  • Attempting to write a small service that will handle custom alert notifications by matching alert names to notification groups through the SDK. A simple XML file is used to create the configuration, and it’s as easy to setup as this:
  • Work on SCOPE continues, including a partial command list – feel free to add to it.
  • As does an article (With a nifty flow chart!) of the steps to take to handle an alert storm, from the console all the way to SQL – get that system back in action!
Sep 122008
 

Edit: 09/13/2008 – On the advice of Pete Zerger, updated script to include a throttling mechanism to prevent an overload if an alert storm occurs. Also changed things around to make it a more generic ‘run remote executable’ instead of run remote sound.

A question was recently asked on the MOM Mailing List over at myITforum.com.

That question was, quite to the point:
How to create a audible alert? I like to create one for the critical alerts..

I’ve been working earlier with a script that would go out and disable the run time tracing, stop it, then delete the log files. So I already had knew what would work – a simple PowerShell script that uses WMI’s process create method on a remote machine.

A caveat lector before I continue; while this solution will technically work, I haven’t tested it formally. Additionally, you’ll need to contend with permission issues that arise as well. If you’re running the OpsMgr services under a named account, you’ll need to give that same account local administrator access on whichever machine you plan to run this call against. If you’re using ‘Local System’ you’ll have to either add the RMS\Local System account to the remote machine’s admin group or embed credentials inside the WMI call((Be careful when doing this. I haven’t included directions for that because it’s just a nightmare waiting to happen. I can give you a jumping off point though.))

First, the script. It’s small and basic. It wants to know the machine you want to run the command on, the command, and because this is a a sound player, the path to the WAV file. It then creates the process via WMI, and decodes the return code. If it’s 0, everything is fine. If it’s anything else, the process creation failed and it writes an event to the Operations Manager event log, which you can create an additional rule to look for.

Download SCOM-RunRemoteExecutable.ps1

To implement this, open the Operations Console and go to Administration > Settings > Notification
Click on the Command tab, then click on add. Fill it out as you normally would:

Then click on OK, and you’ll see it with the rest of your commands:

Now to finish it up you’ll need to create a new notification recipient. Right-click on Notifications and select new recipient.

Make the display something to designate that it runs a command, I used “Sound Audible Alert”. And because the NOC isn’t manned 24/7, I limited the notification time to weekdays from 8am to 6pm. You can also adjust this from the devices tab, but I’m not going to include an emailing or other devices, so I prefer to set it in the general tab, this way it’s obvious even with a casual glance what the settings are.

After that, click on the “Notification Devices” tab, then click “Add”.
In the resulting popup, select our new notification command and enter anything for the delivery address – I used NA, because for this particular command we don’t require any additional information – but OpsMgr still needs something in there. Hit next, keep the schedule at always unless you’re adding additional channels, next again, name the device – I used “Send Audible Alert”

Click OK, and your set. Treat it like any other notification recipient – either create a new rule just for this, or edit an existing subscription and add our new recipient to it.

As you can see, using PowerShell inside of Operations Manager makes it very flexible and powerful. We can run all manner of things in response to alerts; From running a simple sound file all the way up to initiating disaster recovery scenarios and intense system diagnostics – both things which I’ll be showing you later on as we explore the Notification Command Channel together.

Sep 102008
 

When it comes to notifications, we have many options – except one that people have asked about, a nag mode. Something that will re-send an email after a certain amount of time to make sure it’s taken care of.

Well, it does exist in OpsMgr.

Either intentionally or unintentionally as a bug, if you call the Update method on an alert without changing any criteria, the notification bound to the alert will re-fire. This will happen whether you add a comment with the update (Update(“Updating the alert”)) or not (Update(“”)).

To enable this secret nag-mode, it’s as simple as writing a Powershell script that runs every X hours. In that script you’ll just need to do a Get-Alert with the criteria you’re looking for – in the example I’m just going to have it return all alerts older then 4 hours, and update them.

It’s very simple though – how simple? Like this:

You can expand this as much as you’d like. Match against NetBiosComputerName to only nag for those critical core servers, match it against the monitoring object to ensure critical monitors are being addressed. Multiple management groups? Match against that. You see where I’m going with this. In fact, you can find out everything you can match against by just running Get-Alert | select -first 1 – there’s all the fields available.

Sep 052008
 

I’ve never had a ‘proper’ test/development environment for System Center products. I’ve used both client systems and VMs I’d spin up through VirtualBox. That will be changing.

I placed an order a few days ago for a new server – featuring a 2.8Ghz Quad core Xeon and 12GB of ram, it will begin a brand new environment – and I’ll be screencasting all of the most relevant parts of it.

We’ll cover sizing, install, deployment, security – and move on to extending OpsMgr by utilizing the SDK service, designing custom reports – and a lot more.

In the mean time, while I painfully await the arrival of a shiny box from DHL, I’m working on the framework for 2 new side projects – one of which I think you’ll all be quite happy with; the much awaited SCOPE – System Center OpsMgr Powershell Extentions – a collaborative operation between me, Marco Shaw, Cameron Fuller, Pete Zerger and – you, possibly. We’re at the very early stages of SCOPE, and could definitely use people now and down the road – especially C# programmers and those familiar with creating PowerShell snapins. If you’d like to help, send me an email (jeremy@pavleck.net) and let me know what you can do.

Aug 052008
 

It’s been awhile. I’ve actually been terribly busy at my current client, implementing and fine-tuning my Alert Resolution State notification workflow. I’m currently expanding it to hold a few dozen different teams, as well as creating PropertyBags to send performance data (Number of alerts changed per category, total time script ran, total alerts, etc) as well as addming more robust failure checks – that way it can also alert if it fails for any reason.

I’ve also been playing around with PowerShell and 37 year old code.

Super Star Trek - in PowerShell!

Super Star Trek - in PowerShell!

Yes, that’s Super Star Trek. Just a little time-waster I work on while I’m mulling over a problem or two. And huge thanks to Jaykul of course, for all of his Powershell knowledge. I can’t do it without him and the crew in #powershell!

Are you on Twitter? If so, be sure to follow OpsMgr to stay on top of the most recent SCOM posts out there! And while you’re at it, feel free to follow me as well – I can always use more friends.

Until next time.

Jul 072008
 

By default, you can’t really specify a failover management server in OpsMgr. Why? Not really sure, though I think it’s a ploy to ensure you setup the OpsMgr Active Directory Integration, which will handle this for you.

No fret though, we can still do it – it’ll just take a little bit of actual effort.

First, we need to define our Primary and Failover management servers. This isn’t something you can just progmatically grab, so you’ll need to know the name yourself.

In my $PROFILE, I’ve set them to be defined to 2 variables with the following:

Now that we have that set, it’s simple to do the rest. First, lets grab all of the servers that don’t have a failover management server set.

What the above does is call the GetFailoverManagementServers() method on each agent. If they have a failover, it will return data and thus $True. If there aren’t any failovers, it will return nothing – which is the same as $False. So we look for all the ones that don’t return anything.

If you’re curious, you can see just how many servers are missing failovers with

- in my case it was 63.

Now, we just run a quick snippet that adds the failover server to the agent:

That will crunch away as it’s doing it’s thing, we’re redirecting output to $null so we don’t have to see agents scrolling over and over. When it returns you to a prompt, you’re done. If you’d like to verify that you did indeed set all of the agents to have a failover, we can check real quick:

And that’s that. All of your agents have a primary and failover server.

Screen shot of SCOM Command Shell showing steps to setup failover agents

But wait, you have a lot of remotely managed devices too? Monitoring SNMP on a bunch of different servers – what happens for that?

Well, we can’t setup a failover (From what I’ve seen, if I’m wrong please let me know) agent. But we can proactively write a script that will change the proxy agent on the devices, and run it as needed.

This was written in a response to this query on the newsgroups, and is only a cursory look into it. There may be other ways of doing this – and I’d love to hear it. As it stands, I’m not sure how to set them back to a management server as the monitor.

Firstly, we’ll have to pick an agent managed computer to use as the new proxy agent. You can’t use a management server for this, because they aren’t “Agent Managed” and you can’t use Set-ManagementServer because the devices aren’t “Remote Managed Computers”.  I have a seperate agent-managed server on my network I call “Timex” because it acts like a watcher node. So I’ll go ahead and use him.

Then gather a list of our current remotely managed devices

Now just loop through it, setting the device to use the proxy agent we just instantiated:

That will loop through things changing the proxy server that it uses. When it’s done, we can verify it by running:

If it outputs nothing, then they’ve all been changed. Simple as that!

SCOM: Setting the proxy agent for a device via command shell

Jul 032008
 

Edit 09/10/2008: Fixed the script, fixed the reference in point 14.

I haven’t seen this solution offered as a way to send more customized alerts, and am fairly excited about it. With some of the previous solutions, they involve using the command shell to create an alert notification. This is fine, except if you open the subscription in the GUI – once you’ve done that, you’ve essentially undid all that work and created a ‘catch all’ that sends an alert on any event. Why? Well, the GUI itself isn’t designed for the custom settings that can be done in Powershell. This makes it fairly difficult to add people or change the alert – not acceptable to me.

After messing around with authoring console and creating classes based on event viewer errors and other equally exotic methods I came upon something that works wonderfully. The catch? You can only create 254 rules this way.

What am I talking about? Some powershell scripts and the alert resolution states!

SCOM Administration MG Settings - Alerts

By default, there are only 2 states defined – 0 for New, and 255 for Closed. They are always there, and can not be deleted. This leaves 1 – 254 as user definable states. We can use these to make one-to-one events.

Let me start off that this isn’t an ideal solution, but it is the most readable and elegant solution for this particular problem. You probably shouldn’t do this on a single rule basis, but target it more at a wildcard match. You do have a naming convention for your rules and monitors, right? If not, this is the perfect reason to get one. I’ll typically use a convention of <Product Type>-<Product>-<Version (If multiple)-<Rule>. So if I had a rule targeted at exchange, I’d have a rule similar to “EMAIL – Exchange – Exchange 2007 – Search for ‘Jeremy is fired’ in execs mail”. Then when I’m using an exotic config to send an alert, such as this one, I can better fine tune alerts.

Remember, the more you move away from the  “Out of box” yfunctionality with OpsMgr, the more you should be documenting. Or even better, a wiki. Just make a reference to the wiki in the description, and people will know exactly what you’re trying to do – that’s for another post though.

Let’s get on with it, shall we?

I’m going to create a situation. I have a custom application which logs to the Application log. There’s one particular event that only one group in the organization cares about – all they want is a notification of this one single event and nothing else. How do we do it?

The Cliff’s Notes version of what we’ll be accomplishing today:

  • Create a custom Resolution State
  • Define a new rule
  • Deploy a PowerShell script to the RMS to update the resolution state of matching alerts
  • Create a notification subscription which responds to our particular state

Now, for the complete steps

  1. First go to Authoring > Management Pack Objects > Rules – Right click and “Create new rule”
  2. Under rule type, select Alert Generating, Event Based, NT Event Log (Alert) and select a management pack to use.
    System Center Operations Manager 2007 Creat Rule Wizard
  3. Enter a rule name that is distinctive enough that no other rules will have that same name.  Then enter a description, rule category and choose a target. You can go with the shotgun approach and pick “All Computers” here if you’d like.
    SCOM - Authoring - Create Rule Wizard
  4. Now walk through the rest of the wizard and configure your event log settings – for this test I’m using the Application log, Event ID of 926 and event source of “Pavleck.NET Test”. But you can put whatever you want here ace, it’s up to you.
  5. Configure your alert. It should automatically copy over the rule name as the alert name. The alert name is what we’ll actually be alerting on, so it’s important that you remember what it is, and ensure it’s distinctive enough to not match something that already exists.
    SCOM - Authoring - Create Rule Wizard - Configuring the alert
  6. Now we can work on the other parts of this while our rule is propagating across the environment.
  7. Go to Administration > Settings > Alerts – this is where we’ll define a new Alert Resolution State to use. Click on “New…” and name your state and choose an ID for it, I used 10 in this example.
    SCOM - Alert Resolution States - Adding a new state
  8. Click Apply, then Ok and now we’re done with part 2.
  9. Let’s go ahead and test and see if our rule works, just open a command window and use some EventCreate.exe magic.
  10. Open up the alert console for whatever machine you ran that on and you should see our new alert in there – yay, we did something!
  11. Now we’ll add the magic that changes the alert resolution state. It’s a fairly simple script, and it’s meant to be that way. For simplicity’s sake, we’ll be running this script as a timed response from the RMS. Depending on how your particular environment is setup, you could also run it inside of the rule itself, as an additional response to “Create Alert”. But that only works well if you only plan on doinbg this sparingly, otherwise it makes more sense to run this from the RMS and add onto the script as needed.
    First, download SCOM-UpdateResolution.ps1 here (Or view it after the jump) and edit the alert name, resolution state and RMS to what matches your environment.
  12. Now we’ll need to go and create a new rule. Rule type is Timed Commands > Execute a command. Give it a name and description. I’ve set the rule category to “Maintenance” as that makes the most sense to me.
  13. For the schedule, I’ve set mine to run every 2 minutes. This means there will be a delay of that much between alerts and notifications, but that’s acceptable to me. Then hit next.
  14. Configure the command line execution settings as shown – remembering to use instead of “&”. I’ve set the timeout to 45 seconds.
  15. Hit create and that’s almost all of it – all we need to do now is to create the alert subscription. Go to administration, right click on Subscriptions and choose “Create new notification subscription”
  16. Step through it like normal, choosing all groups, all classes. When you get to the Alert Criteria page, uncheck “New” and “Closed” and check our new resolution state. If you keep ‘closed’ in there, it will pertain to all alerts that close. That’s one drawback to this method, you won’t get closed alerts.
    Alert Criteria Pane of the Notification Subscription wizard, showing our custom resolution selected
  17. Finish it up as you normally would, then lets test it! Create a few more test events, and lets see if it works.

That’s all there is to it. This works, reliably and 100% of the time. It’s extremely flexible and easy to follow for someone just walking into your environment.

By using a single PowerShell script, and targeting the RMS computer group you’ll be making sure that you have only a single simple script to edit and by mirroring the files and directory paths to any other management servers in your environment you maintain this method if you ever need to promote one to an RMS.

Continue reading »

Jun 232008
 

In MOM 2005, virtually everything was a rule. A rule looked for an even in the event viewer, a line in a log file, a return code from a script, etc and fired off an alert (Or did another action). It was essentially ‘dumb’, because it had no idea whether or not if an even it raised was ever fixed. It just fired them off every time it saw it.

Enter OpsMgr 2007. It introduced us to an old concept of the ‘monitor’. The monitor is a multi-state event. It watches for multiple items; something will set a particular item into a failed or degraded mode, and there is a corresponding event that marked it as being healthy again. This is wonderful, as it helps minimize the amount of open alerts sitting in your system at any given time. Less open alerts means we have more relevant information to look at.

When it comes to core Windows monitors, it works beautifully and 100% of the time. If you cross a memory threshold, an event is created and an alert goes out (If you’ve set it up to alert). When the memory drops below this threshold, then the monitor marks that particular object as being in a Healthy state again and, if you’ve allowed it to, it auto-closes the alert.

When this doesn’t work beautifully and 100% of the time is when you need to rely on 3rd party agents and management packs. I’ll use the HP Management Packs as an example, because that’s what I’ve been facing recently.

The way OpsMgr knows about hardware events that happen on an HP machine is because the HP agents themselves will place an event in the Event Viewer and/or send an SNMP trap about it. Works flawlessly to create an event in SCOM about an unhealthy object. What doesn’t work perfectly is the corresponding event that marks that system as being healthy again.

The reason for this seems to depend on the exact configuration of a server, the version of the HP agents, and the actual event itself. If there is an event, such as a power supply failing, the log is populated and SCOM creates an event saying “Power Supply #1 degraded.”. When that power supply is replaced, it won’t necessarily auto-resolve the event, because instead of seeing “Power Supply #1 Healthy”, the HP agents might instead log “Power Supply (Serial number: FD30401104-P) Inserted into Bay #0″. The monitor isn’t looking for that, and so it isn’t aware that that is the corresponding ‘good’ event, and the event stays open.

So theoretically you could replace a failing piece of hardware, such as a Power Supply, which doesn’t auto-resolve and then in the future have that same PSU die, which won’t cause a new alert and literally leave you ‘powerless’ to know what is going on.

Now, in a normal deployment of OpsMgr this isn’t to large of a concern. There are always eyes on the console or emails being sent. Someone will see it, fix it, then ensure the event is closed.

The current situation I’m in, however, doesn’t work this way. SCOM is being used consoleless to monitor a group of monitoring tools. Essentially it’s here to keep ‘them’ honest, and to ensure there’s another level of defense to protect us and let us know when a failure has occurred.

Because of this, those slight discrepancies in the HP agents and the HP management pack aren’t acceptable. But OpsMgr really doesn’t have a way of being run without anyone paying attention to it – or does it?

It actually does. What I’ve setup at this site is a PowerShell script which runs every 4 hours and resolves all the open HP alerts.The HP Agents themselves will run a self-check every hour or so, and log that “Power Supply #1″ is still failed. Because we’ve already cleared that alert, SCOM will pick it up again and re-fire the event, the alert, and all that jazz. In essence, we’ve created a ‘nag’ feature in SCOM.

This is beneficial in our case, because the current setup of OpsMgr where I’m at is mainly there to watch the other monitoring tools. This ‘nag’ lets us know that the problem was either not taken care of, or was not alerted on – thus ‘keeping them honest’.

How we do all this is very simple – the OpsMgr Command Shell has almost everything we need.

We’ll use Get-Alert to bring back a list of all open HP events, and Resolve-Alert to close them, adding a comment that we automated this.

To find the HP alerts, we need to match against the MonitoringObjectFullName property inside the alert. Through trial and error, I noticed that every single HP object began with “HewlettPackard”. So we’ll match against that, picking all alerts that don’t have a resolution state of 255 (Closed).

From there, we cycle through the alert array, passing each one to Resolve-Alert, along with a -comment – in my case I used “Closed by Powershell – see (link) for more details” with a link to the internal Wiki.

And that’s really all that there is to it. Mind you, I’ve done a lot more in the script, as you’ll see below. It measures how long it took to bring up the alerts, counts how many were per severity, the repeat count, etc then creates a PropertyBag and submits all the information to OpsMgr for reporting. It then also logs it to the eventviewer.

Download SCOM-Resolve-HardwareAlerts.ps1

This script is best setup to run every 4 hours or so. It’s setup as a generic ‘timed script’ inside of SCOM. If you’d like more info on setting up SCOM to work with Powershell more properly, see Brian Wren’s post here.

Here’s the script: