Chapter 2. Pipelines

Introduction

One of the fundamental concepts in a shell is called the pipeline. It also forms the basis of one of PowerShell's most significant advances. A pipeline is a big name for a simple concept—a series of commands where the output of one becomes the input of the next. A pipeline in a shell is much like an assembly line in a factory: it successively refines something as it passes between the stages, as shown in Example 2.1, “A PowerShell pipeline”.

1 comment

  1. Jeff Poling Posted 16 days and 16 hours ago

    I will review this chapter

Add a comment

Example 2.1. A PowerShell pipeline

Get-Process | Where-Object { $_.WorkingSet -gt 500kb } | Sort-Object -Descending Name

In PowerShell, you separate each stage in the pipeline with the pipe (|) character.

In Example 2.1, “A PowerShell pipeline”, the Get-Process cmdlet generates objects that represent actual processes on the system. These process objects contain information about the process's name, memory usage, process id, and more. As the Get-Process cmdlet generates output, it passes it along. Simultaneously, the Where-Object cmdlet, then, gets to work directly with those processes, testing easily for those that use more than 500 kb of memory. It passes those along immediately as it processes them, allowing the Sort-Object cmdlet to also work directly with those processes, and sort them by name in descending order.

This brief example illustrates a significant advancement in the power of pipelines: PowerShell passes full-fidelity objects along the pipeline, not their text representations.

In contrast, all other shells pass data as plain text between the stages. Extracting meaningful information from plain-text output turns the authoring of pipelines into a black art. Expressing the previous example in a traditional Unix-based shell is exceedingly difficult and nearly impossible in cmd.exe.

Traditional text-based shells make writing pipelines so difficult because they require you to deeply understand the peculiarities of output formatting for each command in the pipeline, as shown in Example 2.2, “A traditional text-based pipeline”.

Example 2.2. A traditional text-based pipeline

lee@trinity:~$ ps -F | awk '{ if($5 > 500) print }' | sort -r -k 64,70
UID        PID  PPID  C    SZ   RSS PSR STIME TTY             TIME CMD
lee       8175  7967  0   965  1036   0 21:51 pts/0       00:00:00 ps -F
lee       7967  7966  0  1173  2104   0 21:38 pts/0       00:00:00 -bash

In this example, you have to know that, for every line, group number five represents the memory usage. You have to know another language (that of the awk tool) to filter by that column. Finally, you have to know the column range that contains the process name (columns 64 to 70 on this system) and then provide that to the sort command. And that's just a simple example.

An object-based pipeline opens up enormous possibilities, making system administration both immensely more simple and more powerful.

Filter Items in a List or Command Output

Problem

You want to filter the items in a list or command output.

Solution

Use the Where-Object cmdlet to select items in a list (or command output) that match a condition you provide. The Where-Object cmdlet has the standard aliases where and ?.

1 comment

  1. Jeff Poling Posted 16 days and 16 hours ago

    the parentheticals make the solution a little confusing. Can another sentence be used to express what the aliases are?

Add a comment

To list all running processes that have "search" in their name, use the -like operator to compare against the process's Name property:

Get-Process | Where-Object { $_.Name -like "*Search*" }

To list all directories in the current location, test the PsIsContainer property:

Get-ChildItem | Where-Object { $_.PsIsContainer }

To list all stopped services, use the -eq operator to compare against the service's Status property:

Get-Service | Where-Object { $_.Status -eq "Stopped" }

Discussion

For each item in its input (which is the output of the previous command), the Where-Object cmdlet evaluates that input against the script block that you specify. If the script block returns True, then the Where-Object cmdlet passes the object along. Otherwise, it does not. A script block is a series of PowerShell commands enclosed by the { and } characters. You can write any PowerShell commands inside the script block. In the script block, the $_ variable represents the current input object. For each item in the incoming set of objects, PowerShell assigns that item to the $_ variable, and then runs your script block. In the preceding examples, this incoming object represents the process, file, or service that the previous cmdlet generated.

This script block can contain a great deal of functionality, if desired. It can combine multiple tests, comparisons, and much more. For more information about script blocks, see the section called “Write a Script Block”. For more information about the type of comparisons available to you, see the section called “Comparison Operators”.

For simple filtering, the syntax of the Where-Object cmdlet may sometimes seem overbearing. The following section, the section called “Program: Simplify Most Where-Object Filters”, shows a script that can make simple filtering (such as the previous examples) easier to work with.

For complex filtering (for example, the type you would normally rely on a mouse to do with files in an Explorer window), writing the script block to express your intent maybe difficult or even infeasible. If this is the case, the section called “Program: Interactively Filter Lists of Objects” shows a script that can make manual filtering easier to accomplish.

For more information about the Where-Object cmdlet, type Get-Help Where-Object.

Group and Pivot Data by Name

Problem

You want to easily access items in a list by a property name.

Solution

Use the Group-Object cmdlet (which has the standard alias group) with the -AsHash and -AsString parameters. This creates a hashtable with the selected property (or expression) used as keys in that hashtable.

1 comment

  1. Jeff Poling Posted 16 days and 15 hours ago

    It might be worthwhile to offer a brief explanation of a hashtable. Someone really new to IT and/or powershell may not understand the concept

Add a comment

PS > $h = dir | group -AsHash -AsString Length
PS > $h

Name                           Value
----                           -----
746                            {ReplaceTest.ps1}
499                            {Format-String.ps1}
20494                          {test.dll}

PS > $h["499"]


    Directory: C:\temp


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        10/18/2009   9:57 PM        499 Format-String.ps1


PS > $h["746"]


    Directory: C:\temp


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        10/18/2009   9:51 PM        746 ReplaceTest.ps1

Discussion

In some situations, you might find yourself repeatedly calling the Where-Object cmdlet to interact with the same list or output:

PS > $processes = Get-Process
PS > $processes | Where-Object { $_.Id -eq 1216 }

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     62       3     1012       3132    50     0.20   1216 dwm


PS > $processes | Where-Object { $_.Id -eq 212 }

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
    614      10    28444       5484   117     1.27    212 SearchIndexer

In these situations, you can instead use the -AsHash parameter of the Group-Object cmdlet. When you use this parameter, PowerShell creates a hashtable to hold your results, which creates a map between the property you are interested, and the object it represents:

PS > $processes = Get-Process | Group-Object -AsHash Id
PS > $processes[1216]

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     62       3     1012       3132    50     0.20   1216 dwm


PS > $processes[212]

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
    610      10    28444       5488   117     1.27    212 SearchIndexer

For simple types of data, this approach works well. Depending on your data, though, the -AsHash parameter alone can run into difficulties.

The first issue you might run into comes from when the value of a property is $null. Hashtables in PowerShell (and the .NET Framework that provides the underlying support) do not support $null as a value, so you get a misleading error message:

PS > "Hello",(Get-Process -id $pid) | Group-Object -AsHash Id
Group-Object : The objects grouped by this property cannot be expanded sin
ce there is a duplication of the key. Please give a valid property and try
 again.

1 comment

  1. Jeff Poling Posted 16 days and 15 hours ago

    Can the error message output be cleaned up? The word "since" is broken up on two lines

Add a comment

A second issue comes when more complex data gets stored within the hashtable. This can unfortunately be true even of data that appears to be simple.

PS > $result = dir | Group-Object -AsHash Length
PS > $result

Name                           Value
----                           -----
746                            {ReplaceTest.ps1}
499                            {Format-String.ps1}
20494                          {test.dll}

PS > $result[746]
(Nothing appears)

This missing result is caused by an incompatibility between the information in the hashtable, and the information you typed. This is normally not an issue in hashtables that you create yourself, because you provided all of the information to populate it. In this case, though, the Length values stored in the hashtable come from the directory listing, and are of the type Int64. An explicit cast resolves the issue, but takes a great deal of trial and error to discover:

PS > $result[ [int64] 746 ]


    Directory: C:\temp


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        10/18/2009   9:51 PM        746 ReplaceTest.ps1

It is difficult to avoid both of these issues, so the Group-Object cmdlet also offers an -AsString parameter to convert all of the values to their string equivalent. With that parameter, you can always assume that the values will be treated as (and accessible by) strings:

1 comment

  1. David "Makovec" Moravec Posted 17 days and 7 hours ago

    Another issue happen when some files has the same size. Then you'll receive error regarding 'duplication of the key'. Maybe also something to mention. With AsString parameter it looks this way then:

    Name Value


    0 {AUTOEXEC.BAT, CONFIG.SYS}

    and is possible to access it like: $result["0"][1].name

Add a comment

PS > $result = dir | Group-Object -AsHash -AsString Length
PS > $result["746"]


    Directory: C:\temp


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---        10/18/2009   9:51 PM        746 ReplaceTest.ps1

For more information about the Group-Object cmdlet, type Get-Help Group-Object. For more information about PowerShell hashtables, see the section called “Create a Hashtable or Associative Array”.

Program: Simplify Most Where-Object Filters

The Where-Object cmdlet is incredibly powerful, in that it allows you to filter your output based on arbitrary criteria. For extremely simple filters (such as filtering based only on a comparison to a single property), though, the syntax can get a little ungainly:

Get-Process | Where-Object { $_.Handles -gt 1000 }

For this type of situation, it is easy to write a script (as shown in Example 2.3, “Compare-Property.ps1”) to offload all the syntax to the script itself:

Get-Process | Compare-Property Handles gt 1000
Get-ChildItem | Compare-Property PsIsContainer

With a shorter alias, this becomes even easier to type:

PS > Set-Alias wheres Compare-Property
PS > Get-ChildItem | wheres Length gt 100

Example 2.3, “Compare-Property.ps1” implements this "simple where" functionality. Note that supplying a non-existing operator as the $operator parameter will generate an error message.

Example 2.3. Compare-Property.ps1

param($property, $operator = "eq", $matchText = "$true")

Begin { $expression = "`$_.$property -$operator `"$matchText`"" }
Process { if(Invoke-Expression $expression) { $_ } }
      

For more information about running scripts see the section called “Run Programs, Scripts, and Existing Tools”.

Program: Interactively Filter Lists of Objects

There are times when the Where-Object cmdlet is too powerful. In those situations, the Compare-Property script shown in the section called “Program: Simplify Most Where-Object Filters” provides a much simpler alternative. There are also times when the Where-Object cmdlet is too simple—when expressing your selection logic as code is more cumbersome than selecting it manually. In those situations, an interactive filter can be much more effective.

Example 2.4, “Select-FilteredObject.ps1” implements this interactive filter. It uses several concepts not covered yet in the book, so feel free to just consider it a neat script for now. To learn more about a part that you don't yet understand, look it up in the table of contents or the index.

3 comments

  1. David "Makovec" Moravec Posted 17 days and 6 hours ago

    I would suggest to mention in one sentence what the script is doing. Otherwise it's just "unknown" code (even you mention that will be understandable later).

  2. Jeff Poling Posted 16 days and 15 hours ago

    I agree with David. A short explanation of the script would be useful, even if additional concepts later on will help explain it

  3. Lee Holmes Posted 14 days and 18 hours ago

    Hmm, looks like a bug in the feedback system. The actual script (and manuscript) have the comments.

Add a comment

Example 2.4. Select-FilteredObject.ps1

##############################################################################

begin
{
    $filename = [System.IO.Path]::GetTempFileName()
    
    $header = @"

"@

    $header > $filename

    $objectList = @()
    $counter = 0
}

process
{
    "{0}: {1}" -f $counter,$_.ToString() >> $filename

    $objectList += $_
    $counter++
}

end
{
    $processStartInfo = New-Object System.Diagnostics.ProcessStartInfo "notepad"
    $processStartInfo.Arguments = $filename
    $process = [System.Diagnostics.Process]::Start($processStartInfo)
    $process.WaitForExit()

    foreach($line in (Get-Content $filename))
    {
        if($line -match "^(\d+?):.*")
        {
            $objectList[$matches[1]]
        }
    }

    Remove-Item $filename
}

1 comment

  1. David "Makovec" Moravec Posted 17 days and 5 hours ago

    I just compared listing with the one from 1st edition of this book and in first edition are comments included. It's very helpful!

Add a comment


For more information about running scripts, see the section called “Run Programs, Scripts, and Existing Tools”.

Work with Each Item in a List or Command Output

Problem

You have a list of items and want to work with each item in that list.

Solution

Use the Foreach-Object cmdlet (which has the standard aliases foreach and %) to work with each item in a list.

To apply a calculation to each item in a list, use the $_ variable as part of a calculation in the scriptblock parameter:

PS > 1..10 | Foreach-Object { $_ * 2 }
2
4
6
8
10
12
14
16
18
20

To run a program on each file in a directory, use the $_ variable as a parameter to the program in the script block parameter:

Get-ChildItem *.txt | Foreach-Object { attrib -r $_ }

To access a method or property for each object in a list, access that method or property on the $_ variable in the script block parameter. In this example, you get the list of running processes called notepad, and then wait for each of them to exit:

$notepadProcesses = Get-Process notepad
$notepadProcesses | Foreach-Object { $_.WaitForExit() }

Discussion

Like the Where-Object cmdlet, the Foreach-Object cmdlet runs the script block that you specify for each item in the input. A script block is a series of PowerShell commands enclosed by the { and } characters. For each item in the set of incoming objects, PowerShell assigns that item to the $_ variable, one element at a time. In the examples given by the solution, the $_ variable represents each file or process that the previous cmdlet generated.

This script block can contain a great deal of functionality, if desired. You can combine multiple tests, comparisons, and much more. For more information about script blocks, see the section called “Write a Script Block”. For more information about the type of comparisons available to you, see the section called “Comparison Operators”.

Note

The first example in the solution demonstrates a neat way to generate ranges of numbers:

1..10

This is PowerShell's array range syntax, which you can learn more about in the section called “Access Elements of an Array”.

The Foreach-Object cmdlet isn't the only way to perform actions on items in a list. The PowerShell scripting language supports several other keywords, such as for, (a different) foreach, do, and while. For information on how to use those keywords, see the section called “Repeat Operations with Loops”.

For more information about the Foreach-Object cmdlet, type Get-Help Foreach-Object.

For more information about dealing with pipeline input in your own scripts, functions, and script blocks, see the section called “Access Pipeline Input”.

Automate Data-Intensive Tasks

Problem

You want to invoke a simple task on large amounts of data.

Solution

If only one piece of data changes (such as a server name or user name), store the data in a text file. Use the Get-Content cmdlet to retrieve the items, and then use the Foreach-Object cmdlet (which has the standard aliases foreach and %) to work with each item in that list. Example 2.5, “Using information from a text file to automate data-intensive tasks” illustrates this technique.

Example 2.5. Using information from a text file to automate data-intensive tasks

PS > Get-Content servers.txt
SERVER1
SERVER2
PS > $computers = Get-Content servers.txt
PS > $computers | Foreach-Object { Get-WmiObject Win32_OperatingSystem -Computer $_ }

SystemDirectory : C:\WINDOWS\system32
Organization    :
BuildNumber     : 2600
Version         : 5.1.2600

SystemDirectory : C:\WINDOWS\system32
Organization    :
BuildNumber     : 2600
Version         : 5.1.2600

If it becomes cumbersome (or unclear) to include the actions in the Foreach-Object cmdlet, you can also use the foreach scripting keyword as illustrated by Example 2.6, “Using the foreach scripting keyword to make a looping statement easier to read”.

Example 2.6. Using the foreach scripting keyword to make a looping statement easier to read

$computers = Get-Content servers.txt

foreach($computer in $computers)
{
    $system = Get-WmiObject Win32_OperatingSystem -Computer $computer

    if($system.Version -eq "5.1.2600")
    {
        "$computer is running Windows XP"
    }
}

If several aspects of the data change per task (for example, both the WMI class and the computer name for computers in a large report), create a CSV file with a row for each task. Use the Import-Csv cmdlet to import that data into PowerShell, and then use properties of the resulting objects as multiple sources of related data. Example 2.7, “Using information from a CSV to automate data-intensive tasks” illustrates this technique.

Example 2.7. Using information from a CSV to automate data-intensive tasks

PS > Get-Content WmiReport.csv
ComputerName,Class
LEE-DESK,Win32_OperatingSystem
LEE-DESK,Win32_Bios
PS > $data = Import-Csv WmiReport.csv
PS > $data

ComputerName                          Class
------------                          -----
LEE-DESK                              Win32_OperatingSystem
LEE-DESK                              Win32_Bios


PS > $data |
>>     Foreach-Object { Get-WmiObject $_.Class -Computer $_.ComputerName }
>>


SystemDirectory : C:\WINDOWS\system32
Organization    :
BuildNumber     : 2600
Version         : 5.1.2600

SMBIOSBIOSVersion : ASUS A7N8X Deluxe ACPI BIOS Rev 1009
Manufacturer      : Phoenix Technologies, LTD
Name              : Phoenix - AwardBIOS v6.00PG
SerialNumber      : xxxxxxxxxxx
Version           : Nvidia - 42302e31

Discussion

One of the major benefits of PowerShell is its capability to automate repetitive tasks. Sometimes, these repetitive tasks are action-intensive (such as system maintenance through registry and file cleanup) and consist of complex sequences of commands that will always be invoked together. In those situations, you can write a script to combine these operations to save time and reduce errors.

Other times, you need only to accomplish a single task (for example, retrieving the results of a WMI query) but need to invoke that task repeatedly for a large amount of data. In those situations, PowerShell's scripting statements, pipeline support, and data management cmdlets help automate those tasks.

One of the options given by the solution is the Import-Csv cmdlet. The Import-Csv cmdlet reads a CSV file and, for each row, automatically creates an object with properties that correspond to the names of the columns. Example 2.8, “The Import-Csv cmdlet creating objects with Computer Name and Class properties” shows the results of a CSV that contains a ComputerName and Class header.

Example 2.8. The Import-Csv cmdlet creating objects with Computer Name and Class properties

PS > $data = Import-Csv WmiReport.csv
PS > $data

ComputerName                         Class
------------                         -----
LEE-DESK                             Win32_OperatingSystem
LEE-DESK                             Win32_Bios


PS > $data[0].ComputerName
LEE-DESK

As the solution illustrates, you can use the Foreach-Object cmdlet to provide data from these objects to repetitive cmdlet calls. It does this by specifying each parameter name, followed by the data (taken from a property of the current CSV object) that applies to it.

While this is the most general solution, many cmdlet parameters can automatically retrieve their value from incoming objects if any property of that object has the same name. This can let you to omit the Foreach-Object and property mapping steps altogether. Parameters that support this feature are said to support Value from pipeline by property name. The Move-Item cmdlet is one example of a cmdlet with parameters that support this, as shown by the Accept pipeline input rows in Example 2.9, “Help content of the Move-Item showing a parameter that accepts value from pipeline by property name”.

1 comment

  1. David "Makovec" Moravec Posted 17 days and 4 hours ago

    'anyproperty' => 'any property'

Add a comment

Example 2.9. Help content of the Move-Item showing a parameter that accepts value from pipeline by property name

PS > Get-Help Move-Item -Full
(...)
PARAMETERS

    -path <string[]>
        Specifies the path to the current location of the items. The default
        is the current directory. Wildcards are permitted.

        Required?                    true
        Position?                    1
        Default value                <current location>
        Accept pipeline input?       true (ByValue, ByPropertyName)
        Accept wildcard characters?  true

    -destination <string>
        Specifies the path to the location where the items are being moved.
        The default is the current directory. Wildcards are permitted, but
        the result must specify a single location.

        To rename the item being moved, specify a new name in the value of
        Destination.

        Required?                    false
        Position?                    2
        Default value                <current location>
        Accept pipeline input?       true (ByPropertyName)
        Accept wildcard characters?  True
        (...)

If you purposefully name the columns in the CSV to correspond to parameters that take their value from pipeline by property name, PowerShell can do some (or all) of the parameter mapping for you. Example 2.10, “Using the Import-Csv cmdlet to automate a cmdlet that accepts value from pipeline by property name” demonstrates a CSV file that moves items in bulk.

Example 2.10. Using the Import-Csv cmdlet to automate a cmdlet that accepts value from pipeline by property name

PS > Get-Content ItemMoves.csv
Path,Destination
test.txt,Test1Directory
test2.txt,Test2Directory
PS > dir test.txt,test2.txt | Select Name

Name
----
test.txt
test2.txt


PS > Import-Csv ItemMoves.csv | Move-Item
PS > dir Test1Directory | Select Name

Name
----
test.txt


PS > dir Test2Directory | Select Name
Name
----
test2.txt

For more information about the Foreach-Object cmdlet and foreach scripting keyword, see the section called “Work with Each Item in a List or Command Output”. For more information about working with CSV files, see the section called “Import CSV and Delimited Data from a File”. For more information about working with Windows Management Instrumentation (WMI), see Chapter 28, Windows Management Instrumentation

Program: Simplify Most Foreach-Object Pipelines

Problem

You want to access methods and retrieve properties of each pipeline object without the overhead required by the Foreach-Object cmdlet.

1 comment

  1. Karl Mitschke Posted 10 days and 21 hours ago

    "required" is repeated.

Add a comment

Solution

Use the Invoke-Member script to avoid the need for scriptblocks and pipeline variables ($_) for simple property and method access.

Example 2.11. Invoke-Member.ps1


[CmdletBinding(DefaultParameterSetName= "Member")]
param(

    [Parameter(ParameterSetName = "Method")]
    [Alias("M","Me")]
    [switch] $Method,

    [Parameter(ParameterSetName = "Method", Position = 0)]
    [Parameter(ParameterSetName = "Member", Position = 0)]
    [string] $Member,

    [Parameter(
        ParameterSetName = "Method", Position = 1,
        Mandatory = $false, ValueFromRemainingArguments = $true)]
    [object[]] $ArgumentList = @(),

    [Parameter(ValueFromPipeline = $true)]
    $InputObject
    )

begin
{
    Set-StrictMode -Version Latest
}

process
{
    if($psCmdlet.ParameterSetName -eq "Method")
    {
        $inputObject.$member.Invoke(@($argumentList))
    }
    else
    {
        $inputObject.$member
    }
}

        

Discussion

As shown in the section called “Automate Data-Intensive Tasks”, the Foreach-Object cmdlet supports literally the entire PowerShell scripting language when working with objects in a pipeline. However, the syntax and non-alphabetic characters required for simple expressions can sometimes feel overbearing.

Note

In addition to the Foreach-Object cmdlet, you can use the -ExpandProperty parameter of the Select-Object cmdlet to retrieve the value of properties:

1 comment

  1. Johannes Rössel Posted 17 days and 20 hours ago

    Shouldn't this be the -ExpandProperty parameter, since everywhere the full cmdlet and parameter names are used?

Add a comment

Example 2.12. Select-Object expanding property values

PS > "Hello","World" | Select-Object -Expand Length
5
5
          

While its main intent is to include the properties of nested objects as through they were properties of the parent object, it is a useful shortcut for this situation as well.

To remove this syntax overhead, the Invoke-Member script supports simple method and property access as its main (and only) function. To make this even easier to type, give it a short alias, such as:

PS > Set-Alias :: Invoke-Member
PS > dir | :: Length
907
1425
1641
2057
2286
1854
11220
1562
248
985
560
524
      

For an example of appying this type of simplification to the Where-Object cmdlet, see the section called “Program: Simplify Most Where-Object Filters”.

Intercept Stages of the Pipeline

Problem

You want to intercept or take some action at different stages of the PowerShell pipeline.

Solution

Use the New-CommandWrapper script given in the section called “Program: Enhance or Extend an Existing Cmdlet” to wrap the Out-Default command, and place your custom functionality in that.

Discussion

For any pipeline, PowerShell adds an implicit call to the Out-Default cmdlet at the end. By adding a command wrapper over this function we can heavily customize the pipeline processing behavior.

When PowerShell creates a pipeline, it first calls the BeginProcessing() method of each command in the pipeline. For advanced functions (the type created by the New-CommandWrapper script), PowerShell invokes the Begin block. If you want to do anything at the beginning of the pipeline, then, put your customizations in that block.

For each object emitted by the pipeline, PowerShell sends that object to the ProcessRecord() method of the next command in the pipeline. For advanced functions (the type created by the New-CommandWrapper script), PowerShell invokes the Process block. If you want to do anything for each element in the pipeline, then, put your customizations in that block.

Finally, when PowerShell has processed all items in the pipeline, it calls the EndProcessing() method of each command in the pipeline. For advanced functions (the type created by the New-CommandWrapper script), PowerShell invokes the End block. If you want to do anything at the end of the pipeline, then, put your customizations in that block.

For two examples of this approach, see the section called “Automatically Capture Pipeline Output”, and the section called “Invoke Dynamically-Named Commands”.

For more information about running scripts, see the section called “Run Programs, Scripts, and Existing Tools”

Automatically Capture Pipeline Output

Problem

You want to automatically capture the output of the last command without explicitly storing its output in a variable.

Solution

Invoke the Add-ObjectCollector script, which in-turn builds upon the New-CommandWrapper script.

Example 2.13. Add-ObjectCollector.ps1


Set-StrictMode -Version Latest

New-CommandWrapper Out-Default `
    -Begin {
        $cachedOutput = New-Object System.Collections.ArrayList
     } `
    -Process {
        if($_ -ne $null) { $null = $cachedOutput.Add($_) }
        while($cachedOutput.Count -gt 500) { $cachedOutput.RemoveAt(0) }
     } `
    -End {
        $uniqueOutput = $cachedOutput | Foreach-Object {
            $_.GetType().FullName } | Select -Unique
        $containsInterestingTypes = ($uniqueOutput -notcontains `
            "System.Management.Automation.ErrorRecord") -and
            ($uniqueOutput -notlike `
                "Microsoft.PowerShell.Commands.Internal.Format.*")
        
        if(($cachedOutput.Count -gt 0) -and $containsInterestingTypes)
        {
            $GLOBAL:ll = $cachedOutput | % { $_ }
        }
    }

        

Discussion

The example in the Solution builds a command wrapper over the Out-Default command by first creating an ArrayList during the Begin stage of the pipeline.

As each object passes down the pipeline (and is processed by the Process block of Out-Default), the wrapper created by Add-ObjectCollector adds the object to the ArrayList.

Once the pipeline completes, the Add-ObjectCollector wrapper stores the saved items in the $ll variable, making them always available at the next prompt.

Capture and Redirect Binary Process Output

Problem

You want to run programs that transfter complex binary data between themselves.

Solution

Use the Invoke-BinaryProcess script to invoke the program. If it is the source of binary data, use the -RedirectOutput parameter. If it consumes binary data, use the -RedirectInput parameter.

Example 2.14. Invoke-BinaryProcess.ps1

##############################################################################

<#

.SYNOPSIS
Invokes a process that emits or consumes binary data.

.EXAMPLE
PS >Invoke-BinaryProcess binaryProcess.exe -RedirectOutput |
       Invoke-BinaryProcess binaryProcess.exe -RedirectInput
  
#>

param(
    [string] $ProcessName,
    
    [Alias("Input")]
    [switch] $RedirectInput,
    
    [Alias("Output")]
    [switch] $RedirectOutput,
    
    [string] $ArgumentList)

$processStartInfo = New-Object System.Diagnostics.ProcessStartInfo
$processStartInfo.FileName = (Get-Command $processname).Definition
$processStartInfo.WorkingDirectory = (Get-Location).Path
if($argumentList) { $processStartInfo.Arguments = $argumentList }
$processStartInfo.UseShellExecute = $false

$processStartInfo.RedirectStandardOutput = $true
$processStartInfo.RedirectStandardInput = $true

$process = [System.Diagnostics.Process]::Start($processStartInfo)

if($redirectInput)
{
    $inputBytes = @($input)
    $process.StandardInput.BaseStream.Write($inputBytes, 0, $inputBytes.Count)
    $process.StandardInput.Close()
}
else
{
    $input | % { $process.StandardInput.WriteLine($_) }
    $process.StandardInput.Close()
}

if($redirectOutput)
{
    $byteRead = -1
    do
    {
        $byteRead = $process.StandardOutput.BaseStream.ReadByte()
        if($byteRead -ge 0) { $byteRead }
    } while($byteRead -ge 0)
}
else
{
    $process.StandardOutput.ReadToEnd()
}
        

Discussion

When PowerShell launches a native application, one of the benefits it provides is allowing you to use PowerShell commands to work with the output. For example:

PS > (ipconfig)[7]
   Link-local IPv6 Address . . . . . : fe80::20f9:871:8365:f368%8
PS > (ipconfig)[8]
   IPv4 Address. . . . . . . . . . . : 10.211.55.3

PowerShell enables this by splitting the output of the program on its newline characters, and then passing each line independently down the pipeline. This includes programs that use the Unix newline (\n) as well as the Windows newline (\r\n.)

If the program outputs binary data, however, that re-interpretation can corrupt data as it gets redirected to another process or file. For example, some programs communicate between themselves through complicated binary data structures that cannot be modified along the way. This is common of some image editing utilities and other non-PowerShell tools designed for pipelined data manipulation.

We can see this through an example BinaryProcess.exe that either emits binary data, or consumes it. Here is the C# source code to the BinaryProcess.exe application:

using System;
using System.IO;

public class BinaryProcess
{
    public static void Main(string[] args)
    {
        if(args[0] == "-consume")
        {
            using(Stream inputStream = Console.OpenStandardInput())
            {
                for(byte counter = 0; counter < 255; counter++)
                {
                    byte received = (byte) inputStream.ReadByte();
                    if(received != counter)
                    {
                        Console.WriteLine(
                            "Got an invalid byte: {0}, expected {1}.",
                            received, counter);
                        return;
                    }
                    else
                    {
                        Console.WriteLine(
                            "Properly received byte: {0}.", received, counter);
                    }
                }
            }
        }

        if(args[0] == "-emit")
        {
            using(Stream outputStream = Console.OpenStandardOutput())
            {
                for(byte counter = 0; counter < 255; counter++)
                {
                    outputStream.WriteByte(counter);
                }
            }
        }
    }
} 

1 comment

  1. Mike Martino Posted 15 days and 20 hours ago

    Whoa! C# code? I think you should warn the reader that this code is different. I started going through it thinking I was still in Powershell.

    I suggest editing:

    "We can see this through an example BinaryProcess.exe that either emits binary data, or consumes it:"

    to say

    "We can see this through a simple C# program (code follows) we'll call BinaryProcess.exe that either emits binary data or consumes it:"

    Or I guess this would be "Example 2.15a: binaryprocess.cs" and the powershell code that invokes it would be "Example 2.15b".

Add a comment

When we run it with the -emit parameter, PowerShell breaks the output into three objects:

PS > $output = .\binaryprocess.exe -emit
PS > $output.Count
3

We would expect this output to contain the numbers 0 through 254, but we see that it does not:

PS > $output | Foreach-Object { "------------";
    $_.ToCharArray() | Foreach-Object { [int] $_ } }
------------
0
1
2
3
4
5
6
7
8
9
------------
11
12
------------
14
15
16
17
18
19
20
21
22
(...)
255
214
220
162
163
165
8359
402
225

At number 10, PowerShell interprets that byte as the end of the line, and uses that to split the output into a new element. It does the same for number 13. Things appear to get even stranger when we get to the higher numbers as PowerShell starts to interpret combinations of bytes as Unicode characters from another language.

The solution resolves this behavior by managing the output of the binary process directly. If you supply the -RedirectInput parameter, the script assumes an incoming stream of binary data and passes it to the program directly. If you supply the -RedirectOutput parameter, the script assumes that the output is binary data, and likewise reads it from the process directly.

You must sign in or register before commenting
*
*
*
*
*

Atom Icon Comments on this page or Comments on the whole book.