Creating and manipulating text has long been one of the primary tasks of scripting languages and traditional shells. In fact, Perl (the language) started as a simple (but useful) tool designed for text processing. It has grown well beyond those humble roots, but its popularity provides strong evidence of the need it fills.
In text-based shells, this strong focus
continues. When most of your interaction with the system happens by
manipulating the text-based output of programs, powerful text processing
utilities become crucial. These text parsing tools such as awk, sed, and grep form the keystones of text-based systems
management.
“awk, sed” -> “awk, sed” (the comma [and following space] should be formatted as normal text, not code)
In PowerShell's object-based environment, this traditional tool chain plays a less critical role. You can accomplish most of the tasks that previously required these tools much more effectively through other PowerShell commands. However, being an object-oriented shell does not mean that PowerShell drops all support for text processing. Dealing with strings and unstructured text continues to play an important part in a system administrator's life. Since PowerShell lets you to manage the majority of your system in its full fidelity (using cmdlets and objects), the text processing tools can once again focus primarily on actual text processing tasks.
Use PowerShell string variables to give you a way to store and work with text.
To define a string that supports variable expansion and escape characters in its definition, surround it with double quotes:
$myString = "Hello World"To define a literal string (that does not interpret variable expansion or escape characters), surround it with single quotes:
$myString = 'Hello World'String literals come in two varieties:
literal (nonexpanding) and
expanding strings. To create a literal string,
place single quotes ($myString = 'Hello
World') around the text. To create an expanding string, place
double quotes ($myString = "Hello
World") around the text.
In a literal string, all the text between the
single quotes becomes part of your string. In an expanding string,
PowerShell expands variable names (such as
$myString) and escape sequences (such as
`n) with their values (such as the
content of $myString and the newline
character, respectively).
For a detailed explanation of the escape sequences and replacement rules inside PowerShell strings, see the section called “Strings”.
One exception to the "all text in a literal string is literal" rule comes from the quote characters themselves. In either type of string, PowerShell lets you to place two of that string's quote characters together to add the quote character itself:
$myString = "This string includes ""double quotes"" because it combined quote characters." $myString = 'This string includes ''single quotes'' because it combined quote characters.'
This helps prevent escaping atrocities that would arise when you try to include a single quote in a single-quoted string. For example:
$myString = 'This string includes ' + "'" + 'single quotes' + "'"
This example shows how easy PowerShell makes it to create new strings by adding other strings together. This is an attractive way to build a formatted report in a script but should be used with caution. Due to the way that the .NET Framework (and therefore PowerShell) manages strings, adding information to the end of a large string this way causes noticeable performance problems. If you intend to create large reports, see the section called “Generate Large Reports and Text Streams”.
You want to create a variable that holds text with newlines or other explicit formatting.
Use a PowerShell here string to store and work with text that includes newlines and other formatting information.
$myString = @" This is the first line of a very long string. A "here string" lets you to create blocks of text that span several lines. "@
PowerShell begins a here
string when it sees the characters @" followed by a newline. It ends the string
when it sees the characters "@ on
their own line. These seemingly odd restrictions let you create strings
that include quote characters, newlines, and other symbols that you
commonly use when you create large blocks of preformatted text.
These restrictions, while useful, can sometimes cause problems when you copy and paste PowerShell examples from the Internet. Web pages often add spaces at the end of lines, which can interfere with the strict requirements of the beginning of a here string. If PowerShell produces an error when your script defines a here string, check that the here string does not include an errant space after its first quote character.
Like string literals, here strings may be literal (and use single quotes) or expanding (and use double quotes).
In PowerShell version one, here strings were frequently used as the equivalent of block comments to disable lines in a script. PowerShell version two now supports this fully through multiline comments. For more information, see the section called “Comments”.
You want to place special characters (such as tab and newline) in a string variable.
In an expanding string, use PowerShell's escape sequences to include special characters such as tab and newline.
PS > $myString = "Report for Today`n----------------" PS > $myString Report for Today ----------------
As discussed in the section called “Create a String”, PowerShell strings come in two varieties: literal (or nonexpanding) and expanding strings. A literal string uses single quotes around its text, while an expanding string uses double quotes around its text.
In a literal string, all the text between the
single quotes becomes part of your string. In an expanding string,
PowerShell expands variable names (such as $ENV:
SystemRoot) and escape sequences (such as `n) with their values (such as the SystemRoot environment variable and the
newline character).
“such as $ENV: SystemRoot” → “such as $ENV:SystemRoot” (superfluous space)
Unlike many languages that use a backslash character (\) for escape sequences, PowerShell uses a back-tick (`) character. This stems from its focus on system administration, where backslashes are ubiquitous in path names.
The backslash and backtick should be formatted as code.
For a detailed explanation of the escape sequences and replacement rules inside PowerShell strings, see the section called “Strings”.
I noticed that in v1 of the cookbook, the references to other sections included "Appendix A" or whatever appendix they are in. having more detail about where to go for info in the book would be helpful.
Thanks. The display text for these hyperlinks gets auto-generated. I expect the printed version will do the same as the first edition.
You want to place dynamic information (such as the value of another variable) in a string.
In an expanding string, include the name of a variable in the string to insert the value of that variable.
PS > $header = "Report for Today" PS > $myString = "$header`n----------------" PS > $myString Report for Today ----------------
To include information more complex than just the value of a variable, enclose it in a subexpression:
PS > $header = "Report for Today"
PS > $myString = "$header`n$('-' * $header.Length)"
PS > $myString
Report for Today
----------------Variable substitution in an expanding string is a simple enough concept, but subexpressions deserve a little clarification.
A subexpression is the dollar sign character, followed by a PowerShell command (or set of commands) contained in parentheses:
$(subexpression)When PowerShell sees a subexpression in an
expanding string, it evaluates the subexpression and places the result
in the expanding string. In the solution, the expression '-' * $header.Length tells PowerShell to make
a line of dashes $header.Length
long.
Another way to place dynamic information inside a string is to use PowerShell's string formatting operator, which is based on the rules of the .NET string formatting:
PS > $header = "Report for Today"
PS > $myString = "{0}`n{1}" -f $header,('-' * $header.Length)
PS > $myString
Report for Today
----------------For an explanation of PowerShell's formatting
operator, see the section called “Place Formatted Information in a String”. For more
information about PowerShell's escape characters, type Get-Help About_Escape_Characters or type
Get-Help
About_Special_Characters.
You want to prevent PowerShell from interpreting special characters or variable names inside a string.
Use a nonexpanding string to have PowerShell interpret your string exactly as entered. A nonexpanding string uses the single quote character around its text.
PS > $myString = 'Useful PowerShell characters include: $, `, " and { }'
PS > $myString
Useful PowerShell characters include: $, `, " and { }If you want to include newline characters as well, use a nonexpanding here string, as in Example 5.1, “A nonexpanding here string that includes newline characters”.
Example 5.1. A nonexpanding here string that includes newline characters
PS > $myString = @'
>> Tip of the Day
>> -------------
>> Useful PowerShell characters include: $, `, ', " and { }
>> '@
>>
PS > $myString
Tip of the Day
Useful PowerShell characters include: $, `, ', " and { }In a literal string, all the text between the
single quotes becomes part of your string. This is in contrast to an
expanding string, where PowerShell expands variable names (such as
$myString) and escape sequences (such as
`n) with their values (such as the
content of $myString and the newline
character).
Nonexpanding strings are a useful way to manage files and folders that contain special characters that might otherwise be interpreted as escape sequences. For more information about managing files with special characters in their name, see the section called “Manage Files That Include Special Characters”.
As discussed in the section called “Create a String”, one exception to the "all text in a literal string is literal" rule comes from the quote characters themselves. In either type of string, PowerShell lets you place two of that string's quote characters together to include the quote character itself:
"in either type of string, Powershell let you...." --> should be "powershell lets you" ?
$myString = "This string includes ""double quotes"" because it combined quote characters." $myString = 'This string includes ''single quotes'' because it combined quote characters.'
You want to place formatted information (such as right-aligned text or numbers rounded to a specific number of decimal places) in a string.
Use PowerShell's formatting operator to place formatted information inside a string.
PS > $formatString = "{0,8:D4} {1:C}`n"
PS > $report = "Quantity Price`n"
PS > $report += "---------------`n"
PS > $report += $formatString -f 50,2.5677
PS > $report += $formatString -f 3,9
PS > $report
Quantity Price
---------------
0050 $2.57
0003 $9.00PowerShell's string formatting operator
(-f) uses the same string formatting
rules as the String.Format() method
in the .NET Framework. It takes a format string on its left side, and
the items you want to format on its right side.
In the solution, you format two numbers: a
quantity and a price. The first number ({0}) represents the quantity and is
right-aligned in a box of 8 characters (,8). It is formatted as a decimal number with
4 digits (:D4). The second number
({1}) represents the price, which you
format as currency (:C).
“with 4 digits (:D4)” → “with 4 digits (:D4)” (formatting “:D4” as code)
“as currency (:C)” → “as currency (:C)” (formatting “:C” as code)
(consistency with the other pieces of the format string in that paragraph)
Disregard the second one; my eyes start getting weary.
If you find yourself hand-crafting text-based reports, STOP! Let PowerShell's built-in commands do all the work for you. Instead, emit custom objects so that your users can work with your script as easily as they work with regular PowerShell commands. For more information, see the section called “Create and Initialize Custom Objects”.
For a detailed explanation of PowerShell's formatting operator, see the section called “Simple Operators”. For a detailed list of the formatting rules, see Appendix D, .NET String Formatting.
Although primarily used to control the layout of information, the string-formatting operator is also a readable replacement for what is normally accomplished with string concatenation:
PS > $number1 = 10 PS > $number2 = 32 PS > "$number2 divided by $number1 is " + $number2 / $number1 32 divided by 10 is 3.2
The string formatting operator makes this much easier to read:
PS > "{0} divided by {1} is {2}" -f $number2, $number1, ($number2 / $number1)
32 divided by 10 is 3.2In addition to the string formatting operator,
PowerShell provides three formatting commands (Format-Table, Format-Wide, and Format-List) that let you easily generate
formatted reports. For detailed information about those cmdlets, see
the section called “Formatting Output”.
You want to determine if a string contains another string, or want to find the position of a string within another string.
PowerShell provides several options to help you search a string for text.
Use the –like operator to determine whether a string
matches a given DOS-like wildcard:
PS > "Hello World" -like "*llo W*" True
Use the –match operator to determine whether a string
matches a given regular expression:
PS > "Hello World" -match '.*l[l-z]o W.*$' True
Use the Contains() method to determine whether a
string contains a specific string:
PS > "Hello World".Contains("World")
TrueUse the IndexOf() method to determine the location of
one string within another:
PS > "Hello World".IndexOf("World")
6Since PowerShell strings are fully featured
.NET objects, they support many string-oriented operations directly. The
Contains() and IndexOf() methods are two examples of the many
features that the String class
supports. To learn what other functionality the String class supports, see the section called “Learn About Types and Objects”.
To search entire files for text or a pattern, see the section called “Search a File for Text or a Pattern”.
Although they use similar characters, simple wildcards and regular expressions serve significantly different purposes. Wildcards are much more simple than regular expressions, and because of that, more constrained. While you can summarize the rules for wildcards in just four bullet points, entire books have been written to help teach and illuminate the use of regular expressions.
A common use of regular expressions is to
search for a string that spans multiple lines. By default, regular
expressions do not search across lines, but you can use the
singleline (?s) option to instruct them to do so:
PS > "Hello `n World" -match "Hello.*World" False PS > "Hello `n World" -match "(?s)Hello.*World" True
Wildcards lend themselves to simple matches, while regular expressions lend themselves to more complex matches.
For a detailed description of the –like operator, see the section called “Comparison Operators”. For a detailed description of the
–match operator, see the section called “Simple Operators”. For a detailed list of the regular
expression rules and syntax, see Appendix B, Regular Expression Reference.
One difficulty sometimes arises when you try to store the result of a PowerShell command in a string, as shown in Example 5.2, “Attempting to store output of a PowerShell command in a string”.
Example 5.2. Attempting to store output of a PowerShell command in a string
PS > Get-Help Get-ChildItem
NAME
Get-ChildItem
SYNOPSIS
Gets the items and child items in one or more specified locations.
(...)
PS > $helpContent = Get-Help Get-ChildItem
PS > $helpContent -match "location"
FalseThe –match
operator searches a string for the pattern you specify but seems to fail
in this case. This is because all PowerShell commands generate objects.
If you don't store that output in another variable or pass it to another
command, PowerShell converts to a text representation before it displays
it to you. In Example 5.2, “Attempting to store output of a PowerShell command in a
string”, $helpContent is a fully featured object, not
just its string representation:
PS > $helpContent.Name Get-ChildItem
To work with the text-based representation of
a PowerShell command, you can explicitly send it through the Out-String cmdlet. The Out-String cmdlet converts its input into the
text-based form you are used to seeing on the screen:
PS > $helpContent = Get-Help Get-ChildItem | Out-String PS > $helpContent -match "location" True
For a script that makes searching textual command output easier, see the section called “Program: Search Formatted Output for a Pattern”.
PowerShell provides several options to help you replace text in a string with other text.
Use the Replace() method on the string itself to
perform simple replacements:
PS > "Hello World".Replace("World", "PowerShell")
Hello PowerShellUse PowerShell's regular expression –replace operator to perform more advanced
regular expression replacements:
PS > "Hello World" -replace '(.*) (.*)','$2 $1' World Hello
The Replace() method and the –replace operator both provide useful ways to
replace text in a string. The Replace() method is the quickest but also the
most constrained. It replaces every occurrence of the exact string you
specify with the exact replacement string that you provide. The –replace operator provides much more
flexibility, since its arguments are regular expressions that can match
and replace complex patterns.
Given the power of the regular expressions it
uses, the -replace operator carries with it some
pitfalls of regular expressions, as well.
First, the regular expressions that you use
with the –replace operator often
contain characters (such as the dollar sign that represents a group
number) that PowerShell normally interprets as variable names or escape
characters. To prevent PowerShell from interpreting these characters,
use a nonexpanding string (single quotes) as shown by the
solution.
Another, less common, pitfall is wanting to use characters that have special meaning to regular expressions as part of your replacement text. For example:
PS > "Power[Shell]" -replace "[Shell]","ful" Powfulr[fulfulfulfulful]
That's clearly not what we intended. In regular expressions, square brakets around a set of characters means "match any of the characters inside of the square brackets." In our example, this translates to "Replace the characters, S, h, e, and l with 'ful'."
“square brakets” -> “square brackets”
To avoid this, we can use the regular expression escape character to escape the square brackets:
PS > "Power[Shell]" -replace "\[Shell\]","ful" Powerful
However, this means knowing all of the regular
expression special characters, and modifying the input string.
Sometimes, we don't control that, so the
[Regex]::Escape() method comes in handy:
PS > "Power[Shell]" -replace ([Regex]::Escape("[Shell]")),"ful"
PowerfulFor more information about the –replace operator, see the section called “Simple Operators” and Appendix B, Regular Expression Reference.
You want to split a string based on some literal text, or a regular expression pattern.
Use PowerShell's -split
operator to split on a sequence of characters or specific string:
PS > "a-b-c-d-e-f" -split "-c-" a-b d-e-f
To split on a pattern, supply a regular expression as the first argument:
PS > "a-b-c-d-e-f" -split "b|[d-e]" a- -c- - -f
In PowerShell version one, the
String.Split() and
[Regex]::Split() methods were the two options
available for splitting strings. While still available in PowerShell
version two, PowerShell's -split operator provides a
more natural way to split a string into smaller strings. When used with
no arguments (the unary split operator), it splits
a string on whitespace characters.
Example 5.3. PowerShell's unary split operator
PS > -split "Hello World `t How `n are you?" Hello World How are you?
When used with an argument, it treats the argument as a regular expression, and then splits based on that pattern.
PS > "a-b-c-d-e-f" -split 'b|[d-e]' a- -c- - -f
If the replacement pattern avoids characters that have special meaning in a regular expression, you can use it to split a string based on another string.
PS > "a-b-c-d-e-f" -split '-c-' a-b d-e-f
If the replacement pattern has characters that have
special meaning in a regular expression (such as the
. character that represents 'any character'), use the
-split operator's SimpleMatch
option:
Example 5.4. PowerShell's SimpleMatch split option
PS > "a.b.c" -split '.' (A bunch of newlines. Something went wrong!) PS > "a.b.c" -split '.',0,"SimpleMatch" a b c
For more information about the
-split operator's options, see Get-Help
about_split.
While regular expressions offer an enormous
amount of flexibility, the -split operator gives you
ultimate flexibility by letting you supply a script block for split
operation. For each character, it invokes the scriptblock and splits the
string based on the result. In the script block, $_
represents the current character. For example, to split a string on even
numbers:
Example 5.5. Using a script block to split a string
PS > "1234567890" -split { ($_ % 2) -eq 0 }
1
3
5
7
9To split an entire file by a pattern, use the
-Delimiter parameter of the
Get-Content cmdlet.
For more information about the –split operator, see the section called “Simple Operators” and .Get-Help
about_split
You want to combine several separate strings into a single string.
Use PowerShell's unary
-join operator to combine separate strings into a
larger string using the default empty separator:
PS > -join ("A","B","C")
ABCIf you want to define the string that PowerShell uses to
combine the strings, use PowerShell's binary
-join operator.
PS > ("A","B","C") -join "`n"
A
B
CIn PowerShell version one, the
[String]::Join() method was the primary option
available for joining strings. While still available in PowerShell
version two, PowerShell's -join operator provides a
more natural way to combine strings. When used with no arguments (the
unary join operator), it joins the list using the
default empty separator. When used between a list and a separator (the
binary join operator), it joins the strings using
the provided separator.
Aside from its performance benefit, the
-join operator solves an extremely common difficulty
that arises from trying to do it by hand.
When first writing the code to join a list with a separator (for example, a comma and a space), you usually end up leaving a lonely separator at the beginning or ending of the output:
PS > $list = "Hello","World"
PS > $output = ""
PS >
PS > foreach($item in $list)
>> {
>> $output += $item + ", "
>> }
>>
PS > $output
Hello, World,You can resolve this by adding some extra logic
to the foreach loop:
PS > $list = "Hello","World"
PS > $output = ""
PS >
PS > foreach($item in $list)
>> {
>> if($output -ne "") { $output += ", " }
>> $output += $item
>> }
>>
PS > $output
Hello, WorldOr, save yourself the trouble and use the
-join operator directly:
PS > $list = "Hello","World" PS > $list -join ", " Hello, World
For more a more structured way to join strings into larger strings or reports, see the section called “Place Formatted Information in a String”.
Use the ToUpper() and ToLower() methods of the string to convert it
to uppercase and lowercase, respectively.
To convert a string to uppercase, use the
ToUpper() method:
PS > "Hello World".ToUpper() HELLO WORLD
To convert a string to lowercase, use the
ToLower() method:
PS > "Hello World".ToLower() hello world
Since PowerShell strings are fully featured
.NET objects, they support many string-oriented operations directly. The
ToUpper() and ToLower() methods are two examples of the many
features that the String class
supports. To learn what other functionality the String class supports, see the section called “Learn About Types and Objects”.
Neither PowerShell nor the methods of the
.NET String class directly support
capitalizing only the first letter of a word. If you want to
capitalize only the first character of a word or sentence, try the
following commands:
PS > $text = "hello" PS > $newText = $text.Substring(0,1).ToUpper() + >> $text.Substring(1) >> $newText >> Hello
One thing to keep in mind as you convert a string to uppercase or lowercase is your motivation for doing it. One of the most common reasons is for comparing strings, as shown in Example 5.6, “Using the ToUpper() method to normalize strings”.
Example 5.6. Using the ToUpper() method to normalize strings
## $text comes from the user, and contains the value "quit"
if($text.ToUpper() -eq "QUIT") { ... }Unfortunately, explicitly changing the
capitalization of strings fails in subtle ways when your script runs in
different cultures. Many cultures follow different capitalization and
comparison rules than you may be used to. For example, the Turkish
language includes two types of the letter "I": one with a dot, and one
without. The uppercase version of the lowercase letter "i" corresponds
to the version of the capital I with a dot, not the capital I used in
QUIT. Those capitalization rules
cause the string comparison code in Example 5.6, “Using the ToUpper() method to normalize strings” to fail in the
Turkish culture.
To compare some input against a hard-coded
string in a case-insensitive manner, the better solution is to use
PowerShell's–eq operator without
changing any of the casing yourself. The–eq operator is case-insensitive and
culture-neutral by default:
“use PowerShell's–eq operator” -> “use PowerShell's -eq operator”
“The–eq operator” -> “The -eq operator”
(missing spaces; en dash instead of hyphen-minus)
I'm guessing some search & replace went wrong here, might have occurred in a few other places as well.
PS > $text1 = "Hello" PS > $text2 = "HELLO" PS > $text1 -eq $text2 True
For more information about writing culture-aware scripts, see the section called “Write Culture-Aware Scripts”.
You want to remove leading or trailing spaces from a string or user input.
Use the Trim() method of the string to remove all
leading and trailing whitespace characters from that string.
PS > $text = " `t Test String`t `t" PS > "|" + $text.Trim() + "|" |Test String|
The Trim()
method cleans all whitespace from the beginning and
end of a string. If you want just one or the other, you can also call
the TrimStart() or TrimEnd() method to remove whitespace from the
beginning or the end of the string, respectively. If you want to remove
specific characters from the beginning or end of a string, the Trim(), TrimStart(), and TrimEnd() methods provide options to support
that. To trim a list of specific characters from the end of a string,
provide that list to the method, as shown in Example 5.7, “Trimming a list of characters from the end of a string”.
Example 5.7. Trimming a list of characters from the end of a string
PS > "Hello World".TrimEnd('d','l','r','o','W',' ')
HeAt first blush, the following command that
attempts to trim the text "World"
from the end of a string appears to work incorrectly:
PS > "Hello World".TrimEnd(" World")
HeThis happens because the TrimEnd() method takes a list of characters
to remove from the end of a string. PowerShell automatically converts
a string to a list of characters if required, and in this case
converts your string to the characters
W,o,r,l,d,
and a space. These are in fact the same characters as were used in
Example 5.7, “Trimming a list of characters from the end of a string”, so it has
the same effect.
It took me a little while to understand how this example is the same as example 5.7. That might be my lack of powershell experience or maybe just tiredness =) Is there a way to more explicitly show that the TrimEnd() method is actually removing all the individual characters in "world" and not the word "world"? Is there a way to trim the word "world" from the end?
“the characters W,o,r,l,d, and a space” -> “the characters W, o, r, l, d, and a space” (missing spaces)
If you want to replace text anywhere in a string (and not just from the beginning or end), see the section called “Replace Text in a String”.
You want to control the way that PowerShell displays or formats a date.
To control the format of a date, use one of the following options:
The Get-Date cmdlet's –Format parameter:
PS > Get-Date -Date "05/09/1998 1:23 PM" -Format "dd-MM-yyyy @ hh:mm:ss" 09-05-1998 @ 01:23:00
PowerShell's string formatting (–f) operator:
PS > $date = [DateTime] "05/09/1998 1:23 PM"
PS > "{0:dd-MM-yyyy @ hh:mm:ss}" -f $date
09-05-1998 @ 01:23:00The object's ToString() method:
PS > $date = [DateTime] "05/09/1998 1:23 PM"
PS > $date.ToString("dd-MM-yyyy @ hh:mm:ss")
09-05-1998 @ 01:23:00The Get-Date cmdlet's–UFormat parameter, which supports Unix
date format strings:
“The Get-Date cmdlet's–UFormat parameter” -> “The Get-Date cmdlet's -UFormat parameter” (missing space, en dash instead of hyphen-minus)
PS > Get-Date -Date "05/09/1998 1:23 PM" -UFormat "%d-%m-%Y @ %I:%M:%S" 09-05-1998 @ 01:23:00
Except for the–Uformat parameter of the Get-Date cmdlet, all date formatting in
PowerShell uses the standard .NET DateTime format strings. These format
strings let you display dates in one of many standard formats (such as
your system's short or long date patterns), or in a completely custom
manner. For more information on how to specify standard .NET DateTime
format strings, see Appendix E, .NET DateTime Formatting.
“the–Uformat parameter” -> “the -UFormat parameter” (missing space, en dash -> hyphen-minus, capitalization of the parameter)
If you are already used to the Unix-style date
formatting strings (or are converting an existing script that uses a
complex one), the –Uformat parameter
of the Get-Date cmdlet may be
helpful. It accepts the format strings accepted by the Unix date command, but does not provide any
functionality that standard .NET date formatting strings cannot.
“the –Uformat parameter” → “the -UFormat parameter” (capitalization of the parameter, en dash → hyphen-minus)
When working with the string version of dates
and times, be aware that they are the most common source of
internationalization issues—problems that arise from running a script on
a machine with a different culture than the one it was written on. In
North America "05/09/1998" means "May 9, 1998." In many other cultures,
though, it means "September 5, 1998." Whenever possible use and compare
DateTime objects (rather than
strings) to other DateTime objects,
as that avoids these cultural differences. Example 5.8, “Comparing DateTime objects with the -gt operator” demonstrates this
approach.
Example 5.8. Comparing DateTime objects with the -gt operator
PS > $dueDate = [DateTime] "01/01/2006"
PS > if([DateTime]::Now -gt $dueDate)
>> {
>> "Account is now due"
>> }
>>
Account is now duePowerShell always
assumes the North American date format when it interprets a DateTime constant such as [DateTime] "05/09/1998". This is for the
same reason that all languages interpret numeric constants (such as
12.34) in the North American
format. If it did otherwise, nearly every script that dealt with dates
and times would fail on international systems.
[DateTime] "1998.05.09" is also accepted. The [datetime]::Parse() method should be mentioned here, as it accepts international imput: [datetime]::Parse("2010. március 20")
For more information about the Get-Date cmdlet, type Get-Help Get-Date. For more information
about dealing with dates and times in a culturally-aware manner, see
the section called “Write Culture-Aware Scripts”.
One of the strongest features of PowerShell is its object-based pipeline. You don't waste your energy creating, destroying, and recreating the object representation of your data. In other shells, you lose the full-fidelity representation of data when the pipeline converts it to pure text. You can regain some of it through excessive text parsing, but not all of it.
However, you still often have to interact with low-fidelity input that originates from outside PowerShell. Text-based data files and legacy programs are two examples.
PowerShell offers great support for two of the three text-parsing staples:
Replaces text. For that functionality,
PowerShell offers the –replace
operator.
Searches text. For that functionality,
PowerShell offers the Select-String cmdlet, among others.
The third traditional text-parsing tool,
Awk, lets you to chop a line of text into more
intuitive groupings. PowerShell offers the Split() method on strings, but that lacks some
of the power you usually need to break a string into groups.
The Convert-TextObject script presented in Example 5.9, “Convert-TextObject.ps1” lets you convert text streams into a
set of objects that represent those text elements according to the rules
you specify. From there, you can use all of PowerShell's object-based
tools, which gives you even more power than you would get with the
text-based equivalents.
Example 5.9. Convert-TextObject.ps1
param(
[string] $delimiter,
[string] $parseExpression,
[string[]] $propertyName,
[type[]] $propertyType
)
function Main(
$inputObjects, $parseExpression, $propertyType,
$propertyName, $delimiter)
{
$delimiterSpecified = [bool] $delimiter
$parseExpressionSpecified = [bool] $parseExpression
if($delimiterSpecified -and $parseExpressionSpecified)
{
Usage
return
}
if(-not $($delimiterSpecified -or $parseExpressionSpecified))
{
$delimiter = "\s+"
$delimiterSpecified = $true
}
foreach($inputObject in $inputObjects)
{
if(-not $inputObject) { $inputObject = "" }
foreach($inputLine in $inputObject.ToString())
{
ParseTextObject $inputLine $delimiter $parseExpression `
$propertyType $propertyName
}
}
}
function Usage
{
"Usage: "
" Convert-TextObject"
" Convert-TextObject -ParseExpression parseExpression " +
"[-PropertyName propertyName] [-PropertyType propertyType]"
" Convert-TextObject -Delimiter delimiter " +
"[-PropertyName propertyName] [-PropertyType propertyType]"
return
}
function ParseTextObject
{
param(
$textInput, $delimiter, $parseExpression,
$propertyTypes, $propertyNames)
$parseExpressionSpecified = -not $delimiter
$returnObject = New-Object PSObject
$matches = $null
$matchCount = 0
if($parseExpressionSpecified)
{
[void] ($textInput -match $parseExpression)
$matchCount = $matches.Count
}
else
{
$matches = [Regex]::Split($textInput, $delimiter)
$matchCount = $matches.Length
}
if(-not $matchCount)
{
return
}
$counter = 0
if($parseExpressionSpecified) { $counter++ }
for(; $counter -lt $matchCount; $counter++)
{
$propertyName = "None"
$propertyType = [string]
if($parseExpressionSpecified)
{
$propertyName = "P$counter"
if($counter -le $propertyNames.Length)
{
if($propertyName[$counter - 1])
{
$propertyName = $propertyNames[$counter - 1]
}
}
if($counter -le $propertyTypes.Length)
{
if($propertyTypes[$counter - 1])
{
$propertyType = $propertyTypes[$counter - 1]
}
}
}
else
{
$propertyName = "P$($counter + 1)"
if($counter -lt $propertyNames.Length)
{
if($propertyNames[$counter])
{
$propertyName = $propertyNames[$counter]
}
}
if($counter -lt $propertyTypes.Length)
{
if($propertyTypes[$counter])
{
$propertyType = $propertyTypes[$counter]
}
}
}
Add-Note $returnObject $propertyName `
($matches[$counter] -as $propertyType)
}
$returnObject
}
function Add-Note ($object, $name, $value)
{
$object | Add-Member NoteProperty $name $value
}
Main $input $parseExpression $propertyType $propertyName $delimiter
You want to write a script that generates a large report or large amount of data.
The best approach to generating a large amount of data is to take advantage of PowerShell's streaming behavior whenever possible. Opt for solutions that pipeline data between commands:
Get-ChildItem C:\ *.txt -Recurse | Out-File c:\temp\AllTextFiles.txt
rather than collect the output at each stage:
$files = Get-ChildItem C:\ *.txt -Recurse $files | Out-File c:\temp\AllTextFiles.txt
If your script generates a large text report
(and streaming is not an option), use the StringBuilder class:
$output = New-Object System.Text.StringBuilder
Get-ChildItem C:\ *.txt -Recurse |
Foreach-Object { [void] $output.Append($_.FullName + "`n") }
$output.ToString()rather than simple text concatenation:
$output = ""
Get-ChildItem C:\ *.txt -Recurse | Foreach-Object { $output += $_.FullName }
$outputIn PowerShell, combining commands in a
pipeline is a fundamental concept. As scripts and cmdlets generate
output, PowerShell passes that output to the next command in the
pipeline as soon as it can. In the solution, the Get-ChildItem commands that retrieve all text
files on the C: drive take a very
long time to complete. However, since they begin to
generate data almost immediately, PowerShell can pass that data onto the
next command as soon as the Get-ChildItem cmdlet produces it. This is true
of any commands that generate or consume data and is called
streaming. The pipeline completes almost as soon as
the Get-ChildItem cmdlet finishes
producing its data and uses memory very efficiently as it does
so.
The second Get-ChildItem example (that collects its data)
prevents PowerShell from taking advantage of this streaming opportunity.
It first stores all the files in an array, which, because of the amount
of data, takes a long time and enormous amount of memory. Then, it sends
all those objects into the output file, which takes a long time as
well.
However, most commands can consume data
produced by the pipeline directly, as illustrated by the Out-File cmdlet. For those commands,
PowerShell provides streaming behavior as long as you combine the
commands into a pipeline. For commands that do not support data coming
from the pipeline directly, the Foreach-Object cmdlet (with the aliases of
foreach and %) lets you to still work
with each piece of data as the previous command produces it, as shown in
the StringBuilder example.
When you generate large reports, it is common to store the entire report into a string, and then write that string out to a file once the script completes. You can usually accomplish this most effectively by streaming the text directly to its destination (a file or the screen), but sometimes this is not possible.
Since PowerShell makes it so easy to add
more text to the end of a string (as in $output += $_.FullName), many initially opt for that
approach. This works great for small-to-medium strings, but causes
significant performance problems for large strings.
As an example of this performance difference, compare the following:
PS > Measure-Command {
>> $output = New-Object Text.StringBuilder
>> 1..10000 |
>> Foreach-Object { $output.Append("Hello World") }
>> }
>>
(...)
TotalSeconds : 2.3471592
PS > Measure-Command {
>> $output = ""
>> 1..10000 | Foreach-Object { $output += "Hello World" }
>> }
>>
(...)
TotalSeconds : 4.9884882In the .NET Framework (and therefore
PowerShell), strings never change after you create them. When you add
more text to the end of a string, PowerShell has to build a
new string by combining the two smaller strings.
This operation takes a long time for large strings, which is why the
.NET Framework includes the System.Text.StringBuilder class. Unlike
normal strings, the StringBuilder
class assumes that you will modify its data—an assumption that allows
it to adapt to change much more efficiently.
You want to simplify the creation of large amounts of repetitive source code or other text.
Use PowerShell's string formatting operator
(-f) to place dynamic information inside of a
pre-formatted string, and then repeat that replacement for each piece of
dynamic information.
Code generation is a useful technique in nearly any technology that produces output from some text-based input. For example, imagine having to create an HTML report to show all of the processes running on your system at that time. In this case, "code" is the HTML code understood by a web browser.
"For example, imaging having" --> should be "imagine" ?
HTML pages start with some standard text
(<html>, <head>,
<body>), and then you would likely include the
processes in an HTML <table>. Each row would
include colums for each of the properties in the process you're working
with.
Generating this by hand would be mind-numbing and error-prone. Instead, you can write a function to generate the code for the row:
function Get-HtmlRow($process)
{
$template = "<TR> <TD>{0}</TD> <TD>{1}</TD> </TR>"
$template -f $process.Name,$process.ID
}Then generate the report in milliseconds, rather than hours:
"<HTML><BODY><TABLE>" > report.html
Get-Process | Foreach-Object { Get-HtmlRow $_ } >> report.html
"</TABLE></BODY></HTML>" >> report.html
Invoke-Item .\report.htmlIn addition to the formatting operator, you
can sometimes use the String.Replace method:
$string = @'
Name is __NAME__
Id is __ID__
'@
$string = $string.Replace("__NAME__", $process.Name)
$string = $string.Replace("__ID__", $process.Id)This works
well (and is very readable) if you have tight control over the data
you'll be using as replacement text. If it is at all possible for the
replacement text to contain one of the special tags
("__NAME__" or
"__ID__", for example), then they will
also get replaced by further replacements and
corrupt your final output.
To avoid this issue, you can use the Format-String script:
Example 5.10. Format-String.ps1
<#
.SYNOPSIS
Replaces text in a string based on named replacement tags
.EXAMPLE
PS >.\Format-String "Hello {NAME}" @{ NAME = 'PowerShell' }
Hello PowerShell
#>
param($string, [hashtable] $replacements)
$currentIndex = 0
$replacementList = @()
foreach($key in $replacements.Keys)
{
$string = $string.Replace("{$key}", "{$currentIndex}")
$replacementList += $replacements[$key]
$currentIndex++
}
$string -f $replacementList
PowerShell includes several commands for code
generation that you've probably used without recognizing the "code
generation" aspect of it. The ConvertTo-Html cmdlet
applies code generation of incoming objects to HTML reports. The
ConvertTo-Csv cmdlet applies code generation to CSV
files. The ConvertTo-Xml cmdlet applies code
generation to XML files.
Code generation techniques seem to come up
naturally when you realize you are writing a report, but are often
missed when writing source code of another programming or scripting
language. For example, imagine you need to write a C# function that
outputs all of the details of a process. The
System.Diagnostics.Process class has a lot of
properties, so that's going to be a long function. Writing it by hand is
going to be difficult, so you can have PowerShell do most of it for
you.
For any object (for example, a process that
you've retrieved from the Get-Process command), you
can access its PsObject.Properties property to get a
list of all of its properties. Each of those has a
Name property, so you can use that to generate the C#
code:
$process.PsObject.Properties |
Foreach-Object {
'Console.WriteLine("{0}: " + process.{0});' -f $_.Name }This generates over 60 lines of C# source code, rather than having you do it by hand:
Console.WriteLine("Name: " + process.Name);
Console.WriteLine("Handles: " + process.Handles);
Console.WriteLine("VM: " + process.VM);
Console.WriteLine("WS: " + process.WS);
Console.WriteLine("PM: " + process.PM);
Console.WriteLine("NPM: " + process.NPM);
Console.WriteLine("Path: " + process.Path);
Console.WriteLine("Company: " + process.Company);
Console.WriteLine("CPU: " + process.CPU);
Console.WriteLine("FileVersion: " + process.FileVersion);
Console.WriteLine("ProductVersion: " + process.ProductVersion);
(...)Similar benefits come from generating bulk SQL statements, repetitive data structures, and more.
PowerShell code generation can even help you
with large-scale administration tasks even when PowerShell is not
available. Given a large list of input (for example, a complex list of
files to copy), you can easily generate a cmd.exe
batch file or Unix shell script to automate the task. Generate the
script in PowerShell, then invoke it on the system of your
choice!
1 comment
I will review this chapter
Add a comment