Learning Perl: 2.2. Strings

Previous Page
Next Page

 

2.2. Strings

Strings are sequences of characters (like hello). Strings may contain any combination of any characters.[] The shortest possible string has no characters. The longest string fills all of your available memory, though you wouldn't be able to do much with that. This is in accordance with the principle of "no built-in limits" that Perl follows at every opportunity. Typical strings are printable sequences of letters, digits, and punctuation in the ASCII 32 to ASCII 126 range. However, the ability to have any character in a string means you can create, scan, and manipulate raw binary data as strings and that is something with which many other utilities would have great difficulty. For example, you could update a graphical image or compiled program by reading it into a Perl string, making the change, and writing the result back out.

[] Unlike C or C++, there's nothing special about the NUL character in Perl because Perl uses length counting, not a null byte, to determine the end of the string.

Like numbers, strings have a literal representation, which is the way you represent the string in a Perl program. Literal strings come in two different flavors: single-quoted string literals and double-quoted string literals.

2.2.1. Single-Quoted String Literals

A single-quoted string literal is a sequence of characters enclosed in single quotes. The single quotes are not part of the string itself but are there to let Perl identify the beginning and the ending of the string. Any character other than a single quote or a backslash between the quote marks (including newline characters, if the string continues onto successive lines) stands for itself inside a string. To get a backslash, put two backslashes in a row; to get a single quote, put a backslash followed by a single quote:

    'fred'    # those four characters: f, r, e, and d
    'barney'  # those six characters
    ''        # the null string (no characters)
    'Don/'t let an apostrophe end this string prematurely!'
    'the last character of this string is a backslash: //'
    'hello/n' # hello followed by backslash followed by n
    'hello
    there'    # hello, newline, there (11 characters total)
    '/'//'    # single quote followed by backslash

The /n within a single-quoted string is not interpreted as a newline but as the two characters backslash and n. Only when the backslash is followed by another backslash or a single quote does it have special meaning.

2.2.2. Double-Quoted String Literals

A double-quoted string literal is similar to the strings you may have seen in other languages. Once again, it's a sequence of characters, though this time enclosed in double quotes. But now the backslash takes on its full power to specify certain control characters or any character through octal and hex representations. Here are some double-quoted strings:

    "barney"        # just the same as 'barney'
    "hello world/n" # hello world, and a newline
    "The last character of this string is a quote mark: /""
    "coke/tsprite"  # coke, a tab, and sprite

The double-quoted literal string "barney" means the same six-character string to Perl as does the single-quoted literal string 'barney'. It's like what you saw with numeric literals, where you saw that 0377 was another way to write 255.0. Perl lets you write the literal in the way that makes more sense to you. Of course, if you wish to use a backslash escape (like /n to mean a newline character), you'll need to use the double quotes.

The backslash can precede different characters to mean different things (generally called a backslash escape). The nearly complete[*] list of double-quoted string escapes is given in Table 2-1.

[*] Recent versions of Perl have introduced Unicode escapes, which we aren't going to show you here.

Table 2-1. Double-quoted string backslash escapes

Construct

Meaning

/n

Newline

/r

Return

/t

Tab

/f

Formfeed

/b

Backspace

/a

Bell

/e

Escape (ASCII escape character)

/007

Any octal ASCII value (here, 007 = bell)

/x7f

Any hex ASCII value (here, 7f = delete)

/cC

A "control" character (here, Ctrl-C)

//

Backslash

/"

Double quote

/l

Lowercase next letter

/L

Lowercase all following letters until /E

/u

Uppercase next letter

/U

Uppercase all following letters until /E

/Q

Quote non-word characters by adding a backslash until /E

/E

End /L, /U, or /Q


Another feature of double-quoted strings is that they are variable interpolated, meaning that some variable names within the string are replaced with their current values when the strings are used. You haven't formally been introduced to what a variable looks like yet, so we'll get back to this later in this chapter.

2.2.3. String Operators

String values can be concatenated with the . operator. (Yes, that's a single period.) This doesn't alter either string, any more than 2+3 alters either 2 or 3. The resulting (longer) string is then available for further computation or assignment to a variable:

    "hello" . "world"       # same as "helloworld"
    "hello" . ' ' . "world" # same as 'hello world'
    'hello world' . "/n"    # same as "hello world/n"

The concatenation must be explicitly requested with the . operator, unlike in some other languages where you merely have to stick the two values next to each other.

A special string operator is the string repetition operator, consisting of the single lowercase letter x. This operator takes its left operand (a string) and makes as many concatenated copies of that string as indicated by its right operand (a number):

    "fred" x 3       # is "fredfredfred"
    "barney" x (4+1) # is "barney" x 5, or "barneybarneybarneybarneybarney"
    5 x 4            # is really "5" x 4, which is "5555"

That last example is worth spelling out. The string repetition operator wants a string for a left operand, so the number 5 is converted to the string "5" (using rules described in detail in the next section), giving a one-character string. This new string is then copied four times, yielding the four-character string 5555. If you had reversed the order of the operands, as 4 x 5, you would have made five copies of the string 4, yielding 44444. This shows that string repetition is not commutative.

The copy count (the right operand) is first truncated to an integer value (4.8 becomes 4) before being used. A copy count of less than one results in an empty (zero-length) string.

2.2.4. Automatic Conversion Between Numbers and Strings

For the most part, Perl automatically converts between numbers and strings as needed. How does it know whether a number or a string is needed? It all depends on the operator being used on the scalar value. If an operator expects a number (as + does), Perl will see the value as a number. If an operator expects a string (like . does), Perl will see the value as a string. You don't need to worry about the difference between numbers and strings; use the proper operators, and Perl will make it all work.

When a string value is used where an operator needs a number (say, for multiplication), Perl automatically converts the string to its equivalent numeric value as if it had been entered as a decimal floating-point value.[*] So "12" * "3" gives the value 36. trailing nonnumber stuff and leading whitespace are discarded, so "12fred34" * " 3" will give 36 without any complaints.[] At the extreme end of this, something that isn't a number at all converts to zero. This would happen if you used the string "fred" as a number.

[*] The trick of using a leading zero to mean a non-decimal value works for literals but never for automatic conversion. Use hex( ) or oct( ) to convert those kinds of strings.

[] Unless you request warnings, which we'll discuss in a moment.

Likewise, if a numeric value is given when a string value is needed (say, for string concatenation), the numeric value expands into whatever string would have been printed for that number. For example, if you want to concatenate the string Z followed by the result of 5 multiplied by 7,[] you can say it this way:

[] You'll see about precedence and parentheses shortly.

    "Z" . 5 * 7 # same as "Z" . 35, or "Z35"

In other words, you don't have to worry about whether you have a number or a string (most of the time). Perl performs all the conversions for you.[§]

[§] And if you're worried about efficiency, don't be. Perl generally remembers the result of a conversion so it's done only once.

Previous Page
Next Page
代码下载链接: https://pan.quark.cn/s/a4b39357ea24 第 一 章 概述 1-1 简述计算机程序设计语言的发展阶段。 解: 自从计算机诞生以来,程序设计语言经历了从机器语言、汇编语言到高级语言的演变过程,C++语言作为一种面向对象的编程语言,也属于高级语言范畴。 1-2 面向对象的编程语言具备哪些特性? 解: 面向对象的编程语言与传统的编程语言有着本质的区别,其设计初衷是为了更直观地模拟现实世界中存在的事物及其相互关系。这类编程语言将客观事物视为具有属性和行为的对象,通过抽象方法提取出同一类对象的共同属性(静态特征)和行为(动态特征),从而构建类。借助类的继承与多态机制,能够便捷地实现代码复用,显著缩短软件开发周期,并确保软件风格的一致性。因此,面向对象的编程语言使得程序能够较为准确地反映问题域的本质,软件开发人员可以运用人类惯用的思维模式进行开发工作。C++语言是目前应用最为广泛的面向对象编程语言。 1-3 结构化程序设计方法是什么?这种方法有哪些优势和不足? 解: 结构化程序设计的核心思想是自顶向下、逐步求精;其程序结构按照功能划分为多个基本模块;各模块之间的关联尽可能简化,在功能上保持相对独立性;每个模块内部均由顺序、选择和循环三种基本结构构成;模块化实现的具体途径是利用子程序。结构化程序设计由于采用模块分解与功能抽象,自顶向下、分而治之的策略,从而有效地将一个较为复杂的程序系统设计任务分解成许多易于管理和处理的子任务,便于开发与维护。 尽管结构化程序设计方法具备诸多优点,但它本质上仍是一种面向过程的程序设计方法,将数据与处理数据的操作分离为相互独立的实体。当数据结构发生变化时,所有相关的处理过程都需要进行相应的调整,每一种...
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值