You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: en/eBook/07.3.md
+22-22Lines changed: 22 additions & 22 deletions
Original file line number
Diff line number
Diff line change
@@ -1,24 +1,24 @@
1
1
# 7.3 Regexp
2
2
3
-
Regexp is a complicated but powerful tool for pattern match and text manipulation. Although its performance is lower than pure text match, it's more flexible. Base on its syntax, you can almost filter any kind of text from your source content. If you need to collect data in web development, it's not hard to use Regexp to have meaningful data.
3
+
Regexp is a complicated but powerful tool for pattern matching and text manipulation. Although does not perform as well as pure text matching, it's more flexible. Based on its syntax, you can filter almost any kind of text from your source content. If you need to collect data in web development, it's not hard to use Regexp to retrieve meaningful data.
4
4
5
-
Go has package`regexp`as official support for regexp, if you've already used regexp in other programming languages, you should be familiar with it. Note that Go implemented RE2 standard except `\C`, more details: [http://code.google.com/p/re2/wiki/Syntax](http://code.google.com/p/re2/wiki/Syntax).
5
+
Go has the`regexp`package, which provides official support for regexp. If you've already used regexp in other programming languages, you should be familiar with it. Note that Go implemented RE2 standard except for `\C`. For more details, follow this link: [http://code.google.com/p/re2/wiki/Syntax](http://code.google.com/p/re2/wiki/Syntax).
6
6
7
-
Actually, package `strings`does many jobs like search(Contains, Index), replace(Replace), parse(Split, Join), etc. and it's faster than Regexp, but these are simple operations. If you want to search a string without case sensitive, Regexp should be your best choice. So if package`strings`can achieve your goal, just use it, it's easy to use and read; if you need to more advanced operation, use Regexp obviously.
7
+
Go's `strings`package can actually do many jobs like searching (Contains, Index), replacing (Replace), parsing (Split, Join), etc., and it's faster than Regexp. However, these are all trivial operations. If you want to search a case insensitive string, Regexp should be your best choice. So, if the`strings`package is sufficient for your needs, just use it since it's easy to use and read; if you need to perform more advanced operations, use Regexp.
8
8
9
-
If you remember form verification we talked before, we used Regexp to verify if input information is valid there already. Be aware that all characters are UTF-8, and let's learn more about Go `regexp`!
9
+
If you recall form verification from previous sections, we used Regexp to verify the validity of user input information. Be aware that all characters are UTF-8. Let's learn more about the Go `regexp` package!
10
10
11
11
## Match
12
12
13
-
Package`regexp` has 3 functions to match, if it matches returns true, returns false otherwise.
13
+
The`regexp`package has 3 functions to match: if it matches a pattern, then it returns true, returning false otherwise.
14
14
15
15
func Match(pattern string, b []byte) (matched bool, error error)
16
16
func MatchReader(pattern string, r io.RuneReader) (matched bool, error error)
17
17
func MatchString(pattern string, s string) (matched bool, error error)
18
18
19
-
All of 3 functions check if `pattern` matches input source, returns true if it matches, but if your Regex has syntax error, it will return error. The 3 input sources of these functions are `slice of byte`, `RuneReader` and `string`.
19
+
All of 3 functions check if `pattern` matches the input source, returning true if it matches. However if your Regex has syntax errors, it will return an error. The 3 input sources of these functions are `slice of byte`, `RuneReader` and `string`.
20
20
21
-
Here is an example to verify IP address:
21
+
Here is an example of how to verify an IP address:
22
22
23
23
func IsIP(ip string) (b bool) {
24
24
if m, _ := regexp.MatchString("^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}$", ip); !m {
@@ -27,7 +27,7 @@ Here is an example to verify IP address:
27
27
return true
28
28
}
29
29
30
-
As you can see, using pattern in package`regexp` is not that different. One more example, to verify if user input is valid:
30
+
As you can see, using pattern in the`regexp`package is not that different. Here's one more example on verifying if user input is valid:
31
31
32
32
func main() {
33
33
if len(os.Args) == 1 {
@@ -40,13 +40,13 @@ As you can see, using pattern in package `regexp` is not that different. One mor
40
40
}
41
41
}
42
42
43
-
In above examples, we use `Match(Reader|Sting)` to check if content is valid, they are all easy to use.
43
+
In the above examples, we use `Match(Reader|Sting)` to check if content is valid, but they are all easy to use.
44
44
45
45
## Filter
46
46
47
-
Match mode can verify content, but it cannot cut, filter or collect data from content. If you want to do that, you have to use complex mode of Regexp.
47
+
Match mode can verify content but it cannot cut, filter or collect data from it. If you want to do that, you have to use complex mode of Regexp.
48
48
49
-
Sometimes we need to write a crawl, here is an example that shows you have to use Regexp to filter and cut data.
49
+
Let's say we need to write a crawler. Here is an example that shows when you must use Regexp to filter and cut data.
50
50
51
51
package main
52
52
@@ -95,7 +95,7 @@ Sometimes we need to write a crawl, here is an example that shows you have to us
95
95
fmt.Println(strings.TrimSpace(src))
96
96
}
97
97
98
-
In this example, we use Compile as the first step for complex mode. It verifies if your Regex syntax is correct, then returns `Regexp` for parsing content in other operations.
98
+
In this example, we use Compile as the first step for complex mode. It verifies that your Regex syntax is correct, then returns a`Regexp` for parsing content in other operations.
99
99
100
100
Here are some functions to parse your Regexp syntax:
101
101
@@ -104,9 +104,9 @@ Here are some functions to parse your Regexp syntax:
104
104
func MustCompile(str string) *Regexp
105
105
func MustCompilePOSIX(str string) *Regexp
106
106
107
-
The difference between `ComplePOSIX` and `Compile` is that the former has to use POSIX syntax which is leftmost longest search, and the latter is only leftmost search. For instance, for Regexp `[a-z]{2,4}` and content `"aa09aaa88aaaa"`, `CompilePOSIX` returns `aaaa` but `Compile` returns `aa`. `Must` prefix means panic when the Regexp syntax is not correct, returns error only otherwise.
107
+
The difference between `ComplePOSIX` and `Compile` is that the former has to use POSIX syntax which is leftmost longest search, and the latter is only leftmost search. For instance, for Regexp `[a-z]{2,4}` and content `"aa09aaa88aaaa"`, `CompilePOSIX` returns `aaaa` but `Compile` returns `aa`. `Must` prefix means panic when the Regexp syntax is not correct, returning error otherwise.
108
108
109
-
After you knew how to create a new Regexp, let's see this struct provides what methods that help us to operate content:
109
+
Now that we know how to create a new Regexp, let's see what how the methods provided by this struct can help us to operate on content:
110
110
111
111
func (re *Regexp) Find(b []byte) []byte
112
112
func (re *Regexp) FindAll(b []byte, n int) [][]byte
@@ -127,7 +127,7 @@ After you knew how to create a new Regexp, let's see this struct provides what m
These 18 methods including same function for different input sources(byte slice, string and io.RuneReader), we can simplify it by ignoring input sources as follows:
130
+
These 18 methods include identical functions for different input sources(byte slice, string and io.RuneReader), so we can really simplify this list by ignoring input sources as follows:
131
131
132
132
func (re *Regexp) Find(b []byte) []byte
133
133
func (re *Regexp) FindAll(b []byte, n int) [][]byte
@@ -194,13 +194,13 @@ Code sample:
194
194
fmt.Println(submatchallindex)
195
195
}
196
196
197
-
As we introduced before, Regexp also has 3 methods for matching, they do exactly same thing as exported functions, those exported functions call these methods underlying:
197
+
As we've previously introduced, Regexp also has 3 methods for matching. They do the exact same things as the exported functions. In fact, those exported functions actually call these methods under the hood:
At this point, you learned whole package `regexp` in Go, I hope you can understand more by studying examples of key methods, and do something interesting by yourself.
235
+
At this point, you've learned the whole `regexp`package in Go. I hope that you can understand more by studying examples of key methods, so that you can do something interesting on your own.
0 commit comments