天天看点

ruby字符串替换_如何在Ruby中使用字符串替换

ruby字符串替换

Splitting a string is only one way to manipulate string data. You can also make substitutions to replace one part of a string with another string. For instance, in an example string (foo,bar,baz) replacing "foo" with "boo" in would yield "boo,bar,baz." You can do this and many more things using the sub and gsub method in the string class.

拆分字符串只是操作字符串数据的一种方法。 您还可以进行替换,以将字符串的一部分替换为另一字符串。 例如,在示例字符串(foo,bar,baz)中将“ foo”替换为“ boo”将产生“ boo,bar,baz”。 您可以使用字符串类中的sub和gsub方法执行此操作以及执行更多操作。

Ruby替代的许多选择 ( Many Options for Ruby Substitution )

The substitution methods come in two varieties. The sub method is the most basic of the two and comes with the least number of surprises. It simply replaces the first instance of the designated pattern with the replacement.

替代方法有两种。 子方法是这两种方法中最基本的方法,并且惊喜最少。 它只是用替换替换了指定模式的第一个实例。

Whereas sub only replaces the first instance, the gsub method replaces every instance of the pattern with the replacement. In addition, both sub and gsub have sub! and gsub! counterparts. Remember, methods in Ruby that end in an exclamation point alter the variable in place instead of returning a modified copy.

sub仅替换第一个实例,而gsub方法用替换替换模式的每个实例。 另外, sub和gsub都有sub! 和gsub! 同行。 请记住, Ruby中以感叹号结尾的方法会在适当位置更改变量,而不是返回修改后的副本。

搜索和替换 ( Search and Replace )

The most basic usage of the substitution methods is to replace one static search string with one static replacement string. In the above example, "foo" was replaced with "boo." This can be done for the first occurrence of "foo" in the string using the sub method or with all occurrences of "foo" using the gsub method.

替换方法的最基本用法是用一个静态替换字符串替换一个静态搜索字符串。 在上面的示例中,“ foo”被替换为“ boo”。 可以使用sub方法在字符串中首次出现“ foo”的情况下进行此操作,或者使用gsub方法使用“ foo”的所有出现的情况进行此操作。

a = "foo,bar,baz"

a =“ foo,bar,baz”

b = a.sub( "foo", "boo" )

b = a.sub(“ foo”,“ boo”)

puts b

把b

foo,bar,baz

foo,bar,baz

gsub$ ./1.rb

gsub $ ./1.rb

boo,bar,baz

嘘,吧,巴兹

灵活的搜索 ( Flexible Searching )

Searching for static strings can only go so far. Eventually, you'll run into cases where a subset of strings or strings with optional components will need to be matched. The substitution methods can, of course, match regular expressions instead of static strings. This allows them to be much more flexible and match virtually any text you can dream up.

搜索静态字符串只能走这么远。 最终,您将遇到需要匹配字符串子集或带有可选组件的字符串的情况。 当然,替换方法可以匹配正则表达式而不是静态字符串。 这使它们更加灵活,并且可以匹配您可以梦想的任何文本。

This example is a little more real world. Imagine a set of comma-separated values. These values are fed into a tabulation program over which you have no control (closed source). The program that generates these values is closed source as well, but it's outputting some badly-formatted data. Some fields have spaces after the comma and this is causing the tabulator program to break.

这个例子更真实一些。 想象一组用逗号分隔的值。 这些值将输入到您无法控制的制表程序中(封闭源 )。 生成这些值的程序也是封闭源,但它输出的是格式错误的数据。 某些字段在逗号后有空格,这导致制表程序中断。

One possible solution is to write a Ruby program to act as "glue," or a filter, between the two programs. This Ruby program will fix any problems in the data formatting so the tabulator can do its job. To do this, it's quite simple: replace a comma followed by a number of spaces with just a comma.

一种可能的解决方案是编写一个Ruby程序,在两个程序之间充当“胶水”或过滤器。 这个Ruby程序将解决数据格式中的所有问题,以便制表器可以完成其工作。 为此,这非常简单:用逗号替换逗号,然后将多个空格替换为逗号。

STDIN.each do|l|

STDIN.each do | l |

l.gsub!( /, +/, "," )

l.gsub!(/,+ /,“,”)

puts l

放l

end

结束

10, 20, 30

10、20、30

12.8, 10.4,11

12.8、10.4、11

gsub$ cat data.txt | ./2.rb

gsub $ cat data.txt | ./2.rb

10,20,30

10,20,30

12.8,10.4,11

12.8,10.4,11

灵活的替换 ( Flexible Replacements )

Now imagine this situation. In addition to the minor formatting errors, the program that produces the data produces number data in scientific notation. The tabulator program doesn't understand this, so you're going to have to replace it. Obviously, a simple gsub won't do here because the replacement will be different every time the replacement is done.

现在想象一下这种情况。 除了较小的格式错误外 ,生成数据的程序还会以科学计数法生成数字数据。 制表器程序不理解这一点,因此您将不得不替换它。 显然,这里不会做一个简单的gsub,因为每次替换完成后替换都会不同。

Luckily, the substitution methods can take a block for the substitution arguments. For each time the search string is found, the text that matched the search string (or regex) is passed to this block. The value yielded by the block is used as the substitution string. In this example, a floating point number in scientific notation form (such as 1.232e4) is converted to a normal number with a decimal point. The string is converted to a number with to_f, then the number is formatted using a format string.

幸运的是,替换方法可以为替换参数取一个块。 每次找到搜索字符串时,与搜索字符串(或正则表达式)匹配的文本都会传递到此块。 该块产生的值用作替换字符串。 在此示例中,科学计数形式的浮点数(例如1.232e4 )被转换为带小数点的普通数。 使用to_f将字符串转换为数字,然后使用格式字符串对数字进行格式化。

STDIN.each do|l|

STDIN.each do | l |

l.gsub!( /-?\d+\.\d+e-?\d+/) do|n|

l.gsub!(/-?\d+\.\d+e-?\d+/)do | n |

"%.3f" % n.to_f

“%.3f”%n.to_f

end

结束

l.gsub!( /, +/, "," )

l.gsub!(/,+ /,“,”)

puts l

放l

end

结束

2.215e-1, 54, 11

2.215e-1、54、11

3.15668e6, 21, 7

3.15668e6,21,7

gsub$ cat floatdata.txt | ./3.rb

gsub $ cat floatdata.txt | ./3.rb

0.222,54,11

0.222,54,11

3156680.000,21,7

3156680.000,21,7

不熟悉正则表达式? ( Not Familiar With Regular Expressions? )

Let's take a step back and look at that regular expression. It looks cryptic and complicated, but it's very simple. If you're not familiar with regular expressions, they can be quite cryptic. However, once you are familiar with them, they're straightforward and natural methods of describing text. There are a number of elements, and several of the elements have quantifiers.

让我们退后一步,看看该正则表达式 。 它看起来神秘而复杂,但是非常简单。 如果您不熟悉正则表达式,那么它们可能会很神秘。 但是,一旦您熟悉它们,它们就是描述文本的直接而自然的方法。 有许多元素,并且其中一些元素带有量词。

The primary element here is the \d character class. This will match any digit, the characters 0 through 9. The quantifier + is used with the digit character class to signify that one or more of these digits should be matched in a row. You have three groups of digits, two separated by a "." and the other separated by the letter "e" (for exponent).

这里的主要元素是\ d字符类。 这将匹配任何数字(字符0到9)。数字字符类与量词+一起使用,表示应连续匹配其中一个或多个数字。 您有三组数字,两组数字之间用“ 。 ”分隔,另一组数字由字母“ e ”分隔(用于指数)。

The second element floating around is the minus character, which uses the "?" quantifier. This means "zero or one" of these elements. So, in short, there may or may not be negative signs at the beginning of the number or exponent.

浮动的第二个元素是减号,它使用“ ? ”量词。 这意味着这些元素为“零或一”。 简而言之,在数字或指数的开头可能没有负号。

The two other elements are the . (period) character and the e character. Combine all this, and you get a regular expression (or set of rules for matching text) that matches numbers in scientific form (such as 12.34e56).

其他两个元素是。 (句点)字符和e字符。 结合所有这些,您将获得一个与科学形式(例如12.34e56 )匹配的正则表达式(或匹配文本的规则集)。

翻译自: https://www.thoughtco.com/string-substitution-in-ruby-2907752

ruby字符串替换