freeCodeCamp/curriculum/challenges/chinese/10-coding-interview-prep/rosetta-code/tokenize-a-string-with-esca...

---
id: 594faaab4e2a8626833e9c3d
title: 使用转义标记字符串
challengeType: 5
videoUrl: ''
dashedName: tokenize-a-string-with-escaping
---

# --description--

<p>编写一个函数或程序，可以在分隔符的每个非转义事件中拆分字符串。 </p><p>它应该接受三个输入参数： </p>  <b>字符串</b>  <b>分隔符字符</b>  <b>转义字符</b>  <p>它应该输出一个字符串列表。 </p><p>拆分规则： </p>由分隔符分隔的字段将成为输出列表的元素。应保留空字段，即使在开始和结束时也是如此。 <p>转义规则： </p> “Escaped”意味着出现一个尚未自行转义的转义字符。当转义字符位于没有特殊含义的字符之前时，它仍然被视为转义符（但不会做任何特殊操作）。用于转义某些内容的每次出现的转义字符都不应成为输出的一部分。 <p>证明您的函数满足以下测试用例：给定字符串</p><pre>一个^ | UNO || 3 ^^^^ |四^^^ | ^夸| </pre>和使用<pre> | </pre>作为分隔符和<pre> ^ </pre>作为转义字符，您的函数应输出以下数组： <p></p><pre> ['one | uno'，“，'three ^^'，'four ^ | quatro'，”]
  </pre>

# --hints--

`tokenize`是一个函数。

```js
assert(typeof tokenize === 'function');
```

`tokenize`应该返回一个数组。

```js
assert(typeof tokenize('a', 'b', 'c') === 'object');
```

`tokenize("one^|uno||three^^^^|four^^^|^cuatro|", "|", "^")`应返回[“one | uno”，“”，“three ^^” ，“四个^ | cuatro”，“”]“）

```js
assert.deepEqual(tokenize(testStr1, '|', '^'), res1);
```

`tokenize("a@&bcd&ef&&@@hi", "&", "@")`应返回`["a&bcd", "ef", "", "@hi"]`

```js
assert.deepEqual(tokenize(testStr2, '&', '@'), res2);
```

# --seed--

## --after-user-code--

```js
const testStr1 = 'one^|uno||three^^^^|four^^^|^cuatro|';
const res1 = ['one|uno', '', 'three^^', 'four^|cuatro', ''];

// TODO add more tests
const testStr2 = 'a@&bcd&ef&&@@hi';
const res2 = ['a&bcd', 'ef', '', '@hi'];
```

## --seed-contents--

```js
function tokenize(str, sep, esc) {
  return true;
}
```

# --solutions--

```js
// tokenize :: String -> Character -> Character -> [String]
function tokenize(str, charDelim, charEsc) {
  const dctParse = str.split('')
    .reduce((a, x) => {
      const blnEsc = a.esc;
      const blnBreak = !blnEsc && x === charDelim;
      const blnEscChar = !blnEsc && x === charEsc;

      return {
        esc: blnEscChar,
        token: blnBreak ? '' : (
          a.token + (blnEscChar ? '' : x)
        ),
        list: a.list.concat(blnBreak ? a.token : [])
      };
    }, {
      esc: false,
      token: '',
      list: []
    });

  return dctParse.list.concat(
    dctParse.token
  );
}
```
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								---
 								id: 594faaab4e2a8626833e9c3d
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								title: 使用转义标记字符串
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								challengeType: 5
 								videoUrl: ''
-												feat(curriculum): restore seed + solution to Chinese (#40683)

* feat(tools): add seed/solution restore script

* chore(curriculum): remove empty sections' markers

* chore(curriculum): add seed + solution to Chinese

* chore: remove old formatter

* fix: update getChallenges

parse translated challenges separately, without reference to the source

* chore(curriculum): add dashedName to English

* chore(curriculum): add dashedName to Chinese

* refactor: remove unused challenge property 'name'

* fix: relax dashedName requirement

* fix: stray tag

Remove stray `pre` tag from challenge file.

Signed-off-by: nhcarrigan <nhcarrigan@gmail.com>

Co-authored-by: nhcarrigan <nhcarrigan@gmail.com>
											
										
										
											2021-01-13 02:31:00 +00:00
+								dashedName: tokenize-a-string-with-escaping
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								---
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								# --description--
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								<p>编写一个函数或程序，可以在分隔符的每个非转义事件中拆分字符串。 </p><p>它应该接受三个输入参数： </p>  <b>字符串</b>  <b>分隔符字符</b>  <b>转义字符</b>  <p>它应该输出一个字符串列表。 </p><p>拆分规则： </p>由分隔符分隔的字段将成为输出列表的元素。应保留空字段，即使在开始和结束时也是如此。 <p>转义规则： </p> “Escaped”意味着出现一个尚未自行转义的转义字符。当转义字符位于没有特殊含义的字符之前时，它仍然被视为转义符（但不会做任何特殊操作）。用于转义某些内容的每次出现的转义字符都不应成为输出的一部分。 <p>证明您的函数满足以下测试用例：给定字符串</p><pre>一个^ | UNO || 3 ^^^^ |四^^^ | ^夸| </pre>和使用<pre> | </pre>作为分隔符和<pre> ^ </pre>作为转义字符，您的函数应输出以下数组： <p></p><pre> ['one | uno'，“，'three ^^'，'four ^ | quatro'，”]
 								  </pre>
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								# --hints--
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								`tokenize`是一个函数。
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								```js
 								assert(typeof tokenize === 'function');
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								```
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								`tokenize`应该返回一个数组。
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
 								```js
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								assert(typeof tokenize('a', 'b', 'c') === 'object');
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								```
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								`tokenize("one^|uno||three^^^^|four^^^|^cuatro|", "|", "^")`应返回[“one | uno”，“”，“three ^^” ，“四个^ | cuatro”，“”]“）
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
 								```js
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								assert.deepEqual(tokenize(testStr1, '|', '^'), res1);
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								```
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								`tokenize("a@&bcd&ef&&@@hi", "&", "@")`应返回`["a&bcd", "ef", "", "@hi"]`
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
 								```js
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								assert.deepEqual(tokenize(testStr2, '&', '@'), res2);
-												Add languages Russian, Arabic, Chinese, Portuguese (#18305)


											
										
										
											2018-10-10 22:03:03 +00:00
+								```
-												fix: insert blank line after ```

search and replace ```\n< with ```\n\n< to ensure there's an empty line
before closing tags

											
										
										
											2020-08-13 15:24:35 +00:00
-												feat(curriculum): restore seed + solution to Chinese (#40683)

* feat(tools): add seed/solution restore script

* chore(curriculum): remove empty sections' markers

* chore(curriculum): add seed + solution to Chinese

* chore: remove old formatter

* fix: update getChallenges

parse translated challenges separately, without reference to the source

* chore(curriculum): add dashedName to English

* chore(curriculum): add dashedName to Chinese

* refactor: remove unused challenge property 'name'

* fix: relax dashedName requirement

* fix: stray tag

Remove stray `pre` tag from challenge file.

Signed-off-by: nhcarrigan <nhcarrigan@gmail.com>

Co-authored-by: nhcarrigan <nhcarrigan@gmail.com>
											
										
										
											2021-01-13 02:31:00 +00:00
+								# --seed--
 								## --after-user-code--
 								```js
 								const testStr1 = 'one^|uno||three^^^^|four^^^|^cuatro|';
 								const res1 = ['one|uno', '', 'three^^', 'four^|cuatro', ''];
 								// TODO add more tests
 								const testStr2 = 'a@&bcd&ef&&@@hi';
 								const res2 = ['a&bcd', 'ef', '', '@hi'];
 								```
 								## --seed-contents--
 								```js
 								function tokenize(str, sep, esc) {
 								  return true;
 								}
 								```
-												chore(learn): Applied MDX format to Chinese curriculum files (#40462)


											
										
										
											2020-12-16 07:37:30 +00:00
+								# --solutions--
-												feat(curriculum): restore seed + solution to Chinese (#40683)

* feat(tools): add seed/solution restore script

* chore(curriculum): remove empty sections' markers

* chore(curriculum): add seed + solution to Chinese

* chore: remove old formatter

* fix: update getChallenges

parse translated challenges separately, without reference to the source

* chore(curriculum): add dashedName to English

* chore(curriculum): add dashedName to Chinese

* refactor: remove unused challenge property 'name'

* fix: relax dashedName requirement

* fix: stray tag

Remove stray `pre` tag from challenge file.

Signed-off-by: nhcarrigan <nhcarrigan@gmail.com>

Co-authored-by: nhcarrigan <nhcarrigan@gmail.com>
											
										
										
											2021-01-13 02:31:00 +00:00
+								```js
 								// tokenize :: String -> Character -> Character -> [String]
 								function tokenize(str, charDelim, charEsc) {
 								  const dctParse = str.split('')
 								    .reduce((a, x) => {
 								      const blnEsc = a.esc;
 								      const blnBreak = !blnEsc && x === charDelim;
 								      const blnEscChar = !blnEsc && x === charEsc;
 								      return {
 								        esc: blnEscChar,
 								        token: blnBreak ? '' : (
 								          a.token + (blnEscChar ? '' : x)
 								        ),
 								        list: a.list.concat(blnBreak ? a.token : [])
 								      };
 								    }, {
 								      esc: false,
 								      token: '',
 								      list: []
 								    });
 								  return dctParse.list.concat(
 								    dctParse.token
 								  );
 								}
 								```