2.9 KiB

Raw Blame History

id	title	challengeType	forumTopicId	dashedName
594faaab4e2a8626833e9c3d	Tokenize a string with escaping	1	302338	tokenize-a-string-with-escaping

--description--

Write a function or program that can split a string at each non-escaped occurrence of a separator character.

It should accept three input parameters:

The string
The separator character
The escape character

It should output a list of strings.

Rules for splitting:

The fields that were separated by the separators, become the elements of the output list.
Empty fields should be preserved, even at the start and end.

Rules for escaping:

"Escaped" means preceded by an occurrence of the escape character that is not already escaped itself.
When the escape character precedes a character that has no special meaning, it still counts as an escape (but does not do anything special).
Each occurrences of the escape character that was used to escape something, should not become part of the output.

Demonstrate that your function satisfies the following test-case:

Given the string

one^|uno||three^^^^|four^^^|^cuatro|

and using | as a separator and ^ as escape character, your function should output the following array:

  ['one|uno', '', 'three^^', 'four^|cuatro', '']

--hints--

tokenize should be a function.

assert(typeof tokenize === 'function');

tokenize should return an array.

assert(typeof tokenize('a', 'b', 'c') === 'object');

tokenize('one^|uno||three^^^^|four^^^|^cuatro|', '|', '^') should return ['one|uno', '', 'three^^', 'four^|cuatro', '']

assert.deepEqual(tokenize(testStr1, '|', '^'), res1);

tokenize('a@&bcd&ef&&@@hi', '&', '@') should return ['a&bcd', 'ef', '', '@hi']

assert.deepEqual(tokenize(testStr2, '&', '@'), res2);

--seed--

--after-user-code--

const testStr1 = 'one^|uno||three^^^^|four^^^|^cuatro|';
const res1 = ['one|uno', '', 'three^^', 'four^|cuatro', ''];

// TODO add more tests
const testStr2 = 'a@&bcd&ef&&@@hi';
const res2 = ['a&bcd', 'ef', '', '@hi'];

--seed-contents--

function tokenize(str, sep, esc) {
  return true;
}

--solutions--

// tokenize :: String -> Character -> Character -> [String]
function tokenize(str, charDelim, charEsc) {
  const dctParse = str.split('')
    .reduce((a, x) => {
      const blnEsc = a.esc;
      const blnBreak = !blnEsc && x === charDelim;
      const blnEscChar = !blnEsc && x === charEsc;

      return {
        esc: blnEscChar,
        token: blnBreak ? '' : (
          a.token + (blnEscChar ? '' : x)
        ),
        list: a.list.concat(blnBreak ? a.token : [])
      };
    }, {
      esc: false,
      token: '',
      list: []
    });

  return dctParse.list.concat(
    dctParse.token
  );
}

2.9 KiB Raw Blame History