자바 스크립트 : 부정적인 lookbehind 동등?
자바 스크립트 정규 표현식에서 부정적인 외관 을 달성하는 방법이 있습니까? 특정 문자 세트로 시작하지 않는 문자열을 일치시켜야합니다.
일치하는 부분이 문자열의 시작 부분에 있으면 실패하지 않고이 작업을 수행하는 정규식을 찾을 수없는 것 같습니다. 부정적인 lookbehinds가 유일한 답변 인 것 같지만 javascript에는 하나가 없습니다.
편집 : 이것은 내가하고 싶은 정규 표현식이지만 그렇지 않습니다.
(?<!([abcdefg]))m
따라서 'jim'또는 'm'의 'm'과 일치하지만 'jam'과는 일치하지 않습니다.
Lookbehind 어설가 있어 가능 에 대한 ECMAScript 사양 지금까지, 그것은 단지에 구현되어 2018 년에 V8 . 따라서 Chrome 전용 환경 (예 : Electron ) 또는 Node 용으로 개발중인 경우 지금 lookbehinds를 사용할 수 있습니다.
긍정적 인 룩백 사용법 :
console.log(
"$9.99 €8.47".match(/(?<=\$)\d+(\.\d*)?/) // Matches "9.99"
);
부정적인 룩백 사용법 :
console.log(
"$9.99 €8.47".match(/(?<!\$)\d+(?:\.\d*)/) // Matches "8.47"
);
플랫폼 지원 :
- ✔ V8
- ❌ Mozilla Firefox (SpiderMonkey)가 작업 중입니다.
- ❌ Microsoft는 Chakra를 위해 작업하고 있었지만 다음 버전의 Edge는 Chromium을 기반으로하므로 지원할 것입니다.
- ❌ Apple Safari (Webkit)가 작업 중입니다.
2018 년 이후 Lookbehind Assertions 는 ECMAScript 언어 사양의 일부입니다 .
// positive lookbehind
(?<=...)
// negative lookbehind
(?<!...)
2018 년 이전 답변
Javascript는 부정적인 lookahead를 지원하므로 한 가지 방법은 다음과 같습니다.
입력 문자열을 반대로
역 정규 표현식과 일치
경기를 뒤집고 다시 포맷
const reverse = s => s.split('').reverse().join('');
const test = (stringToTests, reversedRegexp) => stringToTests
.map(reverse)
.forEach((s,i) => {
const match = reversedRegexp.test(s);
console.log(stringToTests[i], match, 'token:', match ? reverse(reversedRegexp.exec(s)[0]) : 'Ø');
});
예 1 :
@ andrew-ensley의 질문에 따라 :
test(['jim', 'm', 'jam'], /m(?!([abcdefg]))/)
출력 :
jim true token: m
m true token: m
jam false token: Ø
예 2 :
@neaumusic 주석 다음에 ( 토큰이 일치 max-height
하지만 일치 하지는 않음 ) :line-height
height
test(['max-height', 'line-height'], /thgieh(?!(-enil))/)
출력 :
max-height true token: height
line-height false token: Ø
int
앞에 오지 않는 것을 모두 찾고 싶다고 가정 해 봅시다 unsigned
.
부정적인 표정을 지원합니다.
(?<!unsigned )int
부정적인 표정을 지원하지 않는 경우 :
((?!unsigned ).{9}|^.{0,8})int
기본적으로 아이디어는 n 개의 선행 문자를 잡고 부정적인 미리보기와 일치하는 것을 제외하고 n 개의 선행 문자가없는 경우와 일치시키는 것입니다. (여기서 n은 look-behind 길이입니다.)
따라서 문제의 정규 표현식 :
(?<!([abcdefg]))m
다음과 같이 번역됩니다 :
((?!([abcdefg])).|^)m
관심있는 문자열의 정확한 지점을 찾거나 특정 부분을 다른 것으로 바꾸려면 캡처 그룹을 가지고 놀아야 할 수도 있습니다.
Mijoja의 전략은 특정 사례에 적용되지만 일반적으로는 그렇지 않습니다.
js>newString = "Fall ball bill balll llama".replace(/(ba)?ll/g,
function($0,$1){ return $1?$0:"[match]";});
Fa[match] ball bi[match] balll [match]ama
Here's an example where the goal is to match a double-l but not if it is preceded by "ba". Note the word "balll" -- true lookbehind should have suppressed the first 2 l's but matched the 2nd pair. But by matching the first 2 l's and then ignoring that match as a false positive, the regexp engine proceeds from the end of that match, and ignores any characters within the false positive.
Use
newString = string.replace(/([abcdefg])?m/, function($0,$1){ return $1?$0:'m';});
You could define a non-capturing group by negating your character set:
(?:[^a-g])m
...which would match every m
NOT preceded by any of those letters.
This is how I achieved str.split(/(?<!^)@/)
for Node.js 8 (which doesn't support lookbehind):
str.split('').reverse().join('').split(/@(?!$)/).map(s => s.split('').reverse().join('')).reverse()
Works? Yes (unicode untested). Unpleasant? Yes.
following the idea of Mijoja, and drawing from the problems exposed by JasonS, i had this idea; i checked a bit but am not sure of myself, so a verification by someone more expert than me in js regex would be great :)
var re = /(?=(..|^.?)(ll))/g
// matches empty string position
// whenever this position is followed by
// a string of length equal or inferior (in case of "^")
// to "lookbehind" value
// + actual value we would want to match
, str = "Fall ball bill balll llama"
, str_done = str
, len_difference = 0
, doer = function (where_in_str, to_replace)
{
str_done = str_done.slice(0, where_in_str + len_difference)
+ "[match]"
+ str_done.slice(where_in_str + len_difference + to_replace.length)
len_difference = str_done.length - str.length
/* if str smaller:
len_difference will be positive
else will be negative
*/
} /* the actual function that would do whatever we want to do
with the matches;
this above is only an example from Jason's */
/* function input of .replace(),
only there to test the value of $behind
and if negative, call doer() with interesting parameters */
, checker = function ($match, $behind, $after, $where, $str)
{
if ($behind !== "ba")
doer
(
$where + $behind.length
, $after
/* one will choose the interesting arguments
to give to the doer, it's only an example */
)
return $match // empty string anyhow, but well
}
str.replace(re, checker)
console.log(str_done)
my personal output:
Fa[match] ball bi[match] bal[match] [match]ama
the principle is to call checker
at each point in the string between any two characters, whenever that position is the starting point of:
--- any substring of the size of what is not wanted (here 'ba'
, thus ..
) (if that size is known; otherwise it must be harder to do perhaps)
--- --- or smaller than that if it's the beginning of the string: ^.?
and, following this,
--- what is to be actually sought (here 'll'
).
At each call of checker
, there will be a test to check if the value before ll
is not what we don't want (!== 'ba'
); if that's the case, we call another function, and it will have to be this one (doer
) that will make the changes on str, if the purpose is this one, or more generically, that will get in input the necessary data to manually process the results of the scanning of str
.
here we change the string so we needed to keep a trace of the difference of length in order to offset the locations given by replace
, all calculated on str
, which itself never changes.
since primitive strings are immutable, we could have used the variable str
to store the result of the whole operation, but i thought the example, already complicated by the replacings, would be clearer with another variable (str_done
).
i guess that in terms of performances it must be pretty harsh: all those pointless replacements of '' into '', this str.length-1
times, plus here manual replacement by doer, which means a lot of slicing... probably in this specific above case that could be grouped, by cutting the string only once into pieces around where we want to insert [match]
and .join()
ing it with [match]
itself.
the other thing is that i don't know how it would handle more complex cases, that is, complex values for the fake lookbehind... the length being perhaps the most problematic data to get.
and, in checker
, in case of multiple possibilities of nonwanted values for $behind, we'll have to make a test on it with yet another regex (to be cached (created) outside checker
is best, to avoid the same regex object to be created at each call for checker
) to know whether or not it is what we seek to avoid.
hope i've been clear; if not don't hesitate, i'll try better. :)
Using your case, if you want to replace m
with something, e.g. convert it to uppercase M
, you can negate set in capturing group.
match ([^a-g])m
, replace with $1M
"jim jam".replace(/([^a-g])m/g, "$1M")
\\jiM jam
([^a-g])
will match any char not(^
) in a-g
range, and store it in first capturing group, so you can access it with $1
.
So we find im
in jim
and replace it with iM
which results in jiM
.
As mentioned before, JavaScript allows lookbehinds now. In older browsers you still need a workaround.
I bet my head there is no way to find a regex without lookbehind that delivers the result exactly. All you can do is working with groups. Suppose you have a regex (?<!Before)Wanted
, where Wanted
is the regex you want to match and Before
is the regex that counts out what should not precede the match. The best you can do is negate the regex Before
and use the regex NotBefore(Wanted)
. The desired result is the first group $1
.
In your case Before=[abcdefg]
which is easy to negate NotBefore=[^abcdefg]
. So the regex would be [^abcdefg](m)
. If you need the position of Wanted
, you must group NotBefore
too, so that the desired result is the second group.
If matches of the Before
pattern have a fixed length n
, that is, if the pattern contains no repetitive tokens, you can avoid negating the Before
pattern and use the regular expression (?!Before).{n}(Wanted)
, but still have to use the first group or use the regular expression (?!Before)(.{n})(Wanted)
and use the second group. In this example, the pattern Before
actually has a fixed length, namely 1, so use the regex (?![abcdefg]).(m)
or (?![abcdefg])(.)(m)
. If you are interested in all matches, add the g
flag, see my code snippet:
function TestSORegEx() {
var s = "Donald Trump doesn't like jam, but Homer Simpson does.";
var reg = /(?![abcdefg])(.{1})(m)/gm;
var out = "Matches and groups of the regex " +
"/(?![abcdefg])(.{1})(m)/gm in \ns = \"" + s + "\"";
var match = reg.exec(s);
while(match) {
var start = match.index + match[1].length;
out += "\nWhole match: " + match[0] + ", starts at: " + match.index
+ ". Desired match: " + match[2] + ", starts at: " + start + ".";
match = reg.exec(s);
}
out += "\nResulting string after statement s.replace(reg, \"$1*$2*\")\n"
+ s.replace(reg, "$1*$2*");
alert(out);
}
This effectively does it
"jim".match(/[^a-g]m/)
> ["im"]
"jam".match(/[^a-g]m/)
> null
Search and replace example
"jim jam".replace(/([^a-g])m/g, "$1M")
> "jiM jam"
Note that the negative look-behind string must be 1 character long for this to work.
/(?![abcdefg])[^abcdefg]m/gi
yes this is a trick.
This might help, depending on the context:
This matches the m in jim but not jam:
"jim jam".replace(/[a-g]m/g, "").match(/m/g)
참고URL : https://stackoverflow.com/questions/641407/javascript-negative-lookbehind-equivalent
'IT story' 카테고리의 다른 글
파이썬을 사용하여 curl 명령을 실행하는 방법 (0) | 2020.07.01 |
---|---|
원점 / HEAD는 어떻게 설정됩니까? (0) | 2020.07.01 |
IntelliJ IDEA의 출력 창에서 출력이 삭감 됨 (0) | 2020.07.01 |
DbContext 및 SetInitializer를 사용하여 datetime2 범위를 벗어난 변환 오류를 수정하는 방법은 무엇입니까? (0) | 2020.07.01 |
JRE 1.7-Java 버전-리턴 : java / lang / NoClassDefFoundError : java / lang / Object (0) | 2020.07.01 |