News & Info

CSA Daily Updates and Tech Chatter

Perl Regular Expression \K Trick


Regular expressions are a frequently useful tool in our profession, and Perl is probably the most advanced arena for testing your ability to wield regexes.  That’s because Perl has the most feature-rich regular expressions out there (that I know of anyways).  There’s always some new trick to learn about Perl regexes.

Case in point: \K.  Let’s say you want to replace the end of every line that begins with ‘Parent Commit:’, where that string is followed by whitespace and a forty-character hash.  You want to replace the hash.  But you have to hold on to the beginning of the string.  Here’s one way to go about it:

s/^Parent Commit:\s+[0-9a-f]{40}$/Parent Commit: $new_hash/gi

This works, but repeating ‘Parent Commit’ is duplication we would like to avoid.

s/^(Parent Commit:)\s+[0-9a-f]{40}$/$1 $new_hash/gi

Here we capture the beginning of the string so that we can use insert it inside of the replacement part.  This prevents us from having to manually copy the text, but—and maybe this is just me—having to capture that text is annoying.  It kinda feels like a waste of a group.

Enter \K.  When Perl sees this meta-character it throws away everything that it has matched up to that point.  This lets the regex engine continue with a clean slate.  In the context of s///, it means that our replacement won’t affect anything before the \K, because Perl will have forgotten about it.  That means we can write the regex above in the form

s/^Parent Commit:\s+\K[0-9a-f]{40}$/$new_hash/gi

After the \K we are left matching only the hash.  The ‘Parent Commit:\s+’ section gets ignored and we end up performing

 s/[0-9a-f]{40}$/$new_hash/gi

except the initial part of the string will still be left intact after the replacement.  This way we don’t need to repeat ‘Parent Commit’ or use a capture group to prevent it from getting replaced.

Anyone have any other regex tricks or tips?  Please share if you do.

Tags: ,

About Lance Cleveland

I started my high-tech career in the early 80's as a computer technician. I became a lead engineer at a Boston area database company a few years later. When the Internet was just starting to show up on people's radar I quit my corner-office job and founded ProActive Web Marketing, my first start up company. That was the genesis of several successful start up companies including Time Magazine award winner The Lobster Net. After brief retirement in my mid-30s I co-founded the software consulting firm, Cyber Sprocket Labs. In addition to being "man of all hats" at Charleston Software Associates, I currently serve on the board or as technical adviser for several companies including Musiplicity, Model Locate, and Advanced Media Ltd. In the past I consulted for Data General, Kimberly Clark, Kraft, Philip Morris, Rich Foods, Telefonica, Aribtron, and a half-dozen other Fortune 500 companies. I've appeared as a keynote speaker for the USVI Economic Development Summit, showed up as a lead interviewee for Microsoft infomercials, and have been a cited performance advertising, Internet retail, and cybercrime expert in The Wall Street Journal and New York Times. I currently spend most of my time hanging with friends & family while hacking WordPress plugins. ### Code geek. Dad. Husband. Rum Lover. Not necessarily in that order.

Socialize

Enter your email below to sign up for the monthly Store Locator Plus newsletter. Click the Facebook icon to get almost-daily updates on what I'm working on now. The RSS feed icon will bring my bi-weekly blog posts to your feed reader.

Trackbacks/Pingbacks

  1. php – regexps: variable-length lookbehind-assertion alternatives « Python ASK - July 30, 2012

    [...] Perl Regular Expression \K Trick [...]