By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright Β© All Rights Reserved. World of Software.
Reading: Refactoring 025 – Decompose Regular Expressions | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright Β© All Rights Reserved. World of Software.
World of Software > Computing > Refactoring 025 – Decompose Regular Expressions | HackerNoon
Computing

Refactoring 025 – Decompose Regular Expressions | HackerNoon

News Room
Last updated: 2025/03/31 at 9:19 PM
News Room Published 31 March 2025
Share
SHARE

Make Regular Expressions Testable and Understandable

TL;DR: You can break down a complex validation regex into smaller parts to test each part individually and report accurate errors.

Problems Addressed πŸ˜”

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-xxv

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-i-xqz3evd

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-i-xqz3evd

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-xxxvii

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-xx-we-have-reached-100

https://hackernoon.com/how-to-find-the-stinky-parts-of-your-code-part-ix-7rr33ol

Steps πŸ‘£

  1. Analyze the regex to identify its logical components.
  2. Break the regex into smaller, named sub-patterns for each component.
  3. Write unit tests for each sub-pattern to ensure it works correctly.
  4. Combine the tested sub-patterns into the full validation logic.
  5. Refactor the code to provide clear error messages for every failing part.

Sample Code πŸ’»

Before 🚨

function validateURL(url) {
  const urlRegex =
    /^(https?://)([a-zA-Z0-9.-]+.[a-zA-Z]{2,})(/.*)?$/;
  // Criptic and untesteable
  return urlRegex.test(url);
}

After πŸ‘‰

// Step 1: Define individual regex components
const protocolPattern = /^(https?://)/; 
const domainPattern = /^[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$/; 
const pathPattern = /^/.*$/;

// Step 2: Write unit tests for each component
describe("Protocol Validation", () => {
  test("should pass for http://", () => {
    expect(protocolPattern.test("http://")).toBe(true);
  });

  test("should pass for https://", () => {
    expect(protocolPattern.test("https://")).toBe(true);
  });

  test("should fail for invalid protocols", () => {
    expect(protocolPattern.test("ftp://")).toBe(false);
  });
});

describe("Domain Validation", () => {
  test("should pass for valid domains", () => {
    expect(domainPattern.test("example.com")).toBe(true);
    expect(domainPattern.test("sub.domain.org")).toBe(true);
  });

  test("should fail for invalid domains", () => {
    expect(domainPattern.test("example")).toBe(false);
    expect(domainPattern.test("domain..com")).toBe(false);
  });
});

describe("Path Validation", () => {
  test("should pass for valid paths", () => {
    expect(pathPattern.test("/path/to/resource")).toBe(true);
    expect(pathPattern.test("/")).toBe(true);
  });

  test("should fail for invalid paths", () => {
    expect(pathPattern.test("path/to/resource")).toBe(false);
    expect(pathPattern.test("")).toBe(false);
  });
});

// Step 3: Validate each part and report errors
function validateURL(url) {
  if (!protocolPattern.test(url)) {
    throw new Error("Invalid protocol. Use http:// or https://.");
  }

  const domainStartIndex = url.indexOf("://") + 3;
  const domainEndIndex = url.indexOf("/", domainStartIndex);
  const domain = domainEndIndex === -1 ? 
        url.slice(domainStartIndex) :
        url.slice(domainStartIndex, domainEndIndex);

  if (!domainPattern.test(domain)) {
    throw new Error("Invalid domain name.");
  }

  const path = url.slice(domainEndIndex);
  if (path && !pathPattern.test(path)) {
    throw new Error("Invalid path.");
  }

  return true;
}

// Step 4: Add integration tests for the full URL validation
describe("Full URL Validation", () => {
  test("should pass for valid URLs", () => {
    expect(validateURL("https://lesluthiers.com/tour/")).toBe(true);
    expect(validateURL("https://bio.lesluthiers.org/")).toBe(true);
  });

  test("should fail for invalid URLs", () => {
    expect(() => validateURL("ftp://mastropiero.com")).
      toThrow("Invalid protocol");
    expect(() => validateURL("https://estherpsicore..com")).
      toThrow("Invalid domain name");
    expect(() => validateURL("https://book.warren-sanchez")).
      toThrow("Invalid path");
  });
});

Type πŸ“

Safety πŸ›‘οΈ

This refactoring is safe if you follow the steps carefully.

Testing each component ensures that you catch errors early.

Why is the Code Better? ✨

The refactored code is better because it improves readability, maintainability, and testability.

Breaking down the regex into smaller parts makes understanding what each part does easier.

You can also report specific errors when validation fails, which helps users fix their input.

This is also a great opportunity to apply the Test-Driven Development technique, gradually increasing complexity by introducing new subparts.

How Does it Improve the Bijection? πŸ—ΊοΈ

By breaking down the regex into smaller, meaningful components, you create a closer mapping between the Real-World requirements (e.g., “URL must have a valid protocol”) and the code.

This reduces ambiguity and ensures the code reflects the problem domain accurately.

Limitations ⚠️

This approach might add some overhead for very simple regex patterns where breaking them down would be unnecessary.

Refactor with AI πŸ€–

You can use AI tools to help identify regex components.

Ask the AI to explain what each part of the regex does, then guide you in breaking it into smaller, testable pieces. For example, you can ask, “What does this regex do?” and follow up with, “How can I split it into smaller parts?”.

It’s 2025, No programmer should write new Regular Expressions anymore.

You should leave this mechanical task to AI.

Suggested Prompt: 1. Analyze the regex to identify its logical components.2. Break the regex into smaller, named sub-patterns for each component.3. Write unit tests for each sub-pattern to ensure it works correctly.4. Combine the tested sub-patterns into the full validation logic.5. Refactor the code to provide clear error messages for every failing part.

Level πŸ”‹

See also πŸ“š

Credits πŸ™

Image by Gerd Altmann on Pixabay


This article is part of the Refactoring Series.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article China unveils supersonic jet that blasts through skies 50% further than Concorde
Next Article Jeopardy! champ’s 6-day winning streak comes to crushing end in nail-biting game
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Last Chance to Win from 15,000 USDT in Round 2 of the Spacecoin Writing Contest | HackerNoon
Computing
Microsoft's 'Blue Screen of Death' Dies After 40 Years of Memes, Jokes, T-Shirts
News
The best Switch 2 screen protector you should buy
News
Senator Blackburn Pulls Support for AI Moratorium in Trump’s β€˜Big Beautiful Bill’ Amid Backlash
Gadget

You Might also Like

Computing

Last Chance to Win from 15,000 USDT in Round 2 of the Spacecoin Writing Contest | HackerNoon

5 Min Read
Computing

I Let an AI Manage My Diabetes β€” And It Knew Me Better Than I Knew Myself | HackerNoon

8 Min Read
Computing

How a New AI Model is Taming the Chaos of Time Series Data | HackerNoon

7 Min Read
Computing

Why Training on Time Series Beats Fine-Tuning LLMs for Time Series Tasks | HackerNoon

6 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright Β© All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?