Regular Expression (Regex) คืออะไร? สอน Regex ตั้งแต่เริ่มต้นสำหรับ Developer 2026

Regular Expression หรือที่เรียกสั้นๆ ว่า Regex เป็นหนึ่งในเครื่องมือที่ทรงพลังที่สุดสำหรับ Developer ในการจัดการกับข้อความ ไม่ว่าจะเป็นการค้นหา การตรวจสอบ การแทนที่ หรือการดึงข้อมูลจาก String ที่ซับซ้อน Regex ทำได้หมด

หลาย Developer กลัว Regex เพราะดูเหมือนภาษาลับ แต่จริงๆ แล้ว Regex มีตรรกะที่ชัดเจนและเรียนรู้ได้ บทความนี้จะสอน Regex ตั้งแต่ตัวอักษรแรกจนถึง Pattern ระดับสูงที่ใช้ในงานจริง พร้อมตัวอย่างในหลายภาษา

อ่านเพิ่ม: Kotlin Multiplatform (KMP) คืออะไร? สอนสร้าง Cross-Platform · อ่านเพิ่ม: Go Testing คืออะไร? สอน Unit Test, Table-Driven Test, Benchm · อ่านเพิ่ม: Database Indexing คืออะไร? สอนสร้าง Index และ Optimize Query

Regex คืออะไร?

Regular Expression (Regex) คือรูปแบบ (Pattern) ที่ใช้อธิบายชุดตัวอักษรที่ต้องการค้นหาหรือจับคู่ในข้อความ ถูกคิดค้นในทศวรรษ 1950 โดยนักคณิตศาสตร์ Stephen Kleene และนำมาใช้ในคอมพิวเตอร์ครั้งแรกโดย Ken Thompson ใน Editor ชื่อ ed บน Unix

ทุกวันนี้ Regex มีอยู่ในเกือบทุกภาษาโปรแกรม ตั้งแต่ Python, JavaScript, Go, Java, C#, PHP, Ruby ไปจนถึง Command Line Tools อย่าง grep, sed, awk การเข้าใจ Regex ทำให้คุณสามารถ:

ค้นหาข้อความที่ตรงตามรูปแบบที่กำหนด
ตรวจสอบข้อมูล (Validation) เช่น Email, เบอร์โทรศัพท์, URL
ดึงข้อมูล (Extract) จาก String ที่ซับซ้อน
แทนที่ข้อความ (Search & Replace) แบบ Pattern
แยกข้อมูล (Parse) จาก Log File, HTML, CSV
ทำความสะอาดข้อมูล (Data Cleaning)

Regex Syntax พื้นฐาน

Literal Characters — ตัวอักษรตรงตัว

วิธีง่ายที่สุดของ Regex คือการค้นหาตัวอักษรตรงตัว (Literal) เช่น Pattern hello จะจับคู่กับคำว่า "hello" ในข้อความ ตัวอักษรปกติทุกตัว (a-z, A-Z, 0-9) เป็น Literal Character

Metacharacters — ตัวอักษรพิเศษ

Metacharacters เป็นตัวอักษรที่มีความหมายพิเศษใน Regex ไม่ได้หมายถึงตัวมันเอง ได้แก่:

Metacharacter	ความหมาย	ตัวอย่าง
`.`	ตัวอักษรใดก็ได้ 1 ตัว (ยกเว้น newline)	`h.t` จับ "hat", "hot", "hit"
`^`	เริ่มต้นบรรทัด	`^Hello` จับ "Hello" ที่ขึ้นต้นบรรทัด
`$`	สิ้นสุดบรรทัด	`end$` จับ "end" ที่ท้ายบรรทัด
`\`	Escape — ทำให้ Metacharacter กลายเป็น Literal	`\.` จับจุด "." จริงๆ
`\|`	หรือ (Alternation)	`cat\|dog` จับ "cat" หรือ "dog"

จำไว้: ถ้าต้องการค้นหา Metacharacter ตรงตัว ต้องใส่ \ หน้ามัน เช่น ค้นหาจุด . ต้องใช้ \. ค้นหา $ ต้องใช้ \$

Character Classes — กลุ่มตัวอักษร

Character Class ช่วยให้กำหนดกลุ่มตัวอักษรที่ยอมรับได้ในตำแหน่งนั้น โดยใส่ไว้ใน []

Pattern	ความหมาย	ตัวอย่างที่จับ
`[abc]`	a หรือ b หรือ c	"a", "b", "c"
`[a-z]`	ตัวอักษรเล็ก a ถึง z	"a", "m", "z"
`[A-Z]`	ตัวอักษรใหญ่ A ถึง Z	"A", "M", "Z"
`[0-9]`	ตัวเลข 0 ถึง 9	"0", "5", "9"
`[a-zA-Z0-9]`	ตัวอักษรและตัวเลข	"a", "Z", "5"
`[^abc]`	ไม่ใช่ a, b, c (Negation)	"d", "1", "!"
`[^0-9]`	ไม่ใช่ตัวเลข	"a", "!", " "

Shorthand Character Classes

Regex มี Shorthand สำหรับ Character Class ที่ใช้บ่อย ทำให้เขียนสั้นลง:

Shorthand	เทียบเท่า	ความหมาย
`\d`	`[0-9]`	ตัวเลข (Digit)
`\D`	`[^0-9]`	ไม่ใช่ตัวเลข
`\w`	`[a-zA-Z0-9_]`	ตัวอักษร ตัวเลข หรือ _ (Word character)
`\W`	`[^a-zA-Z0-9_]`	ไม่ใช่ Word character
`\s`	`[ \t\n\r\f]`	Whitespace (เว้นวรรค, Tab, Newline)
`\S`	`[^ \t\n\r\f]`	ไม่ใช่ Whitespace

# ตัวอย่างการใช้ Character Class
\d\d\d        # จับตัวเลข 3 ตัว: "123", "456"
\w+           # จับคำ: "hello", "user_1"
\d{3}-\d{4}  # เบอร์โทร: "081-1234"
[aeiou]       # สระภาษาอังกฤษ
[ก-๙]         # ตัวอักษรไทย

Quantifiers — จำนวนที่ต้องการ

Quantifiers กำหนดว่าตัวอักษรหรือ Group ก่อนหน้ามันจะต้องปรากฏกี่ครั้ง:

Quantifier	ความหมาย	ตัวอย่าง
`*`	0 ครั้งหรือมากกว่า	`ab*c` จับ "ac", "abc", "abbc"
`+`	1 ครั้งหรือมากกว่า	`ab+c` จับ "abc", "abbc" (ไม่จับ "ac")
`?`	0 หรือ 1 ครั้ง (Optional)	`colou?r` จับ "color", "colour"
`{n}`	ตรง n ครั้ง	`\d{4}` จับตัวเลข 4 ตัว "2026"
`{n,}`	n ครั้งหรือมากกว่า	`\d{2,}` จับตัวเลข 2 ตัวขึ้นไป
`{n,m}`	n ถึง m ครั้ง	`\d{2,4}` จับตัวเลข 2-4 ตัว

Anchors — ตำแหน่ง

Anchors ไม่ได้จับตัวอักษร แต่กำหนดตำแหน่งในข้อความ:

Anchor	ความหมาย	ตัวอย่าง
`^`	เริ่มต้นบรรทัด (หรือ String)	`^Start` จับ "Start" ที่ขึ้นต้น
`$`	สิ้นสุดบรรทัด (หรือ String)	`end$` จับ "end" ที่ลงท้าย
`\b`	Word Boundary (ขอบคำ)	`\bcat\b` จับ "cat" ไม่จับ "cats"
`\B`	ไม่ใช่ Word Boundary	`\Bcat\B` จับ "concatenate"

# ตัวอย่าง Anchors
^Hello         # "Hello world" (match) | "Say Hello" (no match)
world$         # "Hello world" (match) | "world peace" (no match)
^Hello world$  # ต้องเป็น "Hello world" เป๊ะๆ เท่านั้น
\bword\b       # จับ "word" เป็นคำเดี่ยว ไม่จับ "password"

Groups and Capturing — การจัดกลุ่มและดึงค่า

วงเล็บ () ใน Regex ทำ 2 หน้าที่ คือ จัดกลุ่ม (Grouping) และจับค่า (Capturing)

Capturing Groups

Pattern: (\d{4})-(\d{2})-(\d{2})

Input: "2026-04-10"

Group 0 (ทั้งหมด): "2026-04-10"

Group 1: "2026"

Group 2: "04"

Group 3: "10"

Pattern: (https?://)([^/]+)(/.*)?

Input: "https://siamcafe.net/blog/page1"

Group 1: "https://"

Group 2: "siamcafe.net"

Group 3: "/blog/page1"

Non-Capturing Groups (?:)

บางครั้งต้องการ Group แต่ไม่ต้องการ Capture ค่า ใช้ (?:)

(?:http|https):// — Group แต่ไม่ Capture

เทียบกับ (http|https):// — Group และ Capture

ใช้เมื่อต้องการจัด Group สำหรับ Quantifier

(?:ab)+ # จับ "ab", "abab", "ababab"

Named Groups

ตั้งชื่อให้ Group เพื่อให้อ่านง่ายขึ้น:

Python syntax: (?P<name>pattern)

JavaScript/Go: (?<name>pattern)

Pattern: (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})

Input: "2026-04-10"

match.group("year") = "2026"

match.group("month") = "04"

match.group("day") = "10"

Backreferences

อ้างอิงกลับไปยัง Captured Group ก่อนหน้า:

\1 = อ้างถึง Group 1

Pattern: (\w+)\s+\1

จับคำซ้ำ: "the the", "is is"

Pattern: (['"])(.*?)\1

จับข้อความใน Quote (เปิด-ปิดตรงกัน)

"hello" หรือ 'hello' (ไม่จับ "hello')

Alternation — ตัวเลือก

| ทำหน้าที่เหมือน OR ใน Logic:

cat|dog — จับ "cat" หรือ "dog"

red|green|blue — จับ "red" หรือ "green" หรือ "blue"

(Mon|Tue|Wed|Thu|Fri|Sat|Sun)day — จับวันในสัปดาห์

\.(jpg|jpeg|png|gif|webp)$ — จับนามสกุลไฟล์ภาพ

Lookahead และ Lookbehind

Lookahead และ Lookbehind เป็น Zero-width Assertions คือตรวจสอบเงื่อนไขโดยไม่กิน (Consume) ตัวอักษร เหมาะกับการตรวจสอบบริบทรอบข้าง

Pattern	ชื่อ	ความหมาย
`(?=...)`	Positive Lookahead	ตามด้วย ... (แต่ไม่ Capture)
`(?!...)`	Negative Lookahead	ไม่ตามด้วย ...
`(?<=...)`	Positive Lookbehind	นำหน้าด้วย ...
`(?<!...)`	Negative Lookbehind	ไม่นำหน้าด้วย ...

Positive Lookahead: (?=...)

\d+(?= บาท) # จับตัวเลขที่ตามด้วย " บาท"

"100 บาท" → จับ "100"

Negative Lookahead: (?!...)

\d+(?!\d) # จับตัวเลขที่ไม่ตามด้วยตัวเลข (ตัวสุดท้าย)

Positive Lookbehind: (?<=...)

(?<=\$)\d+ # จับตัวเลขที่นำหน้าด้วย $

"$100" → จับ "100"

Negative Lookbehind: (?<!...)

(?<!\d)\d{3}(?!\d) # จับตัวเลข 3 ตัว ที่ไม่ได้อยู่ในตัวเลขยาวกว่า

ใช้ Lookahead ตรวจ Password Strength

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%]).{8,}$

ต้องมี: ตัวเล็ก, ตัวใหญ่, ตัวเลข, อักขระพิเศษ, ยาว 8+ ตัว

Greedy vs Lazy Matching

โดยปกติ Quantifier ของ Regex จะเป็น Greedy คือจับให้มากที่สุดเท่าที่ได้ เพิ่ม ? หลัง Quantifier เพื่อทำให้เป็น Lazy คือจับให้น้อยที่สุด

Greedy (default) — จับมากที่สุด

Pattern: <.*>

Input: "hello"

Match: "hello" (ทั้งหมด!)

Lazy — จับน้อยที่สุด

Pattern: <.*?>

Input: "hello"

Match 1: ""

Match 2: ""

ตัวอย่างอื่น

".*" → Greedy: จับจาก " แรกถึง " สุดท้าย

".*?" → Lazy: จับจาก " แรกถึง " ที่ใกล้ที่สุด

Quantifiers แบบ Lazy:

*? — 0+ (น้อยที่สุด)

+? — 1+ (น้อยที่สุด)

?? — 0 หรือ 1 (เลือก 0)

{n,m}? — n ถึง m (เลือก n)

กฎทอง: เมื่อทำงานกับ HTML หรือ XML ใช้ Lazy .*? แทน Greedy .* เสมอ เพื่อป้องกันการจับ Tag ข้ามไป แต่ที่ดีกว่าคือใช้ HTML Parser แทน Regex สำหรับ Parse HTML

Common Regex Patterns — Pattern ที่ใช้บ่อย

รวม Regex Pattern ที่ใช้งานจริงในการพัฒนาซอฟต์แวร์:

Email Validation

Basic

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

ตัวอย่างที่จับ: user@example.com, first.last@company.co.th

ตัวอย่างที่ไม่จับ: @example.com, user@, user@.com

Phone Number (Thailand)

เบอร์มือถือไทย

^0[689]\d{8}$

จับ: 0812345678, 0912345678, 0612345678

เบอร์มือถือ (มี -)

^0[689]\d-\d{3}-\d{4}$

จับ: 081-234-5678

ทุกรูปแบบ

^0[689]\d[-]?\d{3}[-]?\d{4}$

URL

# Basic URL
^https?://[^\s/$.?#].[^\s]*$

# URL พร้อม Protocol, Domain, Path, Query
^(https?://)([\w.-]+)(:[0-9]+)?(/[^\s]*)?$

IP Address

# IPv4 (Basic — อนุญาตตัวเลขเกิน 255)
^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$

# IPv4 (Strict — 0-255)
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.?){4}$

Date Format

# YYYY-MM-DD
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

# DD/MM/YYYY
^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/\d{4}$

Thai ID Card

# บัตรประชาชนไทย 13 หลัก
^[1-8]\d{12}$

Regex ใน Python (re Module)

Python มี Module re ในตัวที่รองรับ Regex อย่างสมบูรณ์:

import re

text = "Contact: somchai@email.com or call 081-234-5678"

# re.search() — ค้นหา Pattern แรกที่เจอ
match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if match:
    print(match.group())  # "081-234-5678"

# re.findall() — ค้นหาทั้งหมด คืนเป็น List
emails = re.findall(r'[\w.+-]+@[\w-]+\.[\w.]+', text)
print(emails)  # ['somchai@email.com']

# re.match() — ค้นหาที่จุดเริ่มต้น String เท่านั้น
if re.match(r'^Contact', text):
    print("Starts with Contact")

# re.sub() — แทนที่
cleaned = re.sub(r'\d{3}-\d{3}-\d{4}', '[REDACTED]', text)
print(cleaned)  # "Contact: somchai@email.com or call [REDACTED]"

# re.split() — แยก String ด้วย Pattern
parts = re.split(r'[,;\s]+', "apple, banana; cherry  date")
print(parts)  # ['apple', 'banana', 'cherry', 'date']

# Named Groups
pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
match = re.search(pattern, "Date: 2026-04-10")
if match:
    print(match.group("year"))   # "2026"
    print(match.group("month"))  # "04"
    print(match.group("day"))    # "10"

# Compile Pattern (เร็วกว่าเมื่อใช้ซ้ำหลายครั้ง)
email_pattern = re.compile(r'[\w.+-]+@[\w-]+\.[\w.]+')
emails = email_pattern.findall(text)

# Flags
re.IGNORECASE  # ไม่สนตัวเล็กใหญ่ (re.I)
re.MULTILINE   # ^ และ $ ทำงานทุกบรรทัด (re.M)
re.DOTALL      # . จับ newline ด้วย (re.S)
re.VERBOSE     # อนุญาต Comment ใน Pattern (re.X)

# ตัวอย่าง VERBOSE — เขียน Regex ให้อ่านง่าย
pattern = re.compile(r"""
    ^                   # เริ่มต้น String
    (?P<protocol>https?://)  # Protocol
    (?P<domain>[\w.-]+)      # Domain name
    (?P<port>:\d+)?          # Port (optional)
    (?P<path>/[^\s]*)?       # Path (optional)
    $                   # สิ้นสุด String
""", re.VERBOSE)

match = pattern.search("https://siamcafe.net:8080/blog/page1")
print(match.group("domain"))  # "siamcafe.net"
print(match.group("port"))    # ":8080"

Regex ใน JavaScript

// สร้าง Regex
const pattern1 = /\d+/g;                  // Literal syntax
const pattern2 = new RegExp('\\d+', 'g');  // Constructor

const text = "Order #123 and Order #456";

// test() — ตรวจว่า Match หรือไม่
console.log(/\d+/.test(text));  // true

// match() — ค้นหา Match ทั้งหมด
const matches = text.match(/\d+/g);
console.log(matches);  // ["123", "456"]

// matchAll() — ค้นหาพร้อม Group (ES2020)
const regex = /Order #(?<num>\d+)/g;
for (const match of text.matchAll(regex)) {
    console.log(match.groups.num);  // "123", "456"
}

// replace() — แทนที่
const result = text.replace(/\d+/g, 'XXX');
console.log(result);  // "Order #XXX and Order #XXX"

// replace() พร้อม Function
const result2 = text.replace(/\d+/g, (match) => {
    return String(Number(match) * 2);
});
console.log(result2);  // "Order #246 and Order #912"

// split()
"a,b,,c".split(/,+/);  // ["a", "b", "c"]

// Flags
// g — global (ค้นหาทั้งหมด)
// i — case-insensitive
// m — multiline
// s — dotAll (. จับ newline)
// u — unicode
// d — hasIndices (ES2022)

Regex ใน Go

package main

import (
    "fmt"
    "regexp"
)

func main() {
    text := "Contact: somchai@email.com or 081-234-5678"

    // Compile Pattern
    emailRe := regexp.MustCompile(`[\w.+-]+@[\w-]+\.[\w.]+`)
    phoneRe := regexp.MustCompile(`\d{3}-\d{3}-\d{4}`)

    // FindString — หาอันแรก
    email := emailRe.FindString(text)
    fmt.Println(email)  // "somchai@email.com"

    // FindAllString — หาทั้งหมด
    phones := phoneRe.FindAllString(text, -1)
    fmt.Println(phones)  // ["081-234-5678"]

    // MatchString — ตรวจว่า Match
    matched := emailRe.MatchString(text)
    fmt.Println(matched)  // true

    // ReplaceAllString — แทนที่
    cleaned := phoneRe.ReplaceAllString(text, "[PHONE]")
    fmt.Println(cleaned)

    // Named Groups
    dateRe := regexp.MustCompile(`(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})`)
    match := dateRe.FindStringSubmatch("Date: 2026-04-10")
    for i, name := range dateRe.SubexpNames() {
        if name != "" {
            fmt.Printf("%s: %s\n", name, match[i])
        }
    }
}

หมายเหตุ: Go ใช้ RE2 Engine ที่ไม่รองรับ Lookahead/Lookbehind เพื่อรับประกันว่า Regex จะทำงานใน O(n) เสมอ ไม่มี Catastrophic Backtracking ถ้าต้องการ Lookahead ใน Go ต้องใช้ Library อื่นเช่น github.com/dlclark/regexp2

Regex ใน Command Line

grep — ค้นหาในไฟล์

# ค้นหา Pattern ในไฟล์
grep "error" app.log
grep -i "warning" app.log          # Case-insensitive
grep -n "TODO" *.py                # แสดงเลขบรรทัด
grep -r "import os" ./src/         # ค้นหา Recursive
grep -E "\d{3}-\d{4}" data.txt  # Extended Regex (-E)
grep -P "(?<=price: )\d+" data.txt # Perl Regex (-P, มี Lookahead)
grep -c "error" app.log            # นับจำนวน Match
grep -v "debug" app.log            # Inverse (ไม่ตรง Pattern)
grep -o "\d+" data.txt             # แสดงเฉพาะส่วนที่ Match

sed — ค้นหาและแทนที่

# แทนที่ (s/pattern/replacement/)
sed 's/old/new/' file.txt          # แทนที่ตัวแรก
sed 's/old/new/g' file.txt         # แทนที่ทั้งหมด (Global)
sed -i 's/old/new/g' file.txt      # แก้ไขไฟล์จริง (In-place)
sed 's/[0-9]\+/NUM/g' file.txt     # แทนที่ตัวเลขทั้งหมด

# ลบบรรทัดที่ Match
sed '/^#/d' config.txt             # ลบ Comment lines
sed '/^$/d' file.txt               # ลบบรรทัดว่าง

# ดึงเฉพาะบรรทัดที่ต้องการ
sed -n '10,20p' file.txt           # บรรทัด 10-20
sed -n '/error/p' app.log          # เฉพาะบรรทัดที่มี error

awk — Text Processing ขั้นสูง

# พิมพ์ Column ที่ต้องการ
awk '{print $1, $3}' data.txt     # Column 1 และ 3

# กรองด้วย Pattern
awk '/error/ {print $0}' app.log  # บรรทัดที่มี error

# คำนวณ
awk '{sum += $2} END {print sum}' data.txt  # รวมค่า Column 2

# Custom Separator
awk -F',' '{print $1}' data.csv   # ใช้ , เป็น Separator

# Regex Match
awk '$3 ~ /^[0-9]+$/ {print $1, $3}' data.txt  # Column 3 เป็นตัวเลข

Regex Performance — ประสิทธิภาพและความปลอดภัย

Catastrophic Backtracking

เป็นปัญหาสำคัญที่ Regex Engine แบบ NFA (ที่ใช้ใน Python, JavaScript, Java, C#) อาจทำงานช้ามาก (Exponential Time) เมื่อเจอ Pattern บางแบบกับ Input บางอย่าง

ตัวอย่าง Pattern ที่อันตราย

(a+)+$ # ถ้า Input เป็น "aaaaaaaaaaaaaaa!" จะช้ามาก

(a|a)+$ # เช่นเดียวกัน

(.*a){10} # Nested Quantifiers

ทำไมถึงช้า?

Regex Engine ลอง Backtrack ทุกความเป็นไปได้

"aaaa!" กับ (a+)+$ → ลอง: (aaaa), (aaa)(a), (aa)(aa), (aa)(a)(a), ...

จำนวน Combinations เพิ่มแบบ Exponential ตามความยาว Input

ReDoS — Regular Expression Denial of Service

ReDoS คือการโจมตีที่ส่ง Input ที่ทำให้ Regex ทำงานช้ามาก จนทำให้ Server ค้าง เป็นภัยคุกคามจริงที่เกิดขึ้นกับหลายเว็บไซต์ดัง วิธีป้องกันมีดังนี้:

หลีกเลี่ยง Nested Quantifiers — เช่น (a+)+, (a*)*, (a|b)*
ใช้ Atomic Groups/Possessive Quantifiers — ป้องกัน Backtracking เช่น (?>a+) หรือ a++
จำกัดความยาว Input — ตรวจสอบความยาว String ก่อนส่งเข้า Regex
ตั้ง Timeout — กำหนดเวลาสูงสุดที่ Regex จะทำงาน
ใช้ RE2 Engine — เช่น Go ที่ใช้ RE2 ไม่มีปัญหา Backtracking
Test Pattern — ใช้เครื่องมือตรวจสอบ เช่น regex101 ที่มี Debugger

# Python — ตั้ง Timeout สำหรับ Regex
import signal

def timeout_handler(signum, frame):
    raise TimeoutError("Regex timeout")

signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(5)  # Timeout 5 วินาที

try:
    re.search(pattern, user_input)
except TimeoutError:
    print("Regex took too long - possible ReDoS")

Regex Debugging Tools

เครื่องมือที่ช่วยเขียนและทดสอบ Regex:

regex101.com

เว็บไซต์ที่ดีที่สุดสำหรับทดสอบ Regex มีฟีเจอร์ครบ:

รองรับหลาย Flavor (Python, JavaScript, Go, Java, PHP)
อธิบายทุก Token ของ Pattern
แสดง Match Information ละเอียด
มี Debugger ที่แสดง Step-by-step การทำงาน
บันทึกและแชร์ Pattern ได้
มี Quick Reference สำหรับ Syntax

Regexr.com

อีกเว็บไซต์ที่ดี มี UI สวยงามและใช้ง่าย มี Community Patterns ให้ดูตัวอย่าง เหมาะกับผู้เริ่มต้น

IDE Built-in

IDE สมัยใหม่ทุกตัว (VS Code, IntelliJ, Sublime Text) มี Find & Replace ที่รองรับ Regex ในตัว ใช้ Ctrl+H เปิด Replace แล้วเปิดโหมด Regex เพื่อ Search & Replace ด้วย Regex ได้ทันที

Practical Use Cases — การใช้งานจริง

Log Parsing

import re

# Parse Apache/Nginx Log
log_pattern = re.compile(
    r'(?P<ip>[\d.]+) - - '
    r'\[(?P<date>[^\]]+)\] '
    r'"(?P<method>\w+) (?P<url>[^"]+) HTTP/[\d.]+" '
    r'(?P<status>\d+) (?P<size>\d+)'
)

log_line = '192.168.1.1 - - [10/Apr/2026:13:55:36 +0700] "GET /api/users HTTP/1.1" 200 1234'
match = log_pattern.search(log_line)
if match:
    print(f"IP: {match.group('ip')}")
    print(f"URL: {match.group('url')}")
    print(f"Status: {match.group('status')}")

Data Validation

def validate_input(data):
    validators = {
        'email': r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$',
        'phone_th': r'^0[689]\d{8}$',
        'thai_id': r'^[1-8]\d{12}$',
        'url': r'^https?://[^\s/$.?#].[^\s]*$',
        'password': r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$',
        'date_iso': r'^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$',
        'ipv4': r'^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.?){4}$',
    }

    errors = []
    for field, pattern in validators.items():
        if field in data:
            if not re.match(pattern, data[field]):
                errors.append(f"Invalid {field}: {data[field]}")
    return errors

Search and Replace

# เปลี่ยนรูปแบบวันที่ MM/DD/YYYY → YYYY-MM-DD
text = "Date: 04/10/2026, Due: 05/15/2026"
result = re.sub(
    r'(\d{2})/(\d{2})/(\d{4})',
    r'\3-\1-\2',
    text
)
print(result)  # "Date: 2026-04-10, Due: 2026-05-15"

# ลบ HTML Tags
html = "<p>Hello <b>world</b></p>"
text = re.sub(r'<[^>]+>', '', html)
print(text)  # "Hello world"

# Mask ข้อมูลส่วนตัว
text = "Card: 4111-1111-1111-1111"
masked = re.sub(r'(\d{4}-)(\d{4}-)(\d{4}-)(\d{4})', r'\1****-****-\4', text)
print(masked)  # "Card: 4111-****-****-1111"

Web Scraping (Simple)

# ดึง URL จากข้อความ
text = "Visit https://siamcafe.net and http://example.com/page?id=1"
urls = re.findall(r'https?://[^\s<>"]+', text)
print(urls)  # ['https://siamcafe.net', 'http://example.com/page?id=1']

# ดึงราคาจากข้อความ
text = "iPhone 50,000 บาท, AirPods 5,990 บาท"
prices = re.findall(r'([\d,]+)\s*บาท', text)
print(prices)  # ['50,000', '5,990']

# ดึง Hashtag
text = "สวัสดี #python #regex #programming"
hashtags = re.findall(r'#(\w+)', text)
print(hashtags)  # ['python', 'regex', 'programming']

Regex Best Practices

แนวปฏิบัติที่ดีในการเขียนและใช้งาน Regex:

เริ่มจาก Simple Pattern ก่อน — เขียน Pattern ง่ายๆ ก่อน แล้วค่อยเพิ่มความซับซ้อน อย่าพยายามเขียน Pattern ที่ Perfect ตั้งแต่แรก
ใช้ Verbose Mode — เขียน Comment อธิบาย Pattern โดยเฉพาะ Pattern ที่ยาว (Python: re.VERBOSE)
Test กับ Edge Cases — ทดสอบกับ Input ที่เป็น Empty string, Input ยาวมาก, Input ที่มีอักขระพิเศษ, Input ที่ไม่ควร Match
หลีกเลี่ยง Catastrophic Backtracking — ระวัง Nested Quantifiers เช่น (a+)+
Compile Pattern ที่ใช้ซ้ำ — ใน Python ใช้ re.compile() เมื่อใช้ Pattern เดิมหลายครั้ง จะเร็วกว่าอย่างเห็นได้ชัด
ใช้ Raw String — ใน Python ใช้ r'pattern' เสมอ เพื่อป้องกัน Backslash ถูก Interpret โดย Python
ใช้ Named Groups — เมื่อมีหลาย Groups ใช้ Named Groups แทนตัวเลข เพื่อให้อ่านง่าย
อย่าใช้ Regex สำหรับทุกอย่าง — บางงานใช้ String Method ธรรมดา (split, startswith, endswith) เร็วกว่าและอ่านง่ายกว่า
อย่า Parse HTML ด้วย Regex — ใช้ HTML Parser เช่น BeautifulSoup, cheerio, goquery แทน Regex ไม่สามารถ Parse HTML ที่ซ้อนกัน (Nested) ได้ถูกต้อง
Document Pattern — Regex ที่ซับซ้อนควรมี Comment อธิบายว่าจับอะไร พร้อมตัวอย่าง Input ที่ Match และไม่ Match

สรุป

Regular Expression เป็นเครื่องมือที่ทุก Developer ควรเรียนรู้ ไม่จำเป็นต้องจำ Syntax ทั้งหมด แค่เข้าใจแนวคิดพื้นฐานและรู้ว่าจะค้นหา Reference ได้จากไหน สิ่งสำคัญคือการฝึกฝนเขียน Pattern กับข้อมูลจริงบ่อยๆ

เริ่มต้นด้วย Pattern ง่ายๆ เช่น ค้นหาเบอร์โทร ค้นหา Email แล้วค่อยเพิ่มความซับซ้อน ใช้ regex101.com เป็นเพื่อนคู่ใจ และอย่าลืมว่า Regex ไม่ใช่คำตอบสำหรับทุกปัญหา บางงานใช้ String Method ธรรมดาหรือ Parser เฉพาะทางจะดีกว่า

ในปี 2026 ทักษะ Regex ยังคงสำคัญเหมือนเดิม ไม่ว่า AI จะก้าวหน้าแค่ไหน การเข้าใจว่า Pattern Matching ทำงานอย่างไรจะช่วยให้คุณเป็น Developer ที่แก้ปัญหาได้เร็วขึ้นอย่างแน่นอน

Regular Expression (Regex) คืออะไร? สอน Regex ตั้งแต่เริ่มต้นสำหรับ Developer 2026

Regex คืออะไร?

Regex Syntax พื้นฐาน

Literal Characters — ตัวอักษรตรงตัว

Metacharacters — ตัวอักษรพิเศษ

Character Classes — กลุ่มตัวอักษร

Shorthand Character Classes

Quantifiers — จำนวนที่ต้องการ

Anchors — ตำแหน่ง

Groups and Capturing — การจัดกลุ่มและดึงค่า

Capturing Groups

Non-Capturing Groups (?:)

Named Groups

Backreferences

Alternation — ตัวเลือก

Lookahead และ Lookbehind

Greedy vs Lazy Matching

Quantifiers แบบ Lazy:

Common Regex Patterns — Pattern ที่ใช้บ่อย

Email Validation

Phone Number (Thailand)

URL

IP Address

Date Format

Thai ID Card

Regex ใน Python (re Module)

Regex ใน JavaScript

Regex ใน Go

Regex ใน Command Line

grep — ค้นหาในไฟล์

sed — ค้นหาและแทนที่

awk — Text Processing ขั้นสูง

Regex Performance — ประสิทธิภาพและความปลอดภัย

Catastrophic Backtracking

ReDoS — Regular Expression Denial of Service

Regex Debugging Tools

regex101.com

Regexr.com

IDE Built-in

Practical Use Cases — การใช้งานจริง

Log Parsing

Data Validation

Search and Replace

Web Scraping (Simple)

Regex Best Practices

สรุป

บทความที่เกี่ยวข้อง