Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors

Regular expresion to CFG

To convert a regular expression to a context-free grammar (CFG), you can follow a set of standard conversion rules.

Here are the rules for converting a regular expression to a CFG:

1. Terminal Symbols:

Each character or symbol in the regular expression becomes a terminal symbol in the CFG.

2. Start Symbol:

Create a new start symbol for the CFG.

3. Concatenation:

For every concatenation (ab) in the regular expression, add a new production rule in the CFG. The left-hand side of the rule should be a non-terminal symbol representing the concatenation, and the right-hand side should be the concatenation of the non-terminal symbols representing the individual characters/symbols.

4. Union:

For every union (a + b) in the regular expression, add a new production rule in the CFG. The left-hand side of the rule should be a non-terminal symbol representing the union, and the right-hand side should have two alternatives, one for each character/symbol in the union.

5. Kleene Star:

For every Kleene star (a*) in the regular expression, add a new production rule in the CFG. The left-hand side of the rule should be a non-terminal symbol representing the Kleene star, and the right-hand side should have two alternatives. One alternative should be the non-terminal symbol representing the character/symbol, followed by the non-terminal symbol representing the Kleene star itself. The other alternative should be ε (epsilon), indicating an empty string.

6. Parentheses:

If the regular expression contains parentheses, treat the contents within the parentheses as a separate expression and apply the conversion rules recursively.

By following these rules, you can convert a regular expression into an equivalent context-free grammar (CFG). Note that the resulting CFG may contain additional non-terminal symbols and production rules compared to the original regular expression.

Practice problem on Regular expresion to CFG

Q1. Write given CFG for R.E (011 + 1)*(01)* (RGPV 2020). 

Here’s the context-free grammar (CFG) for the given regular expression (011 + 1)(01):

Non-terminal symbols:
S: Start symbol
A: Represents the regular expression (011 + 1)*
B: Represents the regular expression (01)*

Terminal symbols:
0, 1: Binary digits

Production rules:

  1. S -> AB
  2. A -> 011A
  3. A -> 1A
  4. A -> ε (epsilon)
  5. B -> 01B
  6. B -> ε (epsilon)

Explanation of the CFG:

The CFG consists of two non-terminal symbols, A and B, which represent the two parts of the regular expression. Non-terminal symbol S is the start symbol, which represents the entire regular expression (011 + 1)(01).

The production rules define the valid derivations for each non-terminal symbol. Rule 1 states that the start symbol S can be derived as AB, meaning it can be split into parts A and B. Rule 2 defines that A can be derived as 011A, indicating that the regular expression (011 + 1) can repeat zero or more times. Rule 3 states that A can be derived as 1A, allowing the regular expression 1 to repeat zero or more times. Rule 4 allows A to be derived as ε (epsilon), indicating that it can be empty.

Similarly, Rule 5 defines that B can be derived as 01B, indicating that the regular expression 01 can repeat zero or more times. Rule 6 allows B to be derived as ε (epsilon), indicating that it can be empty.

By applying these production rules, you can generate strings that match the given regular expression.

Q2. Regular Expression: (a + b)*

A. CFG:

Start Symbol: S
Terminal Symbols: a, b
Production Rules:
S -> Sa
S -> Sb
S -> ε (epsilon)

Q3. Regular Expression: ab

A. CFG:

Start Symbol: S
Terminal Symbols: a, b
Production Rules:
S -> aS
S -> bS
S -> ε (epsilon)

Q4. Regular Expression: (aa + b)*c

A. CFG:

Start Symbol: S
Terminal Symbols: a, b, c
Production Rules:
S -> SS
S -> a
S -> b
S -> c

Q5. Regular Expression: (a + b)c*

A. CFG:

Start Symbol: S
Terminal Symbols: a, b, c
Production Rules:
S -> aS
S -> bS
S -> cS
S -> ε (epsilon)

Q6. Regular Expression: (abc*)+

A. CFG:

Start Symbol: S
Terminal Symbols: a, b, c
Production Rules:
S -> T+
T -> aT
T -> bT
T -> cT
T -> ε (epsilon)