Problem
Design a command-line tool that:
- takes a target word
- scans a folder of Shakespeare text files
- reports, for each file:
- how many times the word appears
- which line numbers contain the word
This is essentially a simplified grep + counting utility.
Approach
- Traverse files in the directory.
- Read each file line-by-line.
- Match the word (optionally case-insensitive).
- Track counts and line numbers.
Python (reference)
import os
import re
def search_in_file(path: str, pattern: re.Pattern):
count = 0
lines = []
with open(path, 'r', encoding='utf-8', errors='ignore') as f:
for i, line in enumerate(f, 1):
hits = pattern.findall(line)
if hits:
count += len(hits)
lines.append(i)
return count, lines
def search_in_folder(folder: str, word: str, case_sensitive: bool = False):
flags = 0 if case_sensitive else re.IGNORECASE
pattern = re.compile(r'\\b' + re.escape(word) + r'\\b', flags)
result = {}
for root, _, files in os.walk(folder):
for name in files:
if not name.endswith('.txt'):
continue
path = os.path.join(root, name)
c, ls = search_in_file(path, pattern)
if c > 0:
result[path] = {"count": c, "lines": ls}
return result
Complexity
- Time: proportional to total text size scanned
- Space: proportional to number of matched lines stored
Need interview help? Contact OA VO Service
- 📧 Email: [email protected]
- 📱 Phone: +86 17863968105
Need real interview questions? Contact WeChat: Coding0201 to get the questions.