Array Number Frequency Statistics Algorithm Optimization - In-Place Marking & Counting Strategies | OAVOService Interview Support

Problem Background

In data analysis and system monitoring, frequency statistics is a fundamental and important operation. Especially in resource-constrained embedded systems or high-concurrency scenarios, how to efficiently count the occurrence frequency of numbers within a given range directly affects system performance.

OAVOService Interview Support Team has found through long-term big tech algorithm interview coaching that frequency statistics problems frequently appear in programming interviews at Google, Amazon, Microsoft, and other companies, serving as classic topics to assess candidates' understanding of time-space tradeoffs and array manipulation techniques.

Problem Description

Problem 3 — Print Frequency of Numbers from 1 to N

Given an array of N integers containing only integers from 1 to N. Some numbers may appear multiple times, some numbers may not appear.

Write code to print out the count of times each number from 1 to N appears in the array.

Examples:

PrintFrequency(new int[] {3, 1, 2});
Output:
1 : 1
2 : 1
3 : 1

PrintFrequency(new int[] {2, 4, 4, 6, 6, 6});
Output:
1 : 0
2 : 1
3 : 0
4 : 2
5 : 0
6 : 3

Core Assessment Points:

Array traversal and counting algorithms
Space complexity optimization techniques
In-place array modification strategies
Boundary condition handling

Solution Analysis

Solution 1: Standard Counting Array (Recommended)

def print_frequency_standard(arr):
    """
    Standard counting array solution
    Time Complexity: O(n)
    Space Complexity: O(n)
    """
    if not arr:
        return
    
    n = len(arr)
    # Create counting array, index 0 corresponds to number 1
    count = [0] * n
    
    # Count frequency of each number
    for num in arr:
        if 1 <= num <= n:  # Boundary check
            count[num - 1] += 1
    
    # Print results
    for i in range(n):
        print(f"{i + 1} : {count[i]}")

# Optimized version: support arbitrary range
def print_frequency_range(arr, min_val, max_val):
    """
    Frequency statistics supporting arbitrary number ranges
    """
    count = [0] * (max_val - min_val + 1)
    
    for num in arr:
        if min_val <= num <= max_val:
            count[num - min_val] += 1
    
    for i in range(len(count)):
        print(f"{i + min_val} : {count[i]}")

Solution 2: In-Place Marking Method (Advanced Optimization)

def print_frequency_inplace(arr):
    """
    In-place marking method - O(1) extra space
    Time Complexity: O(n)
    Space Complexity: O(1)
    
    Core idea: Use positive/negative signs of array elements to mark access status
    """
    if not arr:
        return
    
    n = len(arr)
    
    # Step 1: Mark appearing numbers
    for i in range(n):
        # Get current number (may already be marked as negative)
        num = abs(arr[i])
        
        if 1 <= num <= n:
            # Mark corresponding position as negative (indicates number appeared)
            if arr[num - 1] > 0:
                arr[num - 1] = -arr[num - 1]
    
    # Step 2: Count frequencies
    # Re-traverse to calculate occurrence count of each number
    frequency = [0] * n
    for num in arr:
        abs_num = abs(num)
        if 1 <= abs_num <= n:
            frequency[abs_num - 1] += 1
    
    # Step 3: Restore original array (optional)
    for i in range(n):
        arr[i] = abs(arr[i])
    
    # Print results
    for i in range(n):
        print(f"{i + 1} : {frequency[i]}")

Solution 3: Hash Table Method (Universal)

from collections import defaultdict, Counter

def print_frequency_hash(arr):
    """
    Hash table solution - suitable for arbitrary number ranges
    Time Complexity: O(n)
    Space Complexity: O(k), where k is the number of distinct numbers
    """
    if not arr:
        return
    
    # Method 1: Using Counter
    frequency = Counter(arr)
    n = len(arr)
    
    for i in range(1, n + 1):
        print(f"{i} : {frequency.get(i, 0)}")
    
    # Method 2: Manual counting
    # count_map = defaultdict(int)
    # for num in arr:
    #     count_map[num] += 1
    
    # for i in range(1, n + 1):
    #     print(f"{i} : {count_map[i]}")

Advanced Optimization Strategies

1. Chunked Processing for Large Arrays

def print_frequency_chunked(arr, chunk_size=1000):
    """
    Process large arrays in chunks to reduce memory peak
    Suitable for memory-constrained scenarios
    """
    if not arr:
        return
    
    n = len(arr)
    total_count = [0] * n
    
    # Process in chunks
    for start in range(0, len(arr), chunk_size):
        end = min(start + chunk_size, len(arr))
        chunk = arr[start:end]
        
        # Process current chunk
        chunk_count = [0] * n
        for num in chunk:
            if 1 <= num <= n:
                chunk_count[num - 1] += 1
        
        # Accumulate to total count
        for i in range(n):
            total_count[i] += chunk_count[i]
    
    # Print results
    for i in range(n):
        print(f"{i + 1} : {total_count[i]}")

2. Parallel Processing Optimization

from concurrent.futures import ThreadPoolExecutor
import threading

def print_frequency_parallel(arr, num_threads=4):
    """
    Multi-threaded parallel processing
    Suitable for frequency statistics of large arrays
    """
    if not arr:
        return
    
    n = len(arr)
    chunk_size = len(arr) // num_threads
    results = [None] * num_threads
    lock = threading.Lock()
    
    def process_chunk(thread_id, start_idx, end_idx):
        chunk_count = [0] * n
        for i in range(start_idx, end_idx):
            num = arr[i]
            if 1 <= num <= n:
                chunk_count[num - 1] += 1
        
        with lock:
            results[thread_id] = chunk_count
    
    # Start threads
    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        futures = []
        for i in range(num_threads):
            start = i * chunk_size
            end = len(arr) if i == num_threads - 1 else (i + 1) * chunk_size
            future = executor.submit(process_chunk, i, start, end)
            futures.append(future)
        
        # Wait for all threads to complete
        for future in futures:
            future.result()
    
    # Merge results
    total_count = [0] * n
    for chunk_result in results:
        if chunk_result:
            for i in range(n):
                total_count[i] += chunk_result[i]
    
    # Print results
    for i in range(n):
        print(f"{i + 1} : {total_count[i]}")

3. Streaming Processing for Large Files

def print_frequency_streaming(file_path):
    """
    Stream processing for number frequencies in large files
    Suitable for datasets too large to load into memory at once
    """
    frequency = {}
    max_num = 0
    
    with open(file_path, 'r') as file:
        for line in file:
            numbers = map(int, line.strip().split())
            for num in numbers:
                frequency[num] = frequency.get(num, 0) + 1
                max_num = max(max_num, num)
    
    # Print results
    for i in range(1, max_num + 1):
        print(f"{i} : {frequency.get(i, 0)}")

Performance Benchmark Testing

import time
import random

def benchmark_frequency_methods():
    """
    Performance benchmark testing
    """
    # Generate test data
    test_sizes = [1000, 10000, 100000]
    
    for size in test_sizes:
        arr = [random.randint(1, size) for _ in range(size)]
        
        print(f"\nTest array size: {size}")
        
        # Test standard counting method
        start_time = time.time()
        print_frequency_standard(arr.copy())
        standard_time = time.time() - start_time
        
        # Test in-place marking method
        start_time = time.time()
        print_frequency_inplace(arr.copy())
        inplace_time = time.time() - start_time
        
        # Test hash table method
        start_time = time.time()
        print_frequency_hash(arr.copy())
        hash_time = time.time() - start_time
        
        print(f"Standard counting: {standard_time:.4f}s")
        print(f"In-place marking: {inplace_time:.4f}s")
        print(f"Hash table: {hash_time:.4f}s")

OAVOService Interview Tips

Common Interview Variations

Space complexity constraints - Require O(1) space solution
Handle negative/zero numbers - Extend to arbitrary integer ranges
Top-K frequencies - Output only the K most frequent numbers
Real-time updates - Support dynamic addition/removal of numbers

Interview Discussion Points

Algorithm selection criteria - Choose appropriate algorithm based on data characteristics
Boundary condition handling - Empty arrays, duplicate elements, out-of-bounds values
Scalability considerations - How to support larger data scales
Practical application scenarios - Log analysis, user behavior statistics, etc.

Complexity Comparison

Method	Time Complexity	Space Complexity	Use Case
Standard Counting	O(n)	O(n)	Known and small number range
In-Place Marking	O(n)	O(1)	Space-constrained, small range
Hash Table	O(n)	O(k)	Large range, sparse distribution
Chunked Processing	O(n)	O(n/chunks)	Memory-limited large arrays

OAVOService Professional Interview Support - We deeply analyze various algorithm interview topics, providing real-time interview assistance for Google, Amazon, Microsoft, Meta, and other top companies. With rich interview experience and algorithm optimization techniques, we help you stand out in programming interviews.

If you're preparing for algorithm interviews, learn about our customized algorithm interview training solutions — from basic data structures to advanced optimization techniques, comprehensively enhancing your programming interview competitiveness.

Tags: Frequency Statistics, Array Algorithms, Space Optimization, In-Place Operations, Algorithm Interviews, OAVOService