Canvas Signature Extract - Building Computer Vision for Cheque Clearing

During my time at Ensemble Matrix, I worked on a fascinating project: automating cheque clearing using computer vision. One of the most challenging parts was extracting signatures from scanned cheques for verification. This involved building a signature extraction tool that could isolate signatures from noisy backgrounds, handle various cheque formats, and work reliably with real bank data.

The Challenge

Banks process thousands of cheques daily. Manual signature verification is slow, error-prone, and expensive. We needed to build a system that could automatically extract signatures from scanned cheques and compare them with stored signature specimens for verification.

Understanding the Problem

Signature extraction from cheques isn’t just about finding ink on paper. Real-world cheques have:

Challenges

Background noise (logos, watermarks, lines)
Varying paper quality and scanning resolution
Multiple signatures on one cheque
Overlapping text and signature areas
Different ink colors and intensities
Skewed or rotated scanned images

Requirements

Accurate signature boundary detection
Background noise removal
Support for multiple image formats
Real-time processing capability
Consistent output quality
Integration with existing bank systems

Technical Architecture

Our signature extraction system had three main components: image preprocessing, signature detection, and extraction. Here’s how we structured it:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Image Upload   │───▶│   Preprocessing  │───▶│   Signature     │
│   (Frontend)    │    │    (Canvas)      │    │   Detection     │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │                        │
                                ▼                        ▼
                       ┌──────────────────┐    ┌─────────────────┐
                       │  Noise Removal   │    │   Boundary      │
                       │   & Filtering    │    │   Detection     │
                       └──────────────────┘    └─────────────────┘
                                │                        │
                                ▼                        ▼
                       ┌──────────────────┐    ┌─────────────────┐
                       │    Signature     │───▶│   Final Output  │
                       │   Extraction     │    │   (Clean Sig)   │
                       └──────────────────┘    └─────────────────┘

HTML5 Canvas Setup

We used HTML5 Canvas for client-side image processing. This allowed real-time preview and reduced server load:

class SignatureExtractor {
    constructor(canvasId) {
        this.canvas = document.getElementById(canvasId);
        this.ctx = this.canvas.getContext('2d');
        this.originalImageData = null;
        this.processedImageData = null;
        
        // Processing parameters
        this.threshold = 128;
        this.minSignatureArea = 1000;
        this.maxSignatureArea = 50000;
        this.noiseThreshold = 5;
    }
    
    // Load image from file input
    loadImage(file) {
        return new Promise((resolve, reject) => {
            const img = new Image();
            img.onload = () => {
                // Resize canvas to match image
                this.canvas.width = img.width;
                this.canvas.height = img.height;
                
                // Draw image to canvas
                this.ctx.drawImage(img, 0, 0);
                
                // Store original image data
                this.originalImageData = this.ctx.getImageData(0, 0, img.width, img.height);
                
                resolve(img);
            };
            img.onerror = reject;
            img.src = URL.createObjectURL(file);
        });
    }
    
    // Reset to original image
    reset() {
        if (this.originalImageData) {
            this.ctx.putImageData(this.originalImageData, 0, 0);
        }
    }
}

Image Preprocessing Pipeline

Step 1: Grayscale Conversion

// Convert image to grayscale
convertToGrayscale() {
    const imageData = this.ctx.getImageData(0, 0, this.canvas.width, this.canvas.height);
    const data = imageData.data;
    
    for (let i = 0; i < data.length; i += 4) {
        // Calculate grayscale value using luminance formula
        const gray = Math.round(0.299 * data[i] + 0.587 * data[i + 1] + 0.114 * data[i + 2]);
        
        data[i] = gray;     // Red
        data[i + 1] = gray; // Green
        data[i + 2] = gray; // Blue
        // Alpha channel (data[i + 3]) remains unchanged
    }
    
    this.ctx.putImageData(imageData, 0, 0);
    return imageData;
}

Step 2: Noise Reduction

// Apply Gaussian blur to reduce noise
applyGaussianBlur(radius = 1) {
    const imageData = this.ctx.getImageData(0, 0, this.canvas.width, this.canvas.height);
    const data = imageData.data;
    const width = this.canvas.width;
    const height = this.canvas.height;
    
    // Create Gaussian kernel
    const kernel = this.createGaussianKernel(radius);
    const kernelSize = kernel.length;
    const half = Math.floor(kernelSize / 2);
    
    const newData = new Uint8ClampedArray(data);
    
    for (let y = half; y < height - half; y++) {
        for (let x = half; x < width - half; x++) {
            let r = 0, g = 0, b = 0;
            
            for (let ky = 0; ky < kernelSize; ky++) {
                for (let kx = 0; kx < kernelSize; kx++) {
                    const pixelY = y + ky - half;
                    const pixelX = x + kx - half;
                    const pixelIndex = (pixelY * width + pixelX) * 4;
                    const weight = kernel[ky][kx];
                    
                    r += data[pixelIndex] * weight;
                    g += data[pixelIndex + 1] * weight;
                    b += data[pixelIndex + 2] * weight;
                }
            }
            
            const index = (y * width + x) * 4;
            newData[index] = Math.round(r);
            newData[index + 1] = Math.round(g);
            newData[index + 2] = Math.round(b);
        }
    }
    
    const blurredImageData = new ImageData(newData, width, height);
    this.ctx.putImageData(blurredImageData, 0, 0);
    return blurredImageData;
}

Step 3: Adaptive Thresholding

// Adaptive thresholding for varying lighting conditions
applyAdaptiveThreshold(blockSize = 15, C = 10) {
    const imageData = this.ctx.getImageData(0, 0, this.canvas.width, this.canvas.height);
    const data = imageData.data;
    const width = this.canvas.width;
    const height = this.canvas.height;
    const newData = new Uint8ClampedArray(data);
    
    const half = Math.floor(blockSize / 2);
    
    for (let y = half; y < height - half; y++) {
        for (let x = half; x < width - half; x++) {
            let sum = 0;
            let count = 0;
            
            // Calculate mean in local neighborhood
            for (let dy = -half; dy <= half; dy++) {
                for (let dx = -half; dx <= half; dx++) {
                    const pixelIndex = ((y + dy) * width + (x + dx)) * 4;
                    sum += data[pixelIndex];
                    count++;
                }
            }
            
            const mean = sum / count;
            const threshold = mean - C;
            const currentIndex = (y * width + x) * 4;
            const pixelValue = data[currentIndex];
            
            const binary = pixelValue < threshold ? 0 : 255;
            newData[currentIndex] = binary;
            newData[currentIndex + 1] = binary;
            newData[currentIndex + 2] = binary;
        }
    }
    
    const thresholdedImageData = new ImageData(newData, width, height);
    this.ctx.putImageData(thresholdedImageData, 0, 0);
    return thresholdedImageData;
}

Pro Tip: Adaptive Thresholding

Regular thresholding fails when cheques have uneven lighting or shadows. Adaptive thresholding calculates the threshold locally for each pixel based on its neighborhood, giving much better results with real-world scanned documents.

Signature Detection Algorithm

Connected Component Analysis

After preprocessing, we need to find connected components (groups of connected black pixels) that could be signatures:

// Find connected components using flood fill
findConnectedComponents() {
    const imageData = this.ctx.getImageData(0, 0, this.canvas.width, this.canvas.height);
    const data = imageData.data;
    const width = this.canvas.width;
    const height = this.canvas.height;
    
    const visited = new Array(width * height).fill(false);
    const components = [];
    
    for (let y = 0; y < height; y++) {
        for (let x = 0; x < width; x++) {
            const index = y * width + x;
            const pixelIndex = index * 4;
            
            // Check if pixel is black (signature) and not visited
            if (data[pixelIndex] === 0 && !visited[index]) {
                const component = this.floodFill(data, visited, x, y, width, height);
                
                if (this.isValidSignatureComponent(component)) {
                    components.push(component);
                }
            }
        }
    }
    
    return components;
}

// Flood fill algorithm to find connected pixels
floodFill(data, visited, startX, startY, width, height) {
    const stack = [{x: startX, y: startY}];
    const component = {
        pixels: [],
        minX: startX,
        maxX: startX,
        minY: startY,
        maxY: startY,
        area: 0
    };
    
    while (stack.length > 0) {
        const {x, y} = stack.pop();
        const index = y * width + x;
        
        if (x < 0 || x >= width || y < 0 || y >= height || visited[index]) {
            continue;
        }
        
        const pixelIndex = index * 4;
        if (data[pixelIndex] !== 0) { // Not black
            continue;
        }
        
        visited[index] = true;
        component.pixels.push({x, y});
        component.area++;
        
        // Update bounding box
        component.minX = Math.min(component.minX, x);
        component.maxX = Math.max(component.maxX, x);
        component.minY = Math.min(component.minY, y);
        component.maxY = Math.max(component.maxY, y);
        
        // Add neighbors to stack
        stack.push({x: x + 1, y}, {x: x - 1, y}, {x, y: y + 1}, {x, y: y - 1});
    }
    
    return component;
}

Signature Component Validation

// Determine if a component is likely a signature
isValidSignatureComponent(component) {
    const area = component.area;
    const width = component.maxX - component.minX + 1;
    const height = component.maxY - component.minY + 1;
    const aspectRatio = width / height;
    const density = area / (width * height);
    
    // Signature validation criteria
    return (
        area >= this.minSignatureArea &&
        area <= this.maxSignatureArea &&
        aspectRatio >= 0.5 && aspectRatio <= 4.0 && // Not too narrow or wide
        density >= 0.1 && density <= 0.8 &&         // Not too sparse or dense
        width >= 50 && height >= 20                  // Minimum dimensions
    );
}

Morphological Operations

// Remove noise using morphological operations
applyMorphologicalOps() {
    // Erosion followed by dilation (opening) to remove noise
    this.erode(1);
    this.dilate(1);
    
    // Dilation followed by erosion (closing) to fill gaps
    this.dilate(1);
    this.erode(1);
}

// Erosion operation
erode(iterations = 1) {
    for (let i = 0; i < iterations; i++) {
        const imageData = this.ctx.getImageData(0, 0, this.canvas.width, this.canvas.height);
        const data = imageData.data;
        const newData = new Uint8ClampedArray(data);
        const width = this.canvas.width;
        const height = this.canvas.height;
        
        for (let y = 1; y < height - 1; y++) {
            for (let x = 1; x < width - 1; x++) {
                const index = (y * width + x) * 4;
                
                // Check if any neighbor is white (background)
                let hasWhiteNeighbor = false;
                for (let dy = -1; dy <= 1; dy++) {
                    for (let dx = -1; dx <= 1; dx++) {
                        const neighborIndex = ((y + dy) * width + (x + dx)) * 4;
                        if (data[neighborIndex] === 255) {
                            hasWhiteNeighbor = true;
                            break;
                        }
                    }
                    if (hasWhiteNeighbor) break;
                }
                
                if (hasWhiteNeighbor) {
                    newData[index] = 255;     // Set to white
                    newData[index + 1] = 255;
                    newData[index + 2] = 255;
                }
            }
        }
        
        this.ctx.putImageData(new ImageData(newData, width, height), 0, 0);
    }
}

Production Optimizations

Performance Improvements

// Optimized processing using Web Workers
processImageWithWorker(imageData) {
    return new Promise((resolve, reject) => {
        const worker = new Worker('/js/signature-worker.js');
        
        worker.postMessage({
            imageData: imageData,
            parameters: {
                threshold: this.threshold,
                blurRadius: 1,
                minArea: this.minSignatureArea,
                maxArea: this.maxSignatureArea
            }
        });
        
        worker.onmessage = (e) => {
            const {processedImageData, signatures} = e.data;
            resolve({processedImageData, signatures});
            worker.terminate();
        };
        
        worker.onerror = reject;
    });
}

// Batch processing for multiple signatures
async processBatch(files) {
    const results = [];
    const batchSize = 5; // Process 5 images at a time
    
    for (let i = 0; i < files.length; i += batchSize) {
        const batch = files.slice(i, i + batchSize);
        const batchPromises = batch.map(file => this.processSignature(file));
        
        try {
            const batchResults = await Promise.all(batchPromises);
            results.push(...batchResults);
        } catch (error) {
            console.error('Batch processing error:', error);
        }
    }
    
    return results;
}

Real-World Challenges and Solutions

Challenge: Multiple Signatures

Bank cheques often have multiple signatures - account holder, witness, bank stamp. We needed to identify which one was the primary signature.

Solution: Implemented signature ranking based on size, position, and ink density. Primary signatures are typically larger and in specific cheque areas.

Challenge: Background Removal

Cheque backgrounds have complex patterns, logos, and security features that interfere with signature detection.

Solution: Used frequency domain filtering and template matching to identify and remove known background patterns before signature extraction.

Performance Results

Processing Speed: ~2-3 seconds per cheque on average hardware
Accuracy: 94% successful signature extraction rate
False Positives: Reduced from 15% to 3% with morphological filtering
Memory Usage: Optimized to handle 10MB+ scanned images efficiently

Integration with Banking Systems

// API integration for signature verification
class SignatureAPI {
    constructor(baseURL, apiKey) {
        this.baseURL = baseURL;
        this.apiKey = apiKey;
    }
    
    async extractAndVerify(chequeImage, customerSignatures) {
        try {
            // Extract signature from cheque
            const extractor = new SignatureExtractor('processing-canvas');
            await extractor.loadImage(chequeImage);
            
            const signatures = await extractor.extractSignatures();
            
            // Send to verification service
            const verificationResults = await Promise.all(
                signatures.map(sig => this.verifySignature(sig, customerSignatures))
            );
            
            return {
                success: true,
                signatures: signatures,
                verificationResults: verificationResults,
                confidence: this.calculateOverallConfidence(verificationResults)
            };
        } catch (error) {
            return {
                success: false,
                error: error.message
            };
        }
    }
    
    async verifySignature(extractedSignature, storedSignatures) {
        const response = await fetch(`${this.baseURL}/verify`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${this.apiKey}`
            },
            body: JSON.stringify({
                signature: extractedSignature,
                references: storedSignatures
            })
        });
        
        return response.json();
    }
}

Conclusion

Building a signature extraction system for real banking applications taught me that computer vision in production is 80% handling edge cases and 20% core algorithms. The clean, academic examples work great in controlled environments, but real cheques are messy, varied, and full of surprises.

Key lessons learned:

Preprocessing is everything - 70% of the work was in image cleanup
Adaptive algorithms outperform fixed parameters - Real-world data varies too much
Performance matters - Banks need fast processing for thousands of cheques daily
Error handling is crucial - Failed extractions should degrade gracefully

The system successfully processed over 50,000 cheques in production, significantly reducing manual verification time and improving accuracy. It proved that browser-based computer vision can be both powerful and practical for enterprise applications.

References

OpenCV.js Documentation - Computer vision in JavaScript
Canvas API Reference - HTML5 Canvas documentation
Digital Image Processing by Gonzalez - Classic computer vision textbook