Project Story

My Experience Making Awesomegrad

Building a micrograd implementation in Rust - because sometimes you need to reinvent the wheel, but in a memory-safe way.

5 min read
rust machine-learning neural-networks project automatic-differentiation

After watching Andrej Karpathy’s micrograd video, I was inspired to build my own automatic differentiation engine. But instead of Python, I decided to implement it in Rust. This turned out to be both challenging and incredibly educational.

Why Rust?

Rust’s memory safety guarantees and zero-cost abstractions make it perfect for building systems that need to be both fast and reliable. Plus, I wanted to learn more about systems programming while building something practical.

Key Benefits of Rust for This Project

  • Memory safety without garbage collection
  • Zero-cost abstractions
  • Excellent performance
  • Strong type system prevents many bugs
  • Great tooling and ecosystem

The Core Architecture

The heart of any automatic differentiation system is the computational graph. Each operation creates a node that knows how to compute its value and how to propagate gradients backward.

Key Components

  1. Value: Represents a scalar value with gradient
  2. Operations: Add, multiply, tanh, etc.
  3. Backward Pass: Chain rule implementation
  4. Neural Network: Layers and training loop

The Value Struct

The core of the system is the Value struct, which represents a scalar value in the computational graph:

struct Value {
    data: f64,
    grad: f64,
    children: Vec<Rc<RefCell<Value>>>,
    backward: Option<fn(&mut Value)>,
}

Challenges Faced

Ownership and Borrowing

Rust’s ownership system was the biggest challenge. Managing references between nodes in the computational graph required careful thought about lifetimes and borrowing rules.

Memory Management

Unlike Python’s garbage collection, I had to manually manage memory. Using Rc and RefCell was necessary but added complexity.

Type System

Rust’s strong type system caught many bugs at compile time, but also required more explicit type annotations and trait implementations.

Key Learnings

Rust Patterns

Learned about Rc, RefCell, traits, and how to structure code to work with Rust’s ownership system.

Automatic Differentiation

Deepened understanding of how gradients flow through computational graphs and the chain rule.

Systems Programming

Gained appreciation for memory management and performance considerations in systems programming.

Testing

Rust’s testing framework is excellent. Unit tests and integration tests helped catch bugs early.

Performance Comparison

While the Rust implementation was more complex to write, it offered significant performance benefits:

Benchmarks (training a small neural network)

  • Python micrograd: ~2.5 seconds
  • Rust awesomegrad: ~0.8 seconds
  • Memory usage: ~3x less in Rust

What I’d Do Differently

If I were to rebuild this project, I would:

Conclusion

Building awesomegrad in Rust was a challenging but rewarding experience. It taught me not just about automatic differentiation, but also about systems programming, memory management, and the trade-offs between different programming languages.

The project reinforced my belief that sometimes the best way to learn is to rebuild something from scratch. While the result might not be as polished as existing solutions, the learning experience is invaluable.

References