top of page
Elegant Abstract Background

Binary Neural Networks

WINLAB Summer 2021 Research Internship

Overview

This project is part of the Rutgers WINLAB Summer 2021 Research Internship program, bringing together researchers, graduate students, and undergraduate students. Traditionally, power consumption has been an oft overlooked metric in the training and execution of neural networks, but the paradigm is beginning to shift as large computing systems used in deep learning continue to increase in scale [1]. It is a well-established fact that integer units take up far less physical chip space than floating point units [2][3], and their power consumption is far less as a result. Through our research at WINLAB, we investigated the use of integers and binary fixed-point number implementations in two neural networks trained on the MNIST-digits and MNIST-fashion datasets to see what effect their use might have on accuracy and training time.

Objectives

Objectives

energy

Integer processing units use up to 60% less physical chip space on a CPU than float processing units. Since fixed-point numbers can be created using integers and bit-wise operations, fixed-point unlocks this potential of greater energy efficiency.

Establishing and maintaining comparable accuracy while transitioning to using fixed-point numbers is vital before this method can be implemented at scale.

accuracy

Our fixed-point implementations of neural networks should require the same training time, allowing for power-saving benefits with next to no draw backs.

​

speed

Weekly Timeline

week 1 and week 2

  • Introductions
     

  • Read existing literature surrounding non-floating point implementations of neural networks
     

  • Familiarized ourselves with Golang
     

  • Gathered accuracy data using previous years' floating-point neural networks
     

  • Continued implementation of fixed-point linear algebra framework

week 3

  • Implemented code to track weight maxima and minima for each training iteration
     

  • Began implementing ReLU activation function instead of Sigmoid to move away from floating-points
     

  • Made plots of absolute maxima and minima of the weights in both weight matrices in our neural network to investigate dynamic range
     

w1_range.png
w2_range.png

The Figures above show that the dynamic range (minimums and maximums of the individual weights in the neural network) increases with time, but not very rapidly and not by much. This range remains very small, around [-10, 10] should be plenty for accuracy, which bodes well for finite-precision fixed-point numbers.

week 4

  • Began checking intermittent accuracy for GONN model using validation set
     

  • Plotted absolute maxima and minima of the weights in the weight matrices for each epoch along side accuracy to determine if there was any connection between dynamic range and accuracy
     
  • Found accuracy plateau at relatively small dynamic range, which bodes well for fixed-point implementation
Hidden Range, Output Range, Accuracy vs. Epoch(1).png

Because reasonably sized fixed-points are small, anywhere from 32 to 128 bits depending on the specific implementation, this plateau means that it is possible to maintain model accuracy even with finite precision

week 5

  • Added layers to network to test dynamic range under different circumstances
     

  • Created range and accuracy vs. training iteration graphs as we prototyped different fixed-point networks and ReLU functions
     

  • Created GitHub repository to bring team onto one code base
     

  • Encountered an error caused by overflow when using the Sigmoid function with our fixed-point implementation, which resulted in steep drop-offs in accuracy

week 6

Fixed Point Ranges and Accuracy.png

Broken Sigmoid

Fixed Point Ranges and Accuracy Adjusted ReLU 1.png

Nearly Fixed Sigmoid

Fixed Point Ranges and Accuracy ReLU Works.png

Fixed Sigmoid

We fixed the Sigmoid function by adding code to conditionally deal with different edge cases, including overflow

  • Analyzed upper and lower bounds for weight precision by observing the precision needed to maintain relative order of the output scores
     

  • Continued testing of various weight cutoffs
     

  • Continued working on fixed-point ReLU implementation
     

  • Began exploring other, more complicated datasets to test our network on. Finally decided on MNIST Fashion
     

GitHub-logo.png

week 7

  • Completed and implemented fixed-point neural network MNIST fashion datasets
     

  • Tested ReLU activation function, found it was not accurate and we must add another layer to the network in order to fix that as well as look for potential bugs
     

  • Experimented with truncating bits from floating-point numbers to see how much precision is really needed to maintain the accuracy of our networks

week 8

  • Collected accuracy data for fixed-point models over the MNIST Digit and MNIST Fashion datasets
     

  • Compared fixed-point network's accuracy to floating-point network's accuracy
     

  • Started implementation of a Softmax layer, which would test the extremes of our fixed-point implementation due to large returns from exp() method

Fixed Point Ranges and Accuracy (MNIST Fashion)(1).png

Graph starts after 100 data points of training. This once again shows that improvement in accuracy do not necessarily require a larger dynamic range, even for more complicated problems than MNIST

week 9

  • Continued implementation of Softmax layer
     

  • Tested other datasets to test effectiveness and observe the different dynamic ranges that different datasets produce

Internship end

The current trajectory of machine learning research has produced a level of abstraction in which machine learning engineers need not necessarily concern themselves with low-level, hardware details. Our investigation into using fixed-point numbers in neural networks shows promise, and in doing so could bridge the existing gap and bring nuanced hardware implementations into an up-to-date software stack. Despite an unstable dynamic weight range, one that continues to diverge with time, our preliminary results indicate that the range will remain small enough to allow using reasonably small-sized fixed-point numbers, without significant loss in accuracy. These results show that fixed-points are viable, with their power-saving and potentially time-saving benefits. This topic is worth further investigation, especially in experimenting with different kinds of neural network architectures like convolutional neural networks and recurrent neural networks.

bits  trunc graph.png
bits trunc chart.png

This graph plots accuracy of the neural network vs. bits truncated from the floating point number. Reading from Right to Left, the accuracy of the model remains very high up until we truncated about 35 bits from the mantissa. This is a very good result, because it means that very high precision is not necessary for accurate neural networks. Once again, this is evidence that a limited precision fixed-point implementation can work

Weekly Progress

Our Team

dan m.png

Daniel Maevsky

Undergraduate Sophomore

Honors Academy student at Rutgers University - New Brunswick School of Engineering

Studying Electrical and Computer Engineering, Computer Science, and Math

sachin.jpg

Sachin Mathew

Undergraduate Senior

Honors Academy student at Rutgers University - New Brunswick School of Engineering

Studying Electrical and Computer Engineering and Computer Science

dan chen.jpg

Daniel Chen

Undergraduate Sophomore

Honors Academy student at Rutgers University - New Brunswick School of Engineering

Studying Electrical and Computer Engineering

tommy.jpg

Thomas Forzani

Undergraduate Sophomore

Honors Academy student at Rutgers University - New Brunswick School of Engineering

Studying Electrical and Computer Engineering and Computer Science

serena.jpg

Serena Zhang

High School Junior

Student at Basis Independent Silicon Valley

Interested in studying Computer Science

prof martin.jpg

Richard Martin

Project Mentor

Associate Professor at Rutgers University

Department of Computer Science

Years based on 2021 Internship

Click on an image to go to

linkedin.gif
Team
References
bottom of page