Analog Computing for Neural Networks: Computation in the Physical Domain

Yatin Taneja
Mar 9
12 min read

Analog computing utilizes continuous physical properties such as voltage and current to execute computations directly within the hardware substrate, a methodology that fundamentally differs from the discrete binary logic employed in contemporary digital processors. This direct execution mechanism bypasses the multiple digital abstraction layers intrinsic in standard processor architectures, allowing physical phenomena to instantaneously represent mathematical relationships. Neural network operations map naturally to these analog physical phenomena, specifically applying Ohm’s law and Kirchhoff’s current law to perform the weighted summations required for artificial intelligence tasks. By encoding information in the amplitude of electrical signals rather than as a sequence of binary bits, analog systems achieve a level of computational density and energy efficiency that remains difficult for digital logic gates to replicate. The underlying physics of electricity flowing through circuits performs the necessary arithmetic operations without the need for clock-driven transistor switching, creating a smooth setup of data storage and data processing. Crossbar arrays enable dense, parallel analog matrix-vector multiplication by encoding weights as conductance values at the precise intersections of wire grids within the array structure.

A crossbar array consists of a two-dimensional grid of vertical and horizontal wires with programmable resistive elements situated at each intersection point where the rows cross the columns. Input voltages applied to the horizontal rows generate currents that flow through the resistive devices according to their programmed conductance, effectively multiplying the input value by the stored weight. The vertical columns collect these currents from all connected rows, summing them together through Kirchhoff’s current law to yield the result of the vector-matrix multiplication in a single physical step. This architecture allows thousands of multiplication-and-accumulation operations to occur simultaneously as electrons traverse the circuit, providing a massive throughput advantage over serial digital processing. In-memory computing describes an architectural framework where computation occurs within or adjacent to the memory cells themselves, eliminating the necessity for data movement between separate memory and processor units. Traditional von Neumann architectures suffer from significant latency and energy penalties because data must constantly shuttle back and forth between the memory unit and the central processing unit, creating a bandwidth limitation known as the memory wall.

In-memory computing addresses this inefficiency by storing synaptic weights directly within the computational elements, ensuring that the data required for a calculation is physically present at the location where the calculation takes place. This reduction in data movement decreases the energy and latency limitations built into standard von Neumann architectures, allowing for substantially higher performance per watt for workloads dominated by matrix operations. Analog substrates offer orders of magnitude improvements in energy efficiency for inference tasks compared to digital GPUs or TPUs because they avoid the energy cost of charging and discharging capacitive loads for every single logical operation. Digital systems require thousands of transistors switching states to perform a high-precision multiplication, whereas an analog crossbar performs the same function through a single resistive element drawing current from a supply line. The energy consumption in an analog system is primarily determined by the resistive losses and the capacitance of the wires, both of which scale much more favorably than the dynamic switching power of advanced digital nodes. This efficiency advantage makes analog computing particularly attractive for deployment in power-constrained environments and large-scale data centers where electricity costs constitute a major operational expense.

Analog matrix multiplication exploits the physical relationship where input voltages applied across resistive elements produce output currents proportional to the weighted sum of the inputs and their corresponding conductances. Crossbar arrays implement this function via a grid of programmable resistors that act as variable weights, where the conductance of each resistor determines the strength of the connection between a specific input neuron and a specific output neuron. Each column sums currents from all rows to yield a vector-matrix product in a single step, effectively performing the core calculation of a neural network layer in constant time regardless of the size of the matrix. This physical parallelism ensures that the computation time does not increase linearly with the number of operations, providing a fixed latency for large matrix multiplications that digital systems struggle to match. Weight storage occurs directly in non-volatile memory devices like phase-change memory or resistive RAM, which retain their programmed states even when power is removed from the system. This co-location enables computation and storage to exist within the same physical device, removing the distinction between the memory register and the arithmetic logic unit found in digital processors.

Phase-change memory stores data via reversible amorphous-crystalline phase transitions in chalcogenide glass materials, offering tunable conductance by adjusting the ratio between the disordered amorphous state and the ordered crystalline state within the active volume of the cell. Resistive RAM functions as a two-terminal device whose resistance switches between high and low states based on the formation or dissolution of a conductive filament caused by the application of a specific voltage stress. Conductance is the measurable electrical property used to store synaptic weights in analog neural networks, serving as the physical analog to the numerical weights in software models. The precision of these weights depends on the ability to reliably program and maintain specific conductance levels within the memory device, a task complicated by material imperfections and stochastic switching behavior. Conductance drift refers to the time-dependent degradation or fluctuation of stored conductance values that occurs after programming, particularly in phase-change materials where structural relaxation causes the resistance to change slowly over time. This drift poses a key reliability challenge requiring calibration or adaptive training schemes to ensure that the accuracy of the neural network does not degrade significantly during operation.

Mitigation strategies for conductance drift include periodic refresh cycles where weights are read and reprogrammed to their target values and algorithmic compensation during inference where the software model adjusts its expectations based on measured drift characteristics. Advanced control circuits monitor the state of reference cells within the array to track environmental variations and aging effects, allowing the system to dynamically adjust input voltages or output scaling factors to maintain computational fidelity. These techniques add complexity to the peripheral circuitry yet remain necessary to achieve the required precision for complex inference tasks. By implementing closed-loop feedback mechanisms, analog compute systems can maintain stable operation over extended periods despite the intrinsic instability of the underlying memory materials. Key components of an analog accelerator include programmable resistive elements arranged in crossbar arrays, input digital-to-analog converters that translate digital data into voltage signals, output analog-to-digital converters that convert summed currents back into digital values, and peripheral control logic that manages the data flow and programming operations. The system workflow converts digital inputs into analog voltages and applies them to the rows of the crossbar array, allowing the physics of the array to perform the matrix multiplication.

Resulting currents undergo summation along the columns and conversion back to digital outputs by precision analog-to-digital converters situated at the edge of the array. This hybrid approach combines the efficiency of analog computation with the convenience of digital interfaces for data storage and communication with host systems. Training can occur off-chip in high-precision digital systems, with weights programmed into the analog array subsequently, a method known as ex-situ training that isolates the analog hardware from the complexities of the backpropagation algorithm. Off-chip training allows researchers to utilize mature software frameworks and powerful GPUs to fine-tune the network parameters before mapping them onto the constrained analog hardware. On-chip training involves incremental conductance updates using specialized pulse schemes that modify the resistance of the devices gradually based on the error signals calculated during the forward pass. This approach requires precise control over the voltage pulses applied to each cell to ensure symmetric updates for weight increases and decreases, a challenge that has limited the widespread adoption of fully analog training.

Flexibility faces limitations due to device variability, noise, converter precision, and parasitic resistances in large arrays that introduce errors into the computation. Device-level variability in PCM and RRAM leads to inconsistent weight programming across different cells on the same chip, necessitating redundancy or calibration overhead to ensure uniform performance. Error resilience varies by application where inference tolerates higher noise than training, allowing analog systems to operate effectively on recognition tasks even if the effective precision of the calculations is relatively low. This tolerance enables relaxed precision requirements for inference workloads, permitting designers to use smaller converters and lower precision memory cells to save area and power. Early analog computers used operational amplifiers and passive components for differential equation solving, providing high-speed solutions for scientific problems before the advent of digital electronics. Digital systems displaced these early machines due to their flexibility in handling diverse logic operations and their precision in representing arbitrary numbers without noise accumulation.

A resurgence began in the 2010s, driven by AI workload demands and the end of Dennard scaling, which had historically provided consistent improvements in digital power efficiency. This shift prompted exploration of non-von Neumann architectures capable of performing matrix operations with greater energy efficiency than standard CMOS logic. IBM’s 2018 demonstration of analog AI using PCM crossbars marked a crucial shift from theoretical proposals to functional prototypes, proving that multi-layer neural networks could run on analog hardware with accuracy comparable to digital systems. This project utilized phase-change memory devices to store weights and performed matrix multiplication entirely in the analog domain before digitizing the results for subsequent layers. Adoption of RRAM and other developing memories accelerated after 2020 due to their compatibility with standard CMOS fabrication processes, which lowers the barrier to connection with existing semiconductor manufacturing lines. These developments signaled a transition from academic curiosity to serious industrial investment in analog computing technologies.

Device-level variability in PCM and RRAM leads to inconsistent weight programming, meaning that setting a specific resistance value often results in a distribution of values around the target due to material randomness and process variations. This inconsistency necessitates redundancy or calibration overhead to correct for errors after programming, reducing the effective density of the array. Analog-to-digital and digital-to-analog conversion remains a hindrance because high-precision converters consume significant power and area, which limits array size and speed. The energy cost of converting signals between the digital and analog domains can sometimes negate the efficiency gains of the analog computation itself if the array size is too small or the precision requirements are too high. Thermal noise and signal crosstalk degrade signal integrity in large-scale arrays, introducing random fluctuations in the output currents that can obscure the actual results of the computation. These issues constrain practical problem sizes for current analog chips because maintaining an adequate signal-to-noise ratio becomes increasingly difficult as the array dimensions grow.

Economic viability hinges on connection with existing semiconductor fabs, as building new dedicated facilities for novel materials is prohibitively expensive for most companies. New materials such as chalcogenides for PCM complicate standard process flows because they require deposition and etching steps that are not typically available in logic-improved foundries. Digital systolic arrays offer high precision and programmability by utilizing a grid of processing elements that pass data rhythmically to their neighbors, minimizing data access costs. These digital systems suffer from memory-wall limitations and high energy per operation compared to analog approaches because they still rely on charging capacitive nodes for every arithmetic operation. Optical neural networks provide low-latency linear operations using light interference patterns, yet optical systems face challenges in nonlinear activation implementation and setup density due to the large wavelength of light compared to transistor features. Stochastic computing uses probabilistic bit streams for approximate arithmetic with very simple hardware, however this method lacks compatibility with standard neural network training frameworks which expect precise numerical values.

The industry rejected these alternatives for large-scale analog neural network deployment due to insufficient efficiency gains or poor flexibility relative to the potential of resistive memory-based analog computing. While optical systems offer speed and stochastic computing offers simplicity, neither matches the combination of density, energy efficiency, and CMOS compatibility offered by electronic crossbar arrays using non-volatile memory. Exponential growth in model size demands improvements in compute efficiency beyond what digital scaling can deliver, forcing engineers to revisit analog techniques that were previously discarded in favor of digital logic. Energy costs dominate AI infrastructure total cost of ownership, making efficiency a primary driver for hardware architecture decisions in modern hyperscale data centers. Analog approaches reduce joules per inference by avoiding data movement and utilizing low-energy physics for multiplication, directly addressing the rising energy demands of large language models and generative AI. Edge AI applications require ultra-low-power inference where analog in-memory computing offers advantages by enabling complex processing capabilities within tight thermal budgets.

Mythic AI ships analog compute-in-memory chips for edge vision applications, using flash memory technology to perform matrix operations with minimal power consumption. Tesla focuses on digital architectures for high-throughput training rather than fully analog solutions because training requires higher precision than current analog devices can reliably provide. IBM and Samsung have demonstrated lab-scale PCM-based analog accelerators that show promising results for specific workloads like image classification and speech recognition. Commercial deployments remain focused on fixed-function inference such as keyword spotting or image classification where the network parameters do not change frequently after deployment. PCM relies on germanium-antimony-tellurium alloys, which are niche materials with limited global supply chains, posing potential risks for mass production adaptability compared to standard silicon dioxide or metals used in conventional chips. RRAM uses transition metal oxides, which are more compatible with standard CMOS processes, making them a more attractive option for large-scale setup with logic circuits.

Geopolitical control over semiconductor equipment affects manufacturing access because advanced lithography tools are concentrated in specific regions, restricting the ability of some companies to produce advanced analog chips. Wafer-scale connection favors regions with advanced foundry capabilities such as Taiwan and South Korea, where companies like TSMC and Samsung have established infrastructure for complex process setup. IBM leads in PCM-based analog AI research with a strong intellectual property portfolio covering device structures, array architectures, and compensation algorithms. Samsung and TSMC explore RRAM and embedded non-volatile memory for analog compute connection, aiming to integrate these capabilities directly into their advanced logic nodes. Startups target edge inference with proprietary analog architectures improved for low power and small form factors, often focusing on specific market segments like surveillance cameras or industrial sensors. NVIDIA and AMD remain focused on digital accelerators while investing in hybrid analog-digital co-design research to explore potential future connection points without disrupting their current dominant business models.

Software stacks must adapt to support weight programming and drift compensation, requiring new layers of abstraction that hide hardware complexity from application developers. Compilers need to map neural networks to physical crossbar constraints such as tile size and converter resolution, breaking large models into chunks that fit onto the available hardware resources. Infrastructure for testing and characterization of analog compute chips lags behind digital standards because traditional automated test equipment is designed for binary pass/fail criteria rather than measuring analog performance distributions. Traditional metrics such as FLOPS are insufficient for evaluating analog performance because they do not account for the precision limitations or the parallelism intrinsic in the physical computation. New key performance indicators include TOPS/W and effective bits of precision, which provide a more accurate picture of the actual utility delivered by the hardware for AI workloads. Benchmark suites must include noise and variability effects to reflect real-world analog performance rather than idealized simulations that ignore physical imperfections.

Energy-per-inference becomes the primary metric for edge and data center sustainability reporting as organizations seek to reduce the environmental impact of their AI operations. Setup of analog compute with photonic interconnects will reduce data movement loss between chips by using light to transmit signals between separate analog tiles or different packages. Development of self-calibrating arrays will use embedded reference cells and real-time feedback loops to continuously adjust operating parameters to compensate for temperature changes and aging effects. Hybrid digital-analog training algorithms will account for device non-idealities during backpropagation by incorporating noise models into the software optimization process. 3D stacking of resistive memory layers will increase density without increasing planar footprint by building vertical structures where logic tiers sit beneath multiple layers of memory arrays. The field will depend on solving system-level challenges such as converter efficiency and software compatibility before widespread adoption can occur in general-purpose computing markets.

Success will rely on embracing controlled imperfection through co-design rather than mimicking digital precision, accepting that analog hardware will introduce noise that algorithms must be durable enough to handle. This philosophy is a pivot away from the deterministic perfection demanded by digital computing towards a probabilistic approach reminiscent of biological neural systems. Superintelligence systems will require massive, energy-efficient inference capabilities to evaluate vast hypothesis spaces in real time, exceeding the capacity of current digital hardware to provide economically viable solutions. Analog substrates will enable continuous, low-latency reasoning loops unattainable with digital batch processing because they can process streams of sensory data without the overhead of clock cycles or memory fetches. Physical-domain computation will allow superintelligent agents to interact directly with sensorimotor environments without digital mediation, effectively blurring the line between the computer and the physical world it inhabits. By interfacing directly with sensors and actuators through analog signals, these systems eliminate the latency and quantization errors introduced by analog-to-digital conversion at the periphery.

Superintelligence will exploit analog noise and drift as computational resources for stochastic sampling rather than treating them as defects to be eliminated. Random fluctuations in device conductance can be captured to generate true random numbers or to perform Monte Carlo simulations natively in hardware, accelerating probabilistic reasoning tasks required for decision-making under uncertainty. Crossbar arrays will serve as native substrates for embodied cognition where perception and action are physically unified through the continuous flow of electrical signals representing environmental stimuli and motor commands. This tight coupling between perception and action mirrors biological nervous systems where reflex arcs operate with minimal synaptic delay. Analog neuromorphic systems will form the sensory and reflex layers of superintelligent architectures, handling high-bandwidth raw data streams from cameras and microphones with extreme power efficiency. Digital systems will handle symbolic reasoning while analog systems manage real-time interaction, creating a heterogeneous architecture that applies the strengths of both approaches.

This division of labor allows the digital components to focus on abstract planning and logic manipulation while the analog components process the massive influx of low-level sensory information required to ground the intelligence in physical reality. The connection of these distinct modalities will define the next generation of artificial intelligence hardware.