Adapted from the webinar How to Debug PCI Express Power Management and Dynamic Link Behaviors by Patrick Connally and Gordon Getty

Introduction

Power management is a key consideration for PCI Express® (PCIe®). Consequently, PCIe specifies the L1 low-power state. When a link is in L1, no data transfer takes place in either direction, so that a PCIe device in the L1 state consumes less power than when in the active L0 state.

L1 substates (designated L1.1 and L1.2, with the original L1 renamed L1.0) offer even deeper power savings than L1, an especially important feature for laptops, tablets and other battery-powered devices. Device designers must measure power consumption during low-power states to evaluate the tradeoffs and optimize performance.

This is difficult to do using either a protocol analyzer or an oscilloscope alone. Protocol analyzers can trigger on sequences of events and take very long captures, but they cannot capture analog events. Oscilloscopes can capture analog events, but the acquisitions are very short and difficult to correlate to protocol events. Fortunately, the two instruments complement each other very well to examine events triggered by higher layer processes, but which have an effect on the physical layer—like L1 substates. The Teledyne LeCroy CrossSync™ PHY for PCIe software framework synchronizes triggering, acquisition and analysis on the two instruments to provide total link visibility.

Examples in this application note demonstrate how to perform power-consumption measurements for L1 substates based on logic states in the data-link layer or the physical layer’s logical subblock and the corresponding waveforms at the physical layer’s electrical subblock.

Overview of L1 Substates

A device enters the L1 state through one of two mechanisms: Active State Power Management (ASPM) or PCI Power Management (PCI-PM). A device will indicate its support for L1 substates and entry mechanisms in its configuration space, and it will make use of the clock-request signal (CLKREQ#, asserted when low) for exit and entry into an L1 substate.

The data-link layer in the PCIe protocol stack handles link-management tasks such as the initialization of flow-control credits, the update of flow-control credits as the link is active in the L0 state, and the acknowledge and negative-acknowledgement mechanisms to make sure the packets maintain integrity across the link. The data-link layer also manages requests to enter L1 and its substates for low-power operation.

Configuration space indicating support for L1 substates and entry mechanisms.
Figure 1. Configuration space indicating support for L1 substates and entry mechanisms.

Measuring L1 Substate Power

To make L1 substate power measurements, you employ a PCIe protocol analyzer such as the Teledyne LeCroy Summit T54 along with an oscilloscope such as the Teledyne LeCroy LabMaster 10Zi-A. In addition, an interposer monitors communications with the device under test and provides data to the protocol analyzer as well as the oscilloscope.

Test setup for L1 substate measurements using CrossSync PHY for PCIe software.
Figure 2. Test setup for L1 substate measurements using CrossSync PHY for PCIe software.

We want to measure how much power a device consumes in each state: during L0 (before going into the low-power state), during the low-power substate, and then in L0 when it comes back out of the low-power substate. We are going to look at these three different sections shown on a timing diagram and measure how much power the device consumes, which requires a different probing setup than that used for timing measurements.

Power measurements during L0, during the low-power L1 substate and on the return to L0.
Figure 3. Power measurements during L0, during the low-power L1 substate and on the return to L0.

Setting Up Trigger and Acquisition

The oscilloscope should be configured to acquire four signals: high-speed PCIe upstream data on C2 and downstream data on C3, and rail voltage and current on C1 and C4. It can be configured with a fairly long timebase at a reduced sample rate (we used 10 GS/s), since we care primarily about the power signals on C1 and C4.

Configure the protocol analyzer to trigger the oscilloscope at the beginning of the low-power state when the clock request is deasserted.

Trigger setup for L1 substate power measurements.
Figure 4. Trigger setup for L1 substate power measurements.

Probing and Setting Up Math Traces

The probing setup for power measurements differs slightly from those for timing measurements. Using high-bandwidth differential probes, we probe the lane 0 upstream data and lane 0 downstream data (C2 and C3), so we can see the activity on at least one of the high-speed lanes. We also probe rail voltage by means of the CrossSync PHY interposer, which provides access to the device’s 3.3 V power-supply rail (we used the Teledyne LeCroy RP4030 Active Voltage Rail Probe on C1). A series shunt enables the current measurement (C4), allowing us to look at rail voltage and current in real time.

Note that the device is not powered from the interposer, although it can be probed from the interposer; it's still drawing its power from the host, enabling us to do some dynamic power analysis.

We set up an oscilloscope math function trace (F2) to calculate power consumption as measured rail voltage multiplied by measured rail current (C1*C4).

CrossSync PHY interposer providing access to the device’s 3.3-V supply rail (shown in enlarged view top right).
Figure 5. CrossSync PHY interposer providing access to the device’s 3.3-V supply rail (shown in enlarged view top right).

Measuring Power Consumption

Our first power measurement takes place with the link in L0, before the device has entered L1. The math function multiplies C1 times C4 to calculate power consumption in the L0 state prior to CLKREQ# deassertion. In our example, the value is 2.144 W. On entering L1.2 for a period of about 16 ms, the upstream and downstream signals become electrically idle, and power consumption goes down to less than 200 mW—power has been reduced by more than a factor of 10.

Math function F2 multiplies C1 times C4 to calculate power consumption in the L0 state prior to CLKREQ# deassertion.
Figure 6. Math function F2 multiplies C1 times C4 to calculate power consumption in the L0 state prior to CLKREQ# deassertion.
Average power consumption reduced to 195 mW During the approximately 16 ms that the link remains in L1.2.
Figure 7. Average power consumption reduced to 195 mW During the approximately 16 ms that the link remains in L1.2.

As the link exits L1.2, the downstream port can become active while the upstream port remains idle, resulting in about a half watt of power consumption. When the link is fully back in the active state with both upstream and downstream ports active, power consumption rises to 2.352 W.

Average power consumption of 520 mW with the downstream port active and the upstream port idle.
Figure 8. Average power consumption of 520 mW with the downstream port active and the upstream port idle.
Average power consumption of 2.352 W with the link back in the active state.
Figure 9. Average power consumption of 2.352 W with the link back in the active state.

Conclusion

The L1 substates can provide considerable power savings on PCIe links. The combination of the oscilloscope, protocol analyzer, interposer and CrossSync PHY for PCIe software provides an effective way of navigating to any point in link operation to conduct power-consumption measurements, as well as investigate conditions such as voltage rail drop and other power-integrity issues.

More information on the Teledyne LeCroy CrossSync PHY software can be found on our website.