Wrong information from link_status tool

Hello,

we did a couple of tests with loop-back and after this seems to confuse the link-status tool/had some side-effects.

Below is the output of the report.py tool

[talt@cn141 cru]$ python cru-sw/COMMON/report.py -i 04:00.0
=========================================================================================
>>> Firmware short hash=0xe311a62d
>>> Firmware is dirty status= True
>>> Firmware build date and time=20181123  160316
>>> Altera chip ID=0x00540186-0x2855fb0c
=========================================================================================
TTC clock is selected
------------------------------
               GBT      GBT  Internal	           	  Datapath	  Enabled in
 Link ID   TX mode  RX mode  loopback	    GBT mux	      mode	    datapath
-----------------------------------------------------------------------------------------
Link  0 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  1 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  2 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  3 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  4 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  5 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  6 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  7 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link  8 :      GBT      GBT       YES	TTC:PATTERN	continuous	    Disabled
Link  9 :      GBT      GBT       YES	TTC:PATTERN	continuous	    Disabled
Link 10 :      GBT      GBT        NO	TTC:PATTERN	continuous	    Disabled
Link 11 :      GBT      GBT        NO	TTC:PATTERN	continuous	    Disabled
Link 12 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 13 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 14 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 15 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 16 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 17 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 18 :      GBT       WB        NO	TTC:PATTERN	continuous	     Enabled
Link 19 :      GBT      GBT       YES	TTC:PATTERN	continuous	    Disabled
Link 20 :      GBT      GBT       YES	TTC:PATTERN	continuous	    Disabled
Link 21 :      GBT      GBT       YES	TTC:PATTERN	continuous	    Disabled
Link 22 :      GBT      GBT        NO	TTC:PATTERN	continuous	    Disabled
Link 23 :      GBT      GBT        NO	TTC:PATTERN	continuous	    Disabled
------------------------------
24 link(s) found in total
------------------------------

The linkstat.py tool gives the following output:

[talt@cn141 cru]$ python cru-sw/COMMON/linkstat.py -i04:00.0
globalgen: | ttc240freq 240.47 MHz | lcl240freq 240.47 MHz | ref240freq 240.47 MHz | clk not ok cnt     0 | 
Wrapper 0: | ref freq0 240.47 MHz  | ref freq1 0.00 MHz  | ref freq2 0.00 MHz  | ref freq3 0.00 MHz  | 
24 link(s) found in total
 Link  0 : UP
 Link  1 : UP
 Link  2 : UP
 Link  3 : UP
 Link  4 : UP
 Link  5 : UP
 Link  6 : UP
 Link  7 : UP
 Link  8 : UP
 Link  9 : UP
 Link  10 : DOWN
 Link  11 : DOWN
 Link  12 : UP
 Link  13 : UP
 Link  14 : UP
 Link  15 : UP
 Link  16 : UP
 Link  17 : UP
 Link  18 : UP
 Link  19 : UP
 Link  20 : UP
 Link  21 : UP
 Link  22 : DOWN
 Link  23 : DOWN
 Status: 20/24 link is up

This is the setup which has 8 + 7 FECs connected to it. So the links 0…7 should be up and links 12-18. All other links should be down.

This is the status after executing the standalone-startup.py tool:

python ./cru-sw/COMMON/standalone-startup.py -i 4:0.0 -c ttc --pon-upstream --onu-address 1 -g wb -l 0,1,2,3,4,5,6,7,12,13,14,15,16,17,18 -x ttc -t pattern -m continuous

It looks like not all internal registers have been initialized/reset properly and we see some left-over settings from previous tests.

Cheers,
Torsten

Hello Torsten,

What is the output if you run the linkstat script with the watch command?

watch -n 0.1 python cru-sw/COMMON/linkstat.py -i04:00.0

Are link 8-11 and 19-23 UP continuously? I think sometimes it can happen that the error counters don’t change, and in those cases the link is detected to be UP.
Cheers,
Tuan

Hello Tuan,

link 8 & 9 and link 19,20,21 are always shown as up. The PLL is shown as locked and the stream locked to data. Error counters are not increasing.

		Link #21: Wrapper 0 - Bank 3 - Link 3
-----------------------------------------------------------------
                Status	|                     Counters		
    PLL locked: YES	|          Not ready: 1523956412 	
Locked to data: YES	| Not locked to data: 2444623087 	
TX clock frequency: 240.47 MHz

We tested a bit more. We can actually get it back into the original state by including those links into the stand-alone startup script, re-configure the CRU with it, then reconfigure again with the actual number of links we are using.

So it seems, the standalone-startup.py works “incrementally” and don’t overwrite settings if the link is not included. This is dangerous, since the actual state of the CRU depends on the previous state it was in before the script was executed.

Cheers,
Torsten

Hello Torsten,

Looking at it again, actually that was the expected behaviour. I just noticed that they were reported to be in internal loopback mode, so they would always appear as up.
Okay, we will see what should be the default state and always start the calibration from there.

Cheers,
Tuan

Hello Tuan,

yes, that was my point :wink: This is why i said, that the links being in loop-back are a leftover from the tests Pippo and me did yesterday and they remained in that state despite running the standalone-startup a few times afterwards.

Cheers,
Torsten

Ciao,
I don’t think the tool is working incrementally.
There are so many detectors that is difficult to have a default configuration.

When you run standalone_startutp it change the configuration of the CARD (link and datapathwrapper) for all the links enabled.

If you change the number of links included in the run the other links are left untouched and I think it is correct as we don’t know which configuration we should bring back.

But indeed we can act globally on the -lb option that it should affect always all the links enabled or not in the run … but for the rest I think the last configuration applied to the link should stays there until the link is included in another run and the configuration changed.

Hi Pippo,

alright, agreed. Especially for the TPC we need the links in WB mode in order to communicate with the SCA, so we always would need to include all the links, even if we don’t wanna read out every link. So keeping it as it is is o.k.

However, it would be nice to have some kind of “default/Reset” script or option which puts the CRU in its initial state. Incremental changes are nice to have but sometimes one wants to have a “fresh” start without the need of rebooting the host.

Cheers,
Torsten

ok I’ll open a git issue and discuss it with Olivier.

P.S. anyway rebooting the host has an effect only on the PCIe … the rest is reset when you do cru_readout.py -stop

Ciao the hard reset is actually already implemented.

python standalone-startup.py -i 21:0.0 -c local -g wb -lb -l 1,2,3,4 -x ddg -m packet
[root@alio2-bld4-lab011 COMMON]# python report.py -i 21:0.0

LOCAL clock is selected
------------------------------
               GBT      GBT  Internal                     Datapath        Enabled in
 Link ID   TX mode  RX mode  loopback       GBT mux           mode          datapath
-----------------------------------------------------------------------------------------
Link  0 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  1 :      GBT       WB       YES           DDG         packet           Enabled
Link  2 :      GBT       WB       YES           DDG         packet           Enabled
Link  3 :      GBT       WB       YES           DDG         packet           Enabled
Link  4 :      GBT       WB       YES           DDG         packet           Enabled
Link  5 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  6 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  7 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  8 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  9 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 10 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 11 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 12 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 13 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 14 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 15 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 16 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 17 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 18 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 19 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 20 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 21 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 22 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 23 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
------------------------------
24 link(s) found in total
------------------------------
globalgen: | ttc240freq 0.00 MHz | lcl240freq 240.47 MHz | ref240freq 240.47 MHz | clk not ok cnt     2 |
Wrapper 0: | ref freq0 240.47 MHz  | ref freq1 0.00 MHz  | ref freq2 0.00 MHz  | ref freq3 0.00 MHz  |


[root@alio2-bld4-lab011 COMMON]# python reset-clock-tree.py -i 21:0.0
[root@alio2-bld4-lab011 COMMON]# python report.py -i 21:0.0
LOCAL clock is selected
------------------------------
               GBT      GBT  Internal                     Datapath        Enabled in
 Link ID   TX mode  RX mode  loopback       GBT mux           mode          datapath
-----------------------------------------------------------------------------------------
Link  0 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  1 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  2 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  3 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  4 :      GBT      GBT        NO           DDG     continuous          Disabled
Link  5 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  6 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  7 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  8 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link  9 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 10 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 11 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 12 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 13 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 14 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 15 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 16 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 17 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 18 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 19 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 20 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 21 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 22 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
Link 23 :      GBT      GBT        NO       TTC:CTP     continuous          Disabled
------------------------------
24 link(s) found in total
------------------------------
globalgen: | ttc240freq 0.00 MHz | lcl240freq 240.47 MHz | ref240freq 240.47 MHz | clk not ok cnt     3 |
Wrapper 0: | ref freq0 240.47 MHz  | ref freq1 0.00 MHz  | ref freq2 0.00 MHz  | ref freq3 0.00 MHz  |

As you can see after executing the script for the hard reset the CRU went back to the “DEFAULT” configuration.
Only the GBT_MUX is untouched (but loopback, status of the link are cleared)