February 7, 2017 at 2:21 pm #883
This is an update to issue first noted here. It concerns our support board (SB) acquiring non-representative data at ~20MB/s (via QuickUSB).
For context and background, we have one fully operational OpenPET crate (Support Board) after implementing the fix discussed by Roger here via our custom DB firmware. Note that this operational board was manufactured post-Jan 2016. Our other support board seemed to have a different issue after our move from Chicago (where it ran successfully) to Philadelphia. However, we placed aside the original crate and its support board in favor of debugging the slot 3 & 7 timing issue above.
We tested this Support Board for a full week and did not have an issue. This was with different OpenPET versions, DBs, slots number, firmware/DB flash, and signal input. Alas, after all this testing, the problem returned today after a simple DB exchange. In its simplest form, we get the issue when trying to run a simple 16-channel singles more run:
$ openpet -c 3 0x8002 2 2017-02-07 11:59:56,423 INFO [S] 0x0003 0x8002 0x00000002 2017-02-07 11:59:56,625 INFO [R] 0x8003 0x8002 0x00000002 $ openpet -c 7 0x8002 2 2017-02-07 12:00:05,108 INFO [S] 0x0007 0x8002 0x00000002 2017-02-07 12:00:05,311 INFO [R] 0x8007 0x8002 0x00000002 $ openpet -a 5 -o def_singles.dat 2017-02-07 12:00:31,427 INFO Starting qusb data acquisition... 2017-02-07 12:00:35,848 INFO 0.587s remaining. 2017-02-07 12:00:36,446 INFO QUSB rate is 45.319 MB/s 2017-02-07 12:00:36,447 INFO Stopping qusb data stream... 2017-02-07 12:00:36,450 INFO QUSB data stream has stopped. 2017-02-07 12:00:36,450 INFO Stopping qusb data queue ... 2017-02-07 12:00:36,453 INFO QUSB data queue has stopped. 2017-02-07 12:00:36,453 INFO 19.933 MB/s written to disk, def_singles.dat $ ls -lhrt def_singles.dat -rwx------+ 1 CSPECT mkpasswd 101M Feb 7 12:00 def_singles.dat
In the above example, we had a single DB in slot #2 and we are using OpenPET v2.1. The (4kB) header appears correct but the data output for (nearly) all 32-bit words is
0000 0000 0000 0000 0000 0000 0000 0110 = 6 [dec]
Other runs would give 0100 = 4 [dec] in the four LSBs of each word. Note that for some runs there would occasionally be a correctly formatted word but rarely would a full event be recorded.
Additional tests showed the problem occurred in different DB slots and for different DBs. More over, the problem persists even when there is no signal input (where we expect just to get the 4kB header).
Acquiring data with our custom firmware was consistent with the above where (nearly) every word would be replaced by 0110  or 0100 . However, we could observe the data stream being correct via Signal Tap with the USB-blaster to the DB JTAG connection. We activated a static pattern pulser via our custom firmware, and we did observe the rare data word (e.g. 0x1FEEEEEA) that was consistent with our input pattern but that occurred out of sync and mostly overwritten by the  and  output.
Has this issue be observed before? How would you suggest we proceed in debugging it further?February 7, 2017 at 3:22 pm #887
We have not observed this issue before.
Let me confirm few things first, you have two Support Boards (SB)s. An old SB and a new SB (post 2016 fab). The main suspect here is the old SB. Is that correct? Also, the new SB is fully functional using all DBs, is that correct? Do you have a HostPC board?
Regarding the suspicious SB we will start with simple diagnostics:
0- Visual inspections of LEDs:
A- If you have a HostPC board all LEDS should be lit. See http://openpet.lbl.gov/img_4693/
B- If you don’t have a HostPC, compare both new and old SB leds on the back. Do they look the same? Do they look like this http://openpet.lbl.gov/img_4620/ ?
These LEDs confirm that the voltages on the board are within spec.
1- JTAG functionality:
A- Using default SB jumpers does Altera Quartus Programmer recognize the Main FPGA?
B- Look at https://openpet-developers-guide.readthedocs.io/en/latest/system_troubleshooting.html?highlight=jtag#system-troubleshooting
Configure your SB jumpers are shown in figures 68 and 67. Does Quartus programmer recognize all three fpgas? Main, IO1, and IO2?
2- Confirm that your QuickUSB module is flashed to the latest version and functioning correctly. Swapping the two modules you have. Please wear ESD friendly attire 🙂
A final note: when testing singles mode please use version v2.3.1 use
$ openpet -c 0x0010 0x0800 0to verify the version
Please report back here and we will provide more suggestions later on.
FaisalFebruary 8, 2017 at 11:23 am #888
The main suspect here is the old SB. Is that correct? Also, the new SB is fully functional using all DBs, is that correct? Do you have a HostPC board?
Yes, the suspect board is the pre-Jan2016 version that we have had since 2014. Yes, fully functional on the newer SB and we have 14 fully tested DBs. No, we do not have a HostPC board; we just have the Quick-USB mounted on the SB for PC communication/data transfer.
Our setup shows the same LEDs modulo, an additional LED on DN 13 (which relates to the QuickUSB). See here for photo of SB LEDs. The jumpers (J10 & J19) are also highlighted in the photo and consistent with the correct setup and our other SB.
We appear to be able to use the JTAG to program without issue. Simple communication:
$ jtagconfig -n 1) USB-Blaster [USB-0] 020F40DD EP3C40/EP4CE(30|40) Node 0C006E00 JTAG UART #0 Node 19104600 Nios II #0 Design hash AF3C215664A671B7FEC6 $ nios2-terminal nios2-terminal: connected to hardware target using JTAG UART on cable nios2-terminal: "USB-Blaster [USB-0]", device 1, instance 0 nios2-terminal: (Use the IDE stop button or Ctrl-C to terminate) OpenPET LBNL SupportBoard-Main FPGA Loading children FPGAs with bitstream stored on EPCS QUSB Waiting nios2-terminal: exiting due to ^C on host
Loading OpenPET v2.3.1 flash for SB
$ pwd /cygdrive/c/OpenPET/v2.3.1/supportboard $ ./flashBoard.sh [Default] CDUC firmware and software will be flashed. [OpenPET] Was your Support Board manufactured after January 2016? [y/n] n [OpenPET] Using default USB cable . [OpenPET] Running "quartus_pgm -c 1 -m jtag -o ipv;./bin/CDUC/sb64.jic" Info: ******************************************************************* Info: Running Quartus II 32-bit Programmer Info: Version 13.1.0 Build 162 10/23/2013 SJ Full Version Info: Copyright (C) 1991-2013 Altera Corporation. All rights reserved. Info: Your use of Altera Corporation's design tools, logic functions Info: and other software and tools, and its AMPP partner logic Info: functions, and any output files from any of the foregoing Info: (including device programming or simulation files), and any Info: associated documentation or information are expressly subject Info: to the terms and conditions of the Altera Program License Info: Subscription Agreement, Altera MegaCore Function License Info: Agreement, or other applicable license agreement, including, Info: without limitation, that your use is for the sole purpose of Info: programming logic devices manufactured by Altera and sold by Info: Altera or its authorized distributors. Please refer to the Info: applicable agreement for further details. Info: Processing started: Wed Feb 08 11:00:34 2017 Info: Command: quartus_pgm -c 1 -m jtag -o ipv;./bin/CDUC/sb64.jic Info (213045): Using programming cable "USB-Blaster [USB-0]" Info (213011): Using programming file ./bin/CDUC/sb64.jic with checksum 0x497888 8A for device EP3C40@1 Info (209060): Started Programmer operation at Wed Feb 08 11:00:35 2017 Info (209016): Configuring device index 1 Info (209017): Device 1 contains JTAG ID code 0x020F40DD Info (209007): Configuration succeeded -- 1 device(s) configured Info (209018): Device 1 silicon ID is 0x16 Info (209044): Erasing ASP configuration device(s) Info (209023): Programming device(s) Info (209021): Performing CRC verification on device(s) Info (209011): Successfully performed operation(s) Info (209061): Ended Programmer operation at Wed Feb 08 11:01:59 2017 Info: Quartus II 32-bit Programmer was successful. 0 errors, 0 warnings Info: Peak virtual memory: 187 megabytes Info: Processing ended: Wed Feb 08 11:01:59 2017 Info: Elapsed time: 00:01:25 Info: Total CPU time (on all processors): 00:00:04 [OpenPET] Done programming board. Please reboot chassis. [OpenPET] Press [Enter] to close this window
I have not tested this yet. What do you do in the cases where you want to jump across pins? For example, from the Main TDO to IO1 TDI (i.e. J62 to J69). I have the jumper/shunts to do the 8-pairs of pins but how do I physically handle the blue & green lines on figure 68?
Quick-USB driver correctly set to v2.15.2 (see below). Note that this PC, USB cables, etc. are all common in our test between the two SB. Before we swap the Quick-USB, is there any other testes we can run to confirm the interface between the output of the SB and Quick-USB card is working correctly? The limited diagnostic options provided by Bitwise shows a fully working/communicating module and otherwise (modulo serial number) identical to the setup in the other crate/SB. Note that in previous tests I have swapped the Quick-USB modules on the SBs and the problem stayed on this older/original SB (independent of swap).
Using OpenPET v2.3.1 I tried the suggested command (verbatim option turned on):
$ openpet -v -c 0x0010 0x0800 0 2017-02-08 13:53:39,153 DEBUG OpenPET v2.3.1 2017-02-08 13:53:39,157 DEBUG Found QUSB-0 2017-02-08 13:53:39,158 DEBUG QUSB DLL Version: v2.15.2 2017-02-08 13:53:39,160 DEBUG QUSB Driver Version: v2.15.2 2017-02-08 13:53:39,161 DEBUG QUSB Firmware Version: v2.15.2 2017-02-08 13:53:39,163 DEBUG QUSB Writing to register(s): 2017-02-08 13:53:39,164 DEBUG =0x0001 2017-02-08 13:53:39,167 DEBUG =0xc000 2017-02-08 13:53:39,168 DEBUG =0x0002 2017-02-08 13:53:39,171 DEBUG =0x8010 2017-02-08 13:53:39,177 INFO [S] 0x0010 0x0800 0x00000000 2017-02-08 13:53:39,177 DEBUG Sending: 2017-02-08 13:53:39,178 DEBUG ID 0x0010 2017-02-08 13:53:39,180 DEBUG SRC 0x4000 2017-02-08 13:53:39,180 DEBUG DST 0x0800 2017-02-08 13:53:39,180 DEBUG PAYLOAD 0x00000000 2017-02-08 13:53:39,384 DEBUG Received [retries 1/20]: 2017-02-08 13:53:39,387 DEBUG ID 0x8010: 2017-02-08 13:53:39,388 DEBUG SRC 0x0800: 2017-02-08 13:53:39,391 DEBUG DST 0x4000: 2017-02-08 13:53:39,394 DEBUG PAYLOAD 0x00000004: 2017-02-08 13:53:39,397 INFO [R] 0x8010 0x0800 0x00000004 2017-02-08 13:53:39,398 DEBUG Done.
February 8, 2017 at 11:52 am #892
- This reply was modified 5 months, 2 weeks ago by Dale. Reason: formating/code block edit
1-B- Jumper Cable Wire female-female. The shorter the cable the better.
I typed the wrong command for the version. Please try this:
openpet -v -c 0x0011 0x0800 0
We need to verify that the IO FPGAs are OK. Once you confirm that jtagconfig can see them, then do the following:
0- Connect JTAG to SB with default jumper configration
1- Download http://openpet.lbl.gov/wp-content/uploads/2017/02/debug_img.zip
2- unzip and note directory
3- Open Quartus -> Open-> dropdown menu Files of Types: select “All files (*.*)” -> open sng_rnd.stp
4- Program the device using the sof image in zip
5- Count to 10
6- Highlight the instance then run analysis.
7- Acquire data as usual using the commands you pasted below. Signal tap should trigger. Send a screenshot of the waveforms. If it doesn’t trigger we need to debug the IO FPGAs.
Other questions to answers:
A- Is the SB PCB arched or bowed?
B- Are the main power cables securely screwed?
C- Basically, do a visual inspections of screws penetrating the PCB, blown capacitors, darker than usual soldermask, etc.
Finally your jumper configuration look OK
February 8, 2017 at 12:44 pm #894
- This reply was modified 5 months, 2 weeks ago by Faisal.
Quick Reply on the 0011 command:
$ openpet -v -c 0x0011 0x0800 0 2017-02-08 15:35:54,602 DEBUG OpenPET v2.3.1 2017-02-08 15:35:54,609 DEBUG Found QUSB-0 2017-02-08 15:35:54,614 DEBUG QUSB DLL Version: v2.15.2 2017-02-08 15:35:54,615 DEBUG QUSB Driver Version: v2.15.2 2017-02-08 15:35:54,617 DEBUG QUSB Firmware Version: v2.15.2 2017-02-08 15:35:54,618 DEBUG QUSB Writing to register(s): 2017-02-08 15:35:54,621 DEBUG =0x0001 2017-02-08 15:35:54,622 DEBUG =0xc000 2017-02-08 15:35:54,625 DEBUG =0x0002 2017-02-08 15:35:54,628 DEBUG =0x8010 2017-02-08 15:35:54,635 INFO [S] 0x0011 0x0800 0x00000000 2017-02-08 15:35:54,637 DEBUG Sending: 2017-02-08 15:35:54,638 DEBUG ID 0x0011 2017-02-08 15:35:54,638 DEBUG SRC 0x4000 2017-02-08 15:35:54,640 DEBUG DST 0x0800 2017-02-08 15:35:54,641 DEBUG PAYLOAD 0x00000000 2017-02-08 15:35:54,845 DEBUG Received [retries 1/20]: 2017-02-08 15:35:54,848 DEBUG ID 0x8011: 2017-02-08 15:35:54,849 DEBUG SRC 0x0800: 2017-02-08 15:35:54,851 DEBUG DST 0x4000: 2017-02-08 15:35:54,854 DEBUG PAYLOAD 0x00002031: 2017-02-08 15:35:54,855 INFO [R] 0x8011 0x0800 0x00002031 2017-02-08 15:35:54,858 DEBUG Done.February 9, 2017 at 8:08 am #895
Here is the result of the jtagconfig command using the JTAG jumper-pin chain for debugging on the SB.
$ jtagconfig -n 1) USB-Blaster [USB-0] 020F40DD EP3C40/EP4CE(30|40) Node 0C006E00 JTAG UART #0 Node 19104600 Nios II #0 Design hash F0FBFE7DE08E4246C75C 020F40DD EP3C40/EP4CE(30|40) Node 19104600 Nios II #0 Node 0C006E00 JTAG UART #0 Design hash 45736B774D4EBB3D57D1 020F40DD EP3C40/EP4CE(30|40) Node 19104600 Nios II #0 Node 0C006E00 JTAG UART #0 Design hash 45736B774D4EBB3D57D1
We will transition back to the default SB (green) jumper setup and follow up with your suggested test above.February 9, 2017 at 12:00 pm #896
I cannot seem to progress to take singles data at step #7. Programing the SB appears to prevent me from running the normal OpenPET commands. For example:
$ openpet -v -c 1 0x8002 0xABCD0123 2017-02-09 14:49:56,233 DEBUG OpenPET v2.3.1 2017-02-09 14:49:56,237 DEBUG Found QUSB-0 2017-02-09 14:49:56,240 DEBUG QUSB DLL Version: v2.15.2 2017-02-09 14:49:56,240 DEBUG QUSB Driver Version: v2.15.2 2017-02-09 14:49:56,240 DEBUG QUSB Firmware Version: v2.15.2 2017-02-09 14:49:56,240 DEBUG QUSB Writing to register(s): 2017-02-09 14:49:56,242 DEBUG =0x0001 2017-02-09 14:49:56,243 DEBUG =0xc000 2017-02-09 14:49:56,243 DEBUG =0x0002 2017-02-09 14:49:56,244 DEBUG =0x8010 2017-02-09 14:49:56,250 INFO [S] 0x0001 0x8002 0xABCD0123 2017-02-09 14:49:56,250 DEBUG Sending: 2017-02-09 14:49:56,250 DEBUG ID 0x0001 2017-02-09 14:49:56,252 DEBUG SRC 0x4000 2017-02-09 14:49:56,252 DEBUG DST 0x8002 2017-02-09 14:49:56,253 DEBUG PAYLOAD 0xABCD0123 2017-02-09 14:50:00,255 WARNING Controller Unit is not responding. Try again or restart. 2017-02-09 14:50:00,259 WARNING Controller Unit is replying with zeros. 2017-02-09 14:50:00,260 DEBUG Received [retries 20/20]: 2017-02-09 14:50:00,265 DEBUG ID 0x0000: 2017-02-09 14:50:00,266 DEBUG SRC 0x0000: 2017-02-09 14:50:00,269 DEBUG DST 0x0000: 2017-02-09 14:50:00,270 DEBUG PAYLOAD 0x00000000: 2017-02-09 14:50:00,273 INFO [R] 0x0000 0x0000 0x00000000 2017-02-09 14:50:00,275 DEBUG Done.
Prior to reprogramming, we do not have this issue:
$ openpet -c 1 0x8002 0xABCD0123 2017-02-09 14:29:45,680 INFO [S] 0x0001 0x8002 0xABCD0123 2017-02-09 14:29:45,882 INFO [R] 0x8001 0x8002 0xABCD0123 $ openpet -c 3 0x8002 2 2017-02-09 14:29:54,727 INFO [S] 0x0003 0x8002 0x00000002 2017-02-09 14:29:54,930 INFO [R] 0x8003 0x8002 0x00000002 $ openpet -c 7 0x8002 2 2017-02-09 14:30:00,953 INFO [S] 0x0007 0x8002 0x00000002 2017-02-09 14:30:01,157 INFO [R] 0x8007 0x8002 0x00000002 $ openpet -a 6 -o pre_LBL_test2.dat 2017-02-09 14:31:23,793 INFO Starting qusb data acquisition... 2s remaining.2017-02-09 14:31:29,815 INFO QUSB rate is 45.303 MB/s 2017-02-09 14:31:29,816 INFO Stopping qusb data stream... 2017-02-09 14:31:29,819 INFO QUSB data stream has stopped. 2017-02-09 14:31:29,819 INFO Stopping qusb data queue ... 2017-02-09 14:31:29,821 INFO QUSB data queue has stopped. 2017-02-09 14:31:29,822 INFO 0.0 MB/s written to disk, pre_LBL_test2.dat
The programing itself does not seem to be an issue; it shows 100% success:
Info (209060): Started Programmer operation at Thu Feb 09 14:45:33 2017 Info (209016): Configuring device index 1 Info (209017): Device 1 contains JTAG ID code 0x020F40DD Info (209007): Configuration succeeded -- 1 device(s) configured Info (209011): Successfully performed operation(s) Info (209061): Ended Programmer operation at Thu Feb 09 14:45:35 2017
What am I missing here?February 10, 2017 at 8:45 am #897
Alas, the SB went back to fully working yesterday and I have not been able to perturb the system back to the error mode. I will make a few adjustments to see if I can get the problem to occur for debugging. I will repeat the 1-B test if I am able.
I have uploaded a photo album of the pictures taken so far of the SB: see SB inspection photos.
With respect to your inspection questions:
A – The board is not warped or bowed
B – Main power cables are firmly connected. Direct measurement of voltages are correct and stable and the DB LEDs also all light up correctly when in.
C – No blown capacitors that I can see. Likewise, no issues with screws or with the board being damaged or stressed. As shown in the linked photos, there is a little bit of coloration around the Main FPGA. I cleaned it and removed dust elsewhere using a fine paint-brush cleaned on an alcohol wipe. This seemed to improved its appearance so it may have been a bit of “stain” on the board there.February 13, 2017 at 11:49 am #898
I am not sure whats causing the issue yet. Please let us know once it goes bad again.
February 13, 2017 at 1:17 pm #899
- This reply was modified 5 months, 1 week ago by Faisal.
You must be logged in to reply to this topic.