At first I thought that I had the PHY autonegotion business right and it was all about getting the emac driver to work at the new speed, but there is more involved than that.
When I examine the startup messages more closely I see that a fair number of rx and tx packets have been transmitted. Also I see that Kyu is sometimes (but not always) able to fetch the symbol table via TFTP.
The next thought is that I am doing something after the symbol table fetch that is the problem. So I need to find where I fetch the symbols table. I look at main.c and see that it calls "shell_init() with notes about that needing the network to fetch the symbol table. This routine is in symbols.c and does little more than using tftp_fetch() to get the symbol table. Initialization rattles along, eventually launching a thread to run user_init(). This in turn launches another thread to run shell_main() which is in tests.c. It is hard to imagine how any of this could cause the network to get jammed up (and only get jammed up in 100 Mbit mode).
Also, sometimes we get the symbol table and sometimes we don't. I try setting "ping" going while I reboot Kyu. We get and respond to a few ICMP packets for a short interval after the autonegotiation and even while the symbol table transfer is going on.
This starts to look like some kind of race that is being exacerbated by the faster network speed. But who can say at this point?
An existing facility "n 11" calls emac_debug() and displays lots of interesting information. It shows the entirety of both the Rx and Tx buffer rings. I currently have 16 entries in each of the Rx and Tx rings. This is handy for debug (and I may use 8 during debug). Linux uses 256.
cd /u1/BSD/NetBSD-git (segfaults!!) ctags -R cd /u1/BSD/NetBSD-git ctags -R . cd /u1/BSD/NetBSD-git/sys/arch/arm/sunxi vi sunxi_emac.cThis is interesting, but has not yet led me to a solution.
What I see is that the link goes from UP to DOWN. Curiously I am able to transfer a few packets after the PHY status indicates the link is down. My little Netgear switch still shows 100 Mbit even after the PHY decides that the link is down. Along with the links status going to DOWN, the values in the PHY "peer" register all go to zero.
I try a variety of things. What I find most useful is looking at the datasheets for the two PHY chips (the RTL8211 and the 8710a chip) and contemplating the bits in the registers. Some wording in the RTL8211 sheet makes me wonder about the setting of the ANEG_ENABLE bit. I set it, then trigger autonegotion, then when autonegotiation finishes I clear it. I decide not to clear it and just leave it set. Now everything works! My take on this is that it is telling the PHY chip to ignore the setting of the speed and duplex bits in the control register and just obey what has been determined by autonegotiation.
Wireshark shows Kyu sending two packets, each with 500 byte payloads, back to back. Then an ACK comes almost immediately from the linux side. Then there is a bit of a delay (on the order of .002 to .006 seconds) before Kyu sends another pair. If this delay was .005 seconds on the average, this would total to 5 seconds. I wonder if we see this on the BBB where 100 Mbit seems to run as fast as it should.
It isn't that hard to look at the BBB. I use "kyucon1" to get a console to ttyUSB1, then stop wireshark and change the mac address in the capture filter "ether host xxxx". and I am able to get the data I want. The 1M transfer goes in 0.15 seconds. Now I see Kyu sending sometimes 3, 2, or rarely 1 packet with a 500 byte payload back to back. In general, the delay from when linux sends the ACK to when Kyu again sends a packet is now very small. 0.00005 seems typical. Taking 1000 delays like this gives us 0.05 seconds rather than 5.0 seconds.
So, the new idea now is to just send a bunch of packets and see how long it takes to send them. I try this the stupid way by calling emac_send in a loop 1000 times. This could never work because there are only 64 slots in the TX ring and when it gets full, the routine will just print a message, do nothing and return. What does happen is I get:
Kyu, ready> n 12 PANIC: do_irq, resume Kyu, ready> l Thread: name ( &tp ) state sem pc sp pri Thread: shell (400a7a28) READY I 400045fc 40568000 11 Thread: n_wrapper (400a737c) FAULT J 40010c90 40576000 14 Thread: net-in (400a7658) SEM I net-inq 4001112c 40570000 20 .... Kyu, ready> u n_wrapper Thread n_wrapper, dump (-) Thread: n_wrapper (400a737c) FAULT J 40010c90 40576000 14 state: FAULT stack: 40576000 mode: JMP CPU: 0 pc : [<40010c90>] lr : [<40010c90>] sp : 40577f20 ip : 60000013 fp : 40577f3c r10: 00000002 r9 : 79f38ee0 r8 : 7dfb4878 r7 : 79f3ac90 r6 : 40000000 r5 : 00000002 r4 : 4005c100 r3 : 400a73c0 r2 : 00000000 r1 : 00000000 r0 : 00000001 cpsr: 600000d3 Flags: nZCv IRQs off FIQs off Mode SVC_32 PC = 40010c90 ( change_thread+194 ) Called from resched+230 -- 400110b8 Called from thr_block+7c -- 40010258 Called from kyu_panic+a4 -- 4000e5bc Called from panic+1c -- 4000e5e4 Called from do_irq+c4 -- 4000117c Called from irq+34 -- 40000434 Called from board_net_send+18 -- 40002f88 Called from test_fast+94 -- 4000b434 Called from n_wrapper+30 -- 4000a5d0 Called from thr_exit -- 4000ffa0
The smart way to do this involves coding up a new routine in emac.c Maybe call it "emac_send_block()". The idea is that it will kick off the send, then block on a semaphore. When the Tx interrupt happens to indicate something (I may have choices) relating to the Tx being finished, it will unblock the semaphore. I can do several of these in series and time them -- and with the resolution of the CCNT counter even just doing one would do the job.
So I did this and asked it to send 10 packets with 1000 byte payloads. They end up 1014 bytes on the wire with ethernet headers. The times in CCNT units were:
data sent = 9467587 data sent = 9464027So, we sent 10,140 bytes in .0095 seconds. Call this 8 bits per byte and we get 8.5 megabits per second. This is disappointing. In fact the times shown by wireshark indicate it takes just under .001 seconds per 1014 byte packet, confirming this measurement using the CCNT register.
Kyu / [email protected]