Where things stand today
As of today, the CPLD is a solved-problem. All the design goals detailed in my prior post are satisfied and have been tested up-to ~5MHz input clock.
The CPLD can now...
- Coordinate traffic as the bus master for both the SPI buses.
- Read IMUs individually.
- Read IMUs in continuous blocks.
- Dynamically switch clock sources.
- Aggregate and collate IRQ signals and send them over the SPI bus dedicated for this purpose.
- Map any single IRQ signal onto the CPU wakeup pin.
- Detect digit addition and removal without host intervention.
- Conserve power and EMI by turning off individual digit buses when not needed.
- Isolate electrical failures to the affected digit and maintain operation of unaffected digits.
On to the software battles...
My first priority is to streamline the firmware's treatment of the STM32F7 hardware to eliminate the need for the constant transfer intervention that the prototype glove needed to keep bytes moving. Once that is done, I can jack up the clock rate and validate the CPLD at full speed.
This is the only reason I am not posting detailed statistics today.
I have no reason to believe that the design won't scale to a 20MHz input clock, but I have to actually see it to believe it.
There is also an untested internal register that is concerned with digit-level power saving, but it isn't important to overall function, and I have no reason to believe it will not work.
Collected dev log
What follows is the sum of posts, notes, and slack reports over the course of the last month related to this task.
2016.05.30: Firmware progress report.
Ported IMU drivers, MQTT client, TCP, UDP support in firmware.
Much-needed condensation of constants and debug methods pertaining to bus transactions. I am due to write up a new blog post. But until then, here is my informal log of compile sizes over the course of last night. GCC was given -O2 for each build.
bss dec Notes 9392 384560 Starting point. 9392 383888 After splitting uint16_t _class_state into two 8-bit fields. 9392 383888 After condensing 4 bools (RNBase) into the above-mentioned field. 9392 383896 After condensing 5 bools (CPLDDriver). 9392 383760 After striking old interrupt logic. New baseline. 9392 382320 After condensing 1 bool (ADP8866). 9392 379660 After condensing 4 bools (i2c-device). 9216 289088 More miscellaneous bool condensation... 9232 289192 After enum conversion and BusOp state abstraction. 9216 289136 Condensed BTQueueOp into BusOpcode enum. 9216 289132 Condensed SPIQueueOp into BusOpcode enum. 9216 288600 CPLD audit results in cuts. New baseline. 9256 288540 Condensed SPIBus op state defs into BusOp enum. 9260 288512 Condensed I2C opcode into BusOp enum. 9260 288748 STM32F7 I2CAdapter compiles again following condensation. 9300 288716 Further condensation of SPIBusOp into BusOp. 9300 288700 More inline migrations into BusOp. About to extend another op class. 9300 288684 Preparing to move I2CBusOp into conformance with new base class. 9300 288412 I2CBusOp now extends BusOp. 9380 288692 CPLD no longer extends SPIDeviceWithRegisters. 9380 288988 Added some CPLD tests. 9380 289992 Widespread preprocessor work. Repaired console parser. 9380 275832 Same code base. Built without console support. 9380 289992 // With console support flag.... 9380 275832 // Without console support. That's resting stack load, and total firmware size, respectively. ~14KB of ballast that isn't needed in anything but a debug build.
[4:19 AM] jspark311 Ok... much cruftiness was lost. r1 simplified the firmware dramatically.
Now that the bus-ops are contained for the moment, I can turn my attention to the the god-awful SPIDeviceWithRegisters class (which I am about to rip out and shit-can for good).
Too much complexity.
No longer required to deal with the heterogeneous bus topology of r0.
PR'ing... Not a bad 24 hours....
ManuvrOS: +1,763 −1,062 (lines in commit)
Digitabulum: +459 −346 (lines in commit)
And the firmware running on the new r1 board....
==< Kernel >=================================== -- bootstrap_completed yes -- Digitabulum v0.0.1 Build date: May 26 2016 15:43:57 -- -- Current datetime -- millis() 0x00000001 -- micros() 0x00000000 -- _ms_elapsed 0 -- getStackPointer() 0x200484f4 -- stack grows down -- -- Queue depth: 0 -- Preallocation depth: 6 -- Prealloc starves: 0 -- events_destroyed: 0 -- burden_of_being_specific 0 -- idempotent_blocks 0 -- Lagged schedules 0 -- Total schedules: 1 -- Active schedules: 0 -- Subscribers: (10 total): 0: Kernel 1: CPLDDriver 2: LegendManager 3: I2CAdapter 4: ADP8866 5: RN4677 6: SDCard 7: IREmitter 8: HapticStrap 9: PMU
That's all of the drivers. The first 6 are nearly complete, and the PMU (power-management unit) can do some basic clock-scaling. The largest remaining task is direct hardware support.
jspark311: I can potentially make you an emulator in firmware now. lol
drquinn: woah, woudn't a firmware emulator be the actual glove?
jspark311: lol yeah. That's how close I'm getting.
drquinn: nice! can't wait to see where its at
jspark311: I have TCP, UDP, MQTT support in ManuvrOS now, so you could just connect with the browser.
drquinn: so you'd just have to snap on some TCP connector to the glove?
jspark311: No hardware involved... the firmware experiences a TCP socket in the same way as it does a bluetooth connection. So when the time comes, we just drop one transport class in for the other.
2016.06.06: CPLD debugging in-progress
That's the r1 main board with no battery, plugged into USB, a logic probe, and the JTAG dongle for the CPLD. The small purple PCB on the right-hand edge is the debug harness for the digits.
Things that work:
- Buffered clock selection (external vs internal), synchronous reset.
- Basic chip-select functionality
- Tri-state SPI1 CS logic (after some argument).
- Address demux and cycle logic
- IRQ aggregation state machine.
- SPI bus-master behavior is inconsistent.
- Software. Much of the infrastructure is improved, but the logical hook-up to the hardware SPI peripheral isn't there yet.
This is the part of the dev-cycle where two complex systems meet on a strand of copper wire. Getting this piece right will take some more time. But once I've finished, I can give some real-world transfer-rate benchmarks.
2016.06.07: CPLD debugging, major milestone
I can now reliably address registers inside the CPLD, which now releases the bus as expected. Another few nights of testing before I move to make the firmware use the SPI hardware, rather than bit-banging the GPIO.
2016.06.13: More CPLD progress
Validated: clock-enable logic for the internal registers. I can now set configuration registers.
Validated: IRQ selection logic works. I can set the CPLD wakeup IRQ to any specific IRQ signal. One of these was wired to the config register for testing. I can trigger the wakeup_isr by listening to the appropriate signal, and then setting that signal in the config register.
I better-pipelined some components of the transfer state-machine, and saved some CPLD complexity. As it turns out, much of my observed inconsistency was software-related, which is fixed.
Can't read from the main IMU yet...... But I think I know why. Operations are repeatable. But I still see a trailing bit in the IMU access. Fortunately, simulation shows the same thing.
Total logic elements 535 / 570 ( 94 % ) Total pins 76 / 76 ( 100 % ) UFM blocks 1 / 1 ( 100 % )
2016.06.13: CPLD patent filed
I am now waiting on the USPTO to write me back and tell me what I screwed up.
But in all seriousness... it's a tremendous relief to be at this point.
How big a relief?
I started designing r1 about 13 months ago, and received the populated main-boards in October (8 months ago).
Arguably, the CPLD contains the majority of the complexity of this project. Once (if?) I publish its schematics, you won't have to take my word for it. We're talking about BGA100 parts on a 6-layer PCB with buried vias, and accounting for ~70% of the trace-length in the board, the odds of recovery from a mistake made here approach zero.
Even partial-failure of this subsystem would mean a basically useless main board, and (~$2600 + 2 months) blown on a moment's oversight.
For an engineer that touches the physical world, "invoking GCC" might cost more than he makes in six months. So when you meet someone who routinely codes for 10+ hours in C, and their code not only compiles the first time, but does exactly what they meant it to do, this is one possible explanation for how they got that way.
And when you fund a project on your own dime, there are only so many stupid small mistakes you can make before economics kills your project. Any given millimeter of trace might be the thing that vaporizes thousands-of-dollars and the possibility of timely market-entry.
Do Not. Fuck. Up.
So until I could (in)validate the CPLD, nothing more could be decided about the future. And nearly everything else must be built before this particular piece could be tested.
So after about one year of experimenting on a working r0, planning, designing, and finally porting firmware, I proved my hardest hardware lemma to myself. The CPLD works.
This is not to say validation is complete. But it has passed enough tests to convince me that the design goals articulated here will all be met (if they aren't already). Software is going to be much cleaner. Hell... the initial working ported code already is cleaner, despite being a sad assemblage of hackery.
Why the patent?
I've never held the US patent system in high-esteem as-implemented. But until we have a blockchain-driven patent system that manages itself and pays inventors based on reference in open-source designs, the idea of purely-defensive patents will have to make-due. Provided my filing goes through, I will begin publishing full CPLD schematics and images alongside the firmware.
Current Quartus report:
Flow Status Successful - Mon Jun 13 18:51:33 2016 Revision Name DigitabulumCPLD Device EPM570F100C5 Timing Models Final Total logic elements 562 / 570 ( 99 % ) Total pins 76 / 76 ( 100 % ) UFM blocks 1 / 1 ( 100 % )
2016.06.19: Hardware timer/SPI
Hardware SPI is working for R/W of CPLD registers. Timer peripheral is also working to drive the CPLD's external clock. This marks the end of the bit-banged SPI.
2016.06.23: [3:02 AM] ￼CPLD is at v4
I will test the IMU access tomorrow.
All checked in.
PR'd. Merged. Sleep.
This version is going to be so awesome...
2016.06.23: [1:00 AM] ￼￼Still no IMUs...
buf *(0x20010df8) 0x80 0x01 0x01 0x8f 0x00 0x00 0x00 0x00 0x00 0xff 0xff 0xff 0x00 0x00 0x00 0x00
No IMU reads yet..... buf is correct, but buf should be an ID byte, and buf[10, 11] should be zero.
2016.07.02: First IMU read!
[3:25 PM] It's almost done.
-----SPIBusOp 0x20010d28 (RX)------------ xfer_state COMPLETE err NONE param_len 4 params 0x80 0x01 0x01 0x8f buf_len 5 buf *(0x20010c78) 0x00 0xff 0xff 0xff 0x68
That's the main PCB gyro/acc.
-----SPIBusOp 0x20010da8 (RX)------------ xfer_state COMPLETE err NONE param_len 4 params 0x91 0x01 0x01 0x8f buf_len 5 buf *(0x20010c78) 0x00 0xff 0xff 0xff 0x3d
And there is the magnetometer!
0x68 and 0x3d are their WHO_AM_I bytes.
Now I have to figure out some SPI peripheral stuff to cut the 4 bytes of slack leading up to the data.
-----SPIBusOp 0x20010d48 (RX)------------ xfer_state COMPLETE err NONE param_len 4 params 0x80 0x01 0x01 0x8f buf_len 1 buf *(0x20010c78) 0x68
[4:49 PM] That's more like it.... That was: Do a read request at the first IMU address (0x80). The transfer should be 0x01 byte long, replicated across 0x01 sensors with a sensor register address of 0x8F.
2016.07.03: Fine polish
[3:39 PM] IRQ subsystem is validated. I am still fighting with spanned ops.
But I can read/write from IMUs now.
Just not read large swaths at once.
I think I have one more pipeline bug to work out...
[6:35 PM] CPLD fully validates. Have not yet tested the frequency limits. But it will be high-enough, I'm sure.
-----SPIBusOp 0x20010d3c (RX)------------ xfer_state COMPLETE err NONE param_len 4 params 0x80 0x01 0x22 0x8f buf_len 34 buf *(0x20010c78) 0x68 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x3d 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
Not all the bytes are shown, because I over-ran my terminal width. But they just repeat 0xff until the end because there are no digits attached.
All the 0xFF represent gaps in the sensor package.
0x68 (position 0) and 0x3d (position 17) are the main PCB's IMU. Read both in a single bus op.
-- SPI2 (online) -------------------- -- Buffer index: 20 -- IRQ buffer: 1 -- IRQ service: disabled -- _irq_data_0: 00000000000000000018 -- _irq_data_1: 00000000000000000008
Those are the double-buffer IRQ reads. I toggled a bit by writing the config register to simulate an IRQ signal changing. I have verified that this also works with all signals, and that the signals are being faithfully-imported from the digit interface.
I cut some critical slack that was causing quartus to optimize improperly. And that solved the last of the IRQ aggregator problems.
The protocol is one of simple framing.
<Initial-Sensor> <Size-per-sensor> <Sensor-count> <Sensor-register>
...is basically the frame. 4 bytes.
I was able to save ~26 filpflops by cutting the two middle params down to 6-bit. Valid parameters can never exceed those bounds, and supporting them was a waste of precious area.
Total logic elements 550 / 570 ( 96 % ) Total pins 76 / 76 ( 100 % ) UFM blocks 1 / 1 ( 100 % )
We still have 20 flipflops to spend. The CPLD has exactly enough space left over to count to 1 mibi. Which isn't that much.
Going to add some previously-stripped features, increment to CPLD r9, re-burn, validate, and then call it done.
What I wouldn't have given to have these features in the prototype glove... This thing is going to kick so much ass...
[11:16 PM] And that, gentlemen, concludes the development of the CPLD. We are at r9. Committing...
Revision Name DigitabulumCPLD Top-level Entity Name r1-CPLD Device EPM570F100C5 Timing Models Final Total logic elements 553 / 570 ( 97 % ) Total pins 76 / 76 ( 100 % ) UFM blocks 1 / 1 ( 100 % )
CPLD debugging setup
That's the main board with an attached battery on the left. JTAG pod, O-scope and logic probes hooked up to the breadboard on the right.