Saturday, August 13, 2011

Shades of Sanity

Heureka!
SPI is a godsend. Everything's forgiven!

1) 8 + 1 colors at 18.868 ms, about 52 frames/sec.

The new code is approximately 2.5 x faster, which may not sound a lot - but that's what separates a boring or flickering display from a nice, crisp multicolor display. The main time thief is the "if ( *f++ >= lx)" part - the code runs at about 128 frames/sec with a fixed value instead of "b", so I figure it will be possible to further improve the code. For reference, old Williams games used 4+1 shades, newer Stern games use 16 shades.  Considering the hardware used, I'm quite satisfied with the current performance.


Feel free to look upon and possible improve the code:

void writeDisplay()
{
/*
Write all shades in one go. (i.e around 2.3 ms per color)
*/


register byte x,y,lx,b;
    register byte *f;


cli(); //Disable interrupts 

for (lx=1; lx < DISPLAY_MAX_BRIGHTNESS+1;lx++) //repeat for each color/shade
{   
   f = &frame[0][0];


//row 1  -------------
bitClear(PORTF, 2);          //set 4th bit of PORTF to LOW          
for (x=0; x < 16; x++)   //send full 128 bit row using SPI
{
if ( *f++ >= lx) b  = B10000000; else b=0;
if ( *f++ >= lx) b |= B01000000;
if ( *f++ >= lx) b |= B00100000;
if ( *f++ >= lx) b |= B00010000;
if ( *f++ >= lx) b |= B00001000;
if ( *f++ >= lx) b |= B00000100;
if ( *f++ >= lx) b |= B00000010;
if ( *f++ >= lx) b |= B00000001;

SPI.transfer(b);
}   
   bitSet(PORTF, 2);            //set 4th bit of PORTF to HIGH //Column latch
      bitSet(PORTF, 4);          //set 5th bit of PORTF to HIGH (mark first row)  
bitClear(PORTF, 5);          //set 6th bit of PORTF to LOW     
        bitClear(PORTF, 3);          //set 4th bit of PORTF to LOW
        bitSet(PORTF, 3);            //set 4th bit of PORTF to HIGH      
bitSet(PORTF, 5);            //set 6th bit of PORTF to HIGH


//row 2-31      ------------
for (y=0; y < DISPLAY_MAX_ROWS-2; y++)
{

   bitClear(PORTF, 2);          //set 4th bit of PORTF to LOW          
for (x=0; x < 16; x++)   //send full 128 bit row using SPI
{
if ( *f++ >= lx) b  = B10000000; else b=0;
if ( *f++ >= lx) b |= B01000000;
if ( *f++ >= lx) b |= B00100000;
if ( *f++ >= lx) b |= B00010000;
if ( *f++ >= lx) b |= B00001000;
if ( *f++ >= lx) b |= B00000100;
if ( *f++ >= lx) b |= B00000010;
if ( *f++ >= lx) b |= B00000001;

SPI.transfer(b);
}   
  bitSet(PORTF, 2);            //set 4th bit of PORTF to HIGH //Column latch
    bitClear(PORTF, 4);          //set 5th bit of PORTF to LOW (the rest of the rows are not leading)
  bitClear(PORTF, 5);          //set 6th bit of PORTF to LOW    
       bitClear(PORTF, 3);          //set 4th bit of PORTF to LOW
      bitSet(PORTF, 3);            //set 4th bit of PORTF to HIGH
bitSet(PORTF, 5);            //set 6th bit of PORTF to HIGH
}
 
 
//row 32 ---------------
   bitClear(PORTF, 2);          //set 4th bit of PORTF to LOW          
for (x=0; x < 15; x++)   //send full 128 bit row using SPI
{
if ( *f++ >= lx) b  = B10000000; else b=0;
if ( *f++ >= lx) b |= B01000000;
if ( *f++ >= lx) b |= B00100000;
if ( *f++ >= lx) b |= B00010000;
if ( *f++ >= lx) b |= B00001000;
if ( *f++ >= lx) b |= B00000100;
if ( *f++ >= lx) b |= B00000010;
if ( *f++ >= lx) b |= B00000001;
SPI.transfer(b);
}   
if ( *f++ >= lx) b  = B10000000; else b=0;
if ( *f++ >= lx) b |= B01000000;
if ( *f++ >= lx) b |= B00100000;
if ( *f++ >= lx) b |= B00010000;
if ( *f++ >= lx) b |= B00001000;
if ( *f++ >= lx) b |= B00000100;
if ( *f++ >= lx) b |= B00000010;
if ( *f  >= lx) b |= B00000001;
SPI.transfer(b);
  bitSet(PORTF, 2);            //set 4th bit of PORTF to HIGH //Column latch
      bitClear(PORTF, 4);          //set 5th bit of PORTF to LOW (the rest of the rows are not leading)
bitClear(PORTF, 5);          //set 6th bit of PORTF to LOW
        bitClear(PORTF, 3);          //set 4th bit of PORTF to LOW
        bitSet(PORTF, 3);            //set 4th bit of PORTF to HIGH
bitSet(PORTF, 5);            //set 6th bit of PORTF to HIGH
} //End of color/shade

sei(); //enable interrupts.


bitClear(PORTF, 5);          //set 6th bit of PORTF to LOW
}



7 comments:

  1. It looks like you are effectively doing the fairly costly bitmap to planar conversion on the fly. Is there any reason you can't convert you graphics to planar format in advance? This would allow you to copy data directly from your buffer to the SPI peripheral without all the bit shuffling. I'm working on a similar project (https://sites.google.com/site/tristansideas/electronics/pinball-display-clock/) using planar graphics and DMA to copy directly to SPI.

    ReplyDelete
  2. Hi there!

    Yes, I realized this as well. In the new version I made a separate buffer that gets incrementally built and then drawn "no questions asked". It also uses the double buffer concept where it keeps drawing one frame until the next is finished and then gets swapped. The actual drawing is now around 950 ns per color with the buffer being built separately.

    But I ran out of RAM. :)
    So the next step is ordering a beefier MCU (Chipkit Max32) that is almost out-of-the-box compatible with Arduino. This allows me to do multitasking as well, with the display and lights running on a separate MCU.

    How would you solve different colors inside one object with your version? I figure you would have to create different layers for each color and object?
    The main benefit with my version is that it allows me to use a single input buffer with lots of color and/or additional data in the spare bits of the image.

    ReplyDelete
  3. It should say: "The actual drawing is now around 950 us per color with the buffer being built separately.".

    950 ns, that would be sweet! :D

    ReplyDelete
  4. It seems you are making great progress. I'll be watching with interest.

    For the clock I plan to have the images consist of a series of bit-planes. So for a 16 level gray-scale image there would be 4 planes, each of 16384 bits. At this stage I'm not sure if I will shoot for 16 levels, 4 may be enough.

    The downside of bit-planes is that horizontal scrolling of pixels requires bit shifts instead of the byte copies you can do with chunky pixels but at least you are effectively moving multiple pixels in each plane per shift.

    If I end up using sprites I also plan to pre-calculate the horizontal shifted positions of the sprites so it doesn't have to be done at runtime.

    Hopefully the use of planes also means one plane can be manipulated while another is drawn. Time will tell just how much I can get away with.

    I'm interested as to where you source your images you are using to test the display? The plan for the clock is to use an external SPI Flash ROM or SD card to store animations. it should be possible to use DMA to copy animations directly from the Flash or SD directly to the display.

    ReplyDelete
  5. At the moment I'm using either hardcoded arrays or my text-to-sprite function that I've written. It's not beautiful, but it works good enough for testing. :)

    The freeware graphics program GIMP can output c++ headers from graphics files, so I might be using this in the future. Or simply write my own function (on a standalone computer) to create the arrays the way I want them.

    Since I've begun migrating to also be using a Chipkit Max32 I've probably got more than enough flashdisk to store all animations without an external flashdrive. In theory, at least...

    It would be really sweet however to read directly from flash to display, let me know if you have any progress with this! :)

    ReplyDelete
  6. Great Project!

    I love pinball dot matrix displays and have worked on many projects controlling them. I usually go down the FPGA route for speed!!! but recently got it working with a LPC2103 micro controller.

    Keep up the good work!!!!

    ReplyDelete