Jump to content
ppulle

Assembly Moving Average Algorithm Pic18f4550

Recommended Posts

Hi,

I'm looking for a bit of help to save some time re-inventing the wheel.

 

I'm sampling a number of sensors, three accelerometer, three magnetometer and two ADC channels. The ADC and Accelerometer channels give me unsigned int data and the magnetometer gives me signed ints. I have a PIC18F4550

 

I'd like to store these in a table, and do a moving average filter on them.

 

eg:

int aMagXArray[8]; //allocate arrays of 8 values each

//....

//Now sample the samples, putting them in variables like SensorMagX, SensorMagY etc

aMagXArray[nPos] = SensorMagX;

aMagYArray[nPos] = SensorMagY;

etc

nPos++;

nPos = nPos & 0x07; //ie keep nPos from 0..7

 

then report the moving average

eg:

nTot = 0;

for(i=0;i<8;i++)

nTot = nTot + aMagXArray;

nMagXAveragedValue = nTot / 8;

etc repeat for the other seven sensors

 

...a pretty standard moving average alogorithm, store the samples in a table with a rotating pointer and average the table when you want a value

 

Unfortunately all this array referencing is taking a big chunk of code space, which I don't have due to all the other code in my PIC. Using pointers isn't much help either.

 

So I'd like to re-write this in assembly. It's been a while since I've played with indirect addressing, hence this post.

 

Is there a 18F4550 assembly algorithm that will sum integers (unsigned or signed) that are sitting in an array? The sum can then be divided to give the moving average.

 

Presumably I'd have to allocate the arrays in a single page to simplify things. Does the Sourceboost compiler automatically ensure that arrays are allocated in one page?

 

Thanks for any help.

Phil

Share this post


Link to post
Share on other sites

Hi Phil,

 

Arrays will be put where the linker thinks there is space. If you assign a fixed address to your arrays then you can put them into the same page. Any variables you use to manipulate these arrays, fix them into the same page as the arrays to reduce some page switching code.

 

Instead of treating your arrays as circular buffers and testing for pointer wraparound, consider using the arrays as a stack. Use something like memmove or memcpy (or assembler equiv) to shift everything up an int then insert your new sample into the first array location only.

 

Declare unsigned int arrays where you are storing unsigned ints to make adding up a bit quicker for these data types.

 

When dividing by 8, ensure the compiler is not calling a 16 bit division routine but is simply doing 3 shifts to the right. If it is then do the shifting yourself. Same for the signed ints.

 

Get out that assembler manual and see what you can do.

 

Cheers

 

Reynard

Share this post


Link to post
Share on other sites

Hi Reynard,

Thanks for the response. I've been fiddling around getting my headspace back into assembler mode.

 

Firstly, I'm not sure there is a memmove opcode for the PIC18F4450....I'd love to be wrong about this, in the meantime I'll stick with the circular buffer technique.

 

The shift right for divisors of multiples of 2 I'm already up on.....in fact if you look at the opcode created by sourceboost it'll optimise divisions by doing multiple shifts anyway. Not sure about right shifting signed ints....I think you need to preserve the high order bit. In any case for 2 byte ints with 18F4450 you need to do two byte shifts, making sure on the high byte you clear the carry, rotate right through carry so the low order bit gets saved in the carry then rotate right through carry on the low order byte (with the saved carry).

 

The indirect addressing vs paging isn't an issue I've found out by reading the datasheet. I'll assign the address of the array to one of the FSRx pointers and use the POSTINCx register to access the elements. From the datasheet the INDFx, POSTINCx, PREINCx...etc registors all use 12 bit addressing......so paging isn't a problem.

 

Here's a trick I found when trying to make my code more efficient.

I originally had code simliar to:

nSensorX = read_sensor_adc(SENSOR_NUMBER) ;//which does an ADC read on the sensor specified and returns an unsigned int.

aSensorX[circular_buffer_pointer] = nSensorX;

or

aSensorX[circular_buffer_pointer] = read_sensor_adc(SENSOR_NUMBER); ///where aSensorX is an array

all well and good...however with lots of sensors you end up with lots of code being generated the does array handling....bloating code out as each array access gets handled seperately.

 

I found it more efficient (though less readable) to re-write my sensor reading function to use a pointer

 

void read_sensor_adc(unsigned char nSensorNum, unsigned int *dest);

....

read_sensor_adc(SENSOR_NUMBER, aSensorX+circular_buffer_pointer);

 

this way code is only generated for array/pointer access once in the function and doesn't need to be repeated for each sensor. Only useful if you are going to put your values in an array.

 

 

Phil

Share this post


Link to post
Share on other sites

Hi Phil,

 

You seem to be on right track for getting that code efficiency up.

 

Memmove and memcpy are standard library functions and not op-codes. Writing your own version for your application would probably be more efficient and quicker.

 

To shift signed ints just use a C statement. The code produced will probably be as good as you can write and takes care of the signd and carrys.

 

The compiler may on occasion use FSR0 inside build in and library functions so keep an eye on that as called functions may not preserve it. The interrupt preserves FSR0.

 

When using FOR loops, it is sometime faster to use a decrementing iteration counter. Testing for zero can be quicker than doing a compare with a constant or variable.

 

Good luck squeezing out that last byte and microsecond.

 

Cheers

 

Reynard

Share this post


Link to post
Share on other sites
Hi,

I'm looking for a bit of help to save some time re-inventing the wheel.

 

I'm sampling a number of sensors, three accelerometer, three magnetometer and two ADC channels. The ADC and Accelerometer channels give me unsigned int data and the magnetometer gives me signed ints. I have a PIC18F4550

 

I'd like to store these in a table, and do a moving average filter on them.

 

eg:

int aMagXArray[8]; //allocate arrays of 8 values each

//....

//Now sample the samples, putting them in variables like SensorMagX, SensorMagY etc

aMagXArray[nPos] = SensorMagX;

aMagYArray[nPos] = SensorMagY;

etc

nPos++;

nPos = nPos & 0x07; //ie keep nPos from 0..7

 

then report the moving average

eg:

nTot = 0;

for(i=0;i<8;i++)

nTot = nTot + aMagXArray;

nMagXAveragedValue = nTot / 8;

etc repeat for the other seven sensors

 

I wouldn't recompute the total each time. I'd do the following:

 

/* Zero nTotMagX and xMagArray at program initialization */

 

nTotMagX -= aMagXArray[nPos]; /* remove oldest value from sum */

aMagXArray[nPos] = SensorMagX; /* save new value */

nTotMagX += SensorMagX; /* add new value to sum */

nPos = (nPos + 1) & 7;

nMagXAveragedValue = nTotMagX / 8;

 

This does assume you have enough RAM for the extra global variables, nTotMagX (and nTotMagY).

 

If you wanted to use a pointer instead of nPos to step through the array:

 

/* Program initialization */

int *pOldestMagX = &aMagXArray[0];

 

//...

 

nTotMagX -= *pOldestMagX; /* remove oldest value from sum */

*pOldestMagX = SensorMagX; /* save new value */

nTotMagX += SensorMagX; /* add new value to sum */

 

if ( ++pOldestMagX == &aMagXArray[8] )

{

pOldestMagX = &aMagXArray[0];

}

 

 

I don't know which would be better. You'd just have to look at what the compiler produced in each case.

 

Orin.

Share this post


Link to post
Share on other sites

What about this code? It generates code (almost) as small as assembly:

 

#include <system.h>

unsigned char samples[8];
unsigned char count = 0;
unsigned short sum = 0;

void add_sample( unsigned char val )
{
//Replace oldest sample and update total
sum -= samples[count];
//Add new value (code below is equal to 'samples[count] = val')
asm
{
	movf _val, W
	movwf _indf
}
sum += val;
//Advance counter
if( ++count == sizeof(samples) )
	count = 0;
}

inline void get_average( unsigned char &average )
{
average = sum/sizeof(samples);
}

void main()
{
unsigned char average, i = 0;

//test
while( 1 )
{
	add_sample( ++i );
	get_average( average );
}
}

Regards,

Pavel

Share this post


Link to post
Share on other sites

Thanks for the responses....the idea of keeping a running track of the total sum, and removing it before adding the new value is great...saves needing code to sum over the array, plus saves time....(got plenty of RAM left, very tight on ROM). Does new value insertion and summation in one step as well.

 

Thanks for the input Pavel, that's an interesting approach....however I'm wondering how future proof it will be. It assumes the FSR register is maintained after the array access...also you don't specify which indf register to use, indf0, indf1 or indf2 (for PIC18F4550). No biggie, could just add some extra code to explicitly load the FSR.

 

eg

unsigned int *ptrSamplesEnd;

unsigned int aSamples[8];

unsigned int *ptrSamplePos;

unsigned int nTotalSamples;

 

ptrSamplesEnd = &aSamples[8]; //point to one value over the end of the Samples array

 

 

nTotalSamples -= *ptrSamplePos; //remove the old sample

*ptrSamplePos++ = nSample; //add the new one

if (ptrSamplePos = ptrSampleEnd) ptrSamplePos = aSamples;

 

or in asm (note I haven't checked this....just coding off my head here, hope I got my high/low order bytes right)

void add_sample(unsigned int nSample)

{

asm

{

movf _ptrSamplePos+1,W

movwf _fsr0h

movf _ptrSamplePos,W

movwf _fsr0l ;set up the FSR

 

movf _postinc0,W

subwf _nTotalSamples

movf _postdec0,W

subwfb _nTotalSamples+1 ;do int subtraction, subtract first low byte, then high byte with borrow.

;Use postinc/postdev to get fsr back to the array element position we want

 

movf _nSample,W

movwf _postinc0

movf _nSample+1,W

movwf _indf0 ;we could do the comparison below on fsr0h and fsr0l using another postinc....but we'd have to load it back to ptrSamplePos anyway, might as well increment ptrSamplePos directly

 

infsnz _ptrSamplePos,F ;increment low byte of ptrSamplePos

incf _ptrSamplePos+1 ;and increment the high byte if the low byte went 0xff->0x00

 

movf _ptrSamplePos+1,W ;now see if we've pointed off the end of the

cpfseq _ptrSamplesEnd+1

bra finish_add ;not equal so ptrSamplePos+1<>ptrSamplesEnd+1

movf _ptrSamplesPos,W ;if we're here, high bytes were equal...so we check low bytes

cpfseq _ptrSamplesEnd

bra finish_add ;no so we finish

 

movf _aSamples+1,W ;ptrSamplePos high and low were equal to ptrSamplesEnd (high and low)...so we reset it to start of array.

movwf _ptrSamplesPos+1

movf _aSamples+1,W

movwf _ptrSamplesPos

 

finish_add:

}

//now nTotalSamples is the sum of all ints in the array aSamples and we can do our average correctly.

}

 

 

question? could we have used the movff opcode in some places above

eg

movff _aSamples+1,_ptrSamplesPos+1

movff _aSamples, ptrSamplesPos

would still take four cycles, but might be clearer than fiddling around with the W register.

Share this post


Link to post
Share on other sites
...Thanks for the input Pavel, that's an interesting approach....however I'm wondering how future proof it will be. It assumes the FSR register is maintained after the array access...also you don't specify which indf register to use, indf0, indf1 or indf2 (for PIC18F4550). No biggie, could just add some extra code to explicitly load the FSR...

 

For code like:

 

sum -= samples[count];
samples[count] = val;

 

compiler should detect that pointer registers (like FSR) get initialised using the same pointer and optimise out code in the second statement. Altually BoostC already does this but for some reasons it doesn't do it for this particular case. Will investigate and fix. So the answer to your question is that in future BoostC releases there will be no need to replace 'samples[count] = val' with assembly. Compiler will generate optimised code.

 

Regards,

Pavel

Share this post


Link to post
Share on other sites

Thanks for looking into that Pavel, I've got enough to go on with....now I've got to incoorporate this into a single routine that will take a set of eight samples, making this an eight tap finite impulse response filter with two byte precision.

Share this post


Link to post
Share on other sites

Hi,

 

I've got a nice little asm code adding sensors, performing array totals etc and doing everything up to the dividing by 8 to take the average.

 

Rather than figure it out for myself, I thought I'd just re-use the code that sourceboost generates to do a divide by 8 on a signed int.

 

However the results were not as expected.

 

eg, compile and step through this in the debugger

 

int k;

int z;

k = 0;

z = 0;

while(1)

{

k = k - 1;

z = k / 8;

}

 

for different values of k, I get the following results for z

k=-1,z=-1

k=-2,z=-1

k=-3,z=-1

k=-4,z=-1

k=-5,z=-1

k=-6,z=-1

k=-7,z=-1

k=-8,z=-1

k=-9,z=-2

etc

k=-16,z=-2

k=-17,z=-3

etc

etc

 

Am I missing something here.....it looks like the optimisation to do a divisor that is a multiple of 2 isn't handling signed ints very well. I would assume for k=-1...-7 -> k/8=0 , k=-8...-15 -> k/8=-1, k=-16...-24 ->k/8=-2 etc

 

Does anyone know asm code sequence that will do a signed int division by eight properly? It works for k>0...I don't want to add extra code to check the sign and artificially add one to correct the division.

 

Phil

Share this post


Link to post
Share on other sites

Hi Phil,

 

Dividing signed ints by powers of 2 requires special treatment and fudge factors because of rounding problems.

 

I suggest converting the negative int to a positive value then dividing by 8 then make negative again.

 

BoostC is not the only compiler that gets this one wrong for signed variables.

 

Cheers

 

Reynard

Share this post


Link to post
Share on other sites

Thanks for the heads up on how compilers deal with signed ints.......it's sure to be a trap for the unwary. I was hoping there'd be a generic bit fiddling algorithm that'd handle that sort of thing.

 

Anyway I've taken everyones advice, subtracting the old history value before adding the new one to quickly calculate the totals as well as converting negative ints to positive (and back again) before doing the division. Thanks to all who've helped.

 

Below is my contribution to the sourceboost world.....a 12 bit precision, signed, eight channel, eight tap finite impulse response filter....aka moving average on eight inputs.

 

Any comments on stripping more bytes from the code would be appreciated, I'm sure there are more efficient ways of doing some things like converting negative to positive numbers, dividing and adding 16bit ints etc.

 

In any case it compiles to 164 bytes, which is fine for my application, and a lot less than I thought it would be.

 

Phil

 

 

 

#define SENSOR_CHANNELS 8
#define TAP_COUNT 8
#define NEXT_ROW_OFF 16
#define TAP_MASK 0eh
int aSamples[SENSOR_CHANNELS * TAP_COUNT ]@100;
int aSensors[SENSOR_CHANNELS]@0x20;
int aTotals[SENSOR_CHANNELS]@0x40;
unsigned char nSamplePos;		


//FUNCTION to implement an eight channel eight tap finite impulse response filter (aka Moving average filter)
//eight separate sensor inputs can be averaged
//INPUTS:
//	aSensors := array of eight int sensor inputs (will be modified after the function)
//OUTPUTS:
//  aSensors := averaged sensor inputs based on previous inputs
//USES:
//  aSamples := array of histories of each sensor 
//  aTotals := array of total values of the histories, used in calculating the averages
//	nSamplePos := byte position in the history array, nb we keep it at 0,2,4..14 not element (eg 0,1,2,3)
//RESTRICTIONS:
//	TAP_COUNT must be a multiple of 2
//  ints are used, so precision is 16bits - sign bit - divisor bit count, eg 16-1-3 = 12 bits for 8 tap
//  so this code can only be used for values +-4096 max
//  need to manually modify division section to do division if 8 tap is not used
//  TAP_MASK should be defined as SENSOR_CHANNELS*2-1
void av_samples(void)
{
unsigned char nChannel;

asm 
{

	//Setup the indirect registers
	movlw	LOW(_aSamples)		;fsr0 := aSamples + nSamplePos
	movwf	_fsr0l
	movlw	HIGH(_aSamples)
	movwf	_fsr0h		
	movf	_nSamplePos,W
	addwf	_fsr0l,F			;add the nSamplePos to get to old history sample
	btfsc	_status,C
	incf	_fsr0h,F

	movlw	LOW(_aTotals)			;fsr1 := aTotals
	movwf	_fsr1l
	movlw	HIGH(_aTotals)
	movwf	_fsr1h	

	movlw	LOW(_aSensors)			;fsr2 := aSensors
	movwf	_fsr2l
	movlw	HIGH(_aSensors)
	movwf	_fsr2h

	movlw	SENSOR_CHANNELS				;we are doing eight channels
	movwf	_nChannel				

AV_SAMP_LOOP:
	movf	_postinc0,W			;get old sample value, low byte
	subwf	_postinc1,F			;and subtract from total, low byte
	movf	_postdec0,W			;ie Total := Total - OldSample
	subwfb	_postdec1,F

	movf	_postinc2,W			;Insert new sample into history array
	movwf	_postinc0
	movf	_postdec2,W
	movwf	_postdec0			;aSamples[nSamplePos/2,nChannel] := aSensor[nChannel]

	movf	_postinc2,W
	addwf	_postinc1,F			;add new sensor value to totals
	movf	_postdec2,W			;nTotal := Total + NewSample
	addwfc	_postdec1,F

	movf	_postinc1,W
	movwf	_postinc2
	movf	_postdec1,W			;gee it'd be nice if microchip implemented a predecX operand
	movwf	_indf2				;move total to samples array

	btfss	_postdec2,7			;check if value is negative
	bra		AV_SAMPLES_SHIFT

	comf	_postinc2,F
	comf	_postdec2,F			;invert value
	movlw	1
	addwf	_postinc2,F
	clrf	_wreg,F
	addwfc	_postdec2,F

AV_SAMPLES_SHIFT:
	bcf		_status,C			;now everything is positive so we can divide by eight
	rrcf	_preinc2,F
	movf	_postdec2,W
	rrcf	_indf2,F			;divide by 2

	bcf		_status,C			;divide by 2
	rrcf	_preinc2,F
	movf	_postdec2,W
	rrcf	_indf2,F

	bcf		_status,C
	rrcf	_preinc2,F
	movf	_postdec2,W
	rrcf	_indf2,F			;divide by 2 for a total division of 8

	movf	_postinc1,W
	btfss	_postdec1,7			;check if total was negative again
	bra		AV_SAMPLES_NEXT

	comf	_postinc2,F			;invert back to negative if the original total was negative
	comf	_postdec2,F
	movlw	1
	addwf	_postinc2,F
	clrf	_wreg,F
	addwfc	_postdec2,F


AV_SAMPLES_NEXT:
	movlw	2
	addwf	_fsr1l,F			;move the aTotals pointer along to next int
	movlw	0
	addwfc	_fsr1h,F
	movlw	2
	addwf	_fsr2l,F			;move the aSensors pointer along to next int
	movlw	0
	addwfc	_fsr2h,F

	movlw	NEXT_ROW_OFF
	addwf	_fsr0l,F
	movlw	0
	addwfc	_fsr0h,F			;move to the next row of samples for next channel

	decfsz	_nChannel,F
	bra		AV_SAMP_LOOP		;do next channel

	movlw	TAP_MASK
	incf	_nSamplePos,F
	incf	_nSamplePos,F
	andwf	_nSamplePos,F		;increment and mask nSamplePos to do circular buffer

}
}
void main()
{
unsigned char i;
int k;	//value we're putting in the sensors array. Note when it goes above/below +-4096 the code explodes
int n;	//a counter to see how many passes we've done

k = 0;
n = 0;
while (1)
{	
	for(i=0;i<SENSOR_CHANNELS;i++)
		aSensors[i] = k;
	k = k - 1;
	n = n + 1;
	av_samples();

}
}

Share this post


Link to post
Share on other sites

Hi Phil,

 

Good to see that you have found a solution. I do a similar thing using an X/Y accelerometer on its side as an inclinometer for my field archery. I use the stack method I mentioned earlier. Time and code space is not a problem for me. I use a lookup table to convert the average X channel in to degrees then put it onto a small 2 digit LED display. It is all build into a long plastic box (120x30x30mm) with a hole at each end to look through and pick out my target. The display is inside the box under the front hole so I can get an immediate reading while still looking at the target. I get +/-63 degrees which covers all the courses I shoot on.

 

It may be an idea if the compiler/preprocessor looked at the data types it is dealing with and calling up a real signed division routine instead of optimizing them into shifts for divisors of powers of 2.

 

Cheers

 

Reynard

Share this post


Link to post
Share on other sites
Hi,

I'm looking for a bit of help to save some time re-inventing the wheel.

 

I'm sampling a number of sensors, three accelerometer, three magnetometer and two ADC channels. The ADC and Accelerometer channels give me unsigned int data and the magnetometer gives me signed ints. I have a PIC18F4550

 

I'd like to store these in a table, and do a moving average filter on them.

 

eg:

int aMagXArray[8]; //allocate arrays of 8 values each

//....

//Now sample the samples, putting them in variables like SensorMagX, SensorMagY etc

aMagXArray[nPos] = SensorMagX;

aMagYArray[nPos] = SensorMagY;

etc

nPos++;

nPos = nPos & 0x07; //ie keep nPos from 0..7

 

then report the moving average

eg:

nTot = 0;

for(i=0;i<8;i++)

nTot = nTot + aMagXArray;

nMagXAveragedValue = nTot / 8;

etc repeat for the other seven sensors

 

...a pretty standard moving average alogorithm, store the samples in a table with a rotating pointer and average the table when you want a value

 

Unfortunately all this array referencing is taking a big chunk of code space, which I don't have due to all the other code in my PIC. Using pointers isn't much help either.

 

So I'd like to re-write this in assembly. It's been a while since I've played with indirect addressing, hence this post.

 

Is there a 18F4550 assembly algorithm that will sum integers (unsigned or signed) that are sitting in an array? The sum can then be divided to give the moving average.

 

Presumably I'd have to allocate the arrays in a single page to simplify things. Does the Sourceboost compiler automatically ensure that arrays are allocated in one page?

 

Thanks for any help.

Phil

 

 

I have a different method and wonder what all of you think.

 

Take the old average and multiply by 7. Add the new value and divide by 8. This is your new average.

 

All reading ever taken contribute to the new average, the last one has a weight of 1/8 or .13, the eighth one back a weight of .04 the 16 th back a weight of .01.

 

For averaging over a tighter period multiply by 3 and divide by 4. You can figure out other values.

 

 

Use a power of 2 for the division so it will be done with ( hopefully ) a shift.

 

To multiply by 7 you can combine shifts and adds ( look at the compiled output to see how efficient ) if it is faster than what the compiler comes up with.

 

Now you have no array or pointers to deal with.

 

What do you think? Should we model this with a spread sheet and compare to other methods.

Share this post


Link to post
Share on other sites
I have a different method and wonder what all of you think.

 

Take the old average and multiply by 7. Add the new value and divide by 8. This is your new average.

 

All reading ever taken contribute to the new average, the last one has a weight of 1/8 or .13, the eighth one back a weight of .04 the 16 th back a weight of .01.

 

For averaging over a tighter period multiply by 3 and divide by 4. You can figure out other values.

 

 

Use a power of 2 for the division so it will be done with ( hopefully ) a shift.

 

To multiply by 7 you can combine shifts and adds ( look at the compiled output to see how efficient ) if it is faster than what the compiler comes up with.

 

Now you have no array or pointers to deal with.

 

What do you think? Should we model this with a spread sheet and compare to other methods.

Russ that's an interesting idea. My first thought (without any testing) is that you are accumulating the rounding error with each new running average. 8 times the old average will not give you the sum of the 8 samples that were used to calculate the old average. Consequently, 7 times won't give you the old sum less one average sample, either.

 

Further, as you pointed out, it's not the same running average even without the accumulated rounding loss. A sample 16 cycles back (or 9 or 50) will still have some weight with this model. Independent of his speed and memory requirements, it really depends on the application whether this method is better or worse (even if you ignore the rounding loss).

 

That's my .02 without really thinking about it too much.

 

-Henry

Share this post


Link to post
Share on other sites
I have a different method and wonder what all of you think.

 

Take the old average and multiply by 7. Add the new value and divide by 8. This is your new average.

 

All reading ever taken contribute to the new average, the last one has a weight of 1/8 or .13, the eighth one back a weight of .04 the 16 th back a weight of .01.

 

For averaging over a tighter period multiply by 3 and divide by 4. You can figure out other values.

 

 

Use a power of 2 for the division so it will be done with ( hopefully ) a shift.

 

To multiply by 7 you can combine shifts and adds ( look at the compiled output to see how efficient ) if it is faster than what the compiler comes up with.

 

Now you have no array or pointers to deal with.

 

What do you think? Should we model this with a spread sheet and compare to other methods.

Russ that's an interesting idea. My first thought (without any testing) is that you are accumulating the rounding error with each new running average. 8 times the old average will not give you the sum of the 8 samples that were used to calculate the old average. Consequently, 7 times won't give you the old sum less one average sample, either.

 

Further, as you pointed out, it's not the same running average even without the accumulated rounding loss. A sample 16 cycles back (or 9 or 50) will still have some weight with this model. Independent of his speed and memory requirements, it really depends on the application whether this method is better or worse (even if you ignore the rounding loss).

 

That's my .02 without really thinking about it too much.

 

-Henry

 

 

I did not think about the rouding, but I should have pointed out that at least the math should be done in enough precision so there would not be overflow problems. Perhaps some of the math could be at a factor of 2 up to keep an extra bit of precision around. That is of course if we think that the method is worth improving.

Share this post


Link to post
Share on other sites

Hi All,

 

I'll leave it to others to check out the method russ has come up with (I've got my solution and moved on to other stuff)...it does look interesting however.

 

I don't think rounding (in terms of just the multiplication) will be an issue if the numbers are limited to the precision of the accumulator (ie for 16bit signed limit the additional samples to 4096 ie 16-sign bit-divisor bits)....however the division gives me a bit of a worry, as there will nearly always be a remainder...that's probably the main area where loss of accuracy will occur. Note that we're doing an average and effectively throwing away bits to smooth things out anyway...so maybe its a case of try it and see.

 

My main concern is that a divide by seven operation (or divide by one less than the multiplication for other selections) will by it's nature never be a clean shifting operation, hence you'll need a progressive subtraction routine (ie division) added to the code...increasing the code length and processing time. Alternatively if you keep the division a multiple of 2 (eg 8) you'll need a progressive addition (ie multiplication) to do the +1....either way it's not an even multiply or divide. If the 8x8bit multiply opcode on the 18F4450 is used, maybe you can multiply by 9 using the opcode and divide by 8 using shifts and not call a separate routine....in which case you'd have a 9 tap filter....of course that only works with 8 bit numbers.

 

Phil

Share this post


Link to post
Share on other sites
Hi All,

 

I'll leave it to others to check out the method russ has come up with (I've got my solution and moved on to other stuff)...it does look interesting however.

 

I don't think rounding (in terms of just the multiplication) will be an issue if the numbers are limited to the precision of the accumulator (ie for 16bit signed limit the additional samples to 4096 ie 16-sign bit-divisor bits)....however the division gives me a bit of a worry, as there will nearly always be a remainder...that's probably the main area where loss of accuracy will occur. Note that we're doing an average and effectively throwing away bits to smooth things out anyway...so maybe its a case of try it and see.

 

My main concern is that a divide by seven operation (or divide by one less than the multiplication for other selections) will by it's nature never be a clean shifting operation, hence you'll need a progressive subtraction routine (ie division) added to the code...increasing the code length and processing time. Alternatively if you keep the division a multiple of 2 (eg 8) you'll need a progressive addition (ie multiplication) to do the +1....either way it's not an even multiply or divide. If the 8x8bit multiply opcode on the 18F4450 is used, maybe you can multiply by 9 using the opcode and divide by 8 using shifts and not call a separate routine....in which case you'd have a 9 tap filter....of course that only works with 8 bit numbers.

 

Phil

 

 

I have done some test with a spreadsheet. Round off error is indeed a problem. Scaling the values up by a couple of powers of 2 may solve this problem. Most of the math is 16 bit and I would like it to work well for 10 bit a to d. More info when I look into it some more. The division is by 8 not 7 so the shift is fine, but for the loss of bits off the right hand end.

Share this post


Link to post
Share on other sites
Hi All,

 

I'll leave it to others to check out the method russ has come up with (I've got my solution and moved on to other stuff)...it does look interesting however.

 

I don't think rounding (in terms of just the multiplication) will be an issue if the numbers are limited to the precision of the accumulator (ie for 16bit signed limit the additional samples to 4096 ie 16-sign bit-divisor bits)....however the division gives me a bit of a worry, as there will nearly always be a remainder...that's probably the main area where loss of accuracy will occur. Note that we're doing an average and effectively throwing away bits to smooth things out anyway...so maybe its a case of try it and see.

 

My main concern is that a divide by seven operation (or divide by one less than the multiplication for other selections) will by it's nature never be a clean shifting operation, hence you'll need a progressive subtraction routine (ie division) added to the code...increasing the code length and processing time. Alternatively if you keep the division a multiple of 2 (eg 8) you'll need a progressive addition (ie multiplication) to do the +1....either way it's not an even multiply or divide. If the 8x8bit multiply opcode on the 18F4450 is used, maybe you can multiply by 9 using the opcode and divide by 8 using shifts and not call a separate routine....in which case you'd have a 9 tap filter....of course that only works with 8 bit numbers.

 

Phil

 

 

I have done some test with a spreadsheet. Round off error is indeed a problem. Scaling the values up by a couple of powers of 2 may solve this problem. Most of the math is 16 bit and I would like it to work well for 10 bit a to d. More info when I look into it some more. The division is by 8 not 7 so the shift is fine, but for the loss of bits off the right hand end.

 

 

I almost suggested this method in my original reply. It's a filter of the form:

 

Filtered value = (New value * p) + ((1-p)*Filtered Value)

 

where 0 < p < 1.

 

The suggested filter has p = 1/8. It scales the right hand side by 8 to get:

 

Filtered value * 8 = New value + (7 * Filtered Value)

 

As for roundoff errors, aren't we talking about a truncation error if the divide is done by a simple shift? I'd add 4 before the shift to round rather than truncate the result.

 

Note that 7 * Filtered value might be better calculated as ((Filtered value << 3) - Filtered Value). It might generate less code than ((Filtered value <<2) + (Filtered value <<1) + Filtered Value) for a multiply by 7.

 

Yes, this filter does still have a significant contribution from the 8th previous sample, as would picking p = 1/4. I'd be tempted to use p = 1/2 which makes the filter:

 

Filtered value = (Filtered value + New value + 1) / 2

 

which is much easier to calculate.

 

Orin.

Share this post


Link to post
Share on other sites

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...