STM8 Precise Cycle Delay
12 Sep 2018, 08:32am TZ +05:30
This started a quest to design the perfect cycle delay
for one of my favorite MCUs the STM8.
In the process I learned a lot about custom pipeline and execution of in
proprietary MCU such as STM8S
.
Fig: STM8S Board
After the 8051 days (1996-2000) had passed, I did not get to touch assembly language. It was always C that I programmed in for the MCUs. With the occasional tinkering in Makefiles and ld-files.
The STM8 is a family of MCUs from ST Microelectronics
called STM8S
. These are best low cost Debug-able MCUs.
Means you can Flash Program them like normal MCUs.
And, they also provide a first class debugging features.
This works great for iterative firmware development.
In my case a lazy firmware designer, who loves this feature.
Then I don’t needed to wait for the full program.
Or don’t need to add special debug prints.
As I can now view memory or the variables in question via debug.
Well lets look into what we want to do here.
The idea is to create a simple loop based cycle delay. #
Yes that’s about the gist of it. Its simple at first look. But as we progress we find more - how things can be different on the C and assembly.
To begin lets get our setup:
Setup #
Hardware #
For my work, I used the cheap STM8S103F3P3 based board from Aliexpress
You can find a lot using the search term STM8 development board
Fig: Aliexpress Snap shot of the search for “STM8 development board”
I am sure you can get similar boards form Banggood and eBay.
Note mySTM8S103F3P3
board also has an LED on Port B5 in Active-High configuration. #Compiler #
We would be using the FREE STM8 COSMIC compiler toolchain .
Fig: Cosmic STM8 Compiler
You can get the FREE fully functional compiler by sending out the license request to a provided email address.
Here is the official word from ST Micro on this.
IDE #
I have not found any better IDEs for STM8 than the ST Micro STVD.
Fig: STVD-STM8 the IDE for STM8 MCUs
It requires a few steps to setup the IDE linked to the Cosmic Compiler.
In the ST Visual Develop window follow the Menu sequence to find the Options:
Tools -> Options
Then In the Options window go to the Toolset tab.
Select STM8 Cosmic and in the Root path Select your installation folder where you can find the
cxstm8.exe
file.
Typically it should be :C:\Program Files (x86)\COSMIC\FSE_Compilers\CXSTM8
There may be some permission dialogs that would pop up after this. Just Accept them and press OK to complete the setting.
Fig: STVD Settings to connect it to the Cosmic Compiler
Idea #
Well now we are all set with the setup. Let’s work out plan of action.
We would be designing simple do-while-loop
for wasting time based on cycles.
Fig: Simple Do while Loop - Microchip Developer Help
Source: http://microchipdeveloper.com/tls2101:do-while-loop
Stage 1: Simple Do-While-Loop in code #
Function delay_cycles
#
|
|
Main #
|
|
Lets look at assembly it generates specially the while
loop.
|
|
Lets look at the delay_cycles
function assembly:
|
|
Now from the above function we can try to estimate 2 items:
- Loop Cycle count
= Lcy
- Total Cycle count
= Tcy
These would help us calculate the time delay it would produce.
Where to find cycle Count details:STM8 CPU Programming Manual a.k.a PM0044 #
Here Lcy
would be calculated from line #103 to #112
ldw x,(OFST+1,sp)
= 3subw x,#1
= 2ldw (OFST+1,sp),x
= 2ldw x,(OFST+1,sp)
= 2jrne L54
= 2 (With Flush)
Hence Lcy = (3+2+2+2+2) * LoopCount - 1 = 11 * LoopCount - 1
Now for Tcy
calculation we need to add the additional pieces:
pushw x
= 2popw x
= 2ret
= 6ldw x,#1000
(While Calling) = 3call _delay_cycles
(While Calling) = 6
Hence Tcy = Lcy + (2+2+3+6) = 13 + Lcy = 13 + 11 * LoopCount - 1
In our example LoopCount
is 1000
.
Lets then calculate the Actual Tcy
by substituting the value.
Tcy = 13 + 11 * 1000 -1 = 11012
At the Frequency of 16MHz = 16000000Hz
we have a cycle time of
t = 1/f = 1/16MHz = 6.25e-8
Hence the Total duration should be
t(delay_cycles{1000}) = Tcy * t = Tcy * 1/f = 11012 * 6.25e-8 = 6.8825e-4 Seconds
Which is actually t(delay_cycles{1000}) = 688.25 Micro Seconds
Lets now look at what the scope Says:
Fig: Scope Plot of Stage 1: Simple Do-While-Loop in code
Well surprise! the delay On Time is 756.1uS
and Off time is 759.8uS
#
Since our LED is active High its only logical that the ON time
is
lesser than the OFF time
.
Lets take the On time and work our way forward.
Total Deviation = 756.1uS - 688.25uS = 67.85uS
That would 1085 Cycles
of difference.
And you thought directly calculating timing would work.
Well there are multiple reasons:
- Compiler dependency
- Post-Linking rearrangement
- you pick…
So, I could not figure out how this magic was happening. In fact I tried many times but the result was the same.
- Reducing the libraries
- Changing compilation options
- trying to directly write code the function in the Main
The results kept me in the dark. There was no hope for fixing this. Probably if some one can help with this - it would be nice.
Stage 2: Inline-Assembly in code #
After some scratching head time I gave up on this simple Idea.
From hindsight I new that in AVRs about delay_ms
function.
This function existed in the standard library libc
of AVR.
Looking into the Arduino/hardware/tools/avr/avr/include/util/
directory,
found the delay.h
file. This contained some interesting insight into
how loop delays is calculated. Most of the parts were written in Assembly.
That part was hard.
So I took help from Google and it brought me to: https://github.com/Hoksmur/stm8_routines
Here too I found some interesting delay code.
|
|
Since we are targeting COSMIC C complier its best we only look at
the __CSMC__
option.
|
|
This is nice small cycle delay code.
They were kind to provide MACRO T_COUNT
.
This MACRO can generate cycle counts from input micro-second delays.
However this function was not without its quirks. Lets look at the final code to understand what changed:
|
|
Here are the important changes:
@inline
is a special indication needed in COSMIC C Compiler to generate inline functions.In MACRO
T_COUNT
the operator precedence was not correct. Hence exclusive brackets were added to make it clear. And a small change in the ordering was done. This was to make sure the results were in range.Change from
F_CPU
toFCLK
for system clock. As that’s define chosen for System Clockfcpu
in actual frequency value terms. E.g.FCLK=16000000
as part of the compiler pre-processor directives.
Here is a peek at the compiler settings:
Fig: Compiler Settings for the FCLK
Now lets look at the new generated asm
code:
|
|
Initially we set the period of 1000 that’s what the ldw x,#1000
instruction shows.
Like we did earlier lets list out the instruction cycles:
ldw x,#1000
= 4nop
= 1decw X
= 1jrne L6
= 1 normal / 2 in jumpnop
= 1
Hence Lcy = (1+2) * LoopCount - 1 = 3 * LoopCount - 1
And Tcy = Lcy + (4+1+1) = 6 + Lcy = 6 + (3 * (LoopCount - 1))
Lets then calculate the Actual Tcy
by substituting the value.
Tcy = 6 + (3 * (1000 -1)) = 3003
At the Frequency of 16MHz = 16000000Hz
we have a cycle time of
t = 1/f = 1/16MHz = 6.25e-8
Hence the Total duration should be
t(_delay_cycl{1000}) = Tcy * t = Tcy * 1/f = 3003 * 6.25e-8 = 1.876e-4 Seconds
Lets now look at what the scope Says (This time we are using Saleae Logic instead of Tek):
Fig: Plot of the New _delay_cycl
function
We Observe that:
- ON-Time = 1.908e-4 Seconds
- OFF-Time = 1.908e-4 Seconds
- Period = 3.816e-4 Seconds
Which is very Close to the Expected 1.876e-4 Seconds !
This is great achievement we are very close.
But in actual practice we need some way to specify time in Microseconds rather than absolute cycles.
This is where the T_COUNT
macro comes in handy.
Stage 3 : Making the _delay_cycl
function Useful
#
Let’s review our setup this time in entirety:
- Quick look at the files we are Using:
Fig: Files used in the Project - Now where to get the Drivers & Examples:
STM8S Standard Peripheral Library STSW-STM8069 v2.3.0
STM8S Code Examples STSW-STM8026 v1.02 - For correct files to use in the project look at
the
StdPeriph_Template\STVD\Cosmic
directory.
As you would need to get the correctstm8_interrupt_vector.c
file for compilation to work.
Other files likestm8s_conf.h
,stm8s_it.h
andstm8s_it.c
are available at the root of thetemplate
directory.
Actual Main Code Listing:
|
|
We would focus on this part alone:
|
|
As this is where we would introduce the change.
Lets alter it using the T_COUNT
macro:
|
|
This means we are going to wait for 100 Microseconds.
Surprise !
Any sane programmer would notice the problem here :
|
|
Why that additional bracket in front of (T_COUNT
?
That’s a BUG in COSMIC compiler ! with respect to pre-processor.
So we need to modify the #define
and the location where the macro is called.
Lets not worry about this too much since this would be buried in our API
we create around the _delay_cycl
function.
Lets look at our results:
Fig: Results of 100uS delay
That would be 102.2uS ON and OFF period very-close for practical use!
SUCCESS At Last !
This was the story of how I got to make the perfect STM8S cycle delay function.
I am very thankful to Mr. Oleg Terentiev for publishing the library
https://github.com/Hoksmur/stm8_routines
It was the source of this effort and path to solving the problem.
Folks if you have any insights please share. It took a while to complete this full story ;-).
*~~ Completed on 20th October 2018*