Stopwatch Tutorial Is Finally Posted!lobster Productions
This simple tutorial shows you how to use StopWatch, a diagnostic tool in C#, to easily determine how long it takes (part of) your application to run. This is a simple way to check the speed of.
Introduction
- It's finally done! Thanks to TVPaint Animation 11, save automatically the files you are using. Combine this feature to the Auto Backup and you ensure yourself to never ever lose your work in case of crash system or power cut. Save options are now so numerous.
- Stopwatch class. Stopwatch class may seem like primitive class that makes some date math behind the scenes but it actually doesn't hold true. With Stopwatch class it is possible to make very accurate measurements if operating system and computer hardware support high-resolution performance counter.
On this page, we will present a stopwatch design. It is similar to the designin the Xilinx ISE tutorial. Wewill tackle it 'the MyHDL way' and take it from spec to implementation.
This is an extensive example, and we will use it to present all aspects of aMyHDL-based design flow. It's also a relatively advanced. If you havedifficulties understanding the material on this page, consider reading thefirst chapters of themanual or theearlier examples in this Cookbook first.
Specification
Compared to the design in the Xilinx ISE tutorial, our design is somewhatsimplified. The intention is not to avoid complexity, but merely to make thecode and the explanations better fit on a single web page. In particular, ourstopwatch will only have three digits: two digits for the seconds, and one forthe tenths of a second. Also, we will not consider clock generation issues andsimply assume that a 10Hz clock is available.
The interface of the stopwatch design looks as follows:
Architecture
A stopwatch system is naturally partitioned as follows:
- a subsystem that counts time, expressed as digits in bcd (binary coded decimal) code
- a subsystem that displays the count, by converting each bcd digit to a 7 segment led display
A natural partitioning often works best, and that's how we will approach thedesign. We will first design a time counter and then a bcd to led convertor.
Time counter design
Approach
One of the goals of the MyHDL project is to promote the use of modern softwaredevelopment techniques for hardware design. One such technique is the conceptof unit testing, a cornerstone of extreme programming (XP).
Unit testing means writing a dedicated test for each building block of adesign, and aggregating all tests in a regression test suite using a unit testframework. Moreover, the XP idea is to write the unit test first, before theactual implementation. This makes sure that the test writer concentrates on allaspects of the high level specification, without being influenced by lowerlevel implementation details.
At the start of an implementation, the existing unit test will fail, and itwill continue to do so until a valid implementation is achieved. The unit testthus serves as a metric for completion. Moreover, to see the unit test fail onincomplete or invalid designs enhances the confidence in the test qualityitself. This is of crucial importance when making design changes later on.
Unit test
To write a unit test for building block, we need two things: the specificationand the interface. The specification was described in previous sections. Theinterface of the time counter looks as follows:
The actual implementation is left open for now. We will first write the test, using the interface.
The following code is the unit test for the time counter subsystem:
dut
is the design under test. clkgen
is a clock generator. action
definesthe stopwatch state, based on a rising edge on either of the input signalsstartstop
or reset
. counter
maintains the expected time count. monitor
is the actual test: it asserts that the actual time count from the designequals the expected time count. Finally, stimulus
defines a number of testcases for the stopwatch. Note that it has an inner for
loop over signals, asa concise way to define test patterns. This is straightforward in Python. Butthink for a moment on how you would do it in Verilog or VHDL.
Also in stimulus
, note the yield clock.negedge
statement. This statementsynchronizes signal changes with the falling clock edge. This is needed toavoid race conditions when signals change 'simultaneously' with the risingclock edge. This is commonly done in digital tests. As you can expect, thisstatement was not present in the first version of the test: it was added afterthe test was run against the implementation and found to fail occasionally,even when the implementation was believed to be correct. This shows that inpractice there may be a good reason why a test needs to be adapted to geteverything working. But it in any case it is better to start with a 'general'unit test that is not influenced by an implementation.
Our unit test is now ready to run. We could actually run it directly against animplementation. However, we will use it via the unit testing frameworkpy.test
instead. The framework provides the following functionality:
- it redefines the Python
assert
statement for extensive error reporting - it looks up and runs each method whose name starts with 'test_'
- it looks up test modules by searching for modules whose name starts with 'test_'
There's a lot more to say about py.test
and you are probably also curiouswhere to get it from. You can find that info further on this page, in thesection More about py.test.
Design
The following is an implementation of the time counter, in file TimeCount.py
:
py.test
confirms that this is a valid implementation:
bcd to led convertor design
Approach
For the design of the bcd to led convertor , we will follow a similar approachas before. We will write a unit test first, and then use it to complete thedesign.
We first put the encoding data in a separate module, seven_segment.py
, tomake it reusable. The appropriate data structure for the encoding is adictionary:
Unit test
This is the unit test, in test_bcd2led.py
:
This test asserts that the led output from the design matches the appropriateencoding for a digit.
Design
Here is an implementation, in bcd2led.py
:
Note how we derive the tuple code
from the encoding
dictionary. We need atuple because that's the data structure that the Verilog convertor supports.It maps tuple indexing to a case statement to support ROM inferencing bysynthesis tools.
When we run py.test
, we get the following output:
Note that when run with no arguments, py.test
finds and runs all testmodules. This is done recursively through all subdirectories, making itstraightforward to run a full regression test suite.
Top level design
The top-level design in StopWatch.py
is just an assembly of the previouslydesigned modules:
Implementation
Automatic conversion to Verilog
To go to an implementation, we first convert the design to Verilogautomatically, using MyHDL's toVerilog
function:
The resulting Verilog code is included in full:
Note how the Verilog convertor expands the hierarchical design into a 'flat netlist of always blocks'. The Verilog ouput is really an intermediate steptowards an implementation. The whole design is flat and contained in a singlefile, which may make it easier to hand it off to back-end synthesis andimplementation tools.
Note also how the convertor expands tuple indexing in MyHDL into a casestatement in Verilog.
Synthesis
We will synthesize the design with Xilinx ISE 8.1. We first create a project inthe ISE environment, add the source of the Verilog file to it, and we are readyto go.
The following is extracted from the synthesis report. It shows how thesynthesis tool recognizes higher-level functions such as ROMs and counters:
How these blocks are actually implemented depends on the target technology andthe capabilities of the synthesis tool.
You can review the full FPGA synthesis report here.
FPGA implementation
The FPGA implementation report can be reviewed here.
CPLD implementation
The same design was also targetted to a CPLD technology. The detailed reportcan be viewed here.
More about py.test
To verify the stopwatch design, we have been using py.test
. However, this isnot the only unit testing framework available for Python. In fact, the standardunit testing framework that comes with Python is the unittest
module. Theunittest
framework is presented in the MyHDL manual, and is used to verifyMyHDL itself. On the other hand, py.test
is not part of the standard Pythonlibrary currently. Why then did we use py.test
in this case?
The reason is that I believe that py.test
will be a better option in thefuture. As demonstrated on this page, py.test
is non-intrusive. The onlything we need to do for basic usage is to obey some simple naming conventionsand to use the assert
statement for testing - things we might want to dowithout a testing framework anyway. In contrast, unittest
requires us towrap our tests into dedicated subclasses and to use special test methods. Thiscan be especially awkward with MyHDL, because MyHDL hardware is typicallydescribed using top-level and embedded functions, not classes and methods.
In short, it is much easier to develop unit tests with py.test
than it iswith unittest
, in particular in the case of MyHDL code. However, py.test
also has its disadvantages:
- As
py.test
is not part of the standard Python library, it has to beinstalled separately. py.test
is currently not distributed in a convential way such as a tarfile. It is part of thepy.lib
library that has to be checked out from asubversion repository. This requires the installation of a subversion client.- The use of the
assert
statement for unit testing is controversial inPython. Theassert
statement is originally intended for programmer usage,to make programs safer. However, in my opinion the use ofassert
fortesting is natural and warranted. py.test
uses a lot of 'magic' behind the scenes to modify Python's behaviorfor its purposes, such as extensive error reporting.
However, I believe that the benefits are far more important than thedisadvantages. Moreover, some disadvantages may disappear over time.Consequently, I plan to promote py.test
as the unit testing framework ofchoice for MyHDL in the future.
More info on the usage and installation of py.test
can be foundhere.
A prerequisite before we dive into the difference of measuring time in Python is to understand various types of time in the computing world. The first type of time is called CPU or execution time, which measures how much time a CPU spent on executing a program. The second type of time is called wall-clock time, which measures the total time to execute a program in a computer. The wall-clock time is also called elapsed or running time. Compared to the CPU time, the wall-clock time is often longer because the CPU executing the measured program may also be executing other program's instructions at the same time.
Another important concept is the so-called system time, which is measured by the system clock. System time represents a computer system's notion of the passing of time. One should remember that the system clock could be modified by the operating system, thus modifying the system time.
Python's time
module provides various time-related functions. Since most of the time functions call platform-specific C library functions with the same name, the semantics of these functions are platform-dependent.
Two useful functions for time measurement are time.time
and time.clock
. time.time
returns the time in seconds since the epoch, i.e., the point where the time starts. For any operatin system, you can always run time.gmtime(0) to find out what epoch is on the given system. For Unix, the epoch is January 1, 1970. For Windows, the epoch is January 1, 1601. time.time
is often used to benchmark a program on Windows. While time.time
behaves the same on Unix and on Windows, time.clock
has different meanings. On Unix, time.clock
returns the current processor time expressed in seconds, i.e., the CPU time it takes to execute the current thread so far. While on Windows, it returns the wall-clock time expressed in seconds elapsed since the first call to this function, based on the Win32 function QueryPerformanceCounter
. Another difference between time.time
and time.clock
is that time.time
could return a lower-value than a previous call if the system clock has been set back between the two calls while time.clock
always return non-decreasing values.
Here is an example of running time.time
and time.clock
on a Unix machine:
# On a Unix-based OS
2 4 6 | >>>print(time.time(),time.clock()) >>>time.sleep(1) 1359147653.310.02168 |
time.time()
shows that the wall-clock time has passed approximately one second while time.clock()
shows the CPU time spent on the current process is less than 1 microsecond. time.clock()
has a much higher precision than time.time()
.
Running the same program under Windows gives back completely different results:
On Windows
2 4 6 | >>>print(time.time(),time.clock()) >>>time.sleep(1) 1359147764.041.01088769662 |
Both time.time()
and time.clock()
show that the wall-clock time passed approximately one second. Unlike Unix, time.clock()
does not return the CPU time, instead it returns the wall-clock time with a higher precision than time.time()
.
Given the platform-dependent behavior of time.time()
and time.clock()
, which one should we use to measure the 'exact' performance of a program? Well, it depends. If the program is expected to run in a system that almost dedicates more than enough resources to the program, i.e., a dedicated web server running a Python-based web application, then measuring the program using time.clock()
makes sense since the web application probably will be the major program running on the server. If the program is expected to run in a system that also runs lots of other programs at the same time, then measuring the program using time.time()
makes sense. Most often than not, we should use a wall-clock-based timer to measure a program's performance since it often reflects the productions environment.
Instead of dealing with the different behaviors of time.time()
and time.clock()
on different platforms, which is often error-prone, Python's timeit
module provides a simple way for timing. Besides calling it directly from code, you can also call it from the command-line.
For example:
On a Unix-based OS
2 4 | 10000loops,best of3:365usec per loop %python-mtimeit-n10000'map(lambda x: x^2, range(1000))' |
# On Windows
2 4 | C:Python27>python.exe-mtimeit-n10000'[v for v in range(10000)]' C:Python27>python.exe-mtimeit-n10000'map(lambda x: x^2, range(1000))' |
In IDLE
2 4 6 8 10 | >>>total_time=timeit.timeit('[v for v in range(10000)]',number=10000) 3.60528302192688# total wall-clock time to execute the statement 10000 times 0.00036052830219268796# average time per loop >>>total_time=timeit.timeit('[v for v in range(10000)]',number=10000) 3.786295175552368# total wall-lock time to execute the statement 10000 times 0.0003786295175552368# average time per loop |
Stopwatch Tutorial Is Finally Posted Lobster Productions Online
Which timer is timeit
using? According to timeit
’s source code, it uses the best timer available:
2 4 6 8 | # On Windows, the best timer is time.clock else: # On most other platforms the best timer is time.time |
Another important mechanism of timeit
is that it disables the garbage collector during execution, as shown in the following code:
2 4 6 8 | gc.disable() timing=self.inner(it,self.timer) ifgcold: |
If garbage collection should be enabled to measure the program's performance more accurately, i.e., when the program allocates and de-allocates lots of objects, then you should enable it during the setup:
Stopwatch Tutorial Is Finally Posted Lobster Productions Going
2 | >>>timeit.timeit('[v for v in range(10000)]',setup='gc.enable()',number=10000) |
Stopwatch Tutorial Is Finally Posted Lobster Productions Free
Except for very special cases, you should always use the module timeit
to benchmark a program. In addition, it is valuable to remember that measuring the performance of a program is always context-dependent since no program is executing in a system with boundless computing resources and an average time measured from a number of loops is always better than one time measured in one execution.