Jerry's blog: 2008

2008年12月31日星期三

稿费

读研时投的一篇论文，意外的收到了稿费（电子与信息学报），虽然只有400元，但还是有些惊喜～～

2008年12月22日星期一

Tip: Reset and Chirp process when use USB3500 as USB HOST's PHY

In USB3500 datasheet 7.14 “USB Reset and Chirp”, the reset and chirp process is detailed described using one figure and the description text. But if you do the reset and chirp process following the instruction of the figure, you may not get the correct result.
As show in the following figure (Figure 7.9), first you set xcvrselect = 2'b01, termselect = 1'b1 and opmode = 2'b00 before reset to detect device attachment (T0~T2); then you set xcvrselect = 2'b00, termselect = 1'b0 and opmode = 2'b00 to drive SE0 on bus (T2~T3); then you wait ChirpK on bus to translate to the next state, but you can not watch ChirpK on bus through the reset process using the current setting (xcvrselect/termselect/opmode), this is the problem. Why?
As we know, we can not see the DP/DM by using USB3500. We can only get the bus status through linestate. In USB3500 datasheet 7.1 “Linestate”, it gives the usage and meaning of linestate under different conditions. We can see in Table 7.2, when XCVRSEL[1:0] = 00, TERMSELECT = 0 and OPMODE = 00/01, the only valid linestate value is 00 (SE0) or 01 (J-state). So if we set xcvrselect = 2'b00, termselect = 1'b0 and opmode = 2'b00 in T2~T3, even the device drives ChirpK on bus, we can only get J state on bus.
Solution:
From Table 7.2, we know that in order to see the ChirpK on bus, we must set opmode to 2'b10. So there are two solutions to resolve this problem.
1. Set xcvrselect = 2'b00, termselect = 1'b0 and opmode = 2'b10 in T2~T3. As opmode = 2'b10 only have meanings when host drives ChirpJ/ChirpK (Disable bit stuffing and NRZI encoding), this setting can work well when host drives SE0 on bus. With this setting, design can get the correct ChirpK from linestate at T3.
This method is simple but it may seem a little different with the datasheet.
2. At T2~T3, the bus state is SE0 and then the actual bus state is ChirpK, but due to opmode setting (opmode = 2'b00), the linestate value is 2'b01 (J-state). So we can detect the bus state transition between T2 and T3. If we detect the bus state (linestate) changes from 2'b00 to 2'b01, then we set opmode = 2'b10 to observe the ChirpK from linestate.
Both methods work well and can get the correct reset and chirp process.

2008年12月3日星期三

[转载] Xilinx ISE所涉及的一些命令以及Command Line的使用

原文：http://www.edacn.net/bbs/viewthread.php?tid=14174
因为目前进行的一个项目使用了多块容量较大的Xilinx FPGA, 对各块FPGA进行synthesis,map,P&R和generating programming file就成了一个大问题。（惭愧啊，group里没有人有modular design的经验）虽然现在的工作站性能比较强劲，但产生每一个programming file还是要花上好几个小时。所以考虑到在Command Line下面进行所有的编译工作，下面是一点点心得，希望对大家有所帮助（不知道有Westor在那本ISE详解里面介绍过这个没有，偶是没有看过那本书了，不过，要是Westor已经写过了的话，:-(. $#@%%N#%&*^$###@#）。废话少说，言归正传。
1.Command的介绍
要完成一次完整的Xilinx FPGA编译过程，涉及到的Command包括了：
XST：
全称为Xilinx Synthesis Technology，这是Xilinx ISE里面附带的一个免费的综合工具。（呵呵，Synplify Pro的购买正在商洽之中）Synthesis完毕后，你可以用任何的文本工具打开后缀名为“.syr”的文件，察看synthesis的具体运行过程及报告。
NGDBuild:
这个命令，其实就是translate啦。这是implementation的第一步。它会把所有的netlist和design constraint合并到一起，生成一个ngd文件供map工具使用。同样的，NGDBuild的报告文件是后缀名为“.bld”的文件。
MAP:
MAP命令是将NGDBuild命令所生成的ngd文件，映射到具体的FPGA器件里面去。MAP将产生一个NCD文件供PAR使用。你可以用任何文本工具打开后缀名为“.mrp”的map报告文件。
PAR:
Place & Route。这个偶就不多说了。免得被砖头砸到。PAR的报告文件后缀名是“.par”。
TRCE:
这个是用来产生偶们最最关心的timing report的。TRCE会分析你的FPGA的设计并且产生一个后缀名为“.twr”的时序报告。你可以用任何文本工具打开它，也可以用xilinx的Timing Analyzer。Timing Analyzer比较直观，推荐新手使用。
Bitgen：
顾名思义，这个Bitgen就是用来generate programming file啦。
2. 使用举例
介绍完Xilinx提供的这些命令后，让我们来看看如何使用它们吧。Xilinx给我们提供了一个很有用的工具：View Command Line Log File。这个工具位于Process for Source窗口的Design Entry Utitlities下面。
双击这个View Command Line Log File，你看到了什么？哈哈，在右边的文本编辑窗口里面出现了类似于下面的语句：
xst -intstyle ise -ifn __projnav/fpga_a.xst -ofn fpga_a.syr
ngdbuild -intstyle ise -dd d:\projects\hardware\hy_multifpga_nwpci_vhd_new_improvetiming/_ngo -uc ./source/hy/fpga_a.ucf -aul -p xc2v6000-ff1152-4 fpga_a.ngc fpga_a.ngd
map -intstyle ise -p xc2v6000-ff1152-4 -cm area -pr b -k 4 -c 100 -tx off -o fpga_a_map.ncd fpga_a.ngd fpga_a.pcf
par -w -intstyle ise -ol high -t 1 fpga_a_map.ncd fpga_a.ncd fpga_a.pcf
trce -intstyle ise -e 1000 -l 1000 -xml fpga_a fpga_a.ncd -o fpga_a.twr fpga_a.pcf
bitgen -intstyle ise -f fpga_a.ut fpga_a.ncd
对了，上面的这些就是你以前曾经使用过的命令啦。其实，我们用ISE作Synthesis,Map,PAR的时候，ISE就会自动调用上面所说的那些command，ISE其实就是一个GUI罢了。:-)
如果双击View Command Line Log File时，出现了下面的错误提示：
Warning: This process is used to display the running command log file that records some application command lines.
这就说明了你还没有implement过你的design,或者你已经清空过了这个design。（在project目录下面，有一个cleanup project files按钮，可以清空）
接下来，就是最后一步啦。你只需要用任何文本编辑器，把上面这段Command的历史纪录Copy&Paste,再存为一个”.Bat”批处理文件，在windows的command line下面执行就可以了。注意需要把这个.bat文件放在你的项目所在目录下。
3. 几点注意
a. 如果要连续跑好几个project，建议先使用Cleanup Project Files.根据偶的经验，如果不进行清空的话，Synthesis和PAR有时会出错。但是这里会出现一个小问题，清空过后，原有的一些文件可能会被删除，比如__projnav目录下面的*.xst(这个时XST的配置文件，做Synthesis的时候，XST会读这个文件以获得相关配置),再比如*.prj（project文件，里面罗列了一些被你的设计所使用的module)以及*.ut文件（ut是Bitgen的配置文件）。这3种文件，在使用Command Line之前最好确认一下是否存在。偶一般会将这3种文件作备份，到时候好拿出来，改一改就用。
b. 在Xilinx的安装目录Xilinx\doc\usenglish\books\docs下，有关于这些command的pdf文件，如果有需要，可以读一读。不过，经过偶苦读一阵之后，偶发现，其实没有太大必要，有问题再说吧。
4. 总结
这是偶第一次发原创，肯定有很多不足之处，恳请赐教。其实这个话题也比较简单，三言两语也可说清楚，但根据偶过去惨痛的新手经验，偶相信还是比较详细地讲解一些为好。至少可以让新手们多了解一些ISE本身嘛。如果大家觉得有那么一丁点儿用处的话，就请re一下先，也不枉偶打那么多字嘛。嘿嘿。
转载请注明作者spriteice

【转载者注】下面是丁丁的回复，感觉也比较有用：
1，Achieve可以进行版本管理，但是文件太大。一般来说，作版本控制最好是保存所有原始输入，如果做FPGA，原始输入包括HDL代码，综合工程.prj或者脚本.tcl，综合约束.sdc，实现约束.ucf，所有core的网表.edn，实现的脚本.bat，再有就是注意综合工具的版本和实现工具的版本。如果所有这些都没有变化，那么可以保证再运行一遍得到完全一样的结果。这也是版本控制的一个关键吧，只要保证可重复性，那么那些结果除了bit你要用到其他也可都不保存。
2，时间标签就是time stamp，在网表文件和bit文件头都有，表示这个文件的生成时间记录，这个不会影响到内部的功能。
3，在插入测试引脚这里，可以利用FPGA Editor来操作，直接编辑.ncd文件，再进行bitgen就可以了，那么如果你需要插入很多测试引脚，利用脚本（好像是.scr）可以帮你自动完成，然后只进行bitgen就可以了，避免重新布局布线浪费时间，避免重新布局布线影响timing。当然FPGA Editor还可以做很多修改了，就不多说了。
4，给一个以前的脚本例子：
综合和实现的命令行：
synplify_pro -batch ../script/chip_syn.tcl
ngdbuild -dd ./_ngo -uc ../script/chip_par.ucf -p xc2s200-fg456-5 .\rev_1\chip.edf chip.ngd
map -p xc2s200-fg456-5 -timing -cm speed -detail -ir -pr b -o chip_map.ncd chip.ngd chip.pcf
par -w -ol med chip_map.ncd chip.ncd chip.pcf
bitgen -w -f ../script/chip_par.ut chip.ncd
copy chip.bit ..\bit\
里面这个chip_par.ut保存了bitgen命令的选项，也可以不用这个文件，都加在命令行里。
后面是chip_syn.tcl：
project -new
#add_file options
add_file -constraint "../script/chip_syn.sdc"'
add_file -verilog "../src/src1.v"
add_file -verilog "../src/src2.v"
add_file -verilog "../src/src3.v"
add_file -verilog "../src/src4.v"
add_file -verilog "../src/src5.v"
add_file -verilog "../src/src6.v"
add_file -verilog "../src/src7.v"
add_file -verilog "../src/src8.v"
#device options
set_option -technology SPARTAN2
set_option -part XC2S200
set_option -package FG456
set_option -speed_grade -5
#compilation/mapping options:
set_option -default_enum_encoding default
set_option -symbolic_fsm_compiler 0
set_option -resource_sharing 0
set_option -use_fsm_explorer 0
#map options
set_option -frequency 50.000
set_option -fanout_limit 100
set_option -disable_io_insertion 0
set_option -pipe 0
set_option -fixgatedclocks 0
set_option -retiming 0
set_option -modular 0
set_option -update_models_cp 0
set_option -verification_mode 0
#simulation options
set_option -write_verilog 0
set_option -write_vhdl
#automatic place and route (vendor) options
set_option -write_apr_constraint 0
#set result format/file last
project -result_file "rev_1/chip.edf"
#implementation attributes
set_option -vlog_std v2001
project -run
只是个人的一点建议，版本控制只要做了就可以，保存archive也要知道这个东东怎么来的，下次可以保证还作出来同样的东东。保存命令行也要知道所有的软件版本等。

2008年11月26日星期三

ncvlog: *E,DUPIDN (t.v,1795|25): identifier 'xx' previously declared [12.5(IEEE)]

这个ERROR明确指出该信号（xx）在前面已经定义了，那么解决办法当然是取消重复定义。但是若在前面没有找到该信号的显式定义(譬如,wire xx;)，就要看该信号在前面是不是已经被使用了，譬如在例化别的模块时，用该信号作为连接信号。

解决办法：在最先使用这个信号的地方显式定义该信号。
引申：在碰到这个ncvlog的ERROR时，先搜索，看前面是否有显式定义，若有则改之；若没有，则要考虑前面是否有用到这个信号的地方。

Tip：
在verilog中，没有显示定义的变量，编译器会默认其类型为wire，相当于定义了该信号。当编译器解析到显式定义的地方时，就会认为是重复定义。

2008年11月13日星期四

Avoid Using Clock Gating In FPGA Emulation

For power saving consideration, clock gating is used in ASIC design. EDA tools are used to balance clocks which come from the same source clock and only the enable signals are different. But in FPGA design, no balance is done among related clocks. The skew among related clocks may be larger or smaller which can not be forecast. This may lead to unexpected results.
Besides, there is no AND/OR/LATCH… etc. only LUT/FF/BUF in FPGA.

2008年11月11日星期二

using the same coding style(both RTL or both handcode) to get generated clock to avoid RTL simulation errors

有两个generated clock: clk1和clk2.它们都是从source clock(clk_src)分频得来的。在STA的constrain中，clk1和clk2是sync的，因此CTS后，可以保证clk1和clk2是sync的。

clk1是用RTL写的：
always @ (posedge clk_src or negedge rst_)
if (!rst_)
clk1 <= 1'b0;
else
clk1 <= clk1_d;

clk2是用handcode写的：
DFCNQHVTD1 DNTCLK2 (.Q(clk2), .D(clk2_d), .CP(clk_src), .CDN(rst_));
DFCNQHVTD1是technology library里面的cell.

这两段代码在功能上是相同的。在RTL仿真中，如果DFCNQHVTD1不含delay，则clk1和clk2的相位也是相同的，即边沿对齐。但是，若在RTL仿真中，DFCNQHVTD1含有delay(CP-->Q有延时),clk1和clk2就会错开一些。这可能会导致RTL simulation出错。

所以，最好全用RTL或全用handcode来产生generated clock,以避免上述的问题。

2008年11月7日星期五

Novas Debussy for Linux终极破解

Novas公司的Debussy类似于软件行业的SourceInsight，对于代码的浏览，跟踪（trace），调试（debug）等都有很好的支持，使用非常方便，是IC前端设计人员必备的一个软件。
破解的方法是在网上找的，在这儿记录一下，同时根据我的经验做了点修改。
具体过程：
1、打开一个terminal，进入debussy/platform/LINUX/bin/
2、启动gdb, gdb debussy回车
3、设置断点， break snsCheckOut回车, 程序将返回一个地址，我的是0x8dfd8b0
4、使用objdump反汇编，输出如下（同时包含机器码和汇编代码，方便定位）
objdump -d --start-address=0x8dfd800 --stop-address=0x8dff900 debussy
08dfd8b0 <snsCheckOut>:
8dfd8b0: 55 push %ebp
8dfd8b1: 89 e5 mov %esp,%ebp
8dfd8b3: 81 ec 0c 30 00 00 sub $0x300c,%esp
8dfd8b9: 57 push %edi
8dfd8ba: 56 push %esi
8dfd8bb: 53 push %ebx
8dfd8bc: 8a 4d 18 mov 0x18(%ebp),%cl
8dfd8bf: 88 8d fb cf ff ff mov %cl,0xffffcffb(%ebp)
8dfd8c5: 8a 4d 28 mov 0x28(%ebp),%cl
5、打开KHEX编辑器，打开debussy文件（platform下的那个35M的）搜索55 89 e5 81 ec 0c 30 00 00 57 56（这组数字根据版本不同可能有所差别，只要定位上面08dfd8b0 <snsCheckOut>:后的代码即可），将头三位改为31 c0 c3 ,存盘退出。
修改后如下：
08dfd8b0<snsCheckOut>:
8dfd8b0: 31 c0 xor %eax,%eax
8dfd8b2: c3 ret
8dfd8b3: 81 ec 0c 30 00 00 sub $0x300c,%esp
8dfd8b9: 57 push %edi
8dfd8ba: 56 push %esi
8dfd8bb: 53 push %ebx
8dfd8bc: 8a 4d 18 mov 0x18(%ebp),%cl
8dfd8bf: 88 8d fb cf ff ff mov %cl,0xffffcffb(%ebp)
8dfd8c5: 8a 4d 28 mov 0x28(%ebp),%cl
7、再起debussy，一切ok.无须license，也不会trace几次就退出了。All N0v@$ products for all Operating systems can be cr@cked easily without license file.Just force the return value of the procedure "snsCheckOut" to "0" or fool the "compare and jump" instruction after calling "snsCheckOut"

about objdump:
-d --disassemble Display the assembler mnemonics for the machine instructions from objfile. This option only disassembles those sections which are expected to contain instructions.
--start-address=address Start displaying data at the specified address. This affects the output of the -d, -r and -s options.
--stop-address=address Stop displaying data at the specified address. This affects the output of the -d, -r and -s options.

2008年7月20日星期日

RHEL下安装Adobe Reader

在Adobe的官方网站上下载了最新版(8.1.2)的RPM，但偶的系统(RHEL3)太老，有包依赖问题。只好退而求其次，安装了7.0版。安装过程比较顺利，没啥问题。

RHEL下安装NVIDIA显卡驱动

刚装好的RHEL3屏幕比较闪，看起来很不舒服。试了各种方法去调整屏幕刷新率，但都没啥效果。看来，只好使用终极解决方案了：安装NVIDIA的显卡驱动。以前使用Debian的时候，每次升级系统内核，都要重装一遍显卡驱动，而每次安装显卡驱动，好像还都要弄个linux-headers-xx相关的东西，再做一些设置，不堪其扰啊。所以，不到最后一刻，我是不想安装显卡驱动的。去NVIDIA的官方网站下载了相应的显卡驱动（NVIDIA-Linux-x86-173.14.09-pkg1.run），直接使用sh进行安装居然可以，也没提示重新编译内核一类的。用startx启动图形界面后，屏幕变得细腻了很多，也不闪了。真是不错，看来还是要装官方的显卡驱动啊。
下面是安装的过程（NVIDIA FX5200）：1.到NVIDIA官方网站下载相应的显卡驱动，可根据你的显卡型号进行选择;
2.退出图形界面,进入文本模式.方法如下:在终端中输入:init 3
3.以root登录
4.进入驱动文件所在目录,输入下列命令:sh NVIDIA-Linux-x86-173.14.09-pkg1.run
5.然后按照提示安装就行了
6.完成后,接着输入以下命令,进入图形模式:startx
到这步就OK了，驱动程序在安装过程中对配置文件做了修改（譬如修改XF86Config等），安装完成后不需要我们手工去修改配置文件了。

从Debian转向Redhat

最开始接触Linux的时候，使用的是Redhat，记不得是哪个版本了。RPM的软件包管理方式，虽然方便，但时不时的包依赖问题仍然会让人觉得很沮丧。后来在别人的介绍下，转到了Debian，其apt-get的软件包管理方式可以省去使用者的很多麻烦。想装某个软件，直接apt-get install xx就可以了，这种感觉真是太美妙了。
但最近为了安装、使用一套EDA工具，又不得不重新使用Redhat，这次使用的是RHEL3，因为这套EDA工具就是针对这个平台开发的，在这个平台上最稳定。在别的平台上可能也能用，但我不想整天去捣鼓软件兼容性问题。所以，并没有尝试在Debian下安装这套工具，而是直接装了RHEL3。
作为有几年使用经验的Linux业余爱好者，我知道自己并不是多么的追求最新的技术体验，我也体会不到无尽的软件升级带来的乐趣。我需要的只是一个可以工作的平台，just a platform.
Redhat就Redhat吧，我要使用的并不是Linux本身，而是那套EDA工具。

RHEL3装好以后，有些问题比较影响使用，主要是以下几点：
1. 屏幕闪，比较伤眼睛。这个应该是屏幕刷新率的问题。
2. 没有Adobe Reader。xpdf用起来还是有些不爽。
3. 中文显示为一个个的小框框。
4. 没有中文输入法。
5. 音频驱动貌似没弄好，没有声音。

在以后的一段日子里，希望逐步把这几个问题解决掉，这样就是一个比较令人满意的平台了:P

2008年7月15日星期二

使用开源软件CVS进行个人代码管理

虽然现在SVN风头正劲，但由于公司使用的是CVS，所以在自己家里的电脑上也装了个CVS。下面记录一下使用过程，便于以后查阅。
Linux系统一般都自带了CVS软件。在开始使用前，需要做下面的事情以初始化CVS的使用环境：
1.建立cvsroot目录。为了避免不小心将该目录整体删除了，所以我建在了/home下:
$:mkdir /home/cvsroot
2.为了便于以后的管理，建立一个组，然后将要使用cvs服务器的帐号添加到这个组里面：
#:groupadd cvs
#:adduser xx
3.将刚才建立的目录cvsroot的组改为cvs:
#: cd /home/cvsroot
#: chgrp -R cvs .
4.修改cvsroot目录的读写权限，赋予同组人读写的权限：
#: chmod 770 .
5.初始化CVS仓库：
cvs -d /home/cvsroot init
6.为了以后使用时不必每次指定CVS仓库的位置，可以在shell的启动文件中定义CVSROOT环境变量。
如果使用的是bash，则在.bashrc中添加：
CVSROOT=/home/cvsroot;export CVSROOT
如果使用的是csh，则在.cshrc中添加：
setenv CVSROOT /home/cvsroot

完成以上各步后，就可以使用cvs checkout, cvs commit, cvs update等命令来做代码的版本控制/管理了。

2008年7月7日星期一

the output delay of SRAM generated by Block Memory Generator v2.4 (Xilinx)

使用Xilinx Block Memory Generator v2.4生成的SRAM的输出延时

默认延时是1T，即输入地址后，过1个时钟周期，相应的数据会出现在输出端。如果在Optional Output Registers里面，将Register Output of Memory Primitives选中，则会增加1个时钟周期的延时;若将Register Output of Memory Core选中，也会增加1个时钟周期的延时。若同时将这两个选项选中，则总的输出延时将为3T。

在设计时，需要根据数据通路的时序要求，选择SRAM相应的输出延时。

2008年6月30日星期一

potential glitch in asynchronous reset

使用异步复位（reset）时如何避免毛刺（glitch）
1.毛刺（glitch）
对于一个简单的与门、或门电路，当输入信号同时向相反的逻辑电平跳变时，可能会产生毛刺。譬如，assign rst = rst_a & rst_b;当rst_a 从1-->0（或从0-->1），同时rst_b从0-->1（或从1-->0）时，就可能产生一个向上的尖峰脉冲（也就是我们所说的毛刺）。同样，对于一个或门来说，若输入信号同时向相反的逻辑电平跳变，也可能会产生毛刺，不过这里产生的是向下的尖峰脉冲而不是向上的。
是否产生毛刺取决于输入信号之间跳变的相对速度，但这一般是不可预知的。所以，我们应该尽量避免使用可能产生毛刺的逻辑，或者虽然会产生毛刺，但对负载电路没有影响的逻辑。
2.异步复位（reset）
在数字电路设计中，我们经常会使用异步reset将电路复位到初始状态。异步reset信号可能是由一些相关信号经过组合逻辑而生成的，譬如与门、或门等。我们知道，或门会产生向下跳变的glitch（而不会产生向上跳变的glitch），与门会产生向上跳变的glitch（而不会产生向下跳变的glitch）。所以，若异步reset信号是高电平有效的，使用或门将相关的信号连接起来可以避免glitch；同样，若异步reset信号是低电平有效的，则使用与门，也可避免glitch。当然，其前提是生成这个异步reset的所有相关信号本身都是没有glitch的。
3.结论：若异步reset是高电平有效，则使用或门；若是低电平有效，则使用与门。

2008年3月25日星期二

Xilinx/Virtex-5 DCM problem:the minimum frequence of input clock

I have an input clock of 24MHz and I need to get a 48MHz clock using DCM. During PAR, there were the following warnings:
WARNING:Timing:3325 - Timing Constraint
"TS_CLK24 = PERIOD TIMEGRP "CLK24" 42 ns HIGH 50%;"
fails the maximum period check for input clock clk48_gen/CLKIN_IBUFG_OUT to DCM_ADV clk48_gen_DCM_INST because the period constraint value (42000 ps) exceeds the maximum internal period limit of 31251 ps.Please reduce the period of the constraint to remove this timing failure.
...

That means the minimum input clock of the DCM is about 32MHz (10^6/31251ps=31.998976MHz) while my input clock is only 24MHz. I did not know whether this DCM will operate normally or not. I think the intuitional way is that:output this 48MHz signal/clock and observe it on LAs. My FPGA board will be ready in several days. I will test and then update this blog.

The following is someone's explanation I found in Xilinx/Forum:
These warnings usually indicate that the period constraint specified in your design violates one of the Spec'ed operating ranges in the datasheet for that device (ie, faster than the Max Frequency of a DCM, etc.). This doesn't necessarily mean that it won't work, but it does mean that it's outside of the range that Xilinx has tested and guarantees. You might take a look in the datasheet for the device you're targetting to see if the frequency you're requesting is outside of the range for either it's driver or load.

订阅：博文 (Atom)