December 2019 S M T W T F S « Nov 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -
Recent Posts
Categories
Archives
- November 2019
- September 2019
- July 2019
- April 2019
- February 2019
- January 2019
- December 2018
- June 2018
- May 2018
- March 2018
- November 2017
- September 2017
- June 2017
- May 2017
- April 2017
- March 2017
- February 2017
- January 2017
- March 2016
- February 2016
- December 2015
- October 2015
- September 2015
- January 2015
- December 2014
- November 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- December 2010
- November 2010
- October 2010
- September 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- November 2009
- October 2009
- September 2009
Meta
Tag Archives: assembly
Low level code / Assembly
C/C++ Low Level Curriculum 这系列的 blog 简要介绍了(x86上)常用 C++ 的汇编code。 不过注意作者犯了一个比较大的错误:blog 中堆栈相关的汇编示例都是 cdecl,但作者的说明中写的都是 stdcall。应用 callee clean-up 的 stdcall 最显著的特征是用 ret xx 指令在函数返回的同时弹出堆栈,而 caller clean-up 的 cdecl 则直接用 ret 指令。我手上版本的 GCC 和 MSVC 的默认设置都是 cdecl。 实际看代码的话,就算同是 cdecl,GCC 生成的汇编和 MSVC 的也是有些差别的。比如 MSVC 进入函数后立刻 … Continue reading
Memory ordering & memory barrier
最近很惊异的发现(对称多处理器)ARM 架构的乱序执行技术对内存的读取/写入顺序几乎没有任何保证(嗯,我知道早就old了)。这方面 x86 架构好些,但也有一条: Loads may be reordered with older stores to different locations. 于是第一想到的就是古董的 Peterson lock(这个现在一般是用来说明为何需要处理器加入 test-and-set 之类的指令),代码如下: 果不其然,不管是 ARM 还是 x86,只要是多核心的,程序都会生成错误结果。 补救的方法是使用 memory barrier(或者 memory fence)处理器指令来强制限制内存操作的顺序。在上面的程序中便是 lock 函数中注释掉的两行汇编代码,dmb 是 ARM 的指令,mfence 是 x86 的指令。 虽然 Peterson lock 早已成为遗迹,但现今也还有栽在这个 … Continue reading
Posted in Computer and Internet, Programming and Algorithm
Tagged assembly, parallel computing
Leave a comment