My question is: Did you use
My question is: Did you use RDTSC instruction to measure performance of these two test-cases?Here are pseudo-codes:... RaisePriorityToREALTIME EnterCriticalSection...
View ArticleIn the first example, I
In the first example, I suggest you use a restrictive scope {} to ensure the scope of vecAp, vecBp, vecCp is limited the the immediately following for loop. IOW place { before the declaraton of vecAp...
View ArticleQuote:Nadav S. wrote:
Quote:Nadav S. wrote: HiI noticed there are two popular ways when writing intrinsics for moving data into ymm registers. I'll use a simple vector addition example to clearify my question. Assuming...
View ArticleSergey Kostrov, Thank you for
Sergey Kostrov, Thank you for your reply.I did not use the RDTSC clock to measue performance. I run my code runs on large vectors and the entire function is wrapped by a loop that runs for thousends of...
View ArticleHi jimdempseyatthecove
Hi jimdempseyatthecoveThanks for the advice about declaring the pointers inside the loop. What about the two options I stated in my question? Any thoughts about which one is better?
View Article>>>I pasted the assembler
>>>I pasted the assembler code below and to me the two options look very similar>>>Those two vector addidion operations written in high level language at machine code level can be...
View Article>>>I did not use the RDTSC
>>>I did not use the RDTSC clock to measue performance>>>Those two assembly loops contain almost the same instruction the only difference is related to various general purpose...
View ArticleLet me ask you a question --
Let me ask you a question -- how are you going to write more complex code using pointers? Where you will store intermediate values? Your example is too simple to understand the differences in writing...
View ArticleThanks for the assembler
Thanks for the assembler codes.>>...I pasted the assembler code below and to me the two options look very similar even in assembler ( both have 6 commands>>inside the loop), so I still...
View ArticleIn looking at the generated
In looking at the generated code, you find some subtle difference that lead to different loop sizes.1st)000000013F5A1090 vmovaps ymm0,ymmword ptr [rbx+rax] vecAp++;...
View ArticleMy question is: Did you use
My question is: Did you use RDTSC instruction to measure performance of these two test-cases?Here are pseudo-codes:... RaisePriorityToREALTIME EnterCriticalSection...
View ArticleIn the first example, I
In the first example, I suggest you use a restrictive scope {} to ensure the scope of vecAp, vecBp, vecCp is limited the the immediately following for loop. IOW place { before the declaraton of vecAp...
View ArticleQuote:Nadav S. wrote:
Quote:Nadav S. wrote:HiI noticed there are two popular ways when writing intrinsics for moving data into ymm registers. I'll use a simple vector addition example to clearify my question. Assuming a[],...
View ArticleSergey Kostrov, Thank you for
Sergey Kostrov, Thank you for your reply.I did not use the RDTSC clock to measue performance. I run my code runs on large vectors and the entire function is wrapped by a loop that runs for thousends of...
View ArticleHi jimdempseyatthecove
Hi jimdempseyatthecoveThanks for the advice about declaring the pointers inside the loop. What about the two options I stated in my question? Any thoughts about which one is better?
View Article>>>I pasted the assembler
>>>I pasted the assembler code below and to me the two options look very similar>>>Those two vector addidion operations written in high level language at machine code level can be...
View Article>>>I did not use the RDTSC
>>>I did not use the RDTSC clock to measue performance>>>Those two assembly loops contain almost the same instruction the only difference is related to various general purpose...
View ArticleLet me ask you a question --
Let me ask you a question -- how are you going to write more complex code using pointers? Where you will store intermediate values? Your example is too simple to understand the differences in writing...
View ArticleThanks for the assembler
Thanks for the assembler codes.>>...I pasted the assembler code below and to me the two options look very similar even in assembler ( both have 6 commands>>inside the loop), so I still...
View ArticleIn looking at the generated
In looking at the generated code, you find some subtle difference that lead to different loop sizes.1st)000000013F5A1090 vmovaps ymm0,ymmword ptr [rbx+rax] vecAp++;...
View Article